Genome Sequence and Comparative Pathogenic Determinants of Multidrug Resistant Uropathogenic Escherichia coli O25b:H4, A Clinical Isolate from Saudi Arabia

Essam J. Alyamani, Anamil M. Khiyami, Rayan Y. Booq, Fayez S. Bahwerth, Benjamin Vaisvil, Daniel P. Schmitt and Vinayak Kapatral

Escherichia coli serotype O25b:H4 is involved in human urinary tract infections. In this study, we sequenced and analyzed E. coli O25b:H4 isolated from a patient suffering from recurring UTI infections in an intensive care unit at Hera General Hospital in Makkah, Saudi Arabia. We aimed to determine the virulence genes for pathogenesis and drug resistance of this isolate compared to other E. coli strains. We sequenced and analyzed the E. coli O25b:H4 Saudi strain clinical isolate using next generation sequencing. Using the ERGO genome analysis platform, we performed annotations and identified virulence and antibiotic resistance determinants of this clinical isolate. The E. coli O25b:H4 genome was assembled into four contigs representing a total chromosome size of 5.28 Mb, and three contigs were identified, including a 130.9 kb (virulence plasmid) contig bearing the bla-CTX gene and 32 kb and 29 kb contigs. In comparing this genome to other uropathogenic E. coli genomes, we identified unique drug resistance and pathogenicity factors. In this work, whole-genome sequencing and targeted comparative analysis of a clinical isolate of uropathogenic Escherichia coli O25b:H4 was performed. This strain encodes virulence genes linked with extraintestinal pathogenic E. coli (ExPEC) that are expressed constitutively in E. coli ST131. We identified the genes responsible for pathogenesis and drug resistance and performed comparative analyses of the virulence and antibiotic resistance determinants with those of other E. coli UPEC isolates. This is the first report of genome sequencing and analysis of a UPEC strain from Saudi Arabia.

Published Nov 5. 2016. DOI: 10.22207/JPAM.10.4.01

Genome of Methanocaldococcus (methanococcus) jannaschii.

Graham DE, Kyrpides N, Anderson IJ, Overbeek R, Whitman WB.

Methanocaldococcus (Methanococcus) jannaschii strain JAL-1 is a hyperthermophilic methanogenic archaeon that was isolated from surface material collected at a “white smoker” chimney at a depth of 2600 m in the East Pacific Rise near the western coast of Mexico. Cells are irregular cocci possessing polar bundles of flagella. The cell envelope is composed of a cytoplasmic membrane and a protein surface layer. Similar isolates have been obtained from hydrothermally active sediments in the Guaymas Basin and the Mid-Atlantic Ridge, and related species have been found at other marine hydrothermal vents. Because these hyperthermophilic species are very different from the mesophilic methanococci, they have been reclassified into a new family, Methanocaldococcaceae, and two new genera, Methanocaldococcus and Methanotorris. The characteristics of the source material for these isolates suggest that they possess adaptations for growth at high temperature and pressure as well as moderate salinity.

Methods Enzymol. 2001;330:40-123. doi:10.1016/S0076-6879(01)30370-1

Bioinformatics classification and functional analysis of PhoH homologs.

Kazakov AE, Vassieva O, Gelfand MS, Osterman A, Overbeek R.

PhoH protein is a putative ATPase belonging to the phosphate regulon in Escherichia coli. EC-PhoH homologs are present in different organisms, but it is not clear if they are functionally related, besides nothing is known about their regulation. To distinguish true functional orthologs of EC-PhoH in different classes of bacteria and to identify their functional role in bacterial metabolic network we performed phylogenetic analysis of these proteins and comparative study of position and regulation of the related genes. Three groups of proteins were identified. Proteins of the first group (BS-PhoH orthologs) are present in most of bacteria and are proposed to be functionally linked to phospholipid metabolism and RNA modification. Proteins of the second group (BS-YlaK orthologs) are present in most of aerobes and Actinobacterial YlaK orthologs are shown to be members of a fatty acid beta-oxidation regulons. EC-PhoH orthologs are classified in a third group, specific for Enterobacteria. Functional role of PhoH homologs in the lipid and RNA metabolism and proposed interrelation of PhoH paralogs in one organism are discussed.

In Silico Biol. 2003;3(1-2):3-15. Epub 2002 Dec 30

From genetic footprinting to antimicrobial drug targets: examples in cofactor biosynthetic pathways.

Gerdes SY, Scholle MD, D'Souza M, Bernal A, Baev MV, Farrell M, Kurnasov OV, Daugherty MD, Mseeh F, Polanuyer BM, Campbell JW, Anantha S, Shatalin KY, Chowdhury SA, Fonstein MY, Osterman AL.

Novel drug targets are required in order to design new defenses against antibiotic-resistant pathogens. Comparative genomics provides new opportunities for finding optimal targets among previously unexplored cellular functions, based on an understanding of related biological processes in bacterial pathogens and their hosts. We describe an integrated approach to identification and prioritization of broad-spectrum drug targets. Our strategy is based on genetic footprinting in Escherichia coli followed by metabolic context analysis of essential gene orthologs in various species. Genes required for viability of E. coli in rich medium were identified on a whole-genome scale using the genetic footprinting technique. Potential target pathways were deduced from these data and compared with a panel of representative bacterial pathogens by using metabolic reconstructions from genomic data. Conserved and indispensable functions revealed by this analysis potentially represent broad-spectrum antibacterial targets. Further target prioritization involves comparison of the corresponding pathways and individual functions between pathogens and the human host. The most promising targets are validated by direct knockouts in model pathogens. The efficacy of this approach is illustrated using examples from metabolism of adenylate cofactors NAD(P), coenzyme A, and flavin adenine dinucleotide. Several drug targets within these pathways, including three distantly related adenylyltransferases (orthologs of the E. coli genes nadD, coaD, and ribF), are discussed in detail.

J Bacteriol. 2002 Aug;184(16):4555-72.

Microarray analysis of gene expression during bacteriophage T4 infection.

Luke K, Radek A, Liu X, Campbell J, Uzan M, Haselkorn R, Kogan Y.

Genomic microarrays were used to examine the complex temporal program of gene expression exhibited by bacteriophage T4 during the course of development. The microarray data confirm the existence of distinct early, middle, and late transcriptional classes during the bacteriophage replicative cycle. This approach allows assignment of previously uncharacterized genes to specific temporal classes. The genomic expression data verify many promoter assignments and predict the existence of previously unidentified promoters.

Virology. 2002 Aug 1;299(2):182-91.

Archaeal shikimate kinase, a new member of the GHMP-kinase family.

Daugherty M, Vonstein V, Overbeek R, Osterman A.

Shikimate kinase (EC is a committed enzyme in the seven-step biosynthesis of chorismate, a major precursor of aromatic amino acids and many other aromatic compounds. Genes for all enzymes of the chorismate pathway except shikimate kinase are found in archaeal genomes by sequence homology to their bacterial counterparts. In this study, a conserved archaeal gene (gi1500322 in Methanococcus jannaschii) was identified as the best candidate for the missing shikimate kinase gene by the analysis of chromosomal clustering of chorismate biosynthetic genes. The encoded hypothetical protein, with no sequence similarity to bacterial and eukaryotic shikimate kinases, is distantly related to homoserine kinases (EC of the GHMP-kinase superfamily. The latter functionality in M. jannaschii is assigned to another gene (gi591748), in agreement with sequence similarity and chromosomal clustering analysis. Both archaeal proteins, overexpressed in Escherichia coli and purified to homogeneity, displayed activity of the predicted type, with steady-state kinetic parameters similar to those of the corresponding bacterial kinases: K(m,shikimate) = 414 +/- 33 microM, K(m,ATP) = 48 +/- 4 microM, and k(cat) = 57 +/- 2 s(-1) for the predicted shikimate kinase and K(m,homoserine) = 188 +/- 37 microM, K(m,ATP) = 101 +/- 7 microM, and k(cat) = 28 +/- 1 s(-1) for the homoserine kinase. No overlapping activity could be detected between shikimate kinase and homoserine kinase, both revealing a >1,000-fold preference for their own specific substrates. The case of archaeal shikimate kinase illustrates the efficacy of techniques based on reconstruction of metabolism from genomic data and analysis of gene clustering on chromosomes in finding missing genes.

J Bacteriol. 2001 Jan; 183(1): 292–300.
doi:  10.1128/JB.183.1.292-300.2001

Protein interaction maps for complete genomes based on gene fusion events.

Enright AJ, Iliopoulos I, Kyrpides NC, Ouzounis CA.

A large-scale effort to measure, detect and analyse protein-protein interactions using experimental methods is under way. These include biochemistry such as co-immunoprecipitation or crosslinking, molecular biology such as the two-hybrid system or phage display, and genetics such as unlinked noncomplementing mutant detection. Using the two-hybrid system, an international effort to analyse the complete yeast genome is in progress. Evidently, all these approaches are tedious, labour intensive and inaccurate. From a computational perspective, the question is how can we predict that two proteins interact from structure or sequence alone. Here we present a method that identifies gene-fusion events in complete genomes, solely based on sequence comparison. Because there must be selective pressure for certain genes to be fused over the course of evolution, we are able to predict functional associations of proteins. We show that 215 genes or proteins in the complete genomes of Escherichia coli, Haemophilus influenzae and Methanococcus jannaschii are involved in 64 unique fusion events. The approach is general, and can be applied even to genes of unknown function.

Nature 402, 86-90 (4 November 1999) | doi:10.1038/47056

Universal protein families and the functional content of the last universal common ancestor.

Kyrpides N, Overbeek R, Ouzounis C.

The phylogenetic distribution of Methanococcus jannaschii proteins can provide, for the first time, an estimate of the genome content of the last common ancestor of the three domains of life. Relying on annotation and comparison with reference to the species distribution of sequence similarities results in 324 proteins forming the universal family set. This set is very well characterized and relatively small and nonredundant, containing 301 biochemical functions, of which 246 are unique. This universal function set contains mostly genes coding for energy metabolism or information processing. It appears that the Last Universal Common Ancestor was an organism with metabolic networks and genetic machinery similar to those of extant unicellular organisms.

J Mol Evol. 1999 Oct;49(4):413-23.