1,188 Matching Results

Search Results

Advanced search parameters have been applied.

IMG ER: A System for Microbial Genome Annotation Expert Review and Curation

Description: A rapidly increasing number of microbial genomes are sequenced by organizations worldwide and are eventually included into various public genome data resources. The quality of the annotations depends largely on the original dataset providers, with erroneous or incomplete annotations often carried over into the public resources and difficult to correct. We have developed an Expert Review (ER) version of the Integrated Microbial Genomes (IMG) system, with the goal of supporting systematic and efficient revision of microbial genome annotations. IMG ER provides tools for the review and curation of annotations of both new and publicly available microbial genomes within IMG's rich integrated genome framework. New genome datasets are included into IMG ER prior to their public release either with their native annotations or with annotations generated by IMG ER's annotation pipeline. IMG ER tools allow addressing annotation problems detected with IMG's comparative analysis tools, such as genes missed by gene prediction pipelines or genes without an associated function. Over the past year, IMG ER was used for improving the annotations of about 150 microbial genomes.
Date: May 25, 2009
Creator: Markowitz, Victor M.; Mavromatis, Konstantinos; Ivanova, Natalia N.; Chen, I-Min A.; Chu, Ken & Kyrpides, Nikos C.
Partner: UNT Libraries Government Documents Department

RegPredict: an integrated system for regulon inference in prokaryotes by comparative genomics approach

Description: RegPredict web server is designed to provide comparative genomics tools for reconstruction and analysis of microbial regulons using comparative genomics approach. The server allows the user to rapidly generate reference sets of regulons and regulatory motif profiles in a group of prokaryotic genomes. The new concept of a cluster of co-regulated orthologous operons allows the user to distribute the analysis of large regulons and to perform the comparative analysis of multiple clusters independently. Two major workflows currently implemented in RegPredict are: (i) regulon reconstruction for a known regulatory motif and (ii) ab initio inference of a novel regulon using several scenarios for the generation of starting gene sets. RegPredict provides a comprehensive collection of manually curated positional weight matrices of regulatory motifs. It is based on genomic sequences, ortholog and operon predictions from the MicrobesOnline. An interactive web interface of RegPredict integrates and presents diverse genomic and functional information about the candidate regulon members from several web resources. RegPredict is freely accessible at http://regpredict.lbl.gov.
Date: May 26, 2010
Creator: Novichkov, Pavel S.; Rodionov, Dmitry A.; Stavrovskaya, Elena D.; Novichkova, Elena S.; Kazakov, Alexey E.; Gelfand, Mikhail S. et al.
Partner: UNT Libraries Government Documents Department

Functional and Phylogenetic Analyses of a Conserved Regulatory Program in the Phloem of Minor Veins

Description: This article discusses functional and phylogenetic analyses of a conserved regulatory program in the phloem of minor veins, which are developmentally and physiologically distinct from the phloem in the rest of the vascular system.
Date: November 2003
Creator: Ayre, Brian G.; Blair, Jaime E. & Turgeon, Robert
Partner: UNT College of Arts and Sciences

Analysis of a Human Transfer RNA Gene Cluster and Characterization of the Transcription Unit and Two Processed Pseudogenes of Chimpanzee Triosephosphate Isomerase

Description: An 18.5-kb human DNA segment was selected from a human XCharon-4A library by hybridization to mammalian valine tRNAiAc and found to encompass a cluster of three tRNA genes. Two valine tRNA genes with anticodons of AAC and CAC, encoding the major and minor cytoplasmic valine tRNA isoacceptors, respectively, and a lysine tRNAcuu gene were identified by Southern blot hybridization and DNA sequence analysis of a 7.1-kb region of the human DNA insert. At least nine Alu family members were found interspersed throughout the human DNA fragment. The tRNA genes are accurately transcribed by RNA polymerase III in a HeLa cell extract, since the RNase Ti fingerprints of the mature-sized tRNA transcription products are consistent with the DNA sequences of the structural genes. Three members of the chimpanzee triosephosphate isomerase (TPI) gene family, the functional transcription unit and two processed pseudogenes, were characterized by genomic blotting and DNA sequence analysis. The bona fide TPI gene spans 3.5 kb with seven exons and six introns, and is the first complete hominoid TPI gene sequenced. The gene exhibits a very high identity with the human and rhesus TPI genes. In particular, the polypeptides of 248 amino acids encoded by the chimpanzee and human TPI genes are identical, although the two coding regions differ in the third codon wobble positions for five amino acids. An Alu member occurs upstream from one of the processed pseudogenes, whereas an isolated endogenous retroviral long terminal repeat (HERV-K) occurs within the structural region of the other processed pseudogene. The ages of the processed pseudogenes were estimated to be 2.6 and 10.4 million years, implying that one was inserted into the genome before the divergence of the chimpanzee and human lineages, and the other inserted into the chimpanzee genome after the divergence.
Date: August 1990
Creator: Craig, Leonard C. (Leonard Callaway)
Partner: UNT Libraries

Greengenes: Chimera-checked 16S rRNA gene database and workbenchcompatible in ARB

Description: A 16S rRNA gene database (http://greengenes.lbl.gov) addresses limitations of public repositories by providing chimera-screening, standard alignments and taxonomic classification using multiple published taxonomies. It was revealed that incongruent taxonomic nomenclature exists among curators even at the phylum-level. Putative chimeras were identified in 3% of environmental sequences and 0.2% of records derived from isolates. Environmental sequences were classified into 100 phylum-level lineages within the Archaea and Bacteria.
Date: February 1, 2006
Creator: DeSantis, T.Z.; Hugenholtz, P.; Larsen, N.; Rojas, M.; Brodie,E.L; Keller, K. et al.
Partner: UNT Libraries Government Documents Department

Laying the Foundation for a Genomic Rosetta Stone: Creating Information Hubs through the User of Consensus Idenifiers

Description: This paper presents a holistic approach that illustrates how the semantic hurdle for integration of biological databases might be overcome when mapping sources that provide information on individual genes and complete genomes to sources that provide information on the biological resources from which these sequences where derived, and vice versa. In particular we will explain how each of the completed and ongoing whole-genome sequencing projects in the Genomes OnLine Database and each of the ribosomal RNA sequences in the SILVA ribosomal RNA database have been persistently cross-referenced with the StrainInfo.net bioportal, serving both a genome centric and an organism centric view to the life on our blue planet as one more stepping stone towards the establishment of fully integrated and flexible biological information networks.
Date: May 1, 2007
Creator: Van Brabant, Bart; Kyrpides, Nikos; Glockner, Frank Oliver; Gray, Tanya; Field, Dawn; De Vos, Paul et al.
Partner: UNT Libraries Government Documents Department

Versatile P(acman) BAC Libraries for Transgenesis Studies in Drosophila melanogaster

Description: We constructed Drosophila melanogaster BAC libraries with 21-kb and 83-kb inserts in the P(acman) system. Clones representing 12-fold coverage and encompassing more than 95percent of annotated genes were mapped onto the reference genome. These clones can be integrated into predetermined attP sites in the genome using Phi C31 integrase to rescue mutations. They can be modified through recombineering, for example to incorporate protein tags and assess expression patterns.
Date: April 21, 2009
Creator: Venken, Koen J.T.; Carlson, Joseph W.; Schulze, Karen L.; Pan, Hongling; He, Yuchun; Spokony, Rebecca et al.
Partner: UNT Libraries Government Documents Department

Accelerated Gene Evolution and Subfunctionalization in the Pseudotetraploid Frog Xenopus Laevis

Description: Ancient whole genome duplications have been implicated in the vertebrate and teleost radiations, and in the emergence of diverse angiosperm lineages, but the evolutionary response to such a perturbation is still poorly understood. The African clawed frog Xenopus laevis experienced a relatively recent tetraploidization {approx} 40 million years ago. Analysis of the considerable amount of EST sequence available for this species together with the genome sequence of the related diploid Xenopus tropicalis provides a unique opportunity to study the genomic response to whole genome duplication.
Date: March 1, 2007
Creator: Hellsten, Uffe; Khokha, Mustafa K.; Grammar, Timothy C.; Harland,Richard M.; Richardson, Paul & Rokhsar, Daniel S.
Partner: UNT Libraries Government Documents Department

The integrated microbial genomes (IMG) system in 2007: datacontent and analysis tool extensions

Description: The Integrated Microbial Genomes (IMG) system is a data management, analysis and annotation platform for all publicly available genomes. IMG contains both draft and complete JGI microbial genomes integrated with all other publicly available genomes from all three domains of life, together with a large number of plasmids and viruses. IMG provides tools and viewers for analyzing and annotating genomes, genes and functions, individually or in a comparative context. Since its first release in 2005, IMG's data content and analytical capabilities have been constantly expanded through quarterly releases. IMG is provided by the DOE-Joint Genome Institute (JGI) and is available from http://img.jgi.doe.gov.
Date: August 1, 2007
Creator: Markowitz, Victor M.; Szeto, Ernest; Palaniappan, Krishna; Grechkin, Yuri; Chu, Ken; Chen, I-Min A. et al.
Partner: UNT Libraries Government Documents Department

Functional Assessment of the Medicago truncatula NIP/LATD Protein Demonstrates That It Is a High-Affinity Nitrate Transporter

Description: Article on the functional assessment of the Medicago truncatula NIP/LATD protein demonstrating that it is a high-affinity nitrate transporter.
Date: October 2012
Creator: Bagchi, Rammyani; Salehin, Mohammad; Adeyemo, O. Sarah; Salazar, Carolina; Shulaev, Vladimir; Sherrier, D. Janine et al.
Partner: UNT College of Arts and Sciences

Habitat-Lite: A GSC case study based on free text terms for environmental metadata

Description: There is an urgent need to capture metadata on the rapidly growing number of genomic, metagenomic and related sequences, such as 16S ribosomal genes. This need is a major focus within the Genomic Standards Consortium (GSC), and Habitat is a key metadata descriptor in the proposed 'Minimum Information about a Genome Sequence' (MIGS) specification. The goal of the work described here is to provide a light-weight, easy-to-use (small) set of terms ('Habitat-Lite') that captures high-level information about habitat while preserving a mapping to the recently launched Environment Ontology (EnvO). Our motivation for building Habitat-Lite is to meet the needs of multiple users, such as annotators curating these data, database providers hosting the data, and biologists and bioinformaticians alike who need to search and employ such data in comparative analyses. Here, we report a case study based on semi-automated identification of terms from GenBank and GOLD. We estimate that the terms in the initial version of Habitat-Lite would provide useful labels for over 60% of the kinds of information found in the GenBank isolation-source field, and around 85% of the terms in the GOLD habitat field. We present a revised version of Habitat-Lite and invite the community's feedback on its further development in order to provide a minimum list of terms to capture high-level habitat information and to provide classification bins needed for future studies.
Date: April 1, 2008
Creator: Kyrpides, Nikos; Hirschman, Lynette; Clark, Cheryl; Cohen, K. Bretonnel; Mardis, Scott; Luciano, Joanne et al.
Partner: UNT Libraries Government Documents Department

The Trichoplax Genome and the Nature of Placozoans

Description: Placozoans are arguably the simplest free-living animals, possibly evoking an early stage in metazoan evolution, yet their biology is poorly understood. Here we report the sequencing and analysis of the {approx}98 million base pair nuclear genome of the placozoan Trichoplax adhaerens. Whole genome phylogenetic analysis suggests that placozoans belong to a 'eumetazoan' clade that includes cnidarians and bilaterians, with sponges as the earliest diverging animals. The compact genome exhibits conserved gene content, gene structure, and synteny relative to the human and other complex eumetazoan genomes. Despite the apparent cellular and organismal simplicity of Trichoplax, its genome encodes a rich array of transcription factor and signaling pathway genes that are typically associated with diverse cell types and developmental processes in eumetazoans, motivating further searches for cryptic cellular complexity and/or as yet unobserved life history stages.
Date: August 1, 2008
Creator: Srivastava, Mansi; Begovic, Emina; Chapman, Jarrod; Putnam, Nicholas H.; Hellsten, Uffe; Kawashima, Takeshi et al.
Partner: UNT Libraries Government Documents Department

An Experimental Metagenome Data Management and AnalysisSystem

Description: The application of shotgun sequencing to environmental samples has revealed a new universe of microbial community genomes (metagenomes) involving previously uncultured organisms. Metagenome analysis, which is expected to provide a comprehensive picture of the gene functions and metabolic capacity of microbial community, needs to be conducted in the context of a comprehensive data management and analysis system. We present in this paper IMG/M, an experimental metagenome data management and analysis system that is based on the Integrated Microbial Genomes (IMG) system. IMG/M provides tools and viewers for analyzing both metagenomes and isolate genomes individually or in a comparative context.
Date: March 1, 2006
Creator: Markowitz, Victor M.; Korzeniewski, Frank; Palaniappan, Krishna; Szeto, Ernest; Ivanova, Natalia N.; Kyrpides, Nikos C. et al.
Partner: UNT Libraries Government Documents Department

Genome sequencing reveals complex secondary metabolome in themarine actinomycete Salinispora tropica

Description: Recent fermentation studies have identified actinomycetes ofthe marine-dwelling genus Salinispora as prolific natural productproducers. To further evaluate their biosynthetic potential, we analyzedall identifiable secondary natural product gene clusters from therecently sequenced 5,184,724 bp S. tropica CNB-440 circular genome. Ouranalysis shows that biosynthetic potential meets or exceeds that shown byprevious Streptomyces genome sequences as well as other naturalproduct-producing actinomycetes. The S. tropica genome features ninepolyketide synthase systems of every known formally classified family,non-ribosomal peptide synthetases and several hybrid clusters. While afew clusters appear to encode molecules previously identified inStreptomyces species,the majority of the 15 biosynthetic loci are novel.Specific chemical information about putative and observed natural productmolecules is presented and discussed. In addition, our bioinformaticanalysis was critical for the structure elucidation of the novelpolyenemacrolactam salinilactam A. This study demonstrates the potentialfor genomic analysis to complement and strengthen traditional naturalproduct isolation studies and firmly establishes the genus Salinispora asa rich source of novel drug-like molecules.
Date: May 1, 2007
Creator: Udwary, Daniel W.; Zeigler, Lisa; Asolkar, Ratnakar; Singan,Vasanth; Lapidus, Alla; Fenical, William et al.
Partner: UNT Libraries Government Documents Department

Automated Eukaryotic Gene Structure Annotation Using EVidenceModeler and the Program to Assemble Spliced Alignments

Description: EVidenceModeler (EVM) is presented as an automated eukaryotic gene structure annotation tool that reports eukaryotic gene structures as a weighted consensus of all available evidence. EVM, when combined with the Program to Assemble Spliced Alignments (PASA), yields a comprehensive, configurable annotation system that predicts protein-coding genes and alternatively spliced isoforms. Our experiments on both rice and human genome sequences demonstrate that EVM produces automated gene structure annotation approaching the quality of manual curation.
Date: December 10, 2007
Creator: Haas, B J; Salzberg, S L; Zhu, W; Pertea, M; Allen, J E; Orvis, J et al.
Partner: UNT Libraries Government Documents Department

The complete mitochondrial sequence of the"living fossil" Tricholepidion gertschi: structure, phylogenetic implications, and the description of a novel A/T asymmetrical bias

Description: Traditionally, the 'Apterygota' has been thought to consist of five orders of wingless hexapods (Protura, Collembola, Diplura, Microcoryphia and Zygentoma) believed to be collectively basal to insects (i.e., the Pterygota). However, some studies have questioned this affinity with insects (Dallai, Abele, Spears, Nardi). Further, within these groups are hotly debated issues, including the monophyly of Entognata (Koch, 1997; Kukalova Peck, 1987), the monophyly of Diplura (Bilinski, 1993; Stys and Bilinski, 1990), the affinity between Collembola and Protura (Dallai, 1994; Kristensen, 1981) and the position of Lepidotrichidae (below). In fact, these relationships constitute one of the most debated issues in hexapod phylogeny. The family Lepidotrichidae was first described by (Silvestri, 1912) (1912: 'Lepidothricinae') from a Baltic Amber fossil (Lepidothrix pilifera Menge). The only living representative of this family is Tricholepidion gertschi Wygodzinski. Since this species was first described (Wygodzinsky, 1961) its phylogenetic position has been difficult to establish, due to an 'array of unique characters' that are difficult to interpret in a phylogenetic framework. Tricholepidion (and therefore the whole family Lepidotrichidae) has been considered either as belonging to the order Zygentoma (Kristensen, 1997; Wygodzinsky, 1961), or basal to the rest of the Zygentoma plus the Pterygota (Beutel, 2001; Bitsch and Bitsch, 2000; Staniczek, 2000), although the significance of some of the morphological characters on which these analyses are based have been questioned (Dallai et al., 2001; Kristensen, 1997). If the latter hypothesis proved to be true, the family Lepidotrichidae, would better deserve the ordinal rank. Since studies based on morphological characters have failed to give a satisfactory answer, a broad scale molecular study is under way ((Nardi et al., 2001), Frati et al, submitted, il Gomphiocephalus) in order to use mitochondrial genome sequences to study the evolution and differentiation of the most basal hexapod groups, including Tricholepidion. Mitochondrial genomics, that is ...
Date: June 23, 2002
Creator: Nardi, F.; Frati, F.; Carapelli, A.; Dallai, R. & Boore, J.
Partner: UNT Libraries Government Documents Department

Wrinkles in the rare biosphere: Pyrosequencing errors can lead to artificial inflation of diversity estimates

Description: Massively parallel pyrosequencing of the small subunit (16S) ribosomal RNA gene has revealed that the extent of rare microbial populations in several environments, the 'rare biosphere', is orders of magnitude higher than previously thought. One important caveat with this method is that sequencing error could artificially inflate diversity estimates. Although the per-base error of 16S rDNA amplicon pyrosequencing has been shown to be as good as or lower than Sanger sequencing, no direct assessments of pyrosequencing errors on diversity estimates have been reported. Using only Escherichia coli MG1655 as a reference template, we find that 16S rDNA diversity is grossly overestimated unless relatively stringent read quality filtering and low clustering thresholds are applied. In particular, the common practice of removing reads with unresolved bases and anomalous read lengths is insufficient to ensure accurate estimates of microbial diversity. Furthermore, common and reproducible homopolymer length errors can result in relatively abundant spurious phylotypes further confounding data interpretation. We suggest that stringent quality-based trimming of 16S pyrotags and clustering thresholds no greater than 97% identity should be used to avoid overestimates of the rare biosphere.
Date: August 1, 2009
Creator: Kunin, Victor; Engelbrektson, Anna; Ochman, Howard & Hugenholtz, Philip
Partner: UNT Libraries Government Documents Department

Accurate phylogenetic classification of DNA fragments based onsequence composition

Description: Metagenome studies have retrieved vast amounts of sequenceout of a variety of environments, leading to novel discoveries and greatinsights into the uncultured microbial world. Except for very simplecommunities, diversity makes sequence assembly and analysis a verychallenging problem. To understand the structure a 5 nd function ofmicrobial communities, a taxonomic characterization of the obtainedsequence fragments is highly desirable, yet currently limited mostly tothose sequences that contain phylogenetic marker genes. We show that forclades at the rank of domain down to genus, sequence composition allowsthe very accurate phylogenetic 10 characterization of genomic sequence.We developed a composition-based classifier, PhyloPythia, for de novophylogenetic sequence characterization and have trained it on adata setof 340 genomes. By extensive evaluation experiments we show that themethodis accurate across all taxonomic ranks considered, even forsequences that originate fromnovel organisms and are as short as 1kb.Application to two metagenome datasets 15 obtained from samples ofphosphorus-removing sludge showed that the method allows the accurateclassification at genus level of most sequence fragments from thedominant populations, while at the same time correctly characterizingeven larger parts of the samples at higher taxonomic levels.
Date: May 1, 2006
Creator: McHardy, Alice C.; Garcia Martin, Hector; Tsirigos, Aristotelis; Hugenholtz, Philip & Rigoutsos, Isidore
Partner: UNT Libraries Government Documents Department

Microbial co-habitation and lateral gene transfer: what transposases can tell us

Description: Determining the habitat range for various microbes is not a simple, straightforward matter, as habitats interlace, microbes move between habitats, and microbial communities change over time. In this study, we explore an approach using the history of lateral gene transfer recorded in microbial genomes to begin to answer two key questions: where have you been and who have you been with? All currently sequenced microbial genomes were surveyed to identify pairs of taxa that share a transposase that is likely to have been acquired through lateral gene transfer. A microbial interaction network including almost 800 organisms was then derived from these connections. Although the majority of the connections are between closely related organisms with the same or overlapping habitat assignments, numerous examples were found of cross-habitat and cross-phylum connections. We present a large-scale study of the distributions of transposases across phylogeny and habitat, and find a significant correlation between habitat and transposase connections. We observed cases where phylogenetic boundaries are traversed, especially when organisms share habitats; this suggests that the potential exists for genetic material to move laterally between diverse groups via bridging connections. The results presented here also suggest that the complex dynamics of microbial ecology may be traceable in the microbial genomes.
Date: March 1, 2009
Creator: Hooper, Sean D.; Mavromatis, Konstantinos & Kyrpides, Nikos C.
Partner: UNT Libraries Government Documents Department