SPOCS User Guide

Description: SPOCS implements a graph-based ortholog prediction method to generate a simple tab-delimited table of orthologs, and in addition, html files that provide a visualization of the ortholog/paralog relationships to which gene/protein expression metadata may be overlaid.
Date: April 15, 2013
Creator: Curtis, Darren S.; Phillips, Aaron R. & McCue, Lee Ann
Partner: UNT Libraries Government Documents Department

Tales of diversity: Genomic and morphological characteristics of forty-six Arthrobacter phages

Description: This article describes the isolation and genomic characterization of 46 phages from environmental samples at various geographic locations in the U.S. infecting a single Arthrobacter sp. strain.
Date: July 17, 2017
Creator: Klyczek, Karen; Bonilla, J. Alfred; Jacobs-Sera, Deborah; Adair, Tamarah; Afram, Patricia; Allen, Katherine G. et al.
Partner: UNT College of Arts and Sciences

Towards richer descriptions of our collection of genomes andmetagenomes

Description: In this commentary, we advocate building a richer set of descriptions about our invaluable and exponentially growing collection of genomes and metagenomic datasets through the construction of consensus-driven data capture and exchange mechanisms. Standardization activities must proceed within the auspices of open-access and international working bodies, and to tackle the issues surrounding the development of better descriptions of genomic investigations we have formed the Genomic Standards Consortium (GSC). Here, we introduce the 'Minimum Information about a Genome Sequence' specification in the hopes of gaining wider participation in its development and discuss the resources that will be required to support it (standardization of annotations through the use of ontologies and mechanisms of metadata capture, exchange). As part of its wider goals, the GSC also strongly supports improving the 'transparency' of the information contained in existing genomic databases that contain calculated analyses and genomic annotations.
Date: June 1, 2006
Creator: Field, D.; Garrity, G; Selengut, J.; Sterk, P.; Tatusova, T.; Thomson, N. et al.
Partner: UNT Libraries Government Documents Department

Data Standards for the Genomes to Life Program

Description: Existing GTL Projects already have produced volumes of dataand, over the course of the next five years, will produce an estimatedhundreds, or possibly thousands, of terabytes of data from hundreds ofexperiments conducted at dozens of laboratories in National Labs anduniversities across the nation. These data will be the basis forpublications by individual researchers, research groups, andmulti-institutional collaborations, and the basis for future DOEdecisions on funding further research in bioremediation. The short-termand long-term value of the data to project participants, to the DOE, andto the nation depends, however, on being able to access the data and onhow, or whether, the data are archived. The ability to access data is thestarting point for data analysis and interpretation, data integration,data mining, and development of data-driven models. Limited orinefficient data access means that less data are analyzed in acost-effective and timely manner. Data production in the GTL Program willlikely outstrip, or may have already outstripped, the ability to analyzethe data. Being able to access data depends on two key factors: datastandards and implementation of the data standards. For the purpose ofthis proposal, a data standard is defined as a standard, documented wayin which data and information about the data are describe. The attributesof the experiment in which the data were collected need to be known andthe measurements corresponding to the data collected need to bedescribed. In general terms, a data standard could be a form (electronicor paper) that is completed by a researcher or a document that prescribeshow a protocol or experiment should be described in writing.Datastandards are critical to data access because they provide a frameworkfor organizing and managing data. Researchers spend significant amountsof time managing data and information about experiments using labnotebooks, computer files, Excel spreadsheets, etc. In addition, dataoutput format varies for different equipment and usually need to beformatted differently for ...
Date: January 31, 2004
Creator: Arkin, Adam; Ambrosiano, John; Babnigg, Gyorgy; Frank, Ed; Geist,Al; Giometti, Carol et al.
Partner: UNT Libraries Government Documents Department

Phylo-VISTA: An interactive visualization tool for multiple DNAsequence alignments

Description: Motivation. The power of multi-sequence comparison forbiological discovery is well established and sequence data from a growinglist of organisms is becoming available. Thus, a need exists forcomputational strategies to visually compare multiple aligned sequencesto support conservation analysis across various species. To be efficientthese visualization algorithms require the ability to universally handlea wide range of evolutionary distances while taking into accountphylogeny Results. We have developed Phylo-VISTA, an interactive tool foranalyzing multiple alignments by visualizing the similarity of DNAsequences among multiple species while considering their phylogenicrelationships. Features include a broad spectrum of resolution parametersfor examining the alignment and the ability to easily compare any subtreeof sequences within a complete alignment dataset. Phylo-VISTA uses VISTAconcepts that have been successfully applied previously to a wide rangeof comparative genomics data analysis problems. Availability Phylo-VISTAis an interactive java applet available for downloading athttp://graphics.cs.ucdavis.edu/~;nyshah/Phylo-VISTA. It is also availableon-line at http://www-gsd.lbl.gov/phylovista and is integrated with theglobal alignment program LAGAN athttp://lagan.stanford.edu.Contactphylovista@lbl.gov
Date: April 25, 2003
Creator: Shah, Nameeta; Couronne, Olivier; Pennacchio, Len A.; Brudno,Michael; Batzoglou, Serafim; Bethel, E. Wes et al.
Partner: UNT Libraries Government Documents Department

Comparing Patterns of Natural Selection Across Species Using Selective Signatures

Description: Comparing gene expression profiles over many different conditions has led to insights that were not obvious from single experiments. In the same way, comparing patterns of natural selection across a set of ecologically distinct species may extend what can be learned from individual genome-wide surveys. Toward this end, we show how variation in protein evolutionary rates, after correcting for genome-wide effects such as mutation rate and demographic factors, can be used to estimate the level and types of natural selection acting on genes across different species. We identify unusually rapidly and slowly evolving genes, relative to empirically derived genome-wide and gene family-specific background rates for 744 core protein families in 30 gamma-proteobacterial species. We describe the pattern of fast or slow evolution across species as the 'selective signature' of a gene. Selective signatures represent a profile of selection across species that is predictive of gene function: pairs of genes with correlated selective signatures are more likely to share the same cellular function, and genes in the same pathway can evolve in concert. For example, glycolysis and phenylalanine metabolism genes evolve rapidly in Idiomarina loihiensis, mirroring an ecological shift in carbon source from sugars to amino acids. In a broader context, our results suggest that the genomic landscape is organized into functional modules even at the level of natural selection, and thus it may be easier than expected to understand the complex evolutionary pressures on a cell.
Date: December 18, 2007
Creator: Alm, Eric J.; Shapiro, B. Jesse & Alm, Eric J.
Partner: UNT Libraries Government Documents Department

Multiple Whole Genome Alignments and Novel Biomedical Applicationsat the VISTA Portal

Description: The VISTA portal for comparative genomics is designed togive biomedical scientists a unified set of tools to lead them from theraw DNA sequences through the alignment and annotation to thevisualization of the results. The VISTA portal also hosts alignments of anumber of genomes computed by our group, allowing users to study regionsof their interest without having to manually download the individualsequences. Here we describe various algorithmic and functionalimprovements implemented in the VISTA portal over the last two years. TheVISTA Portal is accessible at http://genome.lbl.gov/vista.
Date: February 1, 2007
Creator: Brudno, Michael; Poliakov, Alexander; Minovitsky, Simon; Ratnere,Igor & Dubchak, Inna
Partner: UNT Libraries Government Documents Department

Reference set of regulons in Desulfovibrionales inferred by comparative genomics approach

Description: in this study, we carried out large-scale comparative genomics analysis of regulatory interactions in Desulfovibrio vulgaris and 12 related genomes from Desulfovibrionales order using our recently developed web server RegPredict (http://regpredict.lbl.gov). An overall reference collection of 26 Desulfovibrionales regulogs can be accessed through RegPrecise database (http://regpredict.lbl.gov).
Date: November 15, 2010
Creator: Kazakov, A.E.; Rodionov, D.A.; Price, M.N.; Arkin, A.P.; Dubchak, I. & Novichkov, P.S.
Partner: UNT Libraries Government Documents Department

Coordination of Programs on Domestic Animal Genomics: The Federal Framework

Description: This report discusses progress by Federal agencies dealing with domestic animal genomics. The work represents an increase in the understanding of domestic animals such as sheep, cattle, swine, bees, and others. Knowledge in this area is crucial for better understanding animal diseases such at bovine spongiform encephalopathy (BSE, or "mad cow disease")
Date: June 2004
Creator: National Science and Technology Council (U.S.). Interagency Working Group on Domestic Animal Genomics.
Partner: UNT Libraries

Integrated Genome-Based Studies of Shewanella Ecophysiology

Description: This project had as its goals the understanding of the ecophysiology of the genus Shewanella using various genomics approaches. As opposed to other programs involving Shewanella, this one branched out into the various areas in which Shewanella cells are active, and included both basic and applied studies. All of the work was, to some extent, related to the ability of the bacteria to accomplish electron exchange between the cell and solid state electron acceptors and/or electron donors, a process we call Extracellular Electron Transport, or EET. The major accomplishments related to several different areas: Basic Science Studies: 1. Genetics and genomics of nitrate reduction, resulting in elucidation of atypical nitrate reduction systems in Shewanella oneidensis (MR-1)[2]. 2. Influence of bacterial strain and growth conditions on iron reduction, showing that rates of reduction, extents of reduction, and the formation of secondary minerals were different for different strains of Shewanella [3,4,9]. 3. Comparative genomics as a tool for comparing metabolic capacities of different Shewanella strains, and for predicting growth and metabolism [6,10,15]. In these studies, collaboration with ORNL, PNNL, and 4. Basic studies of electron transport in strain MR-1, both to poised electrodes, and via conductive nanowires [12,13]. This included the first accurate measurements of electrical energy generation by a single cell during electrode growth [12], and the demonstration of electrical conductivity along the length of bacterial nanowires [13]. 5. Impact of surface charge and electron flow on cell movement, cell attachment, cell growth, and biofilm formation [7.18]. The demonstration that interaction with solid state electron acceptors resulted in increased motility [7] led to the description of a phenomenon called electrokinesis. The importance of this for biofilm formation and for electron flow was hypothesized by Nealson & Finkel [18], and is now under study in several laboratories. Applications: 1. Corrosion: Electron flow ...
Date: October 15, 2013
Creator: Nealson, Kenneth H.
Partner: UNT Libraries Government Documents Department

Process Control Minitoring by Stress Response

Description: Environmental contamination with a variety of pollutants hasprompted the development of effective bioremediation strategies. But howcan these processes be best monitored and controlled? One avenue underinvestigation is the development of stress response systems as tools foreffective and general process control. Although the microbial stressresponse has been the subject of intensive laboratory investigation, theenvironmental reflection of the laboratory response to specific stresseshas been little explored. However, it is only within an environmentalcontext, in which microorganisms are constantly exposed to multiplechanging environmental stresses, that there will be full understanding ofmicrobial adaptive resiliency. Knowledge of the stress response in theenvironment will facilitate the control of bioremediation and otherprocesses mediated by complex microbial communities.
Date: April 17, 2006
Creator: Hazen, Terry C. & Stahl, David A.
Partner: UNT Libraries Government Documents Department

Implications of structural genomics target selection strategies: Pfam5000, whole genome, and random approaches

Description: The structural genomics project is an international effort to determine the three-dimensional shapes of all important biological macromolecules, with a primary focus on proteins. Target proteins should be selected according to a strategy which is medically and biologically relevant, of good value, and tractable. As an option to consider, we present the Pfam5000 strategy, which involves selecting the 5000 most important families from the Pfam database as sources for targets. We compare the Pfam5000 strategy to several other proposed strategies that would require similar numbers of targets. These include including complete solution of several small to moderately sized bacterial proteomes, partial coverage of the human proteome, and random selection of approximately 5000 targets from sequenced genomes. We measure the impact that successful implementation of these strategies would have upon structural interpretation of the proteins in Swiss-Prot, TrEMBL, and 131 complete proteomes (including 10 of eukaryotes) from the Proteome Analysis database at EBI. Solving the structures of proteins from the 5000 largest Pfam families would allow accurate fold assignment for approximately 68 percent of all prokaryotic proteins (covering 59 percent of residues) and 61 percent of eukaryotic proteins (40 percent of residues). More fine-grained coverage which would allow accurate modeling of these proteins would require an order of magnitude more targets. The Pfam5000 strategy may be modified in several ways, for example to focus on larger families, bacterial sequences, or eukaryotic sequences; as long as secondary consideration is given to large families within Pfam, coverage results vary only slightly. In contrast, focusing structural genomics on a single tractable genome would have only a limited impact in structural knowledge of other proteomes: a significant fraction (about 30-40 percent of the proteins, and 40-60 percent of the residues) of each proteome is classified in small families, which may have little overlap with other ...
Date: July 14, 2004
Creator: Chandonia, John-Marc & Brenner, Steven E.
Partner: UNT Libraries Government Documents Department

Final Technical Report-the Ecology and Genomics of co2 Fixatiion in Oceanic River Plumes

Description: Oceanic river plumes represent some of the most productive environments on Earth. As major conduits for freshwater and nutrients into the coastal ocean, their impact on water column ecosystems extend for up to a thousand km into oligotrophic oceans. Upon entry into the oceans rivers are tremendous sources of CO2 and dissolved inorganic carbon (DIC). Yet owing to increased light transmissivity from sediment deposition coupled with the influx of nutrients, dramatic CO2 drawdown occurs, and plumes rapidly become sinks for CO2. Using state-of-the-art gene expression technology, we have examined the molecular biodiversity of CO2 fixation in the Mississippi River Plume (MRP; two research cruises) and the Orinoco River Plume (ORP; one cruise). When the MRP extends far into the Gulf because of entrainment with the Loop Current, MRP production (carbon fixation) can account for up to 41% of the surface production in the Gulf of Mexico. Nearer-shore plume stations (“high plume,” salinity< 32 ppt) had tremendous CO2 drawdown that was correlated to heterokont (principally diatom) carbon fixation gene expression. The principal form of nitrogen for this production based upon 15N studies was urea, believed to be from anthropogenic origin (fertilizer) from the MRP watershed. Intermediate plume environments (salinity 34 ppt) were characterized by high levels of Synechococcuus carbon fixation that was fueled by regenerated ammonium. Non-plume stations were characterized by high light Prochlorococcus carbon fixation gene expression that was positively correlated with dissolved CO2 concentrations. Although data from the ORP cruise is still being analyzed, some similarities and striking differences were found between the ORP and MRP. High levels of heterokont carbon fixation gene expression that correlated with CO2 drawdown were observed in the high plume, yet the magnitude of this phenomenon was far below that of the MRP, most likely due to the lower levels of anthropogenic nutrient input. ...
Date: June 21, 2013
Creator: Paul, John H.
Partner: UNT Libraries Government Documents Department