Description: Final report on MAGGIE. We set ambitious goals to model the functions of individual organisms and their community from molecular to systems scale. These scientific goals are driving the development of sophisticated algorithms to analyze large amounts of experimental measurements made using high throughput technologies to explain and predict how the environment influences biological function at multiple scales and how the microbial systems in turn modify the environment. By experimentally evaluating predictions made using these models we will test the degree to which our quantitative multiscale understanding wilt help to rationally steer individual microbes and their communities towards specific tasks. Towards this end we have made substantial progress towards understanding evolution of gene families, transcriptional structures, detailed structures of keystone molecular assemblies (proteins and complexes), protein interactions, biological networks, microbial interactions, and community structure. Using comparative analysis we have tracked the evolutionary history of gene functions to understand how novel functions evolve. One level up, we have used proteomics data, high-resolution genome tiling microarrays, and 5' RNA sequencing to revise genome annotations, discover new genes including ncRNAs, and map dynamically changing operon structures of five model organisms: For Desulfovibrio vulgaris Hildenborough, Pyrococcus furiosis, Sulfolobus solfataricus, Methanococcus maripaludis and Haiobacterium salinarum NROL We have developed machine learning algorithms to accurately identify protein interactions at a near-zero false positive rate from noisy data generated using tagfess complex purification, TAP purification, and analysis of membrane complexes. Combining other genome-scale datasets produced by ENIGMA (in particular, microarray data) and available from literature we have been able to achieve a true positive rate as high as 65% at almost zero false positives when applied to the manually curated training set. Applying this method to the data representing around a quarter of the fraction space for water soluble proteins in D. vulgaris, we obtained 854 reliable ...
Date: May 26, 2011
Creator: Baliga, Nitin S.
Item Type: Refine your search to only Text
Partner: UNT Libraries Government Documents Department