172 Matching Results

Search Results

Advanced search parameters have been applied.

Endogenous short RNAs generated by Dicer 2 and RNA-dependent RNA polymerase 1 regulate mRNAs in the basal fungus Mucor circinelloides

Description: Endogenous short RNAs (esRNAs) play diverse roles in eukaryotes and usually are produced from double-stranded RNA (dsRNA) by Dicer. esRNAs are grouped into different classes based on biogenesis and function but not all classes are present in all three eukaryotic kingdoms. The esRNA register of fungi is poorly described compared to other eukaryotes and it is not clear what esRNA classes are present in this kingdom and whether they regulate the expression of protein coding genes. However, evidence that some dicer mutant fungi display altered phenotypes suggests that esRNAs play an important role in fungi. Here, we show that the basal fungus Mucor circinelloides produces new classes of esRNAs that map to exons and regulate the expression of many protein coding genes. The largest class of these exonic-siRNAs (ex-siRNAs) are generated by RNA-dependent RNA Polymerase 1 (RdRP1) and dicer-like 2 (DCL2) and target the mRNAs of protein coding genes from which they were produced. Our results expand the range of esRNAs in eukaryotes and reveal a new role for esRNAs in fungi
Date: September 1, 2011
Creator: Grigoriev, Igor; Nicolas, Francisco; Moxon, Simon; Haro, Juan de; Calo, Silvia; Torres-Martinez, Santiago et al.
Partner: UNT Libraries Government Documents Department

Next-generation transcriptome assembly

Description: Transcriptomics studies often rely on partial reference transcriptomes that fail to capture the full catalog of transcripts and their variations. Recent advances in sequencing technologies and assembly algorithms have facilitated the reconstruction of the entire transcriptome by deep RNA sequencing (RNA-seq), even without a reference genome. However, transcriptome assembly from billions of RNA-seq reads, which are often very short, poses a significant informatics challenge. This Review summarizes the recent developments in transcriptome assembly approaches - reference-based, de novo and combined strategies-along with some perspectives on transcriptome assembly in the near future.
Date: September 1, 2011
Creator: Martin, Jeffrey A. & Wang, Zhong
Partner: UNT Libraries Government Documents Department

Incorporating Genomics and Bioinformatics across the Life Sciences Curriculum

Description: Undergraduate life sciences education needs an overhaul, as clearly described in the National Research Council of the National Academies publication BIO 2010: Transforming Undergraduate Education for Future Research Biologists. Among BIO 2010's top recommendations is the need to involve students in working with real data and tools that reflect the nature of life sciences research in the 21st century. Education research studies support the importance of utilizing primary literature, designing and implementing experiments, and analyzing results in the context of a bona fide scientific question in cultivating the analytical skills necessary to become a scientist. Incorporating these basic scientific methodologies in undergraduate education leads to increased undergraduate and post-graduate retention in the sciences. Toward this end, many undergraduate teaching organizations offer training and suggestions for faculty to update and improve their teaching approaches to help students learn as scientists, through design and discovery (e.g., Council of Undergraduate Research [www.cur.org] and Project Kaleidoscope [www.pkal.org]). With the advent of genome sequencing and bioinformatics, many scientists now formulate biological questions and interpret research results in the context of genomic information. Just as the use of bioinformatic tools and databases changed the way scientists investigate problems, it must change how scientists teach to create new opportunities for students to gain experiences reflecting the influence of genomics, proteomics, and bioinformatics on modern life sciences research. Educators have responded by incorporating bioinformatics into diverse life science curricula. While these published exercises in, and guidelines for, bioinformatics curricula are helpful and inspirational, faculty new to the area of bioinformatics inevitably need training in the theoretical underpinnings of the algorithms. Moreover, effectively integrating bioinformatics into courses or independent research projects requires infrastructure for organizing and assessing student work. Here, we present a new platform for faculty to keep current with the rapidly changing field of bioinformatics, the Integrated ...
Date: August 1, 2011
Creator: Ditty, Jayna L.; Kvaal, Christopher A.; Goodner, Brad; Freyermuth, Sharyn K.; Bailey, Cheryl; Britton, Robert A. et al.
Partner: UNT Libraries Government Documents Department

meraculous: de novo genome assembly with short paired-end reads

Description: We describe a new algorithm, meraculous, for whole genome assembly of deep paired-end short reads, and apply it to the assembly of a dataset of paired 75-bp Illumina reads derived from the 15.4 megabase genome of the haploid yeast Pichia stipitis. More than 95% of the genome is recovered, with no errors; half the assembled sequence is in contigs longer than 101 kilobases and in scaffolds longer than 269 kilobases. Incorporating fosmid ends recovers entire chromosomes. Meraculous relies on an efficient and conservative traversal of the subgraph of the k-mer (deBruijn) graph of oligonucleotides with unique high quality extensions in the dataset, avoiding an explicit error correction step as used in other short-read assemblers. A novel memory-efficient hashing scheme is introduced. The resulting contigs are ordered and oriented using paired reads separated by ~280 bp or ~3.2 kbp, and many gaps between contigs can be closed using paired-end placements. Practical issues with the dataset are described, and prospects for assembling larger genomes are discussed.
Date: August 1, 2011
Creator: Chapman, Jarrod A.; Ho, Isaac; Sunkara, Sirisha; Luo, Shujun; Schroth, Gary P. & Rokhsar, Daniel S.
Partner: UNT Libraries Government Documents Department

Reenacting the birth of an intron

Description: An intron is an extended genomic feature whose function requires multiple constrained positions - donor and acceptor splice sites, a branch point, a polypyrimidine tract and suitable splicing enhancers - that may be distributed over hundreds or thousands of nucleotides. New introns are therefore unlikely to emerge by incremental accumulation of functional sub-elements. Here we demonstrate that a functional intron can be created de novo in a single step by a segmental genomic duplication. This experiment recapitulates in vivo the birth of an intron that arose in the ancestral jawed vertebrate lineage nearly half a billion years ago.
Date: July 1, 2011
Creator: Hellsten, Uffe; Aspden, Julie L.; Rio, Donald C. & Rokhsar, Daniel S.
Partner: UNT Libraries Government Documents Department

Defining the maize transcriptome de novo using deep RNA-Seq

Description: De novo assembly of the transcriptome is crucial for functional genomics studies in bioenergy research, since many of the organisms lack high quality reference genomes. In a previous study we successfully de novo assembled simple eukaryote transcriptomes exclusively from short Illumina RNA-Seq reads [1]. However, extensive alternative splicing, present in most of the higher eukaryotes, poses a significant challenge for current short read assembly processes. Furthermore, the size of next-generation datasets, often large for plant genomes, presents an informatics challenge. To tackle these challenges we present a combined experimental and informatics strategy for de novo assembly in higher eukaryotes. Using maize as a test case, preliminary results suggest our approach can resolve transcript variants and improve gene annotations.
Date: June 2, 2011
Creator: Martin, Jeffrey; Gross, Stephen; Choi, Cindy; Zhang, Tao; Lindquist, Erika; Wei, Chia-Lin et al.
Partner: UNT Libraries Government Documents Department

Engineering the Cyanobacterial Carbon Concentrating Mechanism for Enhanced CO2 Capture and Fixation

Description: In cyanobacteria CO2 fixation is localized in a special proteinaceous organelle, the carboxysome. The CO2 fixation enzymes are encapsulated by a selectively permeable protein shell. By structurally and functionally characterizing subunits of the carboxysome shell and the encapsulated proteins, we hope to understand what regulates the shape, assembly and permeability of the shell, as well as the targeting mechanism and organization of the encapsulated proteins. This knowledge will be used to enhance CO2 fixation in both cyanobacteria and plants through synthetic biology. The same strategy can also serve as a template for the production of modular synthetic bacterial organelles. Our research is conducted using a variety of techniques such as genomic sequencing and analysis, transcriptional regulation, DNA synthesis, synthetic biology, protein crystallization, Small Angle X-ray Scattering (SAXS), protein-protein interaction assays and phenotypic characterization using various types of cellular imaging, e.g. fluorescence microscopy, Transmission Electron Microscopy (TEM), and Soft X-ray Tomography (SXT).
Date: June 2, 2011
Creator: Sandh, Gustaf; Cai, Fei; Shih, Patrick; Kinney, James; Axen, Seth; Salmeen, Annette et al.
Partner: UNT Libraries Government Documents Department

Expansion of the Genomic Encyclopedia of Bacteria and Archaea

Description: To date the vast majority of bacterial and archaeal genomes sequenced are of rather limited phylogenetic diversity as they were chosen based on their physiology and/ or medical importance. The Genomic Encyclopedia of Bacteria and Archaea (GEBA) project (Wu et al. 2009) is aimed at systematically filling the gaps of the tree of life with phylogenetically diverse reference genomes. However more than 99 percent of microorganisms elude current culturing attempts, severely limiting the ability to recover complete or even partial genomes of these largely mysterious species. These limitations gave rise to the GEBA uncultured project. Here we propose to use single cell genomics to massively expand the Genomic Encyclopedia of Bacteria and Archaea by targeting 80 single cell representatives of uncultured candidate phyla which have no or very few cultured representatives. Generating these reference genomes of uncultured microbes will dramatically increase the discovery rate of novel protein families and biological functions, shed light on the numerous underrepresented phyla that likely play important roles in the environment, and will assist in improving the reconstruction of the evolutionary history of Bacteria and Archaea. Moreover, these data will improve our ability to interpret metagenomics sequence data from diverse environments, which will be of tremendous value for microbial ecology and evolutionary studies to come.
Date: June 2, 2011
Creator: Rinke, Christian; Sczyrba, Alex; Malfatti, Stephanie; Lee, Janey; Cheng, Jan-Fang; Stepanauskas, Ramunas et al.
Partner: UNT Libraries Government Documents Department

Metagenomic gene annotation by a homology-independent approach

Description: Fully understanding the genetic potential of a microbial community requires functional annotation of all the genes it encodes. The recently developed deep metagenome sequencing approach has enabled rapid identification of millions of genes from a complex microbial community without cultivation. Current homology-based gene annotation fails to detect distantly-related or structural homologs. Furthermore, homology searches with millions of genes are very computational intensive. To overcome these limitations, we developed rhModeller, a homology-independent software pipeline to efficiently annotate genes from metagenomic sequencing projects. Using cellulases and carbonic anhydrases as two independent test cases, we demonstrated that rhModeller is much faster than HMMER but with comparable accuracy, at 94.5percent and 99.9percent accuracy, respectively. More importantly, rhModeller has the ability to detect novel proteins that do not share significant homology to any known protein families. As {approx}50percent of the 2 million genes derived from the cow rumen metagenome failed to be annotated based on sequence homology, we tested whether rhModeller could be used to annotate these genes. Preliminary results suggest that rhModeller is robust in the presence of missense and frameshift mutations, two common errors in metagenomic genes. Applying the pipeline to the cow rumen genes identified 4,990 novel cellulases candidates and 8,196 novel carbonic anhydrase candidates.In summary, we expect rhModeller to dramatically increase the speed and quality of metagnomic gene annotation.
Date: June 2, 2011
Creator: Froula, Jeff; Zhang, Tao; Salmeen, Annette; Hess, Matthias; Kerfeld, Cheryl A.; Wang, Zhong et al.
Partner: UNT Libraries Government Documents Department

ChIP-seq Mapping of Distant-Acting Enhancers and Their In Vivo Activities

Description: The genomic location and function of most distant-acting transcriptional enhancers in the human genome remains unknown We performed ChIP-seq for various transcriptional coactivator proteins (such as p300) directly from different embryonic mouse tissues, identifying thousands of binding sitesTransgenic mouse experiments show that p300 and other co-activator peaks are highly predictive of genomic location AND tissue-specific activity patterns of distant-acting enhancersMost enhancers are active only in one or very few tissues Genomic location of tissue-specific p300 peaks correlates with tissue-specific expression of nearby genes Most binding sites are conserved, but the global degree of conservation varies between tissues
Date: June 1, 2011
Creator: Visel, Axel & Pennacchio, Len A.
Partner: UNT Libraries Government Documents Department

Defining the maize transcriptome de novo using deep RNA-Seq

Description: De novo assembly of the transcriptome is crucial for functional genomics studies in bioenergy research, since many of the organisms lack high quality reference genomes. In a previous study we successfully de novo assembled simple eukaryote transcriptomes exclusively from short Illumina RNA-Seq reads [1]. However, extensive alternative splicing, present in most of the higher eukaryotes, poses a significant challenge for current short read assembly processes. Furthermore, the size of next-generation datasets, often large for plant genomes, presents an informatics challenge. To tackle these challenges we present a combined experimental and informatics strategy for de novo assembly in higher eukaryotes. Using maize as a test case, preliminary results suggest our approach can resolve transcript variants and improve gene annotations.
Date: June 1, 2011
Creator: Martin, Jeffrey; Gross, Stephen; Choi, Cindy; Zhang, Tao; Lindquist, Erika; Wei, Chia-Lin et al.
Partner: UNT Libraries Government Documents Department

Agave: a biofuel feedstock for arid and semi-arid environments

Description: Efficient production of plant-based, lignocellulosic biofuels relies upon continued improvement of existing biofuel feedstock species, as well as the introduction of newfeedstocks capable of growing on marginal lands to avoid conflicts with existing food production and minimize use of water and nitrogen resources. To this end, specieswithin the plant genus Agave have recently been proposed as new biofuel feedstocks. Many Agave species are adapted to hot and arid environments generally unsuitable forfood production, yet have biomass productivity rates comparable to other second-generation biofuel feedstocks such as switchgrass and Miscanthus. Agavesachieve remarkable heat tolerance and water use efficiency in part through a Crassulacean Acid Metabolism (CAM) mode of photosynthesis, but the genes andregulatory pathways enabling CAM and thermotolerance in agaves remain poorly understood. We seek to accelerate the development of agave as a new biofuelfeedstock through genomic approaches using massively-parallel sequencing technologies. First, we plan to sequence the transcriptome of A. tequilana to provide adatabase of protein-coding genes to the agave research community. Second, we will compare transcriptome-wide gene expression of agaves under different environmentalconditions in order to understand genetic pathways controlling CAM, water use efficiency, and thermotolerance. Finally, we aim to compare the transcriptome of A.tequilana with that of other Agave species to gain further insight into molecular mechanisms underlying traits desirable for biofuel feedstocks. These genomicapproaches will provide sequence and gene expression information critical to the breeding and domestication of Agave species suitable for biofuel production.
Date: May 31, 2011
Creator: Gross, Stephen; Martin, Jeffrey; Simpson, June; Wang, Zhong & Visel, Axel
Partner: UNT Libraries Government Documents Department

Hemicellulolytic organisms in the particle-associated microbiota of the hoatzin crop

Description: The hoatzin (Opisthocomus hoazin) is a South American herbivorous bird, that has an enlarged crop analogous to the rumen, where foregut microbes degrade the otherwise indigestible plant materials, providing energy to the host. The crop harbors an impressive array of microorganisms with potentially novel cellulolytic enzymes. Thie study describes the composition ofthe particle-associated microbiota in the hoatzin crop, combining a survey of 16S rRNA genes in 7 adult birds and metagenome sequencing of two animals. The pyrotag survey demonstrates that Prevotellaceae, are the most abundant and ubiquitous taxa, suggesting that the degradation of hemicellulose is an important activity in the crop. Nonetheless, preliminary results from the metagnome of the particle-associated microbiota of two adult birds show that the crop microbiome contains a high number of genes encoding cellulases (such as GH5) more abundant than those of the termite gut, as well as genes encoding hemicellulases. These preliminary results show that the carbohydate-active enzyme genes in the cropmetagenome could be a source of biochemical catalysts able to deconstruct plant biomass.
Date: May 31, 2011
Creator: Godoy-Vitorino, Filipa; Malfatti, Stephanie; Garcia-Amado, Maria A.; Dominguez-Bello, Maria Gloria; Hugenholtz, Phillip & Tringe, Susannah
Partner: UNT Libraries Government Documents Department

The RNA-Seq Analysis pipeline on Galaxy

Description: Q: How do I know my RNA-Seq experiments worked well A: RNA-Seq QC PipelineQ: How do I detect transcripts which are over expressed or under expressed in my samples A: Counting and Statistic AnalysisQ: What do I do if I don't have a reference genome A: Rnnotator de novo Assembly.
Date: May 31, 2011
Creator: Meng, Xiandong; Martin, Jeffrey & Wang, Zhong
Partner: UNT Libraries Government Documents Department

Using synthetic biology to screen for functional diversity of GH1 enzymes

Description: Advances in next-generation sequencing technologies have enabled single genomes as well as complex environmental samples (metagenomes) to be comprehensively sequenced on a routine basis. Bioinformatics analysis of the resulting sequencing data reveals a continually expanding catalogue of predicted proteins ( 14 million as of April 2011), 75 percent of which are associated with functional annotation (COG, Pfam, Enzyme, Kegg, etc). These predicted proteins cover the full spectrum of known pathways and functional activities, including many novel biocatalysts that are expected to significantly contribute to the development of clean technologies including biomass degradation, lipid transformation for biodiesel generation, intermediates for polymer production, carbon capture, and bioremediation.
Date: May 31, 2011
Creator: Deutsch, Sam; Datta, Supratim; Hamilton, Matthew; Friedland, Greg; D'Haeseleer, Patrik; Chen, Jan-Fang et al.
Partner: UNT Libraries Government Documents Department

Novel Insights into the Diversity of Catabolic Metabolism from Ten Haloarchaeal Genomes

Description: The extremely halophilic archaea are present worldwide in saline environments and have important biotechnological applications. Ten complete genomes of haloarchaea are now available, providing an opportunity for comparative analysis. We report here the comparative analysis of five newly sequenced haloarchaeal genomes with five previously published ones. Whole genome trees based on protein sequences provide strong support for deep relationships between the ten organisms. Using a soft clustering approach, we identified 887 protein clusters present in all halophiles. Of these core clusters, 112 are not found in any other archaea and therefore constitute the haloarchaeal signature. Four of the halophiles were isolated from water, and four were isolated from soil or sediment. Although there are few habitat-specific clusters, the soil/sediment halophiles tend to have greater capacity for polysaccharide degradation, siderophore synthesis, and cell wall modification. Halorhabdus utahensis and Haloterrigena turkmenica encode over forty glycosyl hydrolases each, and may be capable of breaking down naturally occurring complex carbohydrates. H. utahensis is specialized for growth on carbohydrates and has few amino acid degradation pathways. It uses the non-oxidative pentose phosphate pathway instead of the oxidative pathway, giving it more flexibility in the metabolism of pentoses. These new genomes expand our understanding of haloarchaeal catabolic pathways, providing a basis for further experimental analysis, especially with regard to carbohydrate metabolism. Halophilic glycosyl hydrolases for use in biofuel production are more likely to be found in halophiles isolated from soil or sediment.
Date: May 3, 2011
Creator: Anderson, Iain; Scheuner, Carmen; Goker, Markus; Mavromatis, Kostas; Hooper, Sean D.; Porat, Iris et al.
Partner: UNT Libraries Government Documents Department

The Arabidopsis lyrata genome sequence and the basis of rapid genome size change

Description: In our manuscript, we present a high-quality genome sequence of the Arabidopsis thaliana relative, Arabidopsis lyrata, produced by dideoxy sequencing. We have performed the usual types of genome analysis (gene annotation, dN/dS studies etc. etc.), but this is relegated to the Supporting Information. Instead, we focus on what was a major motivation for sequencing this genome, namely to understand how A. thaliana lost half its genome in a few million years and lived to tell the tale. The rather surprising conclusion is that there is not a single genomic feature that accounts for the reduced genome, but that every aspect centromeres, intergenic regions, transposable elements, gene family number is affected through hundreds of thousands of cuts. This strongly suggests that overall genome size in itself is what has been under selection, a suggestion that is strongly supported by our demonstration (using population genetics data from A. thaliana) that new deletions seem to be driven to fixation.
Date: April 29, 2011
Creator: Hu, Tina T.; Pattyn, Pedro; Bakker, Erica G.; Cao, Jun; Cheng, Jan-Fang; Clark, Richard M. et al.
Partner: UNT Libraries Government Documents Department

Fueling the Future with Fungal Genomics

Description: Fungi play important roles across the range of current and future biofuel production processes. From crop/feedstock health to plant biomass saccharification, enzyme production to bioprocesses for producing ethanol, higher alcohols or future hydrocarbon biofuels, fungi are involved. Research and development are underway to understand the underlying biological processes and improve them to make bioenergy production efficient on an industrial scale. Genomics is the foundation of the systems biology approach that is being used to accelerate the research and development efforts across the spectrum of topic areas that impact biofuels production. In this review, we discuss past, current and future advances made possible by genomic analyses of the fungi that impact plant/feedstock health, degradation of lignocellulosic biomass and fermentation of sugars to ethanol, hydrocarbon biofuels and renewable chemicals.
Date: April 29, 2011
Creator: Grigoriev, Igor V.; Cullen, Daniel; Hibbett, David; Goodwin, Stephen B.; Jeffries, Thomas W.; Kubicek, Christian P. et al.
Partner: UNT Libraries Government Documents Department

Integrated genomic and transcriptomic analysis reveals mycoparasitism as the ancestoral life style of Trichoderma

Description: Mycoparasitism, a lifestyle where one fungus is parasitic on another fungus has special relevance when the prey is a plant pathogen, providing a strategy for biological control of pests for plant protection. Probably, the most studied biocontrol agents are species of the genus Hypocrea/Trichoderma.
Date: April 29, 2011
Creator: Kubicek, Christian P.; Herrera-Estrella, Alfredo; Seidl, Verena; Crom, Sté Le, phane; Martinez, Diego A. et al.
Partner: UNT Libraries Government Documents Department

The compact Selaginella genome identifies changes in gene content associated with the evolution of vascular plants

Description: We report the genome sequence of the nonseed vascular plant, Selaginella moellendorffii, and by comparative genomics identify genes that likely played important roles in the early evolution of vascular plants and their subsequent evolution
Date: April 28, 2011
Creator: Grigoriev, Igor V.; Banks, Jo Ann; Nishiyama, Tomoaki; Hasebe, Mitsuyasu; Bowman, John L.; Gribskov, Michael et al.
Partner: UNT Libraries Government Documents Department

Comparative genomics of citric-acid producing Aspergillus niger ATCC 1015 versus enzyme-producing CBS 513.88

Description: The filamentous fungus Aspergillus niger exhibits great diversity in its phenotype. It is found globally, both as marine and terrestrial strains, produces both organic acids and hydrolytic enzymes in high amounts, and some isolates exhibit pathogenicity. Although the genome of an industrial enzyme-producing A. niger strain (CBS 513.88) has already been sequenced, the versatility and diversity of this species compels additional exploration. We therefore undertook whole genome sequencing of the acidogenic A. niger wild type strain (ATCC 1015), and produced a genome sequence of very high quality. Only 15 gaps are present in the sequence and half the telomeric regions have been elucidated. Moreover, sequence information from ATCC 1015 was utilized to improve the genome sequence of CBS 513.88. Chromosome-level comparisons uncovered several genome rearrangements, deletions, a clear case of strain-specific horizontal gene transfer, and identification of 0.8 megabase of novel sequence. Single nucleotide polymorphisms per kilobase (SNPs/kb) between the two strains were found to be exceptionally high (average: 7.8, maximum: 160 SNPs/kb). High variation within the species was confirmed with exo-metabolite profiling and phylogenetics. Detailed lists of alleles were generated, and genotypic differences were observed to accumulate in metabolic pathways essential to acid production and protein synthesis. A transcriptome analysis revealed up-regulation of the electron transport chain, specifically the alternative oxidative pathway in ATCC 1015, while CBS 513.88 showed significant up-regulation of genes relevant to glucoamylase A production, such as tRNA-synthases and protein transporters. Our results and datasets from this integrative systems biology analysis resulted in a snapshot of fungal evolution and will support further optimization of cell factories based on filamentous fungi.[Supplemental materials (10 figures, three text documents and 16 tables) have been made available. The whole genome sequence for A. niger ATCC 1015 is available from NBCI under acc. no ACJE00000000. The up-dated sequence for A. niger ...
Date: April 28, 2011
Creator: Grigoriev, Igor V.; Baker, Scott E.; Andersen, Mikael R.; Salazar, Margarita P.; Schaap, Peter J.; Vondervoot, Peter J.I. van de et al.
Partner: UNT Libraries Government Documents Department

Obligate Biotrophy Features Unraveled by the Genomic Analysis of the Rust Fungi, Melampsora larici-populina and Puccinia graminis f. sp. tritici

Description: Rust fungi are some of the most devastating pathogens of crop plants. They are obligate biotrophs, which extract nutrients only from living plant tissues and cannot grow apart from their hosts. Their lifestyle has slowed the dissection of molecular mechanisms underlying host invasion and avoidance or suppression of plant innate immunity. We sequenced the 101 mega base pair genome of Melampsora larici-populina, the causal agent of poplar leaf rust, and the 89 mega base pair genome of Puccinia graminis f. sp. tritici, the causal agent of wheat and barley stem rust. We then compared the 16,841 predicted proteins of M. larici-populina to the 18,241 predicted proteins of P. graminis f. sp tritici. Genomic features related to their obligate biotrophic life-style include expanded lineage-specific gene families, a large repertoire of effector-like small secreted proteins (SSPs), impaired nitrogen and sulfur assimilation pathways, and expanded families of amino-acid, oligopeptide and hexose membrane transporters. The dramatic upregulation of transcripts coding for SSPs, secreted hydrolytic enzymes, and transporters in planta suggests that they play a role in host infection and nutrient acquisition. Some of these genomic hallmarks are mirrored in the genomes of other microbial eukaryotes that have independently evolved to infect plants, indicating convergent adaptation to a biotrophic existence inside plant cells
Date: April 27, 2011
Creator: Duplessis, Sebastien; Cuomo, Christina A.; Lin, Yao-Cheng; Aerts, Andrea; Tisserant, Emilie; Veneault-Fourrey, Claire et al.
Partner: UNT Libraries Government Documents Department

Linking Advanced Visualization and MATLAB for the Analysis of 3D Gene Expression Data

Description: Three-dimensional gene expression PointCloud data generated by the Berkeley Drosophila Transcription Network Project (BDTNP) provides quantitative information about the spatial and temporal expression of genes in early Drosophila embryos at cellular resolution. The BDTNP team visualizes and analyzes Point-Cloud data using the software application PointCloudXplore (PCX). To maximize the impact of novel, complex data sets, such as PointClouds, the data needs to be accessible to biologists and comprehensible to developers of analysis functions. We address this challenge by linking PCX and Matlab via a dedicated interface, thereby providing biologists seamless access to advanced data analysis functions and giving bioinformatics researchers the opportunity to integrate their analysis directly into the visualization application. To demonstrate the usefulness of this approach, we computationally model parts of the expression pattern of the gene even skipped using a genetic algorithm implemented in Matlab and integrated into PCX via our Matlab interface.
Date: March 30, 2011
Creator: Ruebel, Oliver; Keranen, Soile V.E.; Biggin, Mark; Knowles, David W.; Weber, Gunther H.; Hagen, Hans et al.
Partner: UNT Libraries Government Documents Department

BioPig: Developing Cloud Computing Applications for Next-Generation Sequence Analysis

Description: Next Generation sequencing is producing ever larger data sizes with a growth rate outpacing Moore's Law. The data deluge has made many of the current sequenceanalysis tools obsolete because they do not scale with data. Here we present BioPig, a collection of cloud computing tools to scale data analysis and management. Pig is aflexible data scripting language that uses Apache's Hadoop data structure and map reduce framework to process very large data files in parallel and combine the results.BioPig extends Pig with capability with sequence analysis. We will show the performance of BioPig on a variety of bioinformatics tasks, including screeningsequence contaminants, Illumina QA/QC, and gene discovery from metagenome data sets using the Rumen metagenome as an example.
Date: March 22, 2011
Creator: Bhatia, Karan & Wang, Zhong
Partner: UNT Libraries Government Documents Department