Classifying genes to the correct Gene Ontology Slim term in Saccharomyces cerevisiae using neighbouring genes with classification learning Page: 1
This article is part of the collection entitled: UNT Scholarly Works and was provided to UNT Digital Library by the UNT College of Engineering.
Extracted Text
The following text was automatically extracted from the image on this page using optical character recognition software:
Amthauer and Tsatsoulis BMC Genomics 2010, 11:340
http://www.biomedcentral.com/1471-2164/11/340BMC
GenomicsClassifying genes to the correct Gene Ontology
Slim term in Saccharomyces cerevisiae using
neighbouring genes with classification learning
Heather A Amthauer", Costas Tsatsoulis2
Abstract
Background: There is increasing evidence that gene location and surrounding genes influence the functionality of
genes in the eukaryotic genome. Knowing the Gene Ontology Slim terms associated with a gene gives us insight
into a gene's functionality by informing us how its gene product behaves in a cellular context using three different
ontologies: molecular function, biological process, and cellular component. In this study, we analyzed if we could
classify a gene in Saccharomyces cerevisiae to its correct Gene Ontology Slim term using information about its
location in the genome and information from its nearest-neighbouring genes using classification learning.
Results: We performed experiments to establish that the MultiBoostAB algorithm using the J48 classifier could
correctly classify Gene Ontology Slim terms of a gene given information regarding the gene's location and
information from its nearest-neighbouring genes for training. Different neighbourhood sizes were examined to
determine how many nearest neighbours should be included around each gene to provide better classification
rules. Our results show that by just incorporating neighbour information from each gene's two-nearest neighbours,
the percentage of correctly classified genes to their correct Gene Ontology Slim term for each ontology reaches
over 80% with high accuracy (reflected in F-measures over 0.80) of the classification rules produced.
Conclusions: We confirmed that in classifying genes to their correct Gene Ontology Slim term, the inclusion of
neighbour information from those genes is beneficial. Knowing the location of a gene and the Gene Ontology
Slim information from neighbouring genes gives us insight into that gene's functionality. This benefit is seen by
just including information from a gene's two-nearest neighbouring genes.Background
Determining novel gene functionality is critical for
bringing a better understanding of how an organism
functions as a whole. Traditional biological approaches
to determining gene functions mainly focus on testing
specific hypotheses through well designed mutagenesis
experiments. However, methods of this kind suffer from
the high cost of labour and funds. With the proliferation
of protein and nucleic acid sequences catalogued in gen-
ome databases, the investigation of the function of a
gene and its encoded product often begins by compar-
ing its sequence with those of previously characterized
genes. But, the search for homologues does not always
reveal information about function. As noted by Alberts
* Correspondence: haamthauer@frostburg.edu
Department of Computer Science, Frostburg State University, Frostburg,
Maryland, USAet al. [1] in the Saccharomyces cerevisiae genome, "30%
of the previously uncharacterized genes could be
assigned a putative function by homology analysis; 10%
had homologues whose function was also unknown; and
another 30% had no homologues in any existing data-
bases (the remaining 30% of the genes had been identi-
fied before sequencing the yeast genome)." Sequence
similarity alone cannot provide full function specificity
[2]. The predictions that emerge from sequence analysis
are often only a tool to direct further experimental
investigations.
Knowing the Gene Ontology Slim terms associated
with a gene give us insight into how its gene product
behaves in a cellular context using three different ontol-
ogies: molecular function, biological process, and cellu-
lar component. These terms describe where a gene
product is located or its association with cellular2010 Amthauer and Tsatsoulis; licensee BioMed Central Ltd. This is an Open Access article distributed under the terms of the
i BioY ed Central Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and
reproduction in any medium, provided the original work is properly cited.
Upcoming Pages
Here’s what’s next.
Search Inside
This article can be searched. Note: Results may vary based on the legibility of text within the document.
Tools / Downloads
Get a copy of this page or view the extracted text.
Citing and Sharing
Basic information for referencing this web page. We also provide extended guidance on usage rights, references, copying or embedding.
Reference the current page of this Article.
Amthauer, Heather A. & Tsatsoulis, C. (Costas), 1962-. Classifying genes to the correct Gene Ontology Slim term in Saccharomyces cerevisiae using neighbouring genes with classification learning, article, May 28, 2010; [London, United Kingdom]. (https://digital.library.unt.edu/ark:/67531/metadc122144/m1/1/: accessed April 24, 2024), University of North Texas Libraries, UNT Digital Library, https://digital.library.unt.edu; crediting UNT College of Engineering.