Description: DNA sequence analysis involves precise discrimination of two of the sequence's most important components: exons and introns. Exons encode the proteins that are responsible for almost all the functions in a living organism. Introns interrupt the sequence coding for a protein and must be removed from primary RNA transcripts before translation to protein can occur. A pattern recognition technique called Finite Induction (FI) is utilized to study the language of exons and introns. FI is especially suited for analyzing and classifying large amounts of data representing sequences of interest. It requires no biological information and employs no statistical functions. Finite Induction is applied to the exon and intron components of DNA by building a collection of rules based upon what it finds in the sequences it examines. It then attempts to match the known rule patterns with new rules formed as a result of analyzing a new sequence. A high number of matches predict a probable close relationship between the two sequences; a low number of matches signifies a large amount of difference between the two. This research demonstrates FI to be a viable tool for measurement when known patterns are available for the formation of rule sets.
Date: December 1997
Creator: Taylor, Pamela A., 1941-