Markov Model of Segmentation and Clustering: Applications in Deciphering Genomes and Metagenomes

Pandey, Ravi Shanker

Markov Model of Segmentation and Clustering: Applications in Deciphering Genomes and Metagenomes

PDF Version Also Available for Download.

Description

Rapidly accumulating genomic data as a result of high-throughput sequencing has necessitated development of efficient computational methods to decode the biological information underlying these data. DNA composition varies across structurally or functionally different regions of a genome as well as those of distinct evolutionary origins. We adapted an integrative framework that combines a top-down, recursive segmentation algorithm with a bottom-up, agglomerative clustering algorithm to decipher compositionally distinct regions in genomes. The recursive segmentation procedure entails fragmenting a genome into compositionally distinct segments within a statistical hypothesis testing framework. This is followed by an agglomerative clustering procedure to group compositionally similar … continued below

Physical Description

xii, 135 pages

Creation Information

Pandey, Ravi Shanker August 2017.

Context

This dissertation is part of the collection entitled: UNT Theses and Dissertations and was provided by the UNT Libraries to the UNT Digital Library, a digital repository hosted by the UNT Libraries. It has been viewed 131 times. More information about this dissertation can be viewed below.

Author

Pandey, Ravi Shanker

Chair

Azad, Rajeev Major Professor

Committee Members

Publisher

University of North Texas
Publisher Info: www.unt.edu

Place of Publication: Denton, Texas

Rights Holder

For guidance see Citations, Rights, Re-Use.

Pandey, Ravi Shanker

Provided By

UNT Libraries

The UNT Libraries serve the university and community by providing access to physical and online collections, fostering information literacy, supporting academic research, and much, much more.

Degree Information

Name: Doctor of Philosophy
Level: Doctoral
Department: Department of Biological Sciences
College: College of Arts and Sciences
Discipline: Biology
PublicationType: Doctoral Dissertation
Grantor: University of North Texas

Description

Rapidly accumulating genomic data as a result of high-throughput sequencing has necessitated development of efficient computational methods to decode the biological information underlying these data. DNA composition varies across structurally or functionally different regions of a genome as well as those of distinct evolutionary origins. We adapted an integrative framework that combines a top-down, recursive segmentation algorithm with a bottom-up, agglomerative clustering algorithm to decipher compositionally distinct regions in genomes. The recursive segmentation procedure entails fragmenting a genome into compositionally distinct segments within a statistical hypothesis testing framework. This is followed by an agglomerative clustering procedure to group compositionally similar segments within the same framework. One of our main objectives was to decipher distinctive evolutionary patterns in sex chromosomes via unraveling the underlying compositional heterogeneity. Application of this approach to the human X-chromosome provided novel insights into the stratification of the X chromosome as a consequence of punctuated recombination suppressions between the X and Y from the distal long arm to the distal short arm. Novel "evolutionary strata" were identified particularly in the X conserved region (XCR) that is not amenable to the X-Y comparative analysis due to massive loss of the Y gametologs following recombination cessation. Our compositional based approach could circumvent the limitations of the current methods that depend on X-Y (or Z-W for ZW sex determination system) comparisons by deciphering the stratification even if only the sequence of sex chromosome in the homogametic sex (i.e. X or Z chromosome) is available. These studies were extended to the plant sex chromosomes which are known to have a number of evolutionary strata that formed at the initial stage of their evolution, presenting an opportunity to examine the onset of stratum formation on the sex chromosomes. Further applications included detection of horizontally acquired DNAs in extremophilic eukaryote, Galdieria sulphuraria, which encode variety of potentially adaptive functions, and in the taxonomic profiling of metagenomic sequences. Finally, we discussed how the Markovian segmentation and clustering method can be made more sensitive and robust for further applications in biological and biomedical sciences in future.

Physical Description

xii, 135 pages

Subjects

Keywords

Library of Congress Subject Headings

Language

English

Item Type

Thesis or Dissertation

Identifier

Unique identifying numbers for this dissertation in the Digital Library or other systems.

Accession or Local Control No: submission_741
Archival Resource Key: ark:/67531/metadc1011827

Collections

This dissertation is part of the following collection of related materials.

UNT Theses and Dissertations

Theses and dissertations represent a wealth of scholarly and artistic content created by masters and doctoral students in the degree-seeking process. Some ETDs in this collection are restricted to use by the UNT community.

What responsibilities do I have when using this dissertation?

Creation Date

August 2017

Added to The UNT Digital Library

Oct. 9, 2017, 11:44 a.m.

Description Last Updated

Feb. 1, 2024, 11:18 a.m.

Usage Statistics

When was this dissertation last used?

Yesterday: 0

Past 30 days: 0

Total Uses: 131

Pandey, Ravi Shanker. Markov Model of Segmentation and Clustering: Applications in Deciphering Genomes and Metagenomes, dissertation, August 2017; Denton, Texas. (https://digital.library.unt.edu/ark:/67531/metadc1011827/: accessed July 17, 2024), University of North Texas Libraries, UNT Digital Library, https://digital.library.unt.edu; .

Markov Model of Segmentation and Clustering: Applications in Deciphering Genomes and Metagenomes

Description

Physical Description

Creation Information

Context

Who

Author

Chair

Committee Members

Publisher

Rights Holder

Provided By

UNT Libraries

Contact Us

What

Degree Information

Description

Physical Description

Subjects

Keywords

Library of Congress Subject Headings

Language

Item Type

Identifier

Collections

UNT Theses and Dissertations

Digital Files

When

Creation Date

Added to The UNT Digital Library

Description Last Updated

Usage Statistics

Interact With This Dissertation

Search Inside

Start Reading

Citations, Rights, Re-Use

International Image Interoperability Framework

Print / Share

Links for Robots

Archival Resource Key (ARK)

International Image Interoperability Framework (IIIF)

Metadata Formats

Images

URLs

Stats