Clustering Algorithms for Time Series Gene Expression in Microarray Data

PDF Version Also Available for Download.

Description

Clustering techniques are important for gene expression data analysis. However, efficient computational algorithms for clustering time-series data are still lacking. This work documents two improvements on an existing profile-based greedy algorithm for short time-series data; the first one is implementation of a scaling method on the pre-processing of the raw data to handle some extreme cases; the second improvement is modifying the strategy to generate better clusters. Simulation data and real microarray data were used to evaluate these improvements; this approach could efficiently generate more accurate clusters. A new feature-based algorithm was also developed in which steady state value; overshoot, ... continued below

Creation Information

Zhang, Guilin August 2012.

Context

This thesis is part of the collection entitled: UNT Theses and Dissertations and was provided by UNT Libraries to Digital Library, a digital repository hosted by the UNT Libraries. It has been viewed 404 times . More information about this thesis can be viewed below.

Who

People and organizations associated with either the creation of this thesis or its content.

Author

Chair

Committee Members

Publisher

Rights Holder

For guidance see Citations, Rights, Re-Use.

  • Zhang, Guilin

Provided By

UNT Libraries

The UNT Libraries serve the university and community by providing access to physical and online collections, fostering information literacy, supporting academic research, and much, much more.

Contact Us

What

Descriptive information to help identify this thesis. Follow the links below to find similar items on the Digital Library.

Degree Information

Description

Clustering techniques are important for gene expression data analysis. However, efficient computational algorithms for clustering time-series data are still lacking. This work documents two improvements on an existing profile-based greedy algorithm for short time-series data; the first one is implementation of a scaling method on the pre-processing of the raw data to handle some extreme cases; the second improvement is modifying the strategy to generate better clusters. Simulation data and real microarray data were used to evaluate these improvements; this approach could efficiently generate more accurate clusters. A new feature-based algorithm was also developed in which steady state value; overshoot, rise time, settling time and peak time are generated by the 2nd order control system for the clustering purpose. This feature-based approach is much faster and more accurate than the existing profile-based algorithm for long time-series data.

Language

Collections

This thesis is part of the following collection of related materials.

UNT Theses and Dissertations

Theses and dissertations represent a wealth of scholarly and artistic content created by masters and doctoral students in the degree-seeking process. Some ETDs in this collection are restricted to use by the UNT community.

What responsibilities do I have when using this thesis?

When

Dates and time periods associated with this thesis.

Creation Date

  • August 2012

Added to The UNT Digital Library

  • Aug. 13, 2013, 2:47 p.m.

Description Last Updated

  • Nov. 16, 2016, 5:28 p.m.

Usage Statistics

When was this thesis last used?

Yesterday: 1
Past 30 days: 3
Total Uses: 404

Interact With This Thesis

Here are some suggestions for what to do next.

Start Reading

PDF Version Also Available for Download.

International Image Interoperability Framework

IIF Logo

We support the IIIF Presentation API

Zhang, Guilin. Clustering Algorithms for Time Series Gene Expression in Microarray Data, thesis, August 2012; Denton, Texas. (digital.library.unt.edu/ark:/67531/metadc177269/: accessed October 21, 2018), University of North Texas Libraries, Digital Library, digital.library.unt.edu; .