Semi-supervised and Self-evolving Learning Algorithms with Application to Anomaly Detection in Cloud Computing

Pannu, Husanbir Singh

Semi-supervised and Self-evolving Learning Algorithms with Application to Anomaly Detection in Cloud Computing

Primary view of object titled 'Semi-supervised and Self-evolving Learning Algorithms with Application to Anomaly Detection in Cloud Computing'.

PDF Version Also Available for Download.

Description

Semi-supervised learning (SSL) is the most practical approach for classification among machine learning algorithms. It is similar to the humans way of learning and thus has great applications in text/image classification, bioinformatics, artificial intelligence, robotics etc. Labeled data is hard to obtain in real life experiments and may need human experts with experimental equipments to mark the labels, which can be slow and expensive. But unlabeled data is easily available in terms of web pages, data logs, images, audio, video les and DNA/RNA sequences. SSL uses large unlabeled and few labeled data to build better classifying functions which acquires higher … continued below

Creation Information

Pannu, Husanbir Singh December 2012.

Context

This dissertation is part of the collection entitled: UNT Theses and Dissertations and was provided by the UNT Libraries to the UNT Digital Library, a digital repository hosted by the UNT Libraries. It has been viewed 907 times. More information about this dissertation can be viewed below.

Author

Pannu, Husanbir Singh

Chair

Liu, Jianguo Major Professor

Committee Members

Publisher

University of North Texas
Publisher Info: www.unt.edu

Place of Publication: Denton, Texas

Rights Holder

For guidance see Citations, Rights, Re-Use.

Pannu, Husanbir Singh

Provided By

UNT Libraries

The UNT Libraries serve the university and community by providing access to physical and online collections, fostering information literacy, supporting academic research, and much, much more.

Degree Information

Department: Department of Mathematics
Discipline: Mathematics
Level: Doctoral
Name: Doctor of Philosophy
Grantor: University of North Texas
PublicationType: Doctoral Dissertation

Description

Semi-supervised learning (SSL) is the most practical approach for classification among machine learning algorithms. It is similar to the humans way of learning and thus has great applications in text/image classification, bioinformatics, artificial intelligence, robotics etc. Labeled data is hard to obtain in real life experiments and may need human experts with experimental equipments to mark the labels, which can be slow and expensive. But unlabeled data is easily available in terms of web pages, data logs, images, audio, video les and DNA/RNA sequences. SSL uses large unlabeled and few labeled data to build better classifying functions which acquires higher accuracy and needs lesser human efforts. Thus it is of great empirical and theoretical interest. We contribute two SSL algorithms (i) adaptive anomaly detection (AAD) (ii) hybrid anomaly detection (HAD), which are self evolving and very efficient to detect anomalies in a large scale and complex data distributions. Our algorithms are capable of modifying an existing classier by both retiring old data and adding new data. This characteristic enables the proposed algorithms to handle massive and streaming datasets where other existing algorithms fail and run out of memory. As an application to semi-supervised anomaly detection and for experimental illustration, we have implemented a prototype of the AAD and HAD systems and conducted experiments in an on-campus cloud computing environment. Experimental results show that the detection accuracy of both algorithms improves as they evolves and can achieve 92.1% detection sensitivity and 83.8% detection specificity, which makes it well suitable for anomaly detection in large and streaming datasets. We compared our algorithms with two popular SSL methods (i) subspace regularization (ii) ensemble of Bayesian sub-models and decision tree classifiers. Our contributed algorithms are easy to implement, significantly better in terms of space, time complexity and accuracy than these two methods for semi-supervised anomaly detection mechanism.

Subjects

Keywords

Language

English

Item Type

Thesis or Dissertation

Identifier

Unique identifying numbers for this dissertation in the Digital Library or other systems.

Archival Resource Key: ark:/67531/metadc177238

Collections

This dissertation is part of the following collection of related materials.

UNT Theses and Dissertations

Theses and dissertations represent a wealth of scholarly and artistic content created by masters and doctoral students in the degree-seeking process. Some ETDs in this collection are restricted to use by the UNT community.

What responsibilities do I have when using this dissertation?

Creation Date

December 2012

Added to The UNT Digital Library

Aug. 13, 2013, 2:47 p.m.

Description Last Updated

Nov. 16, 2016, 1:16 p.m.

Usage Statistics

When was this dissertation last used?

Yesterday: 0

Past 30 days: 0

Total Uses: 907

Pannu, Husanbir Singh. Semi-supervised and Self-evolving Learning Algorithms with Application to Anomaly Detection in Cloud Computing, dissertation, December 2012; Denton, Texas. (https://digital.library.unt.edu/ark:/67531/metadc177238/: accessed April 25, 2024), University of North Texas Libraries, UNT Digital Library, https://digital.library.unt.edu; .

Semi-supervised and Self-evolving Learning Algorithms with Application to Anomaly Detection in Cloud Computing

Description

Creation Information

Context

Who

Author

Chair

Committee Members

Publisher

Rights Holder

Provided By

UNT Libraries

Contact Us

What

Degree Information

Description

Subjects

Keywords

Language

Item Type

Identifier

Collections

UNT Theses and Dissertations

Digital Files

When

Creation Date

Added to The UNT Digital Library

Description Last Updated

Usage Statistics

Interact With This Dissertation

Search Inside

Start Reading

Citations, Rights, Re-Use

International Image Interoperability Framework

Print / Share

Links for Robots

Archival Resource Key (ARK)

International Image Interoperability Framework (IIIF)

Metadata Formats

Images

URLs

Stats