Date: December 2012
Creator: Pannu, Husanbir Singh
Description: Semi-supervised learning (SSL) is the most practical approach for classification among machine learning algorithms. It is similar to the humans way of learning and thus has great applications in text/image classification, bioinformatics, artificial intelligence, robotics etc. Labeled data is hard to obtain in real life experiments and may need human experts with experimental equipments to mark the labels, which can be slow and expensive. But unlabeled data is easily available in terms of web pages, data logs, images, audio, video les and DNA/RNA sequences. SSL uses large unlabeled and few labeled data to build better classifying functions which acquires higher accuracy and needs lesser human efforts. Thus it is of great empirical and theoretical interest. We contribute two SSL algorithms (i) adaptive anomaly detection (AAD) (ii) hybrid anomaly detection (HAD), which are self evolving and very efficient to detect anomalies in a large scale and complex data distributions. Our algorithms are capable of modifying an existing classier by both retiring old data and adding new data. This characteristic enables the proposed algorithms to handle massive and streaming datasets where other existing algorithms fail and run out of memory. As an application to semi-supervised anomaly detection and for experimental illustration, we ...
Contributing Partner: UNT Libraries