LDRD 99-ERI-010 Final Report: Sapphire: Scalable Pattern Recognition for Large-Scale Scientific Data Mining

PDF Version Also Available for Download.

Description

There is a rapidly widening gap between our ability to collect data and our ability to explore, analyze, and understand the data. As a result, useful information is overlooked, and the potential benefits of increased computational and data gathering capabilities only partially realized. This problem of data overload is becoming a serious impediment to scientific advancement in areas as diverse as counter-proliferation, the Accelerated Strategic Computing Initiative (ASCI), astrophysics, computer security, and climate modeling, where vast amounts of data are collected through observations or simulations. To improve the way in which scientists extract useful information from their data, we are … continued below

Physical Description

PDF-FILE: 21 ; SIZE: 1.1 MBYTES pages

Creation Information

Kamath, C January 30, 2002.

Context

This report is part of the collection entitled: Office of Scientific & Technical Information Technical Reports and was provided by the UNT Libraries Government Documents Department to the UNT Digital Library, a digital repository hosted by the UNT Libraries. It has been viewed 20 times. More information about this report can be viewed below.

Who

People and organizations associated with either the creation of this report or its content.

Author

Sponsor

Publisher

Provided By

UNT Libraries Government Documents Department

Serving as both a federal and a state depository library, the UNT Libraries Government Documents Department maintains millions of items in a variety of formats. The department is a member of the FDLP Content Partnerships Program and an Affiliated Archive of the National Archives.

Contact Us

What

Descriptive information to help identify this report. Follow the links below to find similar items on the Digital Library.

Description

There is a rapidly widening gap between our ability to collect data and our ability to explore, analyze, and understand the data. As a result, useful information is overlooked, and the potential benefits of increased computational and data gathering capabilities only partially realized. This problem of data overload is becoming a serious impediment to scientific advancement in areas as diverse as counter-proliferation, the Accelerated Strategic Computing Initiative (ASCI), astrophysics, computer security, and climate modeling, where vast amounts of data are collected through observations or simulations. To improve the way in which scientists extract useful information from their data, we are developing a new generation of tools and techniques based on data mining. Data mining is the semi-automated discovery of patterns, associations, anomalies, and statistically significant structures in data. It consists of two steps--in data pre-processing, we extract high-level features from the data, and in pattern recognition, we use the features to identify and characterize patterns in the data. In this project, our focus is on developing scalable algorithms for the pattern recognition task of classification. Our goal is to improve the performance of these algorithms, without sacrificing accuracy. We are demonstrating these techniques using an astronomy application, namely the detection of radio-emitting galaxies with a bent-double morphology in the FIRST survey. Our research has been incorporated into software to make it easily accessible to LLNL scientists. The author describes their accomplishments in each of these three areas.

Physical Description

PDF-FILE: 21 ; SIZE: 1.1 MBYTES pages

Source

  • Other Information: PBD: 30 Jan 2002

Language

Item Type

Identifier

Unique identifying numbers for this report in the Digital Library or other systems.

Collections

This report is part of the following collection of related materials.

Office of Scientific & Technical Information Technical Reports

Reports, articles and other documents harvested from the Office of Scientific and Technical Information.

Office of Scientific and Technical Information (OSTI) is the Department of Energy (DOE) office that collects, preserves, and disseminates DOE-sponsored research and development (R&D) results that are the outcomes of R&D projects or other funded activities at DOE labs and facilities nationwide and grantees at universities and other institutions.

What responsibilities do I have when using this report?

When

Dates and time periods associated with this report.

Creation Date

  • January 30, 2002

Added to The UNT Digital Library

  • Jan. 12, 2019, 4:41 p.m.

Description Last Updated

  • Feb. 5, 2019, 8:04 p.m.

Usage Statistics

When was this report last used?

Yesterday: 0
Past 30 days: 0
Total Uses: 20

Interact With This Report

Here are some suggestions for what to do next.

Start Reading

PDF Version Also Available for Download.

International Image Interoperability Framework

IIF Logo

We support the IIIF Presentation API

Kamath, C. LDRD 99-ERI-010 Final Report: Sapphire: Scalable Pattern Recognition for Large-Scale Scientific Data Mining, report, January 30, 2002; California. (https://digital.library.unt.edu/ark:/67531/metadc1395213/: accessed July 17, 2025), University of North Texas Libraries, UNT Digital Library, https://digital.library.unt.edu; crediting UNT Libraries Government Documents Department.

Back to Top of Screen