Query estimation and order-optimized iteration in very large federations

PDF Version Also Available for Download.

Description

Objectivity federated databases may contain many terabytes of data and span thousands of files. In such an environment, it is often easy for a user to pose a query that may return an iterator over millions of objects, requiring opening thousands of databases. This presentation describes several technologies developed for such settings: (1) a query estimator, which tells the user how many objects satisfy the query, and how many databases will be touched, prior to opening all of those files; (2) an order-optimized iterator, which behaves like an ordinary iterator except that elements are returned in an order optimized for ... continued below

Physical Description

6 p.

Creation Information

Malon, D.M. & Collaboration, HENP Grand Challenge May 4, 1998.

Context

This article is part of the collection entitled: Office of Scientific & Technical Information Technical Reports and was provided by UNT Libraries Government Documents Department to Digital Library, a digital repository hosted by the UNT Libraries. More information about this article can be viewed below.

Who

People and organizations associated with either the creation of this article or its content.

Sponsor

Publisher

Provided By

UNT Libraries Government Documents Department

Serving as both a federal and a state depository library, the UNT Libraries Government Documents Department maintains millions of items in a variety of formats. The department is a member of the FDLP Content Partnerships Program and an Affiliated Archive of the National Archives.

Contact Us

What

Descriptive information to help identify this article. Follow the links below to find similar items on the Digital Library.

Description

Objectivity federated databases may contain many terabytes of data and span thousands of files. In such an environment, it is often easy for a user to pose a query that may return an iterator over millions of objects, requiring opening thousands of databases. This presentation describes several technologies developed for such settings: (1) a query estimator, which tells the user how many objects satisfy the query, and how many databases will be touched, prior to opening all of those files; (2) an order-optimized iterator, which behaves like an ordinary iterator except that elements are returned in an order optimized for efficient access, presorted by the database (and container) in which they reside; (3) a parallel implementation of the order-optimized iterator, allowing any number of processes in a parallel or distributed system to iterate over disjoint subcollections of terms satisfying the query, partitioned by the database or container in which the items reside. These technologies have been developed for scientific experiments that will require handling thousands of terabytes of data annually, but they are intended to be applicable in other massive data settings as well. In such environments, significant amounts of data will reside on tertiary storage, accessible via Objectivity`s recently-announced HPSS (High Performance Storage System) interface. When deployed in large-scale physics settings later in 1998, the query estimator will further inform the user of the number of tape mounts required to satisfy the query, and provide rough time estimates for data delivery. The order-optimized iterator will be connected to a cache manager that will prefetch from tape to disk the files needed by the query (known from the query estimation step), and will decide which items to deliver to the user next according to the order in which data become available in the disk cache.

Physical Description

6 p.

Notes

INIS; OSTI as DE98057832

Source

  • Objectivity worldview `98 conference, Berkeley, CA (United States), 14-15 May 1998

Language

Item Type

Identifier

Unique identifying numbers for this article in the Digital Library or other systems.

  • Other: DE98057832
  • Report No.: ANL/HEP/CP--98-38
  • Report No.: CONF-980577--
  • Grant Number: W-31109-ENG-38
  • Office of Scientific & Technical Information Report Number: 656716
  • Archival Resource Key: ark:/67531/metadc710566

Collections

This article is part of the following collection of related materials.

Office of Scientific & Technical Information Technical Reports

What responsibilities do I have when using this article?

When

Dates and time periods associated with this article.

Creation Date

  • May 4, 1998

Added to The UNT Digital Library

  • Sept. 12, 2015, 6:31 a.m.

Description Last Updated

  • Dec. 16, 2015, 5:16 p.m.

Usage Statistics

When was this article last used?

Yesterday: 0
Past 30 days: 0
Total Uses: 2

Interact With This Article

Here are some suggestions for what to do next.

Start Reading

PDF Version Also Available for Download.

Citations, Rights, Re-Use

Malon, D.M. & Collaboration, HENP Grand Challenge. Query estimation and order-optimized iteration in very large federations, article, May 4, 1998; Illinois. (digital.library.unt.edu/ark:/67531/metadc710566/: accessed August 21, 2017), University of North Texas Libraries, Digital Library, digital.library.unt.edu; crediting UNT Libraries Government Documents Department.