Chunking of Large Multidimensional Arrays Page: 1 of 19
This report is part of the collection entitled: Office of Scientific & Technical Information Technical Reports and was provided to Digital Library by the UNT Libraries Government Documents Department.
The following text was automatically extracted from the image on this page using optical character recognition software:
Chunking of Large Multidimensional Arrays
Doron Rotem and Ekow Otoo Sridhar Seshadri
LBNL, University of California Leonard N. Stern School of Business
1 Cyclotron Road New York University
Berkeley, CA 94720 44 W. 4th St., 7-60, New York, 10012-1126
sseshadr@stern. nyu. edu
Very large multidimensional arrays are commonly used in data intensive scientific com-
putations as well as on-line analytical processing applications referred to as MOLAP. The
storage organization of such arrays on disks is done by partitioning the large global array into
fixed size sub-arrays called chunks or tiles that form the units of data transfer between disk
and memory. Typical queries involve the retrieval of sub-arrays in a manner that accesses
all chunks that overlap the query results. An important metric of the storage efficiency is
the expected number of chunks retrieved over all such queries. The question that immedi-
ately arises is "what shapes of array chunks give the minimum expected number of chunks
over a query workload?" The problem of optimal chunking was first introduced by Sarawagi
and Stonebraker  who gave an approximate solution. In this paper we develop exact
mathematical models of the problem and provide exact solutions using steepest descent and
geometric programming methods. Experimental results, using synthetic and real life work-
loads, show that our solutions are consistently less than 2.0% of the true number of chunks
retrieved for any number of dimensions. In contrast, the approximate solution of  can
deviate considerably from the true result with increasing number of dimensions.
Categories and Subject Descriptors
H.2[Information Systems, Database Management]; H.2.2 [Physical Design]; H.2.8[Database
Applications, Scientific databases,Statistical databases]
Multidimensional Arrays, Algorithms, Array Chunking.
Here’s what’s next.
This report can be searched. Note: Results may vary based on the legibility of text within the document.
Tools / Downloads
Get a copy of this page or view the extracted text.
Citing and Sharing
Basic information for referencing this web page. We also provide extended guidance on usage rights, references, copying or embedding.
Reference the current page of this Report.
Rotem, Doron; Otoo, Ekow J. & Seshadri, Sridhar. Chunking of Large Multidimensional Arrays, report, February 28, 2007; Berkeley, California. (https://digital.library.unt.edu/ark:/67531/metadc896932/m1/1/: accessed April 25, 2019), University of North Texas Libraries, Digital Library, https://digital.library.unt.edu; crediting UNT Libraries Government Documents Department.