Chunking of Large Multidimensional Arrays Page: 8 of 19
This report is part of the collection entitled: Office of Scientific & Technical Information Technical Reports and was provided to Digital Library by the UNT Libraries Government Documents Department.
The following text was automatically extracted from the image on this page using optical character recognition software:
Given a shape A = (A1, A2, ... , Aj), assuming the array M is split into chunks of dimensions
c = (ci, c2, ..., cA), we will denote by E(A, c) the expected number of chunks overlapping the
shape A assuming it can be located randomly anywhere in the array M.
Lemma 4.1. E(A, c) = H 1(Aql + 1)
Proof. It is easy to see that
E(A, c) = V E(A, c)
where AZ = (AZ) is a one dimensional shape and c2 = (ci) is a one dimensional chunk. It is
therefore sufficient to calculate the number of chunks overlapping a shape on a one dimensional
array M. For simplicity we will omit the angular brackets whenever it is clear from the context
whether we are discussing a shape vector or its dimensions.
We will now proceed to show that given a one dimensional shape of size AZ and assuming
a one dimensional array M is partitioned into chunks of size c, the expected number of chunks
overlapping this shape is
E(A2, c) = + 1.
Let us number the points within each chunk from 0 to c - 1 (see Figure 3). We can express AZ
as Ai = mc + r where m (quotient) and r (remainder) are integers with 0 K m and 0 K r < c.
Under the assumption that a shape can fall uniformly anywhere in the array, the left end-
point of a shape can fall on any of the c points within a chunk with equal probability 1/c. This
assumes Ai is relatively small compared to the total size of M and ignores small "edge" effects
due to the constraint that the right endpoint of a range must also fit in the array. Let us denote
by R1 and R2 the sub-intervals within each chunk consisting of the leftmost c - (r - 1) and
rightmost r - 1 points respectively. In the event that the left endpoint of the shape falls in
R1 it will overlap m + 1 chunks and if it falls in region R2 it will overlap m + 2 chunks. For
example, in Figure 3 this is illustrated for the case Ai = 8 and c = 5, i.e., m = 1 and r = 3. The
possible positions within a chunk where the query shape may fall are labeled as r2. We see that
the positions r0,r1 and r2 overlap m + 1 = 2 chunks whereas the positions r3 and r4 overlap
m + 2 = 3 chunks. In this case the expected number of chunks is 3/5 x 2 + 2/5 x 3 = 12/5. In
general, the expected number of chunks that are overlapped by the shape is therefore
E(A, c) = c - (r - 1) (m + 1) + -1(m +2).
By using m = (Ai - r)/c and rearranging, we get the required result of
E(Ac)= 1 1 (4.1)
Table 2: Individual range probabilities under model (IAR)
Dimension Range Appears Range
# value in query# probability
1 2 1,4 1/2
1 3 2 1/4
1 4 3 1/4
2 3 1,3,4 3/4
2 4 2 1/4
Here’s what’s next.
This report can be searched. Note: Results may vary based on the legibility of text within the document.
Tools / Downloads
Get a copy of this page or view the extracted text.
Citing and Sharing
Basic information for referencing this web page. We also provide extended guidance on usage rights, references, copying or embedding.
Reference the current page of this Report.
Rotem, Doron; Otoo, Ekow J. & Seshadri, Sridhar. Chunking of Large Multidimensional Arrays, report, February 28, 2007; Berkeley, California. (https://digital.library.unt.edu/ark:/67531/metadc896932/m1/8/: accessed May 25, 2019), University of North Texas Libraries, Digital Library, https://digital.library.unt.edu; crediting UNT Libraries Government Documents Department.