Sparse Table

Sparse TableA sparse table is a data structure that uses pre-processing to respond to static Range Minimum Queries (RMQ).  Let’s, understand this with the help of an example in which, we consider you have many buddies. They’ve all seen parts of the movie, and when combined, they’ve seen the whole movie. Then you won’t need to go to see the movie in cinema, if your buddies are sufficiently communicative. You simply inquire about the scenario, and they slowly explain it to you. Let’s get back to our world of professional computing. A data structure called a sparse table, in the case of immutable data, is frequently used as a substitute for segment tree. Let’s say you have an array Abc and want to run some queries on it. F({Abc}_{L}, {Abc}_{L+1},..., {Abc}_{R}) should be computed for each query over the sub array [L, R]: F({Abc}_{L}, {Abc}_{L+1},..., {Abc}_{R}). You can do each query in O(log(N)) where N is the size of Abc with initial O(N * log(N)) preprocessing with sparse tables.

Conditions For Sparse Table

Conditions of sparse table

A sparse table can be used if and only if the following conditions are met:

  1. Abc is immutable (i.e., queries have no effect on it).
  2. F is an associative function: F (a, b, c) = F (F (a, b), c) = F (a, F (b, c)).

Processing of Sparse Table

Let’s return to the preface’s discussion of buddies and movies. Having such buddies would be extremely beneficial. They’d retell you everything you needed to know, and you’d just have to put it all together to grasp the “whole picture.” Consider the table, where Table[i][j] = F ({Arr}_{i}, {Arr}_{j+1},..., {Arr}_{i+2^j-1}). Every cell in the array can be thought of as a “buddy” who has computed function F over a small piece of the array. The size of each part is a power of two. We’ll get into why it matters later, but for now, let’s look at a more or less specific context. Assume you’re given the task of computing F ({Arr}_{i}, {Arr}_{i+1},..., {Arr}_{i+12}. Then you can go ahead and do the following:

  1. Inquire about F ({Arr}_{i}, {Arr}_{i+1},..., {Arr}_{i+7}) with a buddy at Table [i][3]
  2. Ask a buddy at Table[i + 8][2] about F({Arr}_{i+8}, {Arr}_{i+9},..., {Arr}_{i+11}); call this value {x}_{1}
  3. Ask a buddy at Table[i + 12][0] about F({Arr}_{i+12}); call this value {x}_{2}
  4. Calculate F ({x}_{1}, {x}_{2}, {x}_{3}) to get the answer; call this number {x}_{3}

It’s worth noting that computing F over a sub-array of size 13 will take only four actions.   However, in this case, you knew which friends to ask so that the total number of questions was kept to a minimum. Every non-negative integer number has a binary representation, as you may already be aware (in numeral system base 2). Consider the following scenario:

{13}_{10} = {1101}_{2}
{15}_{10} = {1111}_{2}
{68}_{10} = {1000100}_{2}
{198}_{10} = {11000110}_{2}

The following pseudocode can be used to accomplish this:

For i=k..0:

If N>=2^i:

N-=2^i

Print(`add 2^i’)

The technique is to find the largest i such that 2i ≤N and subtract it from N. It works because we add precisely those i where the ith bit is set, when we represent N as a k+1 bit binary number. Consider the numbers N = 113 and k = 7. Then N is 01110001 when expressed as a k+1=8 bit binary number. From i= 7 to 0 we iterate:

Processing example

You will see that in order for the method to work, we don’t need to write down the binary form of N. It’s only meant to be used as an example. For given L and R, we can compute F({Arr}_{L}, {Arr}_{L+1},..., {Arr}_{R}) using the above logic. We ask a few of our friends the following questions:

Processing algo

  • F(ZERO, x) = F(x, ZERO) = x for every x is a neutral element for calculating function F.
  • R – L + 1<{2}^{k+1} is assumed in the above code (we can choose such k, given bounds on N).
  • The logic behind the condition “L’ + {2}^{i} – 1R” is as follows: we want to know if we should ask a friend at Table[L][i] for assistance. He accounts for {2}^{i} element subarrays: {Arr}_{L}, {Arr}_{L+1},..., {Arr}_{L+2^i-1}. If the sub array is in [L, R], we contact the previously indicated friend.

Notice that on each iteration of the for loop, we have answer = F({Arr}_{L}, {Arr}_{L+1},...,{Arr}_{L'-1}) (with the exception that if L’ = L, answer is ZERO). This ensures that the procedure is correct. It’s also worth noting that we only add {2}^{i} at each step if R – L + 1 (the size of the subarray from L to R) has the i-th bit set in binary format. That is, we are subtracting R – L + 1 from L’, which is initially equal to L. This means that when the loop is finished, L’ = L + (R – L + 1) = R + 1. As a result, the final answer is F({Arr}_{L}, {Arr}_{L+1},..., {Arr}_{R}). Because we choose a minimal k such that {2}^{k+1}>N, this technique for answering queries using Sparse Table works in O(k), which is O(log(N)).

Intuition of Sparse Table

Intution of Sparse Table

A sum of decreasing powers of two can uniquely represent any non-negative number. This is merely a variation of a number’s binary representation. For example, 13 = {(1101)}_{2}=8+4+1. There can only be a maximum of {log}_{2}x summands for a number x. Any interval can be uniquely represented as a union of intervals with lengths that are decreasing powers of two using the same logic. For example, [2,14]=[2,9]ᴜ[10,13]ᴜ[14,14], where the total interval is 13 and the individual intervals are 8, 4, and 1 respectively. In this case, too, the union contains at most {log}_{2}^{(length of interval)} intervals. Sparse Table’s fundamental notion is to precompute all responses for range queries of power two length. After that, a separate range inquiry can be addressed by dividing the range into power of two lengths, looking up the precomputed answers, and combining them to get a complete answer.

Precalculation

The responses to the previously generated queries will be stored in a 2-dimensional array. The answer for the range [i,i+{2}^{j}˗1] of length {2}^{j} will be stored in st[i][j]. The two dimensional array will be MAXN×(K+1), where MAXN is the maximum array length allowed. Because {2}^{log_2MAXN} is the largest power of two ranges that we must provide, K must meet K≥[{log}_{2}MAXN]. K=25 is a suitable number for arrays with a reasonable length ({10}^{7} items).

Precalculation

We can produce the table effectively using dynamic programming because the range [i,i+{2}^{j}˗1] of length {2}^{j}  breaks well into the ranges [i,i+{2}^{j-1} ˗1] and [i+{2}^{j+1},i+{2}^{j}˗1], both of length {2}^{j}˗1.

for (int i = 0; i < N; i++)

st[i][0] = f(array[i]);

for (int j = 1; j <= K; j++)

for (int i = 0; i + (1 << j) <= N; i++)

st[i][j] = f(st[i][j-1], st[i + (1 << (j – 1))][j – 1]);

The type of query will determine the function f. It will compute the sum for range sum queries and the minimum for range minimum queries. The precalculation has time complexity O(NlogN).

Range Sum Queries

We want to find the sum of all values in a range for this type of query. As a result, f(x,y)=x+y is the natural definition of the function f. The data structure can be built using:

long long st[MAXN][K + 1];

for (int i = 0; i < N; i++)

st[i][0] = array[i];

for (int j = 1; j <= K; j++)

for (int i = 0; i + (1 << j) <= N; i++)

st[i][j] = st[i][j-1] + st[i + (1 << (j – 1))][j – 1];

We iterate over all powers of two, starting with the largest, to answer the sum query for the range [L,R]. We process the first portion of range [L,L+{2}^{j}˗1] and proceed with the remaining range [L+{2}^{j},R] as soon as a power of two {2}^{j} is smaller or equal to the length of the range (=R˗L+1).

long long sum = 0;

for (int j = K; j >= 0; j–) {

if [1]1 << j) <= R – L + 1) {

sum += st[L][j];

L += 1 << j;

}

}

A Range Sum Query has time complexity O(K)=O(logMAXN).

Range Minimum Query


Range MQ

This is where the Sparse Table really shines. It makes no difference whether we process a number in the range once or twice when determining the minimum of a range. As a result, rather than separating a range into several ranges, we can divide it into merely two overlapping ranges with power of two length. For example, we can divide the range [1,6] into [1,4] and [3,6]. The range minimum of [1,6] is clearly the same as the minimum of both the range minimums of [1,4] and [3,6]. As a result, we may find the minimum of the range [L,R] by using the formula:

Range MQ2

This necessitates the ability to compute log2(RL+1) quickly. You can do this by computing all logarithms ahead of time:

int log[MAXN+1];

log[1] = 0;

for (int i = 2; i <= MAXN; i++)

log[i] = log[i/2] + 1;

The Sparse Table structure must then be precomputed. This time, we’ll use the formula f(x,y)=min(x,y) to compute f.

int st[MAXN][K + 1];

for (int i = 0; i < N; i++)

st[i][0] = array[i];

for (int j = 1; j <= K; j++)

for (int i = 0; i + (1 << j) <= N; i++)

st[i][j] = min(st[i][j-1], st[i + (1 << (j – 1][j – 1]);

And you may find the minimum of a range [L,R] by using:

RMQ3

Range Minimum Query’s time complexity is O. (1).

 

References[+]

Add Comment