Professional Documents
Culture Documents
R ANK(S, x) computes the rank of x in S, i.e., |{s S : s x}|. S ELECT(S, k) computes the k-th smallest element in S.
Implementation: Balanced binary search tree with additional information stored at each node v: The number size(v) is the number of internal nodes in the subtree rooted at v. size(v) = size(lc(v)) + size(rc(v)) + 1
Here and later on, lc(v) and rc(v) denote the left and right child of a node v, respectively.
89 46 28 17 42
89 59 17 42
28
S ELECT is implemented using S ELECT(v, k), which returns the k-th smallest key in the subtree rooted at v.
S ELECT N ODE(v, k) 1 r size(lc(v)) + 1 2 if r = k 3 then return v 4 else if r > k 5 then return S ELECT N ODE(lc(v), k) 6 else return S ELECT N ODE(rc(v), k r)
N ODE R ANK(T, v) computes the rank of a node v in an inorder traversal of tree T . In order to compute the rank of key x, search for the node w containing x and call N ODE R ANK(T, w).
N ODE R ANK(T, v) 1 r size(lc(v)) + 1 2 wv 3 while (w = root(T )) 4 do if w = rc(parent(w)) 5 then r r + size(lc(parent(w))) + 1 6 w parent(w) 7 return r
Theorem: Operations S EARCH, I NSERT, D ELETE, S ELECT, and R ANK in an order-statistics tree storing n elements take time O(log n) each.
Interval Overlap
Given a set of closed intervals I = {[x1 , x1 ], [x2 , x2 ], . . . , [xn , xn ]} check whether a closed query interval [q, q ] overlaps any of the intervals in I.
All cases:
1111111111 0000000000 1111111111 0000000000 1111111111 0000000000 1111111111 0000000000 1111111111 0000000000 1111111111 0000000000 1111111111 0000000000 1111111111 0000000000 1111111111 0000000000 1111111111 0000000000 1111111111 0000000000 11111111 00000000 11111111 00000000 11111111 00000000 11111111 00000000 11111111 00000000
Sweep a horizontal line from top to bottom. Maintain the intersection intervals of the rectangles and the sweep line in a data structure appropriate for interval overlap queries. Check for overlap at the upper sides of the rectangles.
7
Sweep line reaches an upper side: 11111111 00000000 11111111 00000000 11111111 00000000 11111111 00000000 11111111 00000000
?
OVERLAP(I) I NSERT(I)
Sweep line reaches a lower side: 11111111 00000000 11111111 00000000 11111111 00000000 11111111 00000000 11111111 00000000
D ELETE(I)
Implementation: Balanced binary search tree for the intervals ordered according to left endpoints xi .
10
Additional information stored at each node v: maxright(v) is the largest right endpoint of the intervals stored in the subtree rooted at v.
maxright(v)
11
F IND OVERLAPPING I NTERVAL(T, [q, q ]) 1 v root(T ) 2 while (v is not a leaf and [q, q ] does not overlap interval(v)) 3 do if (lc(v) exists and maxright(lc(v)) q) 4 then v lc(v) 5 else v rc(v)
12
F IND OVERLAPPING I NTERVAL(T, [q, q ]) 1 v root(T ) 2 while (v is not a leaf and [q, q ] does not overlap interval(v)) 3 do if (lc(v) exists and maxright(lc(v)) q) 4 then v lc(v) 5 else v rc(v)
Lemma: 1. If line 5 is executed, then vs left subtree contains no interval that overlaps [q, q ]. 2. If line 4 is executed, then vs left subtree contains an interval that overlaps [q, q ] or no interval in vs right subtree overlaps [q, q ].
13
Theorem: Let I be a set of n intervals. There is a data structure for I of size O(n), such that the operations I NSERT, D ELETE, and OVER LAP each take time O(log n).
Theorem: Checking whether n axis-parallel rectangles in the plane are pairwise disjoint can be done in O(n log n) time.
14
Let v be a node in a search-tree with leaf-oriented storage. The canonical subset of v is the set of items stored in the subtree rooted at v. In a number of data structures the canonical subsets of the nodes are stored in a secondary data structures associated with a node.
15
16
17