Professional Documents
Culture Documents
Presented By5/26/12
Alekh Dwivedi(11210
Introduction
The Closest Pair problem consists of finding a pair of pointspandqfrom a set ofnpoints such that thepandqare at a minimum distance from each other. The brute force solution to this problem takes O(n2) comparisons to check the distance between each possible pair of points. 5/26/12
Closest pair
Input : set of point in the plane. Output : closest pair of point. We can compute all the distance among these point given in the plane each point has a X,Y coordinate. We find a closest among these . Find the distance among these point by Euclidean distance. Distance { (x1,y1), (x2,y2) } 5/26/12 = Square root of ( (x1 x2 )^2 + (y1
Naive approach
Check all pair of points p and q with (n2) comparisons. There are n-1 distance and your recurrence will lead to O(n^2) solution. I took n-1 extra time but suppose i can do much faster than you are in good shape.
5/26/12
1- Dimension problem
5/26/12
5/26/12
5/26/12
5/26/12
Introduction of 2-D
The brute force approach to the closest pair problem (i.e. checking every possible pair of points) takes quadratic time. We would now like to introduce a faster divide-and-conquer algorithm for solving the closest pair problem. Given a set of points in the planeS, our approach will be to split the set into two roughly equal halves (S1andS2) for which we already have the solutions, and then to merge the halves in linear time to yield an O(nlogn) algorithm. However, the actual solution is far from obvious. It is possible that the the desired pair might have one point inS1and one inS2, does this not force us once again to check all possible pairs of points? The divide-andconquer approach presented here generalizes directly from the one dimensional algorithm we presented in the previous section.
5/26/12
Up to now, we are completely in step with the 1-D case. At this point, however, the extra dimension causes some problems. We wish to determine if some point in sayP1is less thandaway from another point inP2. However, in the plane, we don't have the luxury that we had on the line when we observed that only one point in each set can be withindof the median. In fact, in two dimensions,all of the points could be in the strip! This is disastrous, because we would have to 5/26/12 comparen2pairs of points to merge the set, and hence our divide-and-
5/26/12
S 1
S 2
5/26/12
continue
However, since we sorted the points in the strip by theirycoordinates the process of merging our two subsets isnotlinear, but in fact takes O(nlogn) time
Hence our full algorithm is not yet O(nlogn), but it is still an improvement on the quadratic performance of the brute force approach (as we shall see in the next section)., we will demonstrate how to make this algorithm even more efficient by strengthening our recursive sub-solution.
5/26/12
5/26/12
Divide : into two subsets (according to xcoordinate) : PL<=l <=PR (O(n)) Conquer: recursively on each half.
Get L, R 2T(n/2).
Combine:
select closer pair of the above. =min{L, R), O(1) Find the smaller pairs, onePL and the otherPR
Creat an array Y of points within 2vertical strip, sorted by y-coor. O(nlgn) or O(n). O(7n).
Sort points by x-coordinates will simplify the division. Sorting by y-coordinates will simplify the computation of distances of cross points. sorting in recursive function will not work
Instead, pre-sort all the points, then the 5/26/12 half division will keep the points
CLOSEST_PAIR(P, X, Y)
P: set of points, X: sorted by x-coordinate, Y: sorted by y-coordinate Divide P into PL and PR, X into XL and XR, Y into YL and YR,
//T(n/2)
//T(n/2)
Combine:
compute the minimum distance between the points plPL and pr PR . // O(n).
Form Y, which is the points of Y within 2-wide vertical strip. For each point p in Y, 7 following points for
5/26/12
In summary
2T(n/2)+O(nlgn) O(nlg2n)
T(n)=O(nlgn)+T(n)
T(n)=2T(n/2)+O(n) So O(nlgn)+O(nlgn)=O(nlgn).
Improvements: Comparing 5 not 7? Does not pre-sort Y? Different distance definition? Three dimensions?
5/26/12
Introduction-
we are given a set S of n points in the plane, but this time we shall attempt to use the plane sweep technique. We sweep a vertical line across the set from left to right keeping track of the closest pair seen so far. We shall describe an O(nlogn) algorithm.
5/26/12
The algorithmic technique we shall use is the plane sweep method. This means that we will be sweeping a vertical line across the set of points, keeping track of certain data, and performing certain actions every time a point is encountered during the plane sweep. As we sweep the line, we will maintain the following data:
5/26/12
Plane sweep technique for closest pair. Sweep line 5/26/12 shown in red
ContdEvery time the sweep line encounters a point p, we will perform the following actions
Remove the points further than d to the left of p from the ordered set D that is storing the points in the strip. Determine the point on the left of p that is closest to it
5/26/12
Contd-
5/26/12
sort the set according to xcoordinates (this is necessary for a plane sweep approach to determine the coordinates of the next point), which takes O(nlogn) time. inserting and removing each point once from the ordered set D (each insertion takes O(logn) time . comparing each point to a constant 5/26/12 number of its neighbors (which takes
Proof of Correctness
Let {p1,...,pn} be the set of input points sorted by their x-coordinates. When the sweep line hits p2, then the pair (p1,p2) will be the current closest pair with distance d=dist(p1,p2). Furthermore, we know that if p1 is one of the points that makes up the closest pair for the whole set, then the other point must be p2, since no other points in the 5/26/12 set are closer to p1.
Contd
There are two cases for the position of p. [CASE 1]: p is in the strip - In this
case, p would have been within the bounding box checked by the algorithm (since it is within less than d from q as per our assumption), and hence would not have been missed.
[CASE 2]: p is not in the strip - in this case p must be to the left of the strip 5/26/12 (since we assumed that it is to the left of
Contd-
5/26/12
Contd
Now, if we include the current point pi, the algorithm will check and see if there is a point to the left of the sweep line within d of it and update accordingly. Hence when the sweep completes processing at a given point pi, then d is the distance between the closest pair among the points p1,...,pi. Applying this result to the last point in the list shows that 5/26/12 the algorithm is correct.
Divide the input S into S1; S2 by the median hyper plane normal to some axis. Recursively compute d1, d2 for S1, S2. Set d = min(d1, d2). Let S be the set of points that are within d of H, projected onto H. Use the d-sparsity condition to recursively examine all pairs in S there are only O(n) pairs. 5/26/12
Contd
5/26/12
Application
Hierarchical clustering. Traveling salesman heuristics. Traffic Control. Dynamic minimum spanning trees.
5/26/12