See the PDF, page 4, for the problem description. Find \(\) cluster centers \(\) to minimize the sum of distances from each point to its cluster center.
MSSC is similar to the problem we are trying to solve in the \(\) to \(\) groups of \(\) matching problem. Given \(\) points, we would like to find the \(\) cluster centers that minimize the distance of any point \(\) to its nearest center \(\) (the centroid of that cluster). For fixed \(\) and dimensionality \(\) , it has been shown MSSC is solvable in \(\) , for \(\) input points. For general \(\) and input parameter \(\) (as in our case), the problem is shown to be NP-hard.
MSSC, however, differs from our problem statement, since we require the size of any cluster to be size \(\) such that the cluster center \(\) provides a good representation for \(\) points. The points in each cluster may be divided up to form \(\) groups, where each member comes from a unique cluster, thereby removing bias between the groups. Also, our initial problem statement proposes a greedy method, such that the smallest clusters are found first, whereas MSSC minimizes the sizes of all clusters.
Talk by Faculty candidate on Friday.
Big data computation model
- \(\) = number of vectors in \(\) seen so far
- \(\) = number of sensors (dimensionality)
- only have \(\) memory available on the system
- \(\) = number of cores
\(\) -median queries
- input: \(\)
- key idea: replace many points by a weighted average point \(\)
- \(\) , where \(\) is the approximated core set.
Online algorithm to build core set: read in \(\) points, compute \(\) -means and best-fit lines, then choose some of the furthest outliers, then compress that data. Read next \(\) points, repeat, then compute the core set of those two sets to get a new core set. Then move on to next points until you read all \(\) and have a final core set. Then run off-line heuristic on this core set of size \(\) .
Does a \(\) approximation.
References to consider:
- Feldman, Landberg, STOC'11
- Feldman, Sohler, Monemizadeh, SoCG'07
- Har-Peled, Mazumdar, 04
- Feldman, Wu, Julian, Sung and Rus. GPS Compression
Computational Geometry - An Introduction Through Randomized Algorithms; Mulmuley; 1994; pp 135-
Proposition 4.2.2: The total cost of maintaining a Voronoi diagram in a given update sequence is proportional to the total structural change during that sequence. This ignores a logarithmic factor in the cost of a deletion. It also ignores the cost of locating a conflicting vertex during addition. The expected (total) structural change during a random \(\) -sequence is \(\) .
Dynamic Algorithms in Computational Geometry; Li; date
This is a survey paper of dynamic algorithms available for a variety of geometric problems, including nearest neighbor. For Voronoi diagrams, Li describes Mumuley's algorithm for diagrams of order 1 to \(\) in \(\) . Expected time for \(\) in a random \(\) -sequence is \(\) . For \(\) .
In the 2-dimensional case, the query time is \(\) and update time is \(\) . Similarly for 3-dimensions, query time is \(\) and update time is \(\) .
Mumuley. Computational Geometry: an introduction through randomized algorithms. 1994, Prentice-Hall Inc.
Improved Updating of Euclidean Distance Maps and Voronoi Diagrams; Lau, Sprunk, Burgard; date
Primarily focuses on the Generalized Voronoi Diagram as used in navigation for robotics. Diagrams are represented on grid maps.
Interactive analysis using voronoi diagrams: Algorithms to support dynamic update from a generic triangle-based data structure; Lee, Gahegan; 2002
Describe methods for updating five types of Voronoi diagrams using a data structure that stores the Delaunay triangulation of a given pointset. For Ordinary Vornoi Diagrams (OVD), edge swapping is used to dynamically insert points in \(\) time. To remove points, Ing's 1993 reversion algorithm is used, with worst case \(\) time.
Dynamic Voronoi Diagrams; Gowda, Kirkpatrick, Lee, Naamad; 1983
First note that dynamically updating a Voronoi diagram by adding or removing one point can change as many as \(\) edges in the worst case, therefore no worst-case-sublinear algorithms exist. They present an algorithm that is \(\) for both insert and delete, using a V-tree (Voronoi tree) design.
Each leaf of the tree stores a point in the given pointset \(\) . Internal nodes \(\) store the Voronoi diagram \(\) for the subtree rooted at \(\) . This requires \(\) storage, where \(\) is the height of the tree, but may be reduced to \(\) and preserving linear update time.
Consider the \(\) -minimum spanning circle problem: Given a planar point set \(\) where \(\) , find \(\) circles that every point in \(\) is contained in at least one circle and the largest circle is minimized.