# Category Archives: Algorithms

## Minimum sum-of-squares clustering

See the PDF, page 4, for the problem description. Find $C$ cluster centers $z_j$ to minimize the sum of distances from each point to its cluster center.

MSSC is similar to the problem we are trying to solve in the $kn$ to $k$ groups of $n$ matching problem.  Given $kn$ points, we would like to find the $n$ cluster centers that minimize the distance of any point $p_i$ to its nearest center $z_j$ (the centroid of that cluster).   For fixed $n$ and dimensionality $d$ , it has been shown MSSC is solvable in $O((N)^{nd+1})$ , for $N$ input points.  For general $d$ and input parameter $n$ (as in our case), the problem is shown to be NP-hard.

MSSC, however, differs from our problem statement, since we require the size of any cluster to be size $k$ such that the cluster center $z_i$ provides a good representation for $k$ points.  The points in each cluster may be divided up to form $n$ groups, where each member comes from a unique cluster, thereby removing bias between the groups.  Also, our initial problem statement proposes a greedy method, such that the smallest clusters are found first, whereas MSSC minimizes the sizes of all clusters.

Filed under Algorithms

## Big Data talk

Talk by Faculty candidate on Friday.

Big data computation model

• $n$ = number of vectors in $R^d$ seen so far
• $d$ = number of sensors (dimensionality)
• only have $\log n$ memory available on the system
• $M$ = number of cores

$k$ -median queries

• input: $P \in R^d$
• key idea: replace many points by a weighted average point $c \in C$
• $\sum_{p \in P} dist(p,Q) \approx \sum_{c \in C} w(c) * dist(c,Q)$ , where $C$ is the approximated core set.

Online algorithm to build core set: read in $1/\varepsilon$ points, compute $k$ -means and best-fit lines, then choose some of the furthest outliers, then compress that data. Read next $1/\varepsilon$ points, repeat, then compute the core set of those two sets to get a new core set. Then move on to next points until you read all $n$ and have a final core set. Then run off-line heuristic on this core set of size $k$ .

Does a $1+\varepsilon$ approximation.

References to consider:

1. Feldman, Landberg, STOC'11
2. Feldman, Sohler, Monemizadeh, SoCG'07
• Coreset size: $kd/\varepsilon^2$
3. Har-Peled, Mazumdar, 04
4. Feldman, Wu, Julian, Sung and Rus. GPS Compression

Filed under Big Data, Core Sets, Talks

## Dynamic Voronoi Diagrams

Computational Geometry - An Introduction Through Randomized Algorithms; Mulmuley; 1994; pp 135-

Proposition 4.2.2: The total cost of maintaining a Voronoi diagram in a given update sequence is proportional to the total structural change during that sequence. This ignores a logarithmic factor in the cost of a deletion. It also ignores the cost of locating a conflicting vertex during addition. The expected (total) structural change during a random $(N,\delta)$ -sequence is $O(n)$ .

Filed under Algorithms, Voronoi Diagrams

## Dynamic Algorithms in Computational Geometry

Dynamic Algorithms in Computational Geometry; Li; date

This is a survey paper of dynamic algorithms available for a variety of geometric problems, including nearest neighbor.  For Voronoi diagrams, Li describes Mumuley's algorithm for diagrams of order 1 to $k$ in $R^d$ .  Expected time for $d=2$ in a random $(N,\delta)$ -sequence is $O(k^2 + \log n)$ .  For $d > 2, O(k^{\lceil d/2\rceil + 2} n^{\lceil d/2 \rceil -1})$ .

In the 2-dimensional case, the query time is $O(k \log^2 n)$ and update time is $O(k^2 + \log n)$ .  Similarly for 3-dimensions, query time is $O(k \log^2 n)$ and update time is $O(k^4 n \log^2 n)$ .

Mumuley. Computational Geometry: an introduction through randomized algorithms. 1994, Prentice-Hall Inc.

Filed under Algorithms, Voronoi Diagrams

## Generalized Voronoi Diagram Updating

Improved Updating of Euclidean Distance Maps and Voronoi Diagrams; Lau, Sprunk, Burgard; date

Primarily focuses on the Generalized Voronoi Diagram as used in navigation for robotics.  Diagrams are represented on grid maps.

Filed under Algorithms, Voronoi Diagrams

## Interactive Analysis Using Voronoi Diagrams

Interactive analysis using voronoi diagrams: Algorithms to support dynamic update from a generic triangle-based data structure; Lee, Gahegan; 2002

Describe methods for updating five types of Voronoi diagrams using a data structure that stores the Delaunay triangulation of a given pointset.  For Ordinary Vornoi Diagrams (OVD), edge swapping is used to dynamically insert points in $O(n)$ time.  To remove points, Ing's 1993 reversion algorithm is used, with worst case $O(n)$ time.

Filed under Algorithms, Voronoi Diagrams

## Dynamic Voronoi Diagrams

Dynamic Voronoi Diagrams; Gowda, Kirkpatrick, Lee, Naamad; 1983

First note that dynamically updating a Voronoi diagram by adding or removing one point can change as many as $\Theta(n)$ edges in the worst case, therefore no worst-case-sublinear algorithms exist.  They present an algorithm that is $O(n)$ for both insert and delete, using a V-tree (Voronoi tree) design.

Each leaf of the tree stores a point in the given pointset $S$ .  Internal nodes $x$ store the Voronoi diagram $VOD(sub(x))$ for the subtree rooted at $x$ .  This requires $O(nh)$ storage, where $h$ is the height of the tree, but may be reduced to $O(n \log \log n)$ and preserving linear update time.

Applications

Consider the $k$ -minimum spanning circle problem: Given a planar point set $S=\{(x_j, y_j) | 1 \le j \le n\}$ where $n > k$ , find $k$ circles that every point in $S$ is contained in at least one circle and the largest circle is minimized.