# Monthly Archives: March 2013

## Dynamic Auctions and Wireless Links in Data Centers (Xia)

She wants to tackle the issues of traffic dynamics in large scale networks by injecting flexibility into design.

• design flexible spectrum auction systems to match dynamic traffic, since US auctions off large chunks of the spectrum covering the entire country.
• add flexible wireless links to address traffic hotspots in data centers. (Directional 60GHz links between the different racks that is configurable--but any racks in the way must repeat the signal, since 60GHz is line of sight).
• they use reflections off the ceiling to avoid these problems (3D Beamforming)

Dynamic spectrum auction

• most allocated spectrum sits idle.
• using an auction
• truthful auction as e dominant method to remove gain from cheating. Ex: Vicky auction (second price auction)
• she integrates allocation with pricing: greedy allocation based on bids, winner X pays the bid of its critical neighbor.

How is this different than how radio stations split their frequencies based on location?

Note the use of a Kautz $(d,k)$ graph overlay of nodes in the data centers, which allows them to handle arbitrary network size and incremental growth. They show that 92% of paths can run concurrently.

Filed under Big Data, Data Centers

## Minimum sum-of-squares clustering

See the PDF, page 4, for the problem description. Find $C$ cluster centers $z_j$ to minimize the sum of distances from each point to its cluster center.

MSSC is similar to the problem we are trying to solve in the $kn$ to $k$ groups of $n$ matching problem.  Given $kn$ points, we would like to find the $n$ cluster centers that minimize the distance of any point $p_i$ to its nearest center $z_j$ (the centroid of that cluster).   For fixed $n$ and dimensionality $d$ , it has been shown MSSC is solvable in $O((N)^{nd+1})$ , for $N$ input points.  For general $d$ and input parameter $n$ (as in our case), the problem is shown to be NP-hard.

MSSC, however, differs from our problem statement, since we require the size of any cluster to be size $k$ such that the cluster center $z_i$ provides a good representation for $k$ points.  The points in each cluster may be divided up to form $n$ groups, where each member comes from a unique cluster, thereby removing bias between the groups.  Also, our initial problem statement proposes a greedy method, such that the smallest clusters are found first, whereas MSSC minimizes the sizes of all clusters.

Filed under Algorithms

## Big Data talk

Talk by Faculty candidate on Friday.

Big data computation model

• $n$ = number of vectors in $R^d$ seen so far
• $d$ = number of sensors (dimensionality)
• only have $\log n$ memory available on the system
• $M$ = number of cores

$k$ -median queries

• input: $P \in R^d$
• key idea: replace many points by a weighted average point $c \in C$
• $\sum_{p \in P} dist(p,Q) \approx \sum_{c \in C} w(c) * dist(c,Q)$ , where $C$ is the approximated core set.

Online algorithm to build core set: read in $1/\varepsilon$ points, compute $k$ -means and best-fit lines, then choose some of the furthest outliers, then compress that data. Read next $1/\varepsilon$ points, repeat, then compute the core set of those two sets to get a new core set. Then move on to next points until you read all $n$ and have a final core set. Then run off-line heuristic on this core set of size $k$ .

Does a $1+\varepsilon$ approximation.

References to consider:

1. Feldman, Landberg, STOC'11
• Coreset size: $kd/\varepsilon^2$
3. Har-Peled, Mazumdar, 04
4. Feldman, Wu, Julian, Sung and Rus. GPS Compression

Filed under Big Data, Core Sets, Talks

## Dynamic Voronoi Diagrams

Computational Geometry - An Introduction Through Randomized Algorithms; Mulmuley; 1994; pp 135-

Proposition 4.2.2: The total cost of maintaining a Voronoi diagram in a given update sequence is proportional to the total structural change during that sequence. This ignores a logarithmic factor in the cost of a deletion. It also ignores the cost of locating a conflicting vertex during addition. The expected (total) structural change during a random $(N,\delta)$ -sequence is $O(n)$ .