About the Courses
Hashing Algorithms (Michael Mitzenmacher)
Hashing algorithms and data structures are one of the basic building blocks in
computer science, with both beautiful and challenging mathematics behind the
theory, and many significant and important applications. In this short course,
we cover some highlights from the past several decades of hashing, including
Bloom filters, multiple-choice hash tables, cuckoo hashing, min-wise
independence, and sketching algorithms, as well as some more recent work,
including peeling algorithms, tabular hashing, and invertible Bloom lookup
tables. While emphasizing the theoretical foundations, we will consider both
applications and related algorithmic engineering issues.
Algorithms for Modern Parallel Systems (Sergei Vassilvitskii)
MapReduce, Hadoop, Pregel, Giraph: these modern parallel systems are at the
heart of large-scale data analysis. However, developing efficient
algorithms for them requires juggling an additional set of constraints. The
ultimate goal is an approach that ensures that all of the available
machines are working together to solve the problem, yet keeps costly
communication and synchronization to a minimum.
We will begin by surveying the classical results in parallel algorithms,
and then explore the differences between the different models of
computation. We will then turn to techniques for modern parallel systems,
including simulations, coresets, adaptive sampling, and others. To
conclude, we will touch on lower bounds and explore the limits of parallel
computation. Throughout we will maintain a close connection to
applications, and the real world engineering trade-offs not captured in the
models.
|