Indice

Algorithm Engineering A.A. 2010-2011

Thanks to P. Sanders Teacher: Paolo Ferragina

CFU: 6 (first semester).

Course ID: 283AA.

Language: English.

Degree: All master degrees in Computer Science, and the one in CS&Networking in collaboration with the Scuola Normale Superiore “S. Anna”.

Lectures schedule: Monday, 9-11 (Room C1); Thursday, 14-16 (Room C).

Question time: Monday or Thursday at 11-12.30, Room 295 (Ferragina's office), Dept of Computer Science.

News: about this course will be distributed via a Tweeter-channel with hashtag #ae2010

Official lectures schedule: The schedule and content of the lectures is available with the official registro.

News

Goals

In this course we will study, design and analyze (theoretically and experimentally) advanced algorithms and data structures for the efficient solution of combinatorial problems involving all basic data types, such as integers, strings, (geometric) points, trees and graphs. These algorithmic tools will be designed and analyzed in several models of computation— such as RAM, 2-level memory, cache-oblivious, streaming— in order to take into account the architectural features and the memory hierarchy of modern PCs.

Every lecture will follow a problem-driven approach that starts from a real software-design problem, abstracts it in a combinatorial way (suitable for an algorithmic investigation), and then introduces algorithmic solutions aimed at minimizing the use of some computational resources like time, space, communication, I/O, energy, etc. Some of these solutions will be discussed at an experimental level, in order to introduce proper engineering and tuning tools for algorithmic development.

Background

If you wish to refresh your mind on Algorithms and Data Structures, I suggest you to follow the Video Lectures by Erik Demaine and Charles Leiserson, specifically Lectures 1-5, 7 and 10. There it is missing the part on basic graph problems (representation, DFS, BFS, topological sort) which you may browse in any book, such as Introduction to Algorithms by Cormen-Leiserson-Rivest-Stein, third edition.

Exam

Dates Room
01/02/2011 Text
28/02/2011 Text
09/06/2011 Text
24/06/2011 Text
20/07/2011 Text
01/09/2011 Text
28/09/2011 Text

Books, Notes, etc.

I'll project no slides for teaching, and use just the old-fashioned blackboard.

Since the course covers many different algorithmic themes and will deal with advanced results and issues which are not yet part of books, I'll distribute notes and copies of papers/books which will constitute the reading material of the course.

This is a preliminary list of books from which I've taken some parts (details in the lectures):

List of Lectures

Date Lecture Biblio
23/09/10 Refreshing: Analysis of algorithms, (asymptotic) time/space complexity, Big-Oh notation, Divide-and-Conquer, Binary search, Recurrences, and few other examples. Lectures 1, 2 and 3 of Demaine-Leiserson's course at MIT
24/09/10 Refreshing: Mergesort and Quicksort. Heap: properties and operations. Lectures 4 of Demaine-Leiserson's course at MIT
27/09/10 Refreshing: Hashing: hash functions and their properties. Hashing with chaining. Binary Search Trees: recursive visit, pre/in/post visits. Red-Black Trees. Lectures 7 and 10 of Demaine-Leiserson's course at MIT
30/09/10 Refreshing: Graphs: how to represent it in memory. DFS and BFS visits, properties and applications. part of Lecture 17 of Demaine-Leiserson's course at MIT. Look at the book CLR.
05/10/10 Definition of “Algorithm Engineering”. Models of computation: 2-level memory and cache-oblivious models. An example: summing n numbers.
06/10/10 Effect of caching on the performance of external-memory algorithms. Maximum sub-array Sum. Notes
07/10/10 Random sampling in the disk model and in the streaming model. Notes
14/10/10 Permuting and Sorting in the RAM model and the 2-level memory model: binary mergesort, its analysis and its limitations. Some optimizations: the SnowPlow technique.
18/10/10 Multi-way mergesort. The I/O lower-bound for sorting and permuting. Sorting on D disks: disk striping and its UN-optimality.
25/10/10 Quicksort: pivot selection, average-case analysis. Randomised selection of pivot in linear time. Notes
04/11/10 String sorting: limits of qsort and comparison-based algorithms. MSD-first Radixsort, LSD-first Radixsort, Multi-key Quicksort. Notes
07/11/10 Few exercises.
11/11/10 Prefix search: Array of string pointers, front coding, locality preserving front-coding.
15/11/10 Interpolation search. preliminary notes
22/11/10 Tries, Patricia Trie, String B-tree.
02/12/2010 Text indexing: Suffix tree and array. Text mining with the suffix array. Suffix array's construction. notes and slides
06/12/2010 Hashing: chaining, universal, perfect. notes (no open addressing)
09/12/2010 d-left hashing, cuckoo hashing, minimal ordered perfect hashing. note1, note2, slides
20/12/2010 The Bloom Filter and its spectral variant. slides
10/01/2011 Data compression: notion of entropy, Huffman coding (and its optimality), Canonical Huffman codes. MG (up to page 41)
13/01/2011 Data compression: LZ77, LZ78 and gzip. MG pages 74-81
17/01/2011 Data compression: Burrows-Wheeler Transform, MTF, RLE and bzip. Skip lists: description, properties, proof for heigth and search time. slides and notes