Prof. Roberto Grossi
We will study, design and analyze advanced algorithms and data structures for the efficient solution of combinatorial problems involving all basic data types, such as integers, strings, (geometric) points, trees and graphs. This course deepens and extends the algorithmic notions of students.
The syllabus is structured to highlight the applicative scenarios in which the studied algorithms and data structures can be successfully applied. The level of detail with which each argument will be dealt with can change year-by-year, and will be decided according to requests coming from other courses and/or specific issues arising in, possibly novel, applicative scenarios.
The advanced nature of this course focuses on developing algorithmic design skills. The students are exposed to complex problems that require a significant effort in problem solving. One “brainstorming” lecture per week is devoted to a full immersion in this activity, and it is highly recommend to attend it. The purpose is not that of teaching/learning further N algorithms, but to refine students' skills. The final written exam will be based on the topics discussed during the “brainstorming” lectures.
We like students that make mistakes in a constructive way, since this is the starting point of our brainstorming to drive their reasoning paths, thus learning by mistakes. We are interested in the route that leads to the algorithmic solution, rather than the solution itself. Regarding mistakes and intelligence, Alan Turing wrote in 1947 that “… if a machine is expected to be infallible, it cannot also be intelligent”. This is a motto for this class, with a kind invitation to express your own ideas even if they could be apparently wrong.
Please note that several topics are the outcome of recent advancement in algorithms and data structures, and thus most of the course material consists in research papers or book chapters.
Date | Topics | References and notes |
---|---|---|
Sept. 23 | Introduction to the course. Problem solving: ideas and examples. | - |
Massive data sets are one of the key elements in our so-called big data society. In this scenario the design and analysis of algorithms and data structures adopt the external and cache-oblivious models, which take into account the memory hierarchy that provides performance to modern computers. We study basic problems on sorting and searching in these models, comparing the results with the classical RAM model of computation.
Date | Topics | References and notes |
---|---|---|
Sept. 24 | External memory model (EMM). Case study for sequential access versus random access. Description of the model. CPU complexity and I/O complexity. K-way mergesort. | Introduction, chapt.2 and 3 (set D=1 disks) notes |
Sept. 25 | Cost of scanning, sorting and permuting. Scan-and-sort paradigm. Lower bounds for sorting in the external memory model. | Introduction, chapt.2 and 3 (set D=1 disks), chapt.5 (set D=1 disks, no randomization, till Sect.5.1.1 included)NotesNotes |
Sept. 30 | Problem solving. External memory K-way merge and mergesort. External memory permuting. | list of problems |
Oct. 1 | Scan-and-sort paradigm: map-reduce. Lower and upper bounds for static searching in external-memory. | paper slides slides sect.11.1 |
Oct. 2 | Selection of splitters for the distribution sort. | Notes wikipedia |
Oct. 7 | Problem solving. Implicit static search in external memory. External memory distribution sort. | list of problems |
Oct. 8 | Dynamic searching in external memory: (a,b)-trees and B-trees. Definition, search and insert operations. | sect.1 and 2 |
Oct. 9 | Dynamic searching in external memory: (a,b)-trees and B-trees. Deletion, construction, and I/O analysis. | sect.2 |
Oct. 14 | Problem solving. Number of split/fuse operations for $(a,b)$-trees. 1-D range queries. | list of problems |
Oct. 15 | Suffix sorting and suffix array. Relation with standard sorting. DC-3 algorithm for RAM model. | DC3 algorithm (up to Section3) |
Oct. 16 | Suffix sorting and suffix array. DC-3 algorithm for RAM model (continued). | DC3 algorithm (up to Section3) |
Oct. 21 | Problem solving. Filling the details of the DC3 suffix sorting. Suffix array searching. | list of problems |
Oct. 22 | Cache-oblivious (CO) model. van Emde Boas (vEB) layout of complete binary (search) trees. | Sect.4.1 Notes |
Oct. 23 | Text indexing and searching: suffix trees. | wikipedia |
Oct. 28 | Problem solving. Computing the LCP and suffix array construction. Navigating in the implicit vEB layout. | list of problems |
Randomization is now a common powerful tool to solve large-scale problems. After introducing the concept of randomized algorithms and hashing, we apply them to solve data streaming problems, a field emerged in the last decade. Here data flow as a stream and one-pass algorithms with limited memory can process it. We focus on the count-min sketch paradigm and its applications.
Date | Topics | References and notes |
---|---|---|
Oct. 29 | Randomness and Kolmogorov complexity. Randomized quicksort. | Notes (first 2 pages) par. 7.3 |
Oct. 30 | Las Vegas and Montecarlo algorithms. Analysis of randomised quick sort. Randomized algorithm for min-cut. | par.5.1 par. 7.3 |
Nov. 11 | Problem solving. | list of problems |
Nov. 12 | Data Streaming Model. Motivations and examples. | sect.1 Site Notes |
Nov. 13 | Count-Min Sketches and applications - part 1 | sect.2,3 Site |
Nov. 18 | Problem solving. | list of problems |
Nov. 19 | Count-Min Sketches and applications - part 2 | sect.2,3 Site |
Nov. 20 | Count-Min Sketches and applications - part 3 | sect.2,3 Site |
Nov. 25 | Problem solving. | list of problems |
NP-hard problems are important but difficult to solve and no deterministic polynomial algorithms are currently known for them. Even basic problems, such as finding a simple path of vertices in a graph, are NP-hard. We discuss how to attack these problems (a) by counting and listing their solutions where the cost is proportional to the output (it can be exponential in the worst case but we only pay for what we get) and (b) by approximating their solutions with polynomial-time algorithms when the problem requires to minimize a cost or maximize a benefit. Most of the addressed problems are on graphs, which are popular representations for modern networked data.
Date | Topics | References and notes |
---|---|---|
Nov. 26 | Complexity classes: P, NP, NPC. Polynomial reductions (prof. Pagli) | Turing Machines pp.255-260, 264-276 |
Nov. 27 | Example of reductions: SAT, 3-SAT, NE-3-SAT. | Notes pp.282-283 |
Dec. 2 | Problem solving | list of problems |
Dec. 3 | NP-completeness of MAX-CUT. | Notes |
Dec. 4 | Coloring problems. Reduction from 3-coloring of planar maps to 3-SAT. | Notes |
Dec. 4 | Problem solving | list of problems |
Dec. 9 | Notion of r-approximation algorithm. TSP: NP-hardness and difficulty of approximation. 2-approximation for metric TSP. | Except Sect.35.1 |
Dec. 9 | Problem solving (4 hours) | list of problems |
Dec. 10 | 2-approximation for MAX-CUT: local search, greedy. 2-approximation for minimum vertex cover. | Notes Sect.35.1 |
Dec. 11 | Approximation for bin packing and knapsack. | chapt.2: par. 2.1.1, par. 2.2.2 |
Date | Topics | References and notes |
---|---|---|
Nov. 17 | Review (part I) of topics from Chapt. 1, Chapt. 2, Sect. 3.1, Sect. 4.5. | Cormen, Leiserson, Rivest, Stein. Introduction to Algorithms. MIT Press. |
Nov. 26 | Review (part II) of topics from Chapt. 1, Chapt. 2, Sect. 3.1, Sect. 4.5. | Cormen, Leiserson, Rivest, Stein. Introduction to Algorithms. MIT Press. |
Dec. 3 | Review of topics from Chapt. 6, Sect. 7.1, 7.2, Sect. 8.1, Chapt.9. | Cormen, Leiserson, Rivest, Stein. Introduction to Algorithms. MIT Press. |
Dec. 10 | Review of topics from Chapt. 10, Chapt. 11 (except Sect.11.5), Chapt. 12 (execpt Sect. 12.4), Sect. 15.3 and 15.4. | Cormen, Leiserson, Rivest, Stein. Introduction to Algorithms. MIT Press. |
This is the final list of problems. Some of the screen snapshots shown in the classroom.