Strumenti Utente

Strumenti Sito


magistraleinformatica:ad:ad_19:start

Year 2019-2020

Announcements

This week the teaching activity is suspended, as requested by our Rector. Hence the classes of March 5 and 6 are canceled. Next class is Tue, March 10, regular time, but lectures will be given in streaming. More details follow.

You student, what can you do next for getting a lecture?

  1. Join the class on Google Classroom (use Android/iOS or connect to the Algorithm Design link).
  2. Click on the link for streaming on Google Meet for attending the classes. Please note that we keep our schedule for time, the only difference is that you have connect to the link instead of physically coming to the room.

Schedule

  • Class hours: Tue 14:00‑16:00 (Fib N1), Thu 11:00‑13:00 (Fib L1), Fri 11:00‑13:00 (Fib N1)
  • Office hours: Fri 13:00-16:00 or by appointment.

Overview

The advanced nature of this course focuses on developing algorithmic design skills, exposing the students to complex problems that cannot be directly handled by standard libraries (being aware that several basic algorithms and data structures are already covered by the libraries of modern programming languages), thus requiring a significant effort in problem solving. These problems involve all basic data types, such as integers, strings, (geometric) points, trees and graphs as a starting point. The syllabus is structured to highlight the applicative situations in which the corresponding algorithms can be successfully employed, making references to software applications and libraries. The level of detail in each argument can change year-by-year, and will be decided according to requests coming from other courses in the curriculum and/or specific issues arising in, possibly novel, applicative scenarios.

Exams

Written exam:

  1. choose one of the topics discussed in class
  2. write a very short to-do list and ask the instructor for approval
  3. if the instructor suggests some mods, modify the to-do list according to the instructor's comments and repeat step 2
  4. if the chosen topic and the to-do list are approved, expand the to-do list into a more detailed to-do list and repeat step 3
  5. make a written report in English and submit it to the instructor (recall to add at least 20% new content, when compared to what seen in class)
  6. meet the instructor to read together the report and get some comments on it
  7. make the necessary mods
  8. use Jupyter Lab to prepare your report (so as to mix Markdown, LaTeX and Python code). More info inside Google Classroom.

Credits XKCD https://xkcd.com/1987/

Suggested reading: some useful tips for scientific writing in English (first two sections) by J.S. Vitter.

Example of interaction: student and instructor discussing the report's content and structure.

Oral exam: topics discussed in class, please read the references in the notes.

Topics

Caveat: Several topics are the outcomes of recent advancements in the field, and thus the course material mostly consists in research papers or book chapters.

Randomization, hashing and data streaming

Randomization is a powerful tool to solve large-scale problems. After introducing the concept of randomized algorithms and hashing, we consider some applications, such as data streaming algorithms, a field emerged in the last decade. Here data flow as a stream and one-pass algorithms with limited memory can process it. We focus on the count-min sketch paradigm and its applications. [Note: to refresh the basic notions on counting and probability, please refer to Appendix C in Cormen-Leiserson-Rivest-Stein's book “Introduction to Algorithms”, 3rd ed., MIT Press. Concentration bounds are explained in these class notes.]

Date Topics References and notes
18.02.2020 Playing with probability. Random indicator variables: secretary problem and random permuting (suggested reading: birthday paradox). Randomized quick sort. [CLRS 5.1-5.3 (optional 5.4.1), par. 7.3] code
20.02.2020 Virus scan and stream analysis with Karp-Rabin fingerprints: randomized checking and pattern matching. Montecarlo and Las Vegas algorithms. [RM par.7.4-7.6] code
21.02.2020 Universal hashing. Markov's inequality. Perfect hashing. [CLRS 11.2, 11.3.3, CLRS 11.5 ] code
25.02.2020 Proxy caches and dictionaries with errors: Bloom filters. Survey: except par.2.5-2.6 (optional: par.2.2)
27.02.2020 Worst-case constant-time lookup: Cuckoo hashing. Notes Notes code
28.02.2020 Space-efficient implementation of Bloom filters using cuckoo hashing and succinct rank data structure. Notes (first part)
03.03.2020 Space-efficient storage of sets with approximate memberships: upper and lower bounds. Notes (second part)
10.03.2020 Distributed server and load balancing through hashing. blog Sect.7 and 8.1
12.03.2020 Distributed server and load balancing through hashing (continued). blog Sect.7 and 8.1
13.03.2020 Multiplicative universal hashing. Sect. 2.3
17.03.2020 Data streaming and sketching algorithms: approximate counters (part 1). Sect. 3-5
19.03.2020 Case study on hashing: rsync and file synchronization using hash functions (seminar by F.Geraci). classroom drive
20.03.2020 Sketching algorithms: approximate counters (part 2). Sect. 3-5
24.03.2020 Sketching algorithms: approximate counters (part 3). Sect. 3-5
26.03.2020 Flajolet-Martin sketches for counting distinct elements. notes
27.03.2020 Count-Min sketches for frequent elements. sects.1-3, 4.1 Site Notes code
31.03.2020 Integer counters and range queries with Count-Min Sketches: implementation and analysis. sects.3-4
02.04.2020 Data stream statistics - part 1 (seminar by F.Geraci) classroom drive
03.04.2020 Document resemblance with MinHash, k-sketches and the Jaccard similarity index. Azuma-Hoeffding bound. Triangle counting. paper paper Azuma-Hoeffding code
07.04.2020Unifying view of sketches: min-k, bottom-k, threshold-t. Jaccard example. classroom drive
09.04.2020Distance distribution in networks: approximation with random sampling and sketches classroom drive
16.04.2020 Data stream statistics - part 2 (seminar by F.Geraci). classroom drive
17.04.2020 Fine-grained algorithms. SETH conjecture and conditional lower bounds. Guaranteed heuristics. Case study: diameter in undirected unweighted graphs. notes sect. 2.3, 2.4, 3, 4
21.04.2020 Approximation in fine-grained algorithms and limitations. Case study: diameter in undirected unweighted graphs. notes
23.04.2020 Networked data and randomized min-cut algorithm for graphs. par.1.1
24.04.2020 NP-hard problems: download file manager and the knapsack problem. Reduction from Partition to Knapsack (restriction). Dynamic programming algorithms for Knapsack: Case 1: integer weights, complexity O(nW). Case 2: integer values, complexity O(n2vmax). Examples. PDF code
28.04.2019 NP-hard problems: heuristics based on dynamic programming; approximation algorithms. Case study: knapsack problem. chapt.2: par. 2.1.1 code
30.04.2019 Clique-based social network analysis (seminar by F.Geraci) classroom drive
05.05.2020 NP-hard problems: counting version (#P) based on dynamic programming, uniform random sampling of the feasible solutions. Case study: #knapsack problem. notes code
07.05.2020 NP-hard problems: fully polynomial-time randomized approximation schemes (FPRASs). Case study: #knapsack problem. notes code
12.05.2020 General inapproximability results. Case study: travel salesman problem (TSP). 2-approximation algorithms for metric TSP, Local search. Greedy. Case study: max cut for graphs. Non-existence of PTAS. [CLRS 35.2] Notes
14.05.2020 Randomized approximation and derandomization: universal hash functions; conditional expectations. Case study: max-cut for graphs. sect. 3-4 sect. 1.1
15.05.2020 Fixed-parameter tractable (FPT) algorithms. Kernelization. Bounded search tree. Case study: min-vertex cover in graphs. sect. 2.2.1, 3.1
19.05.2020 Randomized FPT algorithms: color coding and randomized separation. Case study: longest path in graphs and subgraph isomorphism. sect. 5.2, 5.3
Activity in class
  • The screen snapshots shown during the classes are available in the Google Classroom shared drive.
Official documents for the course
Spot yourself in the classroom

magistraleinformatica/ad/ad_19/start.txt · Ultima modifica: 07/07/2020 alle 07:59 (4 anni fa) da Roberto Grossi

Donate Powered by PHP Valid HTML5 Valid CSS Driven by DokuWiki