Indice

Old Editions of the course

Teaching rooms :
Room L1, Polo Fibonacci, first floor.
Room C1, Polo Fibonacci, first floor.
Room C40, ISTI-CNR, Door 19/20, first floor

Lectures

22/09. Introduction to the course, Logistics. Large-scale problems: status and perspectives.
23/09. Data storage architectures: read-only (search engines), read-write (datastores). Grid computing: elements, definition, architecture, protocols and hourglass model. (slides)
29/09. Grid computing: security, resource management, information management, data management. Architectural models, real-world Grids, Globus Toolkit. (slides)
30/09. Cloud computing: definitions, properties, characteristics. Elasticity, dynamic provisioning and autonomic control. (slides)
03/10. Cloud computing: user benefits, provider benefits, economies of scale. Service models: IAAS, PAAS, SAAS. Deployment models. (slides)
07/10. Cloud computing: comparison with Grids. Cloud programming: fault tolerance, service-oriented architectures, decoupling, autoscaling. (slides)
14/10. Cloud computing: design patterns for Cloud applications. Fault tolerance, decoupling, elasticity implementation, data storage, security. Amazon Web Services overview. (slides)
17/10. Map Reduce: design principles, functional programming, basic programming model, exercises (slides)
21/10. Map Reduce: Hadoop installation, configuration.
24/10. Map Reduce: exercises.
28/10. Map Reduce: combiners, partitioners, scheduling, fault tolerances. HDFS and I/O APIs. Exercises. (slides)
11/11. Map Reduce: exercises.
14/11 Map Reduce: algorithmic patterns. State management, matrices, database operations, graphs. (slides)
18/11 a. Map Reduce: exercises.
18/11 b. Map Reduce: exercises.
21/11 Data models, representation and storage: relational, document, graph models. OLTP vs OLAP. LSM Trees and B-Trees. (slides)
25/11 Data encoding: data flows, backward and forward compatibility, Thrift, Protocol Buffers and Avro encodings. (slides)
28/11 Data management: scalability, performance and fault tolerance of distributed systems. General replicated architecture. Consistency models: strict, linearizable and sequential consistency.(slides)
02/12 Data replication: passive replication, replication log, active partitioning, quorum systems, write conflicts. (slides)
05/12 Sospensione della didattica prot. 57140 del 18/11/2016, Università di Pisa.
09/12 Data replication: client-centric consistency models, FLP and CAP theorems.(slides)
12/12 Time: physical clocks and logical clocks. Happens-before relation and systems of logical clocks. Scalar and vector logical clock systems. (slides)
16/12 Data partitioning: consistent hashing and virtual nodes. (slides)
16/12 Projects discussion.

Bibliography

Grids

Study Notes

  1. I. Foster, C. Kesselman, “The Grid 2: Blueprint for a New Computing Infrastructure”, Morgan Kaufmann Publishers Inc., 2003. Chapters 4 and 21.
  2. K. Hwang, G. C. Fox, J. Dongarra, “Distributed and Cloud Computing”, Morgan Kaufmann Publishers Inc., 2012. Chapter 7.

Reading Assignments

  1. IBM Redbooks, Introduction to Grid Computing, 2003. http://www.redbooks.ibm.com/redbooks/pdfs/sg246778.pdf

Clouds

Study Notes

Reading Assignments

Map Reduce

Study Notes

Reading Assignments

Virtualization