Old Editions of the course
Teaching rooms :
Room L1, Polo Fibonacci, first floor.
Room C1, Polo Fibonacci, first floor.
Room C40, ISTI-CNR, Door 19/20, first floor
Lectures
22/09. Introduction to the course, Logistics. Large-scale problems: status and perspectives.
23/09. Data storage architectures: read-only (search engines), read-write (datastores). Grid computing: elements, definition, architecture, protocols and hourglass model. (slides)
29/09. Grid computing: security, resource management, information management, data management. Architectural models, real-world Grids, Globus Toolkit. (slides)
30/09. Cloud computing: definitions, properties, characteristics. Elasticity, dynamic provisioning and autonomic control. (slides)
03/10. Cloud computing: user benefits, provider benefits, economies of scale. Service models: IAAS, PAAS, SAAS. Deployment models. (slides)
07/10. Cloud computing: comparison with Grids. Cloud programming: fault tolerance, service-oriented architectures, decoupling, autoscaling. (slides)
14/10. Cloud computing: design patterns for Cloud applications. Fault tolerance, decoupling, elasticity implementation, data storage, security. Amazon Web Services overview. (slides)
17/10. Map Reduce: design principles, functional programming, basic programming model, exercises (slides)
21/10. Map Reduce: Hadoop installation, configuration.
24/10. Map Reduce: exercises.
28/10. Map Reduce: combiners, partitioners, scheduling, fault tolerances. HDFS and I/O APIs. Exercises. (slides)
11/11. Map Reduce: exercises.
14/11 Map Reduce: algorithmic patterns. State management, matrices, database operations, graphs. (slides)
18/11 a. Map Reduce: exercises.
18/11 b. Map Reduce: exercises.
21/11 Data models, representation and storage: relational, document, graph models. OLTP vs OLAP. LSM Trees and B-Trees. (slides)
25/11 Data encoding: data flows, backward and forward compatibility, Thrift, Protocol Buffers and Avro encodings. (slides)
28/11 Data management: scalability, performance and fault tolerance of distributed systems. General replicated architecture. Consistency models: strict, linearizable and sequential consistency.(slides)
02/12 Data replication: passive replication, replication log, active partitioning, quorum systems, write conflicts. (slides)
05/12 Sospensione della didattica prot. 57140 del 18/11/2016, Università di Pisa.
09/12 Data replication: client-centric consistency models, FLP and CAP theorems.(slides)
12/12 Time: physical clocks and logical clocks. Happens-before relation and systems of logical clocks. Scalar and vector logical clock systems. (slides)
16/12 Data partitioning: consistent hashing and virtual nodes. (slides)
16/12 Projects discussion.
Bibliography
Grids
Study Notes
- I. Foster, C. Kesselman, “The Grid 2: Blueprint for a New Computing Infrastructure”, Morgan Kaufmann Publishers Inc., 2003. Chapters 4 and 21.
- K. Hwang, G. C. Fox, J. Dongarra, “Distributed and Cloud Computing”, Morgan Kaufmann Publishers Inc., 2012. Chapter 7.
Reading Assignments
- IBM Redbooks, Introduction to Grid Computing, 2003. http://www.redbooks.ibm.com/redbooks/pdfs/sg246778.pdf
Clouds
Study Notes
Reading Assignments
Map Reduce
Study Notes
Reading Assignments
Virtualization