Complementi di piattaforme abilitanti distribuite
Teacher: Nicola Tonellotto
Question time: Please contact the teacher
Tuesday | 11:30-13:30 | Room 10B |
---|---|---|
Wednesday | 16:00-18:00 | Room 10B |
Friday | 9:30-11:30 | Room 10B |
Teaching rooms:
Room 10B, S.Anna/CNIT building in CNR Research Area, ground floor.
For this first year, the course is co-organized with Strumenti di programmazione per sistemi paralleli e distribuiti taught by Dr. Massimo Coppola
Syllabus
24/02: Grid Computing (I) Slides Student Notes
- Large-scale problems in research and prodution environments
- How to approach these problems
- Preliminary definitions: resources, protocols, services, APIs and SDKs
- A simple example: web services and their protocols
25/02: Grid Computing (II) Slides Student Notes
- Virtual Organizations
- The Grid vision and its requirements
- The Grid architecture: fabric, connectivity, resource, collective and application layers
- Using the Grid: scenarios and examples
- Open Grid Service Architecture and its capabilities
- The eight fallacies of Grid computing
09/03: Grid Computing (III) Slides Student Notes
- The Globus Project
- Public Key Infrastructure (concepts)
- Grid Security Infrastructure
- Certificates
- Single Sign On and Delegation
10/03: Grid Computing (IV) Slides Student Notes
- Grid Information Services
- Lightweight Directory Access Protocol (concepts)
- Monitoring and Discovery Service
- IP, GRIS and GIIS
- Grid Information Models: MDS-2 and GLUE schemata
19/03: Grid Computing (V) Slides Student Notes
- HPC Resource Management
- Grid Resource Management
- Gatekeeper and Job Manager
- Data Management
- GASS and GridFTP
- Replica Catalog and Replica Management Services
23/03: MapReduce: the programming model Slides Student Notes
- Problem characterization
- Map Fold in LISP
- Programming model: mappers and reducers
- Programming model: partitioners and combiners
- Example and data flow
24/03: Distributed File Systems: GFS and HDFS Slides Student Notes
- Problem characterization
- Blocks, Name nodes and Data Nodes
- Master/server architecture
- Master Server (namenode), Chunk Servers (datanode) protocols and responsabilities
- Anatomy of a read
- Anatomy of a write
- Benchmarks
- Hadoop installation and setup
- Single mode and pseudodistributed mode configuration
- Grep application
- Word Count application (old Hadoop APIs)
- Word Count application (new Hadoop APIs)
- API usage
- Using large number of files
14/04: Lab Problem Solution (1/4)
- Computing tf-idf with MapReduce
- Word frequency in document
04/05: Lab Solution (2/4)
- Computing tf-idf with MapReduce
- Word count in document
05/05: Lab Solution (3/4) Solution (4/4)
- Computing tf-idf with MapReduce
- Word frequency in collection
- Calculate TF-IDF
07/05: Autonomic Computing Slides Student Notes
- Self management
- Self properties
- Feedback control of computing systems
11/05: Scheduling Slides Student Notes
- Single processor scheduling: SJF, FCFS, RR, MLQ
- Real time scheduling: RM, EDF
- Cluster Scheduling: FCFS, Backfilling
12/05: Scheduling SlidesStudent Notes
- Grid Resource Management
- Bag-of-tasks heuristics: Min-Min, Max-Min, Sufferage
- Workflow heuristics: List, Multilevel, Clustering scheduling. HEFT
- Economic Scheduling