Instructors - Docenti:
Notice: you can find a list of the papers to read at this link: http://bit.ly/bda_papers. Send an email to Luca Pappalardo within Thursday, October 26th with your choice for three/four papers. We then assign you one of the papers considering your preferences.
ABOUT THE EXAM: The verbalization of the exam for the students who finalized their project will be on January 18th 2019, room L1. A few questions about the project report will be eventually asked before the verbalization.
Instructions for project proposal (October 26th):
Instructions for paper presentation (November 16th and 23th):
Instructions for project advancements report and presentation (November 26th):
Instructions for final report and presentation (December 10th and 14th):
In our digital society, every human activity is mediated by information technologies, hence leaving digital traces behind. These massive traces are stored in some, public or private, repository: phone call records, movement trajectories, soccer-logs and social media records are all examples of “Big Data”, a novel and powerful “social microscope” to understand the complexity of our societies. The analysis of big data sources is a complex task, involving the knowledge of several technological and methodological tools. This course has three objectives:
In this module, analytical methods and processes are presented thought exemplary cases studies in challenging domains, organized according to the following topics:
This module will provide to the students the technologies to collect, manipulate and process big data. In particular the following tools will be presented:
During the course, teams of students will be guided in the development of a big data analytics project. The projects will be based on real-world datasets covering several thematic areas. Discussions and presentation in class, at different stages of the project execution, will be performed.
17/09 (Mod. 1) Introduction to the course, The Big Data scenario mod1.introduction_bigdatalandscape_newquestions_.pdf
21/09 (Mod. 1) Big Data Analytics: new questions to be solved + Presentation of datasets
24/09 (Mod. 2) Python for Data Science: The Jupyter Notebook: developing open-source and reproducible data science
28/09 (Mod. 1) Soccer data landscape and players’ injury prediction
01/10 (Mod. 2) Scikit-learn: programming tools for data mining and analysis.
05/10 (Mod. 1) Analysis and evolution of sports performance
08/10 (Mod. 1) The mobility data landscape
12/10 (Mod. 1) Suspended
15/10 (Mod. 1) Mobility data mining methods (Patterns&Models)
19/10 (Mod. 1) Understanding Human Mobility with GPS - Case Studies
22/10 (Mod. 1) Urban Dynamics with mobile phone data
26/10 (Mod. 3) Data Understanding and Project Formulation
05/11 (Mod. 2) GeoPandas: analyse geo-spatial data with Python
09/11 (Mod. 1) Predicting well-being from human mobility patterns
12/11 (Mod. 2) MongoDB: fast querying and aggregation in NoSQL databases
16/11 (Mod. 3) Papers presentations from students
19/11 (Mod. 1) Nowcasting influenza with retail market data
23/11 (Mod. 3) Papers presentations from students
26/11 (Mod. 3) Mid Term Project Results
30/11 No lessons
03/12 (Mod. 1) The social media data landscape and social media mining methods
07/12 (Mod. 1) Sentiment analysis and Opinion Mining (Andrea Esuli)
10/12 (Mod. 3) Discussion on Ethical issues in Big Data Analytics and Final Project results
14/12 (Mod. 3) Final Project results
18/01 EXAM: 09:00 @ aula L1
08/02 EXAM: 09:00 @ aula L1
The two mid-terms will be 40% of the final grade, the remaining 60% is the evaluation of the Project and the Discussion (prepare some Slides to present your project). There is the possibility to do the a final test about technologies if the Mid-Terms are not sufficient.
The following table describe the expected content of a project: