bigdataanalytics:bda:start
Differenze
Queste sono le differenze tra la revisione selezionata e la versione attuale della pagina.
Entrambe le parti precedenti la revisioneRevisione precedenteProssima revisione | Revisione precedente | ||
bigdataanalytics:bda:start [11/11/2021 alle 16:37 (3 anni fa)] – [Calendar] Luca Pappalardo | bigdataanalytics:bda:start [04/11/2022 alle 12:21 (2 anni fa)] (versione attuale) – Salvatore Ruggieri | ||
---|---|---|---|
Linea 1: | Linea 1: | ||
- | < | + | ====== Big Data Analytics A.A. 2022/23 ====== |
- | <!-- Google Analytics --> | + | |
- | <script type=" | + | |
- | (function(i, | + | |
- | (i[r].q=i[r].q||[]).push(arguments)}, | + | |
- | m=s.getElementsByTagName(o)[0]; | + | |
- | })(window, | + | |
- | ga(' | + | This year, the course 599AA Big Data Analytics |
- | ga(' | + | |
- | ga(' | + | |
- | + | ||
- | ga(' | + | |
- | ga(' | + | |
- | setTimeout(" | + | |
- | </ | + | ====== Previous Big Data Analytics |
- | <!-- End Google | + | |
- | <!-- Global site tag (gtag.js) - Google Analytics --> | + | |
- | <script async src=" | + | |
- | < | + | |
- | window.dataLayer | + | |
- | function gtag(){dataLayer.push(arguments); | + | |
- | gtag(' | + | |
- | gtag(' | + | [[bigdataanalytics: |
- | </ | + | |
- | <!-- Capture clicks --> | + | |
- | < | + | |
- | jQuery(document).ready(function(){ | + | |
- | jQuery(' | + | |
- | var fname = this.href.split('/' | + | |
- | ga(' | + | |
- | }); | + | |
- | jQuery(' | + | |
- | var fname = this.href.split('/' | + | |
- | ga(' | + | |
- | }); | + | |
- | jQuery(' | + | |
- | var fname = this.href.split('/' | + | |
- | ga(' | + | |
- | }); | + | |
- | jQuery(' | + | |
- | var fname = this.href.split('/' | + | |
- | ga(' | + | |
- | }); | + | |
- | jQuery(' | + | |
- | var fname = this.href.split('/' | + | |
- | ga(' | + | |
- | }); | + | |
- | }); | + | |
- | </ | + | |
- | </ | + | |
- | ====== Big Data Analytics A.A. 2021/22 ====== | + | |
- | + | ||
- | All lectures will be provided also remotely, through the Teams team named "599AA 21/22 - BIG DATA ANALYTICS [WDS-LM]" | + | |
- | + | ||
- | Instructors: | + | |
- | * **Luca Pappalardo** | + | |
- | * **Fosca Giannotti** | + | |
- | * KDD Laboratory, ISTI-CNR, Università di Pisa, and Scuola Normale Superiore, Pisa | + | |
- | * [[http:// | + | |
- | * [[[email protected]]] | + | |
- | * [[[email protected]]] | + | |
- | + | ||
- | Tutor: | + | |
- | * **Giuliano Cornacchia** | + | |
- | * [[[email protected]]] | + | |
- | + | ||
- | Timetable | + | |
- | * Wednesday 09:00 - 10:45 Aula Fib M1 | + | |
- | * Friday 09:00 - 10:45 Aula Fib C1 | + | |
- | + | ||
- | **__Dataset assignment__**: | + | |
- | + | ||
- | + | ||
- | **Instructions for MidTerm 1**: The first mid-term presentation (Data Understanding and Project Proposal) will be on **October 20th** (half of the teams) and **October 22nd** (rest of the teams). | + | |
- | * **presentation**: | + | |
- | * **code**: provide the link to the notebook on Jovian with the code you used for all computations and plots. __Document adequately your notebooks using the markdown language.__ The notebook should be runnable without errors on Google Colab, so put in some blocks instructions to install additional libraries (if any) and instructions on the format the datasets should have in order to run the code correctly. | + | |
- | * upload the material by **__Tuesday, | + | |
- | + | ||
- | **Instructions for MidTerm 2**: The second mid-term presentation (model(s) implementation and evaluation) will be on **November 17th** (half of the teams) and **November 19th** (rest of the teams). | + | |
- | * **presentation**: | + | |
- | * **code**: provide the link to the notebook on Jovian with the code you used for all computations and plots. __Document adequately your notebooks using the markdown language.__ The notebook should be runnable without errors on Google Colab, so put in some blocks instructions to install additional libraries (if any) and instructions on the format the datasets should have in order to run the code correctly. Upload also the link to the notebook of the previous mid-term, with the modifications suggested. | + | |
- | * upload the material by **__November 16th__**. | + | |
- | + | ||
- | + | ||
- | ** Instructions for MidTerm3**: The third mid term presentation (model interpretation and explanation) will be on December 15th and December 17th. | + | |
- | + | ||
- | **Paper presentation**: | + | |
- | * Papers assignment: https:// | + | |
- | * Link to papers: https:// | + | |
- | * Each student will present, during a talk of 7 minutes **at most**, a paper on Big Data Analytics. During the presentation (with slides), you should highlight the following aspects: the data set used, the feature engineering and/or selection (if any), the problem addressed, the models/ | + | |
- | ====== Learning goals ====== | + | |
- | + | ||
- | In our digital society, every human activity is mediated by information technologies, | + | |
- | This course has three objectives: | + | |
- | + | ||
- | * introducing to the emergent field of big data analytics and social mining; | + | |
- | * introducing to the technological scenario of big data, like programming tools to analyze big data, query NoSQL databases, and perform predictive modeling; | + | |
- | * guide students to the development of an open-source and reproducible big data analytics project, based on the analysis of real-world datasets. | + | |
- | + | ||
- | ====== Module 1: Big Data Analytics and Social Mining ====== | + | |
- | In this module, analytical methods and processes are presented through exemplary cases studies in challenging domains, organized according to the following topics: | + | |
- | + | ||
- | * The Big Data Scenario and the new questions to be answered | + | |
- | * Sports Analytics: | + | |
- | - Soccer data landscape and injury prediction | + | |
- | - Analysis and evolution of sports performance | + | |
- | * Mobility Analytics | + | |
- | - Mobility data landscape and mobility data mining methods | + | |
- | - Understanding Human Mobility with vehicular sensors (GPS) | + | |
- | - Mobility Analytics: Novel Demography with mobile-phone data | + | |
- | * Social Media Mining | + | |
- | - The social media data landscape: Facebook, Linked-in, Twitter, Last_FM | + | |
- | - Sentiment analysis. example from human migration studies | + | |
- | - Discussion on ethical issues of Big Data Analytics | + | |
- | * Well-being& | + | |
- | - Nowcasting influenza with retail market data | + | |
- | - Predicting well-being from human mobility patterns | + | |
- | * Paper presentations by students | + | |
- | + | ||
- | + | ||
- | ====== Module 2: Big Data Analytics Technologies ====== | + | |
- | This module will provide to the students the technologies to collect, manipulate and process big data. In particular, the following tools will be presented: | + | |
- | + | ||
- | * Python for Data Science | + | |
- | * The Jupyter Notebook: developing open-source and reproducible data science | + | |
- | * MongoDB: fast querying and aggregation in NoSQL databases | + | |
- | * GeoPandas: analyze geo-spatial data with Python | + | |
- | * Scikit-learn: | + | |
- | * Keras: deep learning in Python | + | |
- | + | ||
- | + | ||
- | ====== Module 3: Laboratory for Interactive Project Development | + | |
- | During the course, teams of students will be guided in the development of a big data analytics project. The projects will be based on real-world datasets covering several thematic areas. Discussions and presentation in class, at different stages of the project execution, will be performed. | + | |
- | + | ||
- | * 1st Mid Term: Data Understanding and Project Formulation | + | |
- | * 2nd Mid Term: Model(s) construction and evaluation | + | |
- | * 3rd Mid Term: Model interpretation/ | + | |
- | * Exam: Final Project results | + | |
- | + | ||
- | ====== Calendar ====== | + | |
- | + | ||
- | 15/09 (Mod. 1) Introduction to the course, The Big Data scenario {{ :bigdataanalytics: | + | |
- | + | ||
- | 17/09 (Mod. 2) Python for Data Science and the Jupyter Notebook: developing open-source and reproducible data science | + | |
- | * How to install Jupyter notebook: https:// | + | |
- | * Python notebooks: https:// | + | |
- | * datasets: {{ : | + | |
- | + | ||
- | 22/09 (Mod. 2) Data Exploration and Understanding practice in Python | + | |
- | * Python notebooks: https:// | + | |
- | * datasets: {{ : | + | |
- | + | ||
- | 24/09 (Mod. 3) Presentation of datasets for the project {{ : | + | |
- | + | ||
- | 29/09 (Mod. 2) Scikit-learn: | + | |
- | + | ||
- | 01/10 (Mod. 2) Scikit-learn: | + | |
- | + | ||
- | 6/10 (Mod. 2) Geopandas and scikit-mobility: | + | |
- | * datasets: https:// | + | |
- | * code: https:// | + | |
- | + | ||
- | 8/10 (Mod. 2) Geopandas and scikit-mobility: | + | |
- | * https:// | + | |
- | + | ||
- | 13/10 (Mod. 1) Case study 1: Injury prediction and how to deal with unbalanced datasets and perform feature selection: {{ : | + | |
- | * Prevedere è meglio che curare: AI al servizio dello sport https:// | + | |
- | + | ||
- | + | ||
- | 15/10 (Mod. 2) Feature selection in Python | + | |
- | * notebook: https:// | + | |
- | * dataset1: https:// | + | |
- | * dataset2: https:// | + | |
- | + | ||
- | 20/10 (Mod. 3) MidTerm1 | + | |
- | * BigData-Islanders | + | |
- | * WeMine | + | |
- | * cpu_in_flames | + | |
- | + | ||
- | 22/10 (Mod. 3) MidTerm1 | + | |
- | * How I Met Your Big Data | + | |
- | * SLM | + | |
- | * The Missing Values | + | |
- | + | ||
- | 27/10 (Mod. 3) Comments and discussion on first Mid Term 1 {{ : | + | |
- | + | ||
- | 29/10 (Mod. 1) Case Study 2: How to use Data Science to nowcast well-being {{ : | + | |
- | + | ||
- | 03/11 (Mod. 1) Case Study 3: Performance evaluation in sports | + | |
- | * {{ : | + | |
- | * {{ : | + | |
- | + | ||
- | 05/11 NO LESSON | + | |
- | + | ||
- | 10/11 (Mod. 2) Interpretations and Explanations 1: https:// | + | |
- | __**ONLINE ONLY LESSON**__ | + | |
- | + | ||
- | 12/11 (Mod. 2) Interpretations and Explanations 2: https:// | + | |
- | __**ONLINE ONLY LESSON**__ | + | |
- | + | ||
- | 17/11 (Mod. 3) Mid Term2 | + | |
- | * How I Met Your Big Data | + | |
- | * WeMine | + | |
- | * The Missing Values | + | |
- | + | ||
- | 19/11 (Mod.3) Mid Term2 | + | |
- | * BigData-Islanders | + | |
- | * SLM | + | |
- | * cpu_in_flames | + | |
- | + | ||
- | 01/12 (Mod. 3) Paper presentations | + | |
- | + | ||
- | 03/12 (Mod. 3) Paper presentations | + | |
- | + | ||
- | 10/12 (Mod. 3) Paper presentations | + | |
- | + | ||
- | 15/12 (Mod. 3) Mid Term 3 | + | |
- | + | ||
- | 17/12 (Mod. 3) Mid Term 3 | + | |
- | ===== Exam (Appelli) ===== | + | |
- | TDA | + | |
- | + | ||
- | ====== Previous Big Data Analytics websites ====== | + | |
[[bigdataanalytics: | [[bigdataanalytics: |
bigdataanalytics/bda/start.1636648679.txt.gz · Ultima modifica: 11/11/2021 alle 16:37 (3 anni fa) da Luca Pappalardo