Instructors - Docenti:
Teaching assistant - Assistente:
Instructors:
… a new kind of professional has emerged, the data scientist, who combines the skills of software programmer, statistician and storyteller/artist to extract the nuggets of gold hidden under mountains of data. Hal Varian, Google’s chief economist, predicts that the job of statistician will become the “sexiest” around. Data, he explains, are widely available; what is scarce is the ability to extract wisdom from them.
Data, data everywhere. The Economist, Special Report on Big Data, Feb. 2010.
La grande disponibilità di dati provenienti da database relazionali, dal web o da altre sorgenti motiva lo studio di tecniche di analisi dei dati che permettano una migliore comprensione ed un più facile utilizzo dei risultati nei processi decisionali. L'obiettivo del corso è quello di fornire un'introduzione ai concetti di base del processo di estrazione di conoscenza, alle principali tecniche di data mining ed ai relativi algoritmi. Particolare enfasi è dedicata agli aspetti metodologici presentati mediante alcune classi di applicazioni paradigmatiche quali il Basket Market Analysis, la segmentazione di mercato, il rilevamento di frodi. Infine il corso introduce gli aspetti di privacy ed etici inerenti all’utilizzo di tecniche inferenza sui dati e dei quali l’analista deve essere a conoscenza. Il corso consiste delle seguenti parti:
Classes - Lezioni
Giorno | Orario | Aula |
---|---|---|
Lunedì/Monday | 16:00 - 18:00 | Aula C |
Venerdì/Friday | 14:00 - 16:00 | Aula A1 |
Office hours - Ricevimento:
Classes - Lezioni
Day of week | Hour | Room |
---|---|---|
Monday | 9:00 - 11:00 | Room N1 |
Thursday | 9:00 - 11:00 | Room A1 |
Office hours - Ricevimento:
Day | Aula | Topic | Learning material | Instructor | |
---|---|---|---|---|---|
1. | 25.09.2014 14:00-16:00 | B | Intro: data mining & knowledge discovery process | Textbook, Chapt. 1 dm_intro-2011.pdf | Pedreschi |
2. | 26.09.2014 16:00 | CNR | Evento BRIGHT presso il CNR di Pisa - Big Data Tales | Pedreschi | |
3. | 02.10.2014 14:00-16:00 | B | Intro: data mining & knowledge discovery process | Textbook, Chapt. 1 dm_intro-2011.pdf | Pedreschi |
4. | 03.10.2014 14:00-16:00 | A1 | Intro: data mining & knowledge discovery process | Textbook, Chapt. 1 dm_intro-2011.pdf | Pedreschi |
5. | 09.10.2014 14:00-16:00 | B | Data: types and basic measures | Textbook, Chapt. 2 chap2_data_new.pdf | Pedreschi |
6. | 10.10.2014 14:00-16:00 | A1 | Data: types and basic measures | Textbook, Chapt. 2 chap2_data_new.pdf | Pedreschi |
7. | 13.10.2014 14:00-16:00 | B | Data: types and basic measures | Textbook, Chapt. 2 chap2_data_new.pdf | Pedreschi |
8. | 17.10.2014 14:00-16:00 | A1 | Canceled | Pedreschi | |
9. | 20.10.2014 14:00-16:00 | B | Exploratory data analysis and data understanding. | Textbook, Chapt. 3 chap3_data_exploration.pdf | Pedreschi |
10. | 24.10.2014 14:00-16:00 | A1 | Clustering analysis. Centroid-based methods | Textbook, Chapt. 8 dm2014_clustering_intro.pdf dm2014_clustering_kmeans.pdf | Pedreschi |
11. | 27.10.2014 14:00-16:00 | B | Clustering analysis. Hierarchical methods | Textbook, Chapt. 8 dm2014_clustering_hierarchical.pdf | Pedreschi |
12. | 31.10.2014 14:00-16:00 | A1 | Tutorial on Knime | Slide: knime_slides_dm.pdf Workflows: data-manipulation_iris.zip data-manipulation_adult.zip clustering_iris.zip | Pedreschi |
13. | 10.11.2014 14:00-16:00 | B | Clustering analysis. Density-based methods | Textbook, Chapt. 8 dm2014_clustering_dbscan.pdf | Pedreschi |
14. | 14.11.2014 14:00-16:00 | A1 | Classification and predictive methods | Textbook, Chapt. 4 chap4_basic_classification.pdf | Pedreschi |
15. | 17.11.2014 14:00-16:00 | B | Classification. Decision trees | Textbook, Chapt. 4 chap4_basic_classification.pdf | Pedreschi |
16. | 21.11.2014 14:00-16:00 | A1 | Classification. Decision trees | Textbook, Chapt. 4 chap4_basic_classification.pdf | Pedreschi |
17. | 24.11.2014 14:00-16:00 | B | Classification. Validation and Weka & KNIME Lab | Workflows:decisiontreeiris.zip decisiontreeadult.zip decisiontreeadultoverfitting.zip | Milli |
18. | 28.11.2014 14:00-16:00 | A1 | Classification. Rule-based and bayesian methods | Textbook, Chapt. 4 chap4_basic_classification.pdf | Pedreschi |
19. | 01.12.2014 14:00-16:00 | B | Frequent Pattern Mining. | Textbook, Chapt. 6 2-3tdm-restructured_assoc_2013.pdf | Pedreschi |
20. | 05.12.2014 14:00-16:00 | A1 | Association Rule Mining | Textbook, Chapt. 6 2-3tdm-restructured_assoc_2013.pdf | Pedreschi |
21. | 12.12.2014 14:00-16:00 | A1 | Cancelled for strike | Pedreschi | |
22. | 15.12.2014 14:00-16:00 | B | Association Rule Mining and Knime | Workflow: FP and AR | Monreale |
Day | Aula | Topic | Learning material | Instructor | |
---|---|---|---|---|---|
1. | 23.02.2014 09:00-11:00 | N1 | Introduction + Sequential patterns / 1 | Sequential Patterns - Slides | Nanni |
2. | 26.02.2015 09:00-11:00 | A1 | Sequential patterns / 2 | Link to Tool for seq. patterns | Nanni |
3. | 02.03.2015 09:00-11:00 | N1 | Graph mining | Slides | Nanni |
05.03.2015 09:00-11:00 | A1 | ———– | |||
4. | 09.03.2015 09:00-11:00 | N1 | Advanced Classification Methods / 1 | Slides | Monreale |
5. | 12.03.2015 09:00-11:00 | A1 | Advanced Classification Methods / 2 | Monreale | |
6. | 16.03.2015 09:00-11:00 | N1 | Advanced Classification Methods / 3 | Exercises on Classidication | Monreale |
7. | 19.03.2015 09:00-11:00 | A1 | Time series / 1 | Slides | Nanni |
8. | 23.03.2015 09:00-11:00 | N1 | Time series / 2 | Example of DTW in R | Nanni |
9. | 26.03.2015 09:00-11:00 | A1 | Exercises | Exercises from past exams | Nanni |
10. | 30.03.2015 09:00-11:00 | N1 | Exercises | Monreale | |
11. | 02.04.2015 09:00-11:00 | A1 | Exercises | Monreale | |
03-07.04.2015 | EASTER HOLIDAYS | ||||
13.04.2015 09:00-11:00 | C1 | Midterm test | |||
12. | 16.04.2015 09:00-11:00 | A1 | Case study: CRM - Customer Segmentation + CRISP-DM | AMRP & Stulong CRISP-DM | Nanni |
13. | 23.04.2015 09:00-11:00 | A1 | Case study: CRM - Churn Analysis | Intro CRM Churn ST-Churn | Nanni |
14. | 27.04.2015 09:00-11:00 | N1 | Case study: CRM - Promotions and Sophistication | Promotions Sophistication | Nanni |
15. | 30.04.2015 09:00-11:00 | A1 | Spatiotemporal analysis / 1 | ST Analysis REF: Survey paper | Nanni |
16. | 04.05.2015 09:00-11:00 | N1 | Spatiotemporal analysis / 2 | Nanni | |
17. | 07.05.2015 09:00-11:00 | A1 | Case study: Spatiotemporal analysys / 1 + Projects presentation | Case study 1 Projects | Nanni |
18. | 11.05.2015 09:00-11:00 | N1 | Case study: Spatiotemporal analysys / 2 | Case study 2 | Nanni |
19. | 14.05.2015 09:00-11:00 | A1 | Spatiotemporal analysis / 3 | ST Classification | Nanni |
20. | 18.05.2015 09:00-11:00 | N1 | Outlier detection | Slides from SDM2010 tutorial | Nanni |
21. | 21.05.2015 09:00-11:00 | A1 | Ethical Issues in Data Analytics | Slides | Monreale |
22. | 25.05.2015 09:00-11:00 | N1 | Ethical Issues in Data Analytics / Fraude Detection Case Study | Monreale |
L'esame consiste in una prova scritta ed in una prova orale:
The exam is composed of three parts:
Guidelines for the homework are here.
Date | Hour | Place | Notes | Marks | |
---|---|---|---|---|---|
Mid-term 2015 | Monday 13.04.2015 | 9.00 | Room C1 |
Session | Date | Time | Room | Notes | Results |
---|---|---|---|---|---|
1. | Monday 19 January 2015 | 9.00 | C | Results of written exam | |
1. | Wednesday 21 January 2015 | 9.00 | Predreschi's office | oral exam. Send an email to register for the oral exam | |
1. | Thursday 29 January 2015 | 14.00 | Predreschi's office | oral exam. Send an email to register for the oral exam | |
2. | Monday 16 February 2015 | 9.00 | C | Results of written exam | |
2. | Monday 23 February 2015 | 11.00 | Predreschi's office | oral exam. Send an email to register for the oral exam | |
2. | Monday 2 March 2015 | 11.00 | Predreschi's office | oral exam. Send an email to register for the oral exam | |
3. | Friday 05 June 2015 | 14.00 | C | Results of written exam |
Session | Date | Time | Room | Notes | Results |
---|---|---|---|---|---|
1. | Monday 19 January 2015 | 9.00 | C | ||
2. | Monday 16 February 2015 | 9.00 | C | ||
3. | Friday 05.06.2015 | 14.00 | C | ||
4. | Friday 26.06.2015 | 14.00 | C | ||
5. | Friday 17.07.2015 | 9.00 | C | ||
6. | Wednesday 09.09.2015 | 9.00 | C |
Date | Time | Room | Notes | Results |
---|---|---|---|---|
7 November 2014 | 9:00-11:00 | C1 |