Instructors:
Teaching Assistant
Instructors:
Teaching Assistant
Classes
Day of Week | Hour | Room |
---|---|---|
Monday | 11:00 - 13:00 | Aula C / MS Teams |
Thursday | 11:00 - 13:00 | Aula A1 / MS Teams |
Office hours - Ricevimento:
Classes
Office Hours - Ricevimento:
Day | Room | Topic | Learning material | Recording | Instructor | |
---|---|---|---|---|---|---|
1. | 16.09.2021 11:00-12:45 | Aula Fib A1 | Introduction. | Introducing DM1 Project-work guidelines (updated 22.11.2021) | Lecture 1 | Pedreschi |
2. | 20.09.2021 11:00-12:45 | Aula Fib C | Course overview | Overview of contents | Lecture 2 | Pedreschi |
3. | 23.09.2021 11:00-12:45 | Aula Fib A1 | Data Understanding | Slides | Lecture 3 | Pedreschi |
4. | 27.09.2021 11:00-12:45 | Aula Fib C | Data Preparation | Slides | Lecture 4 | Pedreschi |
5. | 30.09.2021 11:00-12:45 | Aula Fib A1 | Lab: Data Understanding & Preparation – Python | Python Introduction Dataset: Iris Hands-On Python (Iris) | Lecture 5 | Citraro |
6. | 04.10.2021 11:00-12:45 | Aula Fib C | Lab: Data Understanding & Preparation – Python (cont.) & KNIME | Dataset: Titanic Hands-On Python (Titanic), Titanic DU+DP (complete) KMIME: Intro, KNIME DU+DP | Lecture 6 | Citraro |
7. | 07.10.2021 11:00-12:45 | Aula Fib A1 | Clustering: Intro & K-means | Clustering intro and k-means [revised version] | Lecture 7 | Nanni |
| | |||||
8. | 14.10.2021 11:00-12:45 | Aula Fib A1 | Clustering: k-means | Lecture 8 | Nanni | |
| | |||||
9. | 21.10.2021 11:00-12:45 | Aula Fib A1 | Clustering: Hierarchical methods | Clustering: Hierarchical Methods | Lecture 9 | Nanni |
10. | 25.10.2021 11:00-12:45 | Aula Fib C | Clustering: density-base methods & exercises | Clustering: Density-based methods | Lecture 10 | Nanni |
11. | 28.10.2021 11:00-12:45 | Aula Fib A1 | Lab: Clustering | Python Hands-On Clust. (Iris) Python Titanic Knime | Lecture 11 | Citraro |
12. | 04.11.2021 11:00-12:45 | Aula Fib A1 | Classification: intro and decision trees | Classification and decision trees (updated 11.11.2021) | Lecture 12 | Nanni |
13. | 08.11.2021 11:00-12:45 | Aula Fib C | Classification: decision trees/2 | Lecture 13 | Nanni | |
14. | 11.11.2021 11:00-12:45 | Aula Fib A1 | Classification: decision trees/3 | Lecture 14 | Nanni | |
15. | 15.11.2021 11:00-12:45 | Aula Fib C | Classification: decision trees/4 | Lecture 15 | Nanni | |
16. | 18.11.2021 11:00-12:45 | Aula Fib A1 | Classification: decision trees exercises | Exercise | Lecture 16 | Nanni |
17. | 22.11.2021 11:00-12:45 | Aula Fib C | Lab:Classification | knime_classification Hands_on_Python_Titanic Python_Iris | Online: TBD Lecture 17 (offline) | Citraro |
18. | 25.11.2021 11:00-12:45 | Aula A1 | Pattern Mining - 1 | Slides | Lecture 18 | Pedreschi |
19. | 29.11.2021 11:00-12:45 | Aula C | Pattern Mining - 2 | Lecture 19 | Pedreschi | |
20. | 02.12.2021 11:00-12:45 | Aula A1 | Lab: Pattern Mining | Apriori Exercise Hands_on_Python_Titanic KNIME | Lecture 20 | Citraro |
Day | Room Teams | Topic | Learning material | Instructor | Recordings | |
---|---|---|---|---|---|---|
01. | 14.02.2022 11:00–13:00 | C | Introduction, CRIPS, Evaluation, KNN | Intro, CRISP, Eval, KNN, Notebbok_KNN_Eval | Guidotti | link |
02. | 17.02.2022 11:00–13:00 | A1 | Imbalanced Learning, Evaluation | ImbLearn Eval, ImbLearn | Guidotti | link |
03. | 21.02.2022 11:00–13:00 | C | Dimensionality Reduction | DimRed, Notebook_DimRed | Guidotti | link |
04. | 24.02.2022 11:00–13:00 | A1 | Outlier Detection (part 1) | Outlier Detection, Notebook_OutlierDetection | Guidotti | link |
05. | 28.02.2022 11:00–13:00 | C | Outlier Detection (part 2) | Outlier Detection, Notebook_OutlierDetection | Guidotti | link |
06. | 03.03.2022 11:00–13:00 | A1 | Outlier Detection (part 3) | Outlier Detection, Notebook_OutlierDetection | Guidotti | link |
07. | 07.03.2022 11:00–13:00 | C | Naive Bayes Classifier, Linear Regression | NBC , Notebook_NBC, LinReg | Guidotti | link |
08. | 10.03.2022 11:00–13:00 | A1 | Linear Regression, Gradient Descent, Maximum Likelihood Estimation, Odds | LinReg, GradDes, MLE, Odds | Guidotti | link |
09. | 14.03.2022 11:00–13:00 | C | Logistic Regression, Support Vector Machines | LogReg, SVM, Notebook_LR, Notebook_SVM | Guidotti | link |
10. | 17.03.2022 11:00–13:00 | A1 | Linear and Logistic Perceptron | Perceptron | Guidotti | link1, link2 |
11. | 21.03.2022 11:00–13:00 | C | Neural Networks | NeuralNetwork, Notebook_NN, Notebook_NN_impl | Guidotti | link |
12. | 24.03.2022 11:00–13:00 | A1 | Ensemble Classifiers, Bagging, Random Forest | EnsembleClassifiers, Notebook_ENS | Guidotti | link |
13. | 28.03.2022 11:00–13:00 | C | Boosting, Gradient Boost | GBM | Guidotti | link |
14. | 31.03.2022 11:00–13:00 | A1 | XGBoost, LightGBM | GBM, Notebook_GBM | Guidotti | link |
15. | 04.04.2022 11:00–13:00 | C | Time Series Introduction, Distance Functions | TS_Intro_Distances, Notebook_TS_Sim, Notebook_TS_DTW_Impl, Notebook_TS_DTW_Constr_Impl | Guidotti | link |
16. | 07.04.2022 11:00–13:00 | A1 | Time Series Approximations, Clustering | TS_Approx_Clustering, Notebook_TS_ApproxClus | Guidotti | link |
17. | 11.04.2022 11:00–13:00 | C | Time Series Motifs, Discord, Matrix Profile | TS_MatrixProfile, TS_MatrixProfile | Guidotti | link |
18. | 14.04.2022 11:00–13:00 | A1 | Time Series Classification | TS_Classification Notebook_TSC, Notebook_TSC_SoA | Guidotti | link |
19. | 21.04.2022 11:00–13:00 | A1 | Sequential Pattern Mining | SPM | Guidotti | link |
20. | 28.04.2022 11:00–13:00 | A1 | Sequential Pattern Mining | SPM, Notebook_SPM | Guidotti | link |
21. | 02.05.2022 11:00–13:00 | C | Advanced Clustering Approaches | Advanced_Clustering, Notebook_AC | Guidotti | link |
22. | 05.05.2022 11:00–13:00 | A1 | Transactional Clustering | Transactional Clustering, Notebook_TC | Guidotti | link |
23. | 09.05.2022 11:00–13:00 | C | Explainable Artificial Intelligence | Explainability, Notebook_XAI | Guidotti | link |
24. | 12.05.2022 11:00–13:00 | A1 | Explainable Artificial Intelligence | Explainability, Notebook_XAI | Guidotti | link |
The exam is composed of two parts:
Exam Rules
Exam Booking Periods
Exam Booking Agenda
For online exams the camera must remain open and you must be able to share your screen. For the online exams could be required the usage of the Miro platform (https://miro.com/app/dashboard/).
The exam is composed of two parts:
Project Guidelines
N.B. When “solving the classification task”, remember, (i) to test, when needed, different criteria for the parameter estimation of the algorithms, and (ii) to evaluate the classifiers (e.g., Accuracy, F1, Lift Chart) in order to compare the results obtained with an imbalanced technique against those obtained from using the “original” dataset.
Session | Date | Time | Room | Notes | Marks |
---|---|---|---|---|---|
1. | 11.01.2022 | 14:00 - 18:00 | MS Teams | Please, use the system for registration: https://esami.unipi.it/ | |
3. | 07.06.2022 | Please, use the system for registration: https://esami.unipi.it/ | |||
4. | 28.06.2022 | Please, use the system for registration: https://esami.unipi.it/ | |||
5. | 19.07.2022 | Please, use the system for registration: https://esami.unipi.it/ | |||
6. | 05.09.2022 | Please, use the system for registration: https://esami.unipi.it/ |
… a new kind of professional has emerged, the data scientist, who combines the skills of software programmer, statistician and storyteller/artist to extract the nuggets of gold hidden under mountains of data. Hal Varian, Google’s chief economist, predicts that the job of statistician will become the “sexiest” around. Data, he explains, are widely available; what is scarce is the ability to extract wisdom from them.
Data, data everywhere. The Economist, Special Report on Big Data, Feb. 2010.