Instructors:
Teaching Assistant
Instructors:
Teaching Assistant
Classes
Day of Week | Hour | Room |
---|---|---|
Monday | 11:00 - 13:00 | C1 |
Wednesday | 11:00 - 13:00 | C1 |
Office hours - Ricevimento:
Classes
Day of Week | Hour | Room |
---|---|---|
Monday | 09:00 - 11:00 | C |
Wednesday | 11:00 - 13:00 | C |
Office Hours - Ricevimento:
Other softwares for Data Mining
Day | Time | Room | Topic | Material | Lecturer | |
---|---|---|---|---|---|---|
01. | 18.09.2023 | 11-13 | C1 | Overview, Introduction | Intro | Pedreschi |
20.09.2023 | 11-13 | No Lecture | ||||
02. | 25.09.2023 | 11-13 | C1 | Lab. Introduction to Python | Python Basic | Guidotti |
03. | 27.09.2023 | 11-13 | C1 | Lab. Data Understanding | Data Understanding | Guidotti |
04. | 02.10.2023 | 11-13 | C1 | Data Understanding | Data Understanding | Guidotti |
05. | 04.10.2023 | 11-13 | C1 | Data Understanding & Preparation | Data Understanding, Data Preparation | Pedreschi |
06. | 09.10.2023 | 11-13 | C1 | Data Preparation & Data Similarity | Data Preparation, Data Similarity | Pedreschi |
07. | 11.10.2023 | 11-13 | C1 | Data Similarity & Lab. Data Understanding | Data Similarity, Data Understanding | Pedreschi |
08. | 16.10.2023 | 11-13 | C1 | Introduction to Clustering, K-Means | Intro_Clustering, K-Means | Pedreschi |
09. | 18.10.2023 | 11-13 | C1 | Clustering Validation, Hierarchical Clustering | Intro_Clustering, Hierarchical | Pedreschi |
10. | 23.10.2023 | 11-13 | C1 | Density-based Clustering | Density-based Clustering | Pedreschi |
11. | 25.10.2023 | 11-13 | C1 | Lab. Clustering | Clustering | Guidotti |
12. | 30.10.2023 | 11-13 | C1 | Ex. Clustering | ExClustering | Guidotti |
01.11.2023 | 11-13 | No Lecture | ||||
13. | 06.11.2023 | 11-13 | C1 | Intro Classification, kNN(video) | Intro_Classification, kNN | Guidotti |
14. | 08.11.2023 | 11-13 | C1 | Naive Bayes, Exercises | Naive Bayes | Guidotti |
15. | 13.11.2023 | 11-13 | C1 | Model Evaluation | Model Evaluation | Guidotti |
16. | 15.11.2023 | 11-13 | C1 | Model Evaluation Exercises & Lab | Classification | Guidotti |
20.11.2023 | 11-13 | No Lecture | ||||
17. | 22.11.2023 | 11-13 | C1 | Decision Tree Classifier | Decision Tree | Pedreschi |
18. | 27.11.2023 | 11-13 | C1 | Decision Tree Classifier | Decision Tree | Pedreschi |
19. | 29.11.2023 | 11-13 | C1 | Exercises and Lab. Decision Tree Classifier | Decision Tree | Guidotti |
20. | 04.12.2023 | 11-13 | C1 | Decision Tree Classifier, Exercises and Lab | Decision Tree | Pedreschi |
21. | 06.12.2023 | 11-13 | C1 | Intro Regression & Lab. Regression | Regression, Regression | Guidotti |
22. | 11.12.2023 | 11-13 | C1 | Into Pattern Mining and Apriori | Pattern Mining | Pedreschi |
23. | 13.12.2023 | 16-18 | C1 | Apriori & Lab. Pattern Mining | Pattern Mining, Pattern Mining | Pedreschi |
24. | 18.12.2023 | 11-13 | C | FP-Growth and Exercises | Pattern Mining | Guidotti |
Day | Time | Room | Topic | Material | Lecturer | |
---|---|---|---|---|---|---|
01. | 19.02.2024 | 14-16 | C | Overview, Rule-based Models | Introduction, Guidelines, Rule-based Models | Guidotti |
21.02.2024 | No Lecture | |||||
26.02.2024 | No Lecture | |||||
02. | 28.02.2024 | 11-13 | C | Sequential Pattern Mining | Sequential Pattern Mining, GSP | Guidotti |
03. | 04.03.2024 | 9-11 | C | Sequential Pattern Mining | Sequential Pattern Mining, GSP | Guidotti |
04. | 06.03.2024 | 11-13 | C | Transactional Clustering | Transactional Clustering | Guidotti |
05. | 11.03.2024 | 9-11 | C | Time Series Similarity | Time Series Similarity, TS_Load, TS_Similarity | Guidotti |
06. | 13.03.2024 | 11-13 | C | Time Series Approximation | Time Series Clustering, TS_Approx_Clustering | Guidotti |
07. | 18.03.2024 | 9-11 | C | Time Series Clustering & Motifs | Time Series Motifs, TS_Motifs | Guidotti |
08. | 20.03.2024 | 11-13 | C | Time Series Classification | Time Series Classification, TS_Classification | Guidotti |
09. | 25.03.2024 | 9-11 | C | Imbalanced Learning | Imbalanced Learning, ImbLearn | Guidotti |
10. | 27.03.2024 | 11-13 | C | Dimensionality Reduction | Dimensionality Reduction, DimRed | Guidotti |
11. | 03.04.2024 | 11-13 | C | Outlier Detection | Outlier Detection | Guidotti |
12. | 08.04.2024 | 9-11 | C | Outlier Detection | Outlier Detection, OutlierDetection | Guidotti |
13. | 10.04.2024 | 11-13 | C | Outlier Detection | Outlier Detection, OutlierDetection | Guidotti |
14. | 15.04.2024 | 14-16 | C | Gradient Descend, MLE | GD, MLE | Guidotti |
15. | 17.04.2024 | 11-13 | C | Odds, LogOdds, Logistic Regression | Odds, LogReg, LogReg | Guidotti |
16. | 22.04.2024 | 9-11 | C | Support Vector Machine | SVM, SVM | Guidotti |
17. | 24.04.2024 | 11-13 | C | Perceptron, Neural Networks | Perceptron | Guidotti |
18. | 29.04.2024 | 9-11 | C | Deep Neural Networks | Deep Neural Networks, NN | Guidotti |
19. | 06.05.2024 | 9-11 | C | CNN, RNN, DL-TS, Ensemble Intro | DNN, TSC-DNN, Ensemble | Guidotti |
20. | 08.05.2024 | 11-13 | C | Ensemble, Boosting, Adaboost | Ensemble, LabEnsemble | Guidotti |
21. | 13.05.2024 | 9-11 | C | Ensemble-TS, Gradient Boosting | Gradient Boosting Machines, LabEnsemble | Guidotti |
22. | 15.05.2024 | 11-13 | C | Extreme Gradient Boosting | Gradient Boosting Machines, LabEnsemble | Guidotti |
23. | 20.05.2024 | 9-11 | C1 | eXplainable Artificial Intelligence | XAI, LabXAI | Guidotti |
24. | 22.05.2024 | 11-13 | C1 | eXplainable Artificial Intelligence | XAI, LabXAI | Guidotti |
How and Where: The exam will take place in oral mode only at the teacher's office or classroom previously designated. The exam will be held online on the 420AA Data Mining course channel only at the request of the student in accordance with current legislation.
When: The dates relating to the start of the three exams are/will be published on the online platform https://esami.unipi.it/. Within each session, we will identify dates and slots in order to distribute the various orals. The dates and slots to take the exam will be published on the course page by the end of May. Each student must also register on https://esami.unipi.it/. The examination can only be carried out after the delivery of the project. The project must be delivered one week before when you want to take the exam. Group oral discussions will be preferred in respect of the project groups in order to parallelize any discussion on the project. It is not mandatory to take the oral exam together with the other members of the group. In the event that the oral exam is not passed, it will not be possible to take it for 20 days. If the project is not considered sufficient, it must be carried out again on a new dataset or a very updated version of the current one.
What: The oral test will evaluate the practical understanding of the algorithms. The exam will evaluate three aspects.
questionable steps or choices.
Final Mark: for 12-credit exam, the final mark will be obtained as the average mark of DM1 and DM2.
When registering for the oral exam please specify in the notes DM1 if you do not want to do DM2 (that is assumed by default). After having booked for DM1 please contact Prof. Pedreschi to agree on the exam date (put Prof. Guidotti and Andrea Fedele in cc). There will be no agenda for DM1.
Do not forget to make the evaluation of the course!!!
The exam is composed of two parts:
DM1 Project Guidelines See Project Guidelines.
The exam is composed of two parts:
DM2 Project Guidelines See Project Guidelines.
… a new kind of professional has emerged, the data scientist, who combines the skills of software programmer, statistician and storyteller/artist to extract the nuggets of gold hidden under mountains of data. Hal Varian, Google’s chief economist, predicts that the job of statistician will become the “sexiest” around. Data, he explains, are widely available; what is scarce is the ability to extract wisdom from them.
Data, data everywhere. The Economist, Special Report on Big Data, Feb. 2010.