Indice

Data Analytics for Digital Health (DAD)

Instructors:

News

Learning Goals

Hours and Rooms

Classes

Day of Week Hour Room
Monday 09:00 - 11:00 Room FIB PS4
Wednesday 14:00 - 16:00 Room C
Friday 11:00 - 13:00 Room FIB PS4

Office hours - Ricevimento: Anna Monreale: Thu 09:00-11:00 - Online using Teams or in my Office (Appointment by email). Francesca Naretto: Mon 11:00-13:00 - Online using Teams or in my Office (Appointment by email).

A Teams Channel will be used ONLY to post news, Q&A, and other stuff related to the course. The lectures will be only in presence and will NOT be live-streamed.

Learning Material -- Materiale didattico

Textbook -- Libro di Testo

Slides

Software

Class Calendar (2024/2025)

First Semester

Day Topic Learning material References Video Lectures Teacher
1. 16.09 Overview. Introduction to KDD + Data Types Overview Introduction to DADH Data Understanding Chap. 1 Kumar Book Monreale
2. 18.09 Data Understanding for tabular data Slides of DU of the previous lecture Chap.2 Kumar Book and additioanl resource of Kumar Book: Data Exploration Chap. If you have the first ed. of KUMAR this is the Chap 3 Monreale
3. 20.09 Data Preparation for tabula Data 3-data_preparation_dad.pdf Chap.2 Kumar Book and additioanl resource of Kumar Book: Data Exploration Chap. If you have the first ed. of KUMAR this is the Chap 3 Monreale
4. 23.09 Data Understanding and Preparation for images 4-data-understanding_images.pdfDigital Image processing, 3 edition, Rafael Gonzalez, Richard Woods Naretto
5. 25.09 Data Understanding and Preparation for images and Time Series 5-data-understanding_ts.pdf Naretto
6. 27.09 Data Understanding and Preparation for Time Series + Python Lab. Intro to Python Naretto
7. 30.09 Data Understanding and Preparation for Tabular Python Lab. data_und.zip Naretto
8. 02.10 Data Understanding and Preparation for Images and Time Series Python Lab. data_und.zip Naretto
9. 04.10 Data Management and Data Warehousing 6-dw.pdf Monreale
9. 07.10 Data Management and Data Warehousing 6-dw.pdf Monreale
9. 09.10 Data Reporting - Project presentation 6-dw.pdf Monreale
10. 14.10 Clustering: intro and k-means 6-basic_cluster_analysis-intro.pdf 6-basic_cluster_analysis-kmeans.pdf Chapter 7, Introduction to Data Mining, 2nd Edition by Tan, Steinbach, Karpatne, Kumar Naretto
11. 16.10 Clustering: k-means and hierarchical 7.basic_cluster_analysis-hierarchical.pdfChapter 7, Introduction to Data Mining, 2nd Edition by Tan, Steinbach, Karpatne, Kumar Naretto
12. 21.10 Clustering: k-means variants and density Based approaches11-basic_cluster_analysis-kmeans-variants.pdf 10-basic_cluster_analysis-dbscan.pdf Chapter 7, Introduction to Data Mining, 2nd Edition by Tan, Steinbach, Karpatne, Kumar Monreale
13. 23.10 Clustering: Validity12-basic_cluster_analysis-validity.pdf Chapter 7, Introduction to Data Mining, 2nd Edition by Tan, Steinbach, Karpatne, Kumar Monreale
14. 25.10 Clustering and similarity for Images 3.2-clustering_images.pdf Naretto
15. 28.10 Clustering and similarity for Time Series 8_time_series_similarity_2024.pdf Time Series Analysis and Its Applications. Robert H. Shumway and David S. Stoffer. 4th edition Naretto
16. 30.10 Python Lab: Clustering clustering_diabetes.zip images_similarity.zip timeseries_similarity_clustering.zip clustering_tabular_tips.zip Naretto
17. 04.11 Python Lab: Clustering + Frequent Pattern Mining 17_association_analysis.pdf Naretto, Monreale
18. 06.11 Frequent Pattern Mining same slides as previous lecture Monreale
19. 08.11 Sequential Pattern Mining 18_sequential_patterns_2024.pdf Monreale
20. 11.11 Python lab: FPM + SPMar_spm.zip Naretto
21. 13.11 Classification for tabular Naretto
22. 15.11 Classification for tabular10-knn.pdf Naretto
23. 18.11 Classification for tabular10-lg.pdf Naretto
24. 20.11 Project Monreale, Naretto
25. 22.11 Classification for tabular10-rule-based-classifiers.pdf 11_2021-naive_bayes.pdf Naretto
26. 25.11 Classification for tabular13_ensemble_2023.pdf Naretto

Exams

Project

A project consists in data analyses based on the use of data mining tools. The project has to be performed by a team of 2 students. It has to be performed by using Python. The guidelines require to address specific tasks. Results must be reported in a unique paper. The total length of this paper must be max 25 pages of text including figures. The students must deliver both: paper (single column) and well commented Python Notebooks.

* First part of the project consists in the assignments described here: Project Description: DU and Clustering

  1. Deadline: the fist part has to be delivered by December 2th, 2024 . The delivery will be through Teams' assignement

Students who did not deliver the above project within Dec 31, 2024 need to ask by email a new project to the teachers. The project that will be assigned will require about 20 days of work and after the delivery it will be discussed during the oral exam.

Oral Exam

How to book for the exam colloquium?

In https://esami.unipi.it/ you can find the dates for the exam: one for January and one for February. Each student must do the registration on one of the 2 dates. These are not the dates of the colloquium or project delivery but we will use the list of registered students for organizing the exam dates. After that deadline we will share with you a calendar for the oral exam.