Indice
Data Mining 2018
- Anna Monreale
Università di Pisa, Knowledge Discovery and Data Mining Lab
annam [at] di [dot] unipi [dot] it
News
Goals
Data mining and knowledge discovery techniques emerged as an alternative approach, aimed at revealing patterns, rules and models hidden in the data, and at supporting the analytical user to develop descriptive and predictive models for a number of business problems. This short course focusses on the main applications scenarios of data mining to challenging problems in the broad CRM domain - Customer Relationship Management.
Syllabus
- Clustering models. Discussion of real cases.
- Patterns and association rule mining for market basket analysis.
- Prediction models Discussion of real cases.
Textbooks
- Slides (see Calendar).
- Berthold, M.R., Borgelt, C., Höppner, F., Klawonn, F. GUIDE TO INTELLIGENT DATA ANALYSIS. Springer Verlag, 1st Edition., 2010. ISBN 978-1-84882-259-7
- Pang-Ning Tan, Michael Steinbach, Vipin Kumar. Introduction to Data Mining. Addison Wesley, ISBN 0-321-32136-7, 2006
Reading about the "data analyst" job
Calendar
Date | Topic | Learning material | |
---|---|---|---|
01. | 18.09.2018 | Introduction to data mining and big data analytics. Data Understanding & Preparation | 1-introduction-sa.pdf 2-dataunderstanding-sa.pdf 3-data_preparation-sa.pdf |
02. | 19.09.2018 | knime: Data Understanding & Preparation. Clustering | 4-clusteringintroduction-sa.pdf 5-kmeans-sa.pdf 6-dbscan-sa.pdf 01_titanic_data_understanding |
03. | 20.09.2018 | Knime: Clustering. Classificazione. | knime_clustering 7-classification-sa.pdf |
04. | 21.09.2018 | Knime: Classificazione. Case Studies | knime_classification calcio_infortuni.pdfmusicpref.pdf mensa.pdf |
Datasets
0. Iris. (for details see https://archive.ics.uci.edu/ml/datasets/iris)
1. Human Resources. (for details see https://www.kaggle.com/ludobenistant/hr-analytics)
2. Telco Churn. (for details see http://didawiki.di.unipi.it/doku.php/dm/mains.santanna.dm4crm.2016)
3. Adult. (for details see https://archive.ics.uci.edu/ml/datasets/Adult)
4. Titanic. (for details see https://www.kaggle.com/c/titanic)