Strumenti Utente

Strumenti Sito


dm:start:clustering

Questa è una vecchia versione del documento!


Guidelines for the homework on clustering

  • Data Understanding: useful as a preliminary step to capture some data property that can help the clustering analysis (8 points)
    • Distribution data analysis and suitable transformation of variables
    • Elimination of redundant variables by correlation analysis
  • Clustering Analysis by K-means: (15 points)
    • Identification of the best value of k
    • Characterization of the obtained clusters by using both analysis of the k centroids and comparison of the distribution of variables within the clusters and in the whole dataset
  • Analysis by density-based clustering (7 points)
    • Study of the clustering parameters
    • Characterization and interpretation of the obtained clusters
  • Analysis by hierarchical clustering (Optional - 3 points)
    • Analysis to be performed on a sampling of the data for scalability reasons
dm/start/clustering.1355839756.txt.gz · Ultima modifica: 18/12/2012 alle 14:09 (12 anni fa) (modifica esterna)

Donate Powered by PHP Valid HTML5 Valid CSS Driven by DokuWiki