Foundations of Machine LearningClustering Basics ( 聚类基础 )ClusteringLesson 8 - 1Clustering BasicsDefinition and Motivation (定义与动机)Data Preprocessing and Similarity Computation Objective of Clustering Clustering Evaluation ClusteringLesson 8 - 2Clustering BasicsDefinition and Motivation Finding groups of objects such that the objects in a group will be similar (or related) to one another and different from (or unrelated to) the objects in other groups .ClusteringLesson 8 - 3Clustering BasicsDefinition and Motivation A stand-alone tool: explore data distribution A preprocessing step for other algorithms Pattern recognition, spatial data analysis, image processing, market research, WWW, … Cluster documents Cluster web log data to discover groups of similar access patterns Clustering Co-expressed Genes Marketing: Help marketers discover distinct groups in their customer bases, and then use this knowledge to develop targeted marketing programs Climate: understanding earth climate, find patterns of atmospheric and ocean ClusteringLesson 8 - 4Clustering BasicsDefinition and Motivation A stand-alone tool: explore data distribution A preprocessing step for other algorithms Pattern recognition, spatial data analysis, image processing, market research, WWW, … Two Important Aspects Properties of input data Define the similarity or dissimilarity between points Requirement of clustering Define the objective and methodology ClusteringLesson 8 - 5Clustering BasicsDefinition and Motivation Data Preprocessing and Similarity Computation (数据预处理和相似性计算) Objective of Clustering Clustering Evaluation ClusteringLesson 8 - 6Data Preprocessing and Similarity Computation Data: Collection of data objects and their attributes An attribute is a p...