veda.ng
Back to Glossary

Cluster Analysis

Cluster analysis groups a collection of items so that members of the same group are more similar to each other than to members of other groups. It is unsupervised, meaning it discovers structure hidden in the data without pre-labeled examples. Common algorithms include k-means, hierarchical clustering, DBSCAN, and Gaussian mixture models. Each defines "closeness" differently. In marketing, clustering segments customers by purchase behavior for targeted promotions. In genetics, it groups gene expression profiles to identify disease subtypes. In finance, it detects assets that move together for portfolio diversification. Recommendation engines use clusters of user preferences to suggest products. Cybersecurity tools cluster network logs to spot anomalous activity. Urban planners cluster traffic incidents to prioritize safety improvements. The ability to automatically discover meaningful groupings without supervision is essential for any organization dealing with large, unstructured datasets.