What is a cluster?
Cluster Analysis is an unsupervised Machine Learning pipeline that partitions a dataset and groups together those instances that are similar. It separates a set of instances into several groups so that instances in the same group, called cluster, are more like each other than those in other groups. Such form of analysis does not require any previously labeled data and comes under the category of unsupervised learning. This pipeline is commonly used for market and customer segmentation, portfolio management, and to develop new features from the data while understanding the underlying structure.
AI Studio clusters can be built using two different unsupervised learning algorithms:
- K-means: you will need to specify the number of clusters (k) in advance.
- G-means: This algorithm leverages a hierarchical approach to detecting the number of clusters. It autonomously learns the different clusters by iteratively taking existing cluster groups and checking if the cluster’s neighborhood appears Gaussian in its distribution.
Here is a video to learn how to separate your data into groups of similar groups using AI Studio.