How does AI Studio calculate the distance between instances to create clusters?
How Can We Help?
< All Topics

How does AI Studio calculate the distance between instances to create clusters?

What the AI Studio clustering algorithm attempts to do is basically grouping the data points together by proximity to one another. This proximity is differently computed depending on the field type.

  • For numericfields it is measured with the Euclidean distance, where the total distance from each data point to its assign centroid is minimized.
  • For categorical fields, AI Studio uses a special binary distance (0 or 1) function where:

if valA == valB  then

distance = 0

else 

distance = 1 or user-defined scale value

endif

AI Studio also assigns as the centroid the most common category of the member instances and then computes the Euclidean distance as normal.

  • For text and items fields AI Studio follows a different approach and uses cosine similarity to calculate the distance metric. The terms the algorithm picks for a centroid are the terms that minimize the average cosine distance between the centroid and the points in its neighborhood.
Previous FAQ How does AI Studio calculate centroids for clusters?
Next FAQ Is the distance between clusters relevant?
type your search
Get in touch with us.
Our team is here to help you!

CONTACT INFO

For general inquiries:
hypersense@subex.com

For Media Relations:
sandeep.banga@subex.com

For Investor Relations: investorrelations@subex.com

For Careers:
jobs@subex.com
scroll-up

Before you go, can you please answer a question for us?