Type your query or search by category
Why do I need to scale certain fields?
When clustering, the algorithm is going to compute a multidimensional Euclidean distance, so, the scale of numerical features is very important. For instance, think of a dataset with house prices and number of rooms among other fields. Since the values of house prices are larger than the number of rooms, you will end up having the house prices dominating the distance measurement in that cluster. Therefore, you need to scale those fields.