Set whether to initialize using the probabilistic farthest first like method of the k-means++ algorithm (rather than the standard random selection of initial cluster centers).
distanceFunction
String, default: "Euclidean"
Distance function to use. Options are: Euclidean and Manhattan.
[[["Easy to understand","easyToUnderstand","thumb-up"],["Solved my problem","solvedMyProblem","thumb-up"],["Other","otherUp","thumb-up"]],[["Missing the information I need","missingTheInformationINeed","thumb-down"],["Too complicated / too many steps","tooComplicatedTooManySteps","thumb-down"],["Out of date","outOfDate","thumb-down"],["Samples / code issue","samplesCodeIssue","thumb-down"],["Other","otherDown","thumb-down"]],["Last updated 2024-09-19 UTC."],[[["Cascade simple k-means automatically determines the optimal number of clusters (k) within a specified range using the Calinski-Harabasz criterion."],["Users can customize the clustering process by defining the minimum and maximum number of clusters, the number of algorithm restarts, initialization methods, distance functions, and the maximum number of iterations."],["This Weka-based clusterer offers flexibility by allowing users to either automatically or manually select the number of clusters for their analysis."],["The underlying algorithm leverages either Euclidean or Manhattan distance metrics to measure similarity between data points for cluster assignments."]]],[]]