AI-generated Key Takeaways
-
X-Means extends the K-Means clustering algorithm by efficiently estimating the optimal number of clusters within a specified range.
-
The algorithm iteratively evaluates potential cluster splits using a Bayesian Information Criterion (BIC) to determine the most likely number of clusters.
-
Users can customize parameters like the minimum and maximum number of clusters, iterations, distance function, and randomization seed for fine-grained control over the clustering process.
-
Implemented within Earth Engine, X-Means offers a scalable solution for geospatial data analysis and pattern recognition tasks.
Dan Pelleg, Andrew W. Moore: X-means: Extending K-means with Efficient Estimation of the Number of Clusters. In: Seventeenth International Conference on Machine Learning, 727-734, 2000.
Usage | Returns |
---|---|
ee.Clusterer.wekaXMeans(minClusters, maxClusters, maxIterations, maxKMeans, maxForChildren, useKD, cutoffFactor, distanceFunction, seed) | Clusterer |
Argument | Type | Details |
---|---|---|
minClusters | Integer, default: 2 | Minimum number of clusters. |
maxClusters | Integer, default: 8 | Maximum number of clusters. |
maxIterations | Integer, default: 3 | Maximum number of overall iterations. |
maxKMeans | Integer, default: 1000 | The maximum number of iterations to perform in KMeans. |
maxForChildren | Integer, default: 1000 | The maximum number of iterations in KMeans that is performed on the child centers. |
useKD | Boolean, default: false | Use a KDTree. |
cutoffFactor | Float, default: 0 | Takes the given percentage of the split centroids if none of the children win. |
distanceFunction | String, default: "Euclidean" | Distance function to use. Options are: Chebyshev, Euclidean, and Manhattan. |
seed | Integer, default: 10 | The randomization seed. |