Summary
This offering consists of a Team Studio Mod (custom operator for Spotfire® Data Science - Team Studio) that calculates metrics to help the user decide on the optimal number of clusters (K value) for the K-Means method on the given input dataset.
Overview
The K-Means Cluster Evaluator operator calculates metrics for a range of K values, as specified in the input. These metrics are the Dunn Index and the Within Set Sum of Squared Errors (WSSE). A dataset containing the calculated metrics for all K values in the requested range is returned.
The operator is designed to run on a Hadoop cluster using Spark ML. More information about how to evaluate the optimal K for K-Means can be found in this
.
If you want to start using this Mod/Custom operator in the Team Studio environment, please follow this Knowledge Base article for installation guidelines.
Release 1.0.0
Published: September 2020
Initial release includes:
- Custom operator jar file
- Documentation for the operator
- License file