Skip to content

Dynamically get the suggested clusters in the data for unsupervised learning.

License

Notifications You must be signed in to change notification settings

druogury/clustering-gap-statistic

 
 

Repository files navigation

Python implementation of the Gap Statistic

Build Status Coverage Status Code Health

Forked from original repository


Purpose

Dynamically identify the suggested number of clusters in a data-set using the gap statistic.


Improvements

  • Correct dispersion formula (mean of log instead of log of mean)
  • Compute gap statistic's standard deviation
  • Add Scikit-learn KMeans and SphericalKMeans
  • Scipy kmeans2 looks very unstable, that's why it's not the default algorithm anymore

Full example available in a notebook HERE


Install:

Bleeding edge:

pip install git+git://github.com/druogury/clustering-gap-statistic.git

PyPi:

pip install --upgrade gap-stat

Uninstall:

pip uninstall gap-stat

About

Dynamically get the suggested clusters in the data for unsupervised learning.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 100.0%