This is a simple tutorial to implement K-Means in Python using a K-Means++ initialization strategy in under 150 lines. Additionally, there's some discussion around choosing an optimal k
value using the Dunn Index as well as some plots. The tutorial goes through the methodology of K-Means, its implementation, and even considers naive approaches versus more informed approaches to initialization.
Refer to k-means-tutorial.ipynb
for the tutorial and code/k-means.py
for the finished, uncommented code.