This dissertation is focused on conducting an analytical comparison of unsupervised machine learning algorithms applied to the problem of customer segmentation. Customers are classified into various categories based on their spending habits. This is achieved by analysing the transaction details of an Online e-commerce orders dataset to identify clusters of similar customers. We apply data analysis techniques and statistical measures to extract meaningful features thereby creating a data model. With the algorithms applied, the clustering performance measures determine the validity and shape of the clusters formed. We have used a rudimentary customer segmentation model such as RFM, compared it to common clustering algorithms and then used the output as the baseline for classification using a basic MultiLayer Perceptron Model.
KMeans Clustering, DBSCAN, OPTICS, BIRCH
Davies-Bouldin score, Calinski-Harabasz Index, Silhouette-coefficient
Data Mining, Machine Learning, Multilayer Perceptron, KMeans Clustering
Data Analytics, Machine Learning, Artificial Neural Network