Clustering and Classifying Customer Transaction Behavior with Machine Learning
A walkthrough of an end-to-end machine learning project on financial transaction data. The workflow covers data cleaning and preprocessing (handling missing values, label encoding, outlier removal with IQR, StandardScaler), unsupervised clustering with K-Means (2 clusters selected via silhouette score of 0.57, visualized with PCA), and supervised classification using Decision Tree and Random Forest with GridSearchCV hyperparameter tuning. Clusters are interpreted as 'Stable, Well-Established Customers' and 'Young, Active Customers'. All classifiers achieved 100% accuracy — expected since labels were derived from clustering on the same features. The post honestly notes this reflects pipeline consistency rather than real-world complexity.