Data Science – Cutting Edge '25

Retail Strategy Optimization via ML Customer Segmentation

In the evolving retail landscape, businesses must adapt to changing consumer behavior to remain competitive. Our project, Optimizing Retail Strategy through Machine Learning-Based Customer Segmentation, focuses on empowering John Keells, a leading supermarket chain in Sri Lanka, to enhance its marketing effectiveness through data-driven insights. With operations across 20 locations and a wide product range—including dry goods, fresh produce, and luxury items—John Keells faces the challenge of addressing diverse customer needs using traditional, generalized marketing strategies.
To overcome this, we developed a machine learning-based solution that segments customers according to their purchasing behavior using historical sales data. The process involved comprehensive data preprocessing, including cleaning, feature engineering, and normalization, followed by model development. We evaluated multiple classification algorithms, including Random Forest, XGBoost, LightGBM, CatBoost, and Neural Networks.
These insights enable John Keells to adopt highly targeted and personalized marketing campaigns, improve customer retention, and optimize inventory planning. The solution also provides a foundation for future CRM and marketing automation efforts. This project showcases how machine learning can transform traditional retail strategies into intelligent, customer-focused approaches that drive business growth.

Tea-Intel: Data Driven Insights for Ceylon Tea

Tea-Intel is a data-driven decision support solution designed for Sri Lanka’s tea industry. It integrates two complementary clustering models — multivariate time series clustering for tea grades and feature-based clustering for factories — to uncover hidden behavioral patterns in auction data. These results are visualized through an interactive dashboard, enabling industry stakeholders to benchmark performance, monitor volatility, and make strategic, evidence-based decisions. Tea-Intel aims to modernize auction analysis, improve transparency, and support long-term sustainability in the tea sector

Comparative Need State Analysis for Ride-Hailing Platforms: A Case Study on PickMe and Uber in Colombo, Sri Lanka

This project investigates the evolving ride-hailing landscape in Sri Lanka by conducting a comparative need state analysis of Uber and PickMe customers in Colombo. The research addresses a critical gap in understanding localized customer needs in ride-hailing, which is essential for designing data-driven engagement strategies. Using an unsupervised clustering approach, the study segments 384 ride-hailing users based on categorical behavioural and demographic attributes collected through a bilingual survey.

The K-Modes clustering algorithm was employed to identify four distinct customer personas: (1) Flexible Weekly Dual-App Users, (2) PickMe-Focused, Low-Frequency Cash Users, (3) Occasional Price-Sensitive Uber Users, and (4) Loyal Female Dominant Daily Riders with Subscriptions. The model’s performance was validated using several techniques, confirming four as the optimal number of clusters.

These insights were visualized using an interactive Microsoft Power BI dashboard,
allowing stakeholders to filter, compare, and explore behavioural trends across user groups. Key findings revealed that price sensitivity, app-switching behaviour, subscription adoption, and ride purpose vary widely across clusters. Actionable business recommendations were formulated for both PickMe and Uber to tailor engagement, pricing, and service quality improvements accordingly.

The study contributes methodologically by demonstrating the application of K-Modes in an industry setting and practically by providing local ride-hailing providers with insights for a segmentation-driven strategy. The project also sets a precedent for future research in mobility data analytics within developing urban contexts.

Keywords: clustering, customer segmentation, K-Modes, ride-hailing, user behaviour