CHALLENGE
The Telecommunications industry is constantly evolving, with a highly competitive landscape emphasizing the need for personalized services and targeted campaigns to meet customer demands. Our client, a prominent telecommunications & digital services provider in the Philippines, faced multiple challenges related to personalized targeting of customers with the next best offers.
The client used to run manual and gut-driven campaign designs with limited experimentation, limited data-driven optimization of campaigns and lack of personalization of offers. This was resulting in low campaign efficiency and net revenue impact, especially compared to the control group. Furthermore, the client mainly relied on an on-prem infrastructure, which posed challenges when scaling operations and conducting experiments with limited resources.
Hence, there was a need to build out a comprehensive ML-powered framework that could maximize take-up-rate and net revenues. Furthermore, in order to overcome scalability issues, speed constraints, and resource limitations, and ultimately unlock the full potential of their CVM strategy, there was a need to automate Machine learning workflows & deployment, spanning both on-prem and cloud environments.
SOLUTION
The OneByZero team worked with the customer in two areas :
- Building a comprehensive ML-powered Framework for deciding the Next-Best-Offer (NBO) for each subscriber on a combination of on-prem and AWS infrastructure
- Scaling the ML Operations in an elastic manner on AWS cloud, including real-time prediction of the best offer
Implementation
Development & Deployment of an ensemble NBO model
We developed an ensemble model that combined “customer value” scores and “business value” scores for all available relevant offers. This combined score was used to decide the NBO (next best offer) to show to a telecommunications subscriber
The goal of the model was to optimize the two main requirements of NBO : a) give the customer an offer that they would be very interested in and are highly like to take, and b) give the customer an offer that would increase their ARPU (average revenue per user). In combination, the model is designed to increase overall business revenue.
Multiple models were developed to compute the “customer value” and “business value” for each new offer. For the “customer value”, different “propensity” models based on xgboost and pytorch-tabnet algorithms were developed with different sets of features and hyperparameters. OneByZero also built “persuadability” models to determine if a customer could be persuaded to buy a certain product when there’s a freebie or bonus offer associated with it. These models used a variety of historical purchase data, campaign data, product details data and channel interaction data to create features. The customer value model predicted the probability for a given customer to take an offer, while the business value model predicted the increase in ARPU for the customer if they did take up the offer.
Development & Deployment of a framework to do A-B testing on multiple model configurations on Sagemaker
In order to evaluate which models and configurations had the best performance, an A-B testing framework was deployed that would simultaneously test different configurations on different portions of the population. The best models and configurations, in terms of key business metrics - take-up rate and net revenue impact, were then prioritized over time. They key parameters for the model configurations are :
- Model type (e.g. propensity based on xgboost, tabnet; persuadability models, collaborative filtering models, etc.)
- Weightage given to customer & business value scores
Also, these models could be applied to different segments of the population (e.g. new customers, long-tenure customers, high-value customers, etc.)
In January 2023, we began with 10 model configuration instances, gradually increasing the number of instances for A/B testing to 60 by June 2023. Following this, we continued refining instances for the next six months based on their performance on business metrics. By January 2024, we had the five best-performing instances in production.
Performance Optimization & Monitoring
Various steps were taken to optimize the performance, both in terms of latency and throughput. For latency, the python code was highly optimized to minimize data copies, use vectorization and be efficient in terms of data handling and data types. For throughput, auto-scaling policies were set up to help deal with spiky loads. Load testing was conducted to evaluate the model's performance under high concurrent requests. A monitoring dashboard on CloudWatch was developed for comprehensive oversight.
Response Generation
A key part of the system was generating spiels to be sent to customers in a flexible manner, so as to allow experimenting on what tone and other message properties had the greatest impact.
L1 Support
OneByZero also provided L1 support for the end-to-end system after go-live.
AWS services in the solution
The solution used various AWS tools & services as shown below:
- AWS SageMaker
- AWS Lambda
- DynamoDB
- API Gateway
- S3 storage.
The following diagram shows the architecture of the solution on AWS :
The process begins with AWS Glue, which orchestrates crawler jobs to extract data and store it in Amazon S3, forming the foundational data lake. This data is subsequently utilized by the Amazon SageMaker Feature Store, a centralized hub that maintains curated data for ML development.
Within the S3 buckets, the data undergoes preprocessing tailored to the needs of XGBoost and TabNet models. This step is crucial for optimizing the performance of the machine learning algorithms and is part of the MLOps pipelines that ensure operational efficiency. Following preprocessing, the data is channeled into the training phase for distinct models, each one specific to a particular brand. Post-training, these models are registered in the SageMaker Model Registry. This registration is a governance mechanism that tracks versions and maintains the lineage of models.
Once registered, the models' artifacts are deployed to create inference endpoints. These endpoints serve as scalable and secure channels for real-time predictions.
Concurrently, for inference purposes, a Linear Model runs on AWS Lambda. This serverless component is responsible for invoking the aforementioned endpoints to retrieve individual inferences. It performs a linear weighted computation that integrates two distinct scores: Customer Value and Business Value. The latter is retrieved from a DynamoDB table, ensuring low-latency access to this pivotal business metric. For augmented decision-making, additional information such as promotional data can be incorporated to refine the inference outputs.
At the forefront of this architecture lies Amazon API Gateway, which acts as the interface for external inference requests and the channeling these requests to the AWS Lambda function.
The following image shows a diagram of the UI for end-users, on the mobile app as well as on the USSD channel.
RESULTS AND BENEFITS
The entire machine learning pipeline, for both training and inference, was developed and deployed on AWS. The ensemble NBO model was deployed for real-time inference using a combination of AWS Lambda and auto-scaling sagemaker endpoints for the xgboost and pytorch-tabnet models. The inference process was optimized to meet stringent latency and throughput requirements.
Extensive testing was conducted to ensure correct integration with other on-prem systems, including an on-prem data platform and an on-prem marketing automation platform. Load testing was also performed to make sure the system was able to handle the expected bursty traffic patterns. Training was provided to the customer IT and support teams.
Results
After discussion with our business teams, we identified the following business metrics to evaluate the performance of our ML models in real-time campaigns. These metrics include Campaign Efficiency, Per Subscriber Revenue, and Incremental ARPU.
- Campaign efficiency : it is the measure of net takeup rate over target take up rate. Here net take-up rate is the difference between target group take-up rate and control group take-up rate.
- Per subscriber revenue: it is the multiplier of net take up rate and overall top-up amount during the campaign period.
- Incremental ARPU: This is the measure of the average revenue per user from the previous month compared to the current month.
The ML-driven campaigns went live in January 2023. The following is a summary of the campaign's performance based on given business metrics.
- In January 2023, campaign efficiency was observed to be around 7%. However, after multiple iterations of optimization and retraining models, we were able to achieve campaign efficiency of approximately 55% in December 2023.
- If we compare campaign efficiency with previous rule-based campaigns, It was a significant improvement in campaign efficiency, with 2X improvement compared to prior rule-based campaigns.
- Significant improvement in Incremental ARPU has been observed, with an average 2X improvement compared to the rule-based campaign throughout the year.
- Also, the per subscriber revenue has improved by 1.7X times compared to rule-based campaigns.
- Helped exceed overall revenue targets of the CVM team
- The base for the ML campaigns also grew from 1M to 6M as the performance of the ML campaigns improved and operations were set up to scale the ML deployment
The implementation of the MLOps pipeline on the AWS cloud resulted in significantly reduced the time to develop, deploy and evaluate new versions of the models. This especially helped in deploying newer use-cases such as real-time inference for inbound campaigns. Furthermore, the cloud environment also provided more resources that could be used in an elastic fashion to experiment and build different kinds of models, as well as perform inferencing on them in a bursty fashion with the help of auto-scaling. This addressed scalability challenges faced by the team on existing on-premise servers.
Other Performance Metrics
KPI | Results |
Latency | Met the SLA of 200 msec inference latency on real-time API requests. Improved the latency of Tabnet model by 27% and XGBoost by 72% compared to on-prem versions of these models |
Subscriber base | System was shown to be capable of handling real-time inference for a subscriber base of over 30 million subscribers, while the on-prem servers maxed out at 10 million. |
Throughput | The system on AWS was able to handle a peak load of 600 requests/sec, with the help of elastic auto-scaling features supported by Sagemaker endpoints. |
ㅤ | ㅤ |
WE ARE ONEBYZERO
Headquartered in Singapore with local presence in Asean nations, we are a modern data & AI consulting firm. We focus on transforming enterprises with cutting edge solutions to generate value from data. We specialize in serving the Telecommunications, Banking & Financial Services, Retail and Ecommerce industries.
We focus exclusively on AI/ML, Data & Martech
AI & ML, including Generative AI
We help organizations define their AI/ML strategy, develop and operationalize AI/ML & generative AI models, and implement MLOps to streamline operations. We have experience doing a variety of work in classical data science and cutting edge GenAI for Telcos
Modern Data Platforms
Our team of experts build robust data pipelines, design data warehouses and lakes with strong focus on data quality and lineage, to build C360, marts, support reporting, dashboards to help organizations uncover insights in their data using a modern data stack.
Martech & Personalization
We enable customers to modernize their digital platforms to deliver omni channel personalization use-cases. We have deep experience with marketing use-cases & personalized content generation, automated campaign grid optimization and next-best offers & actions.