Anand Ranganathan

MLRecommendations

Optimizing Recommendations for a Mobile Trading Platform

Author

Anand Ranganathan

Cover

Slug

recommendations-mobile-app

Person

Published

Date

Jun 10, 2024

CHALLENGE

Kelly (https://www.kellyapp.ph/ ) is an opinion trading platform that allows users to trade binary contracts based on specific beliefs and sentiments. By forecasting real-world outcomes, participants buy and sell contracts indicating agreement or disagreement with a stated opinion. As people make their choices, the platform shows which opinions are most popular or trending.

Like most consumer-focused online businesses, Kelly’s main challenges are in attracting and retaining users, increasing user activity and driving overall revenues. A key aspect of Kelly’s platform is in recommending topics or categories to users that they may be interested in buying or selling contracts. A good recommendation engine that can drive clicks, engagement and trades is immensely important for Kelly.

A number of variables need to be considered in deciding the optimal recommendations :

Explicit interests indicated by the users

Past user trades across different categories

Volume of trade for a specific topic / event

Popularity of an event based on engagements and social media activity

Recency of event creation

Closing-soon events

Another key consideration is performance, specifically generating recommendations in real-time for large numbers of users. Recommendations need to be available for each user as soon as they login to the app.

The screenshots below shows the app with a couple of recommended events for a certain user that they may be interested in trading in. One is an event about travel time in Manila, and the other is about the price of Bitcoin.

SOLUTION:

Onebyzero built a ML-driven recommendation engine for Kelly and deployed it on AWS.

While AWS has a built-in service, AWS Personalize, to generate personalized predictions, a decision was made to build the recommendation algorithm using open-source libraries so as to provide more control on the algorithms behavior, and also be able to consider other factors such as popularity of events, recency and time-to-close information.

The final solution that was deployed has several components :

Multiple models to generate recommendations for individual topics, including collaborative filtering, item-based similarity metrics, user-based similarity metrics, etc

A learning to rank algorithm that updates the weight of different factors based on user feedback (measured through a combination of clicks and trades)

An ensemble model that combined the scores for topics from different models along with other features. The ensemble model also considers CLTV (Customer Lifetime Value) and picks recommendations such that the lifetime revenue/profits can be maximized.

A/B testing approaches to decide on the best approach.

Implementation:

Development of multiple recommendation algorithms

We implemented multiple recommendation algorithms including Item-item similarity, user-item based collaborative filtering and learning to rank algorithms leveraging Amazon Sagemaker notebook instances. Each model creates personalized ranking for each user for bets eligible to the user. We ran backtesting to evaluate the performance of individual models.

Development of ensemble model and Deployment using Sagemaker

We created an ensemble model by assigning different weightages to models to maximize for customer lifetime value and user activity(click through rate, Bet rate). To reduce latency, we also created content embeddings using BERT and stored them so that embeddings can be efficiently used for computing content similarity and features for learning to rank models. The recommendation system was deployed using Sagemaker real-time inference. We also leveraged autoscaling to handle peak traffic.

Deployment of A/B testing framework

We also created a continuous A/B testing framework to check the performance of individual models, ensemble models and new challenger models by leveraging Amazon Sagemaker. A/B testing ensures that models are tested in a live environment for a small percentage of users before we scale them.

Monitoring

A monitoring dashboard was developed for comprehensive oversight. We leveraged Sagemaker’s model monitor to detect model’s performance in production and automatically trigger model retraining upon detection of drift in model’s performance. We leveraged Amazon Cloudwatch to check the model's latency and key model performance metrics like click through rate and bid through rate and raise automated alarms if needed.

AWS services in the solution

The Machine Learning team made use of various AWS tools & services as shown below

AWS SageMaker

AWS Lambda

DynamoDB

API Gateway

S3 storage.

AWS athena.

AWS Glue.

AWS Cloudwatch.

RESULTS AND BENEFITS

Deliverables

The entire machine learning pipeline, for both training and inference, was developed and deployed on AWS. The ensemble recommendation model was deployed for real-time inference using a combination of AWS Lambda and auto-scaling sagemaker endpoints. The inference process was optimized to meet stringent latency and throughput requirements.

Load testing was also performed to make sure the system was able to handle the expected bursty traffic patterns. Training was provided to the customer IT and support teams.

Key Outcomes

The system went live on a base of 150k subscribers. The system powered 100k to 120k real time recommendations per day. The implementation of the MLOps pipeline on the AWS cloud resulted in significantly reduced the time to develop, deploy and evaluate new versions of the models, enabling quicker experimentation. Furthermore, the cloud environment also provided more resources that could be used in an elastic fashion to experiment and build different kinds of models, as well as perform inferencing on them in a bursty fashion with the help of auto-scaling. This addressed scalability challenges faced by the team on existing on-premise servers.

Metrics

KPI	Results
Latency	Met the SLA of 200 msec inference latency on real-time API requests.
Deployment of new models	Reduced the TAT for new model deployment by 80% with support for A/B testing
Subscriber base	System was shown to be capable of handling a subscriber base of over 150K subscribers
Throughput	System was able to handle a peak load of 600 requests/sec, with the help of elastic auto-scaling features supported by Sagemaker endpoints. This testing was done on synthetic data and workloads so as to ensure the system can eventually scale up to 5M+ users.
Campaign Performance	The ML-generated recommendations improved the CTR by 80% and number of options traded per session by 50%.

WE ARE ONEBYZERO:

Headquartered in Singapore with local presence in Asean nations, we are a modern data & AI consulting firm. We focus on transforming enterprises with cutting edge solutions to generate value from data. We specialize in serving the Telecommunications, Banking & Financial Services, Retail and Ecommerce industries.

We focus exclusively on AI/ML, Data & Martech

AI & ML, including Generative AI

We help organizations define their AI/ML strategy, develop and operationalize AI/ML & generative AI models, and implement MLOps to streamline operations. We have experience doing a variety of work in classical data science and cutting edge GenAI for Telcos

Modern Data Platforms

Our team of experts build robust data pipelines, design data warehouses and lakes with strong focus on data quality and lineage, to build C360, marts, support reporting, dashboards to help organizations uncover insights in their data using a modern data stack.

Martech & Personalization

We enable customers to modernize their digital platforms to deliver omni channel personalization use-cases. We have deep experience with marketing use-cases & personalized content generation, automated campaign grid optimization and next-best offers & actions.