6 min read
|
Saved February 14, 2026
|
Copied!
Do you care about this?
ShareChat engineers faced scalability issues with their ML feature store, initially unable to handle the required load. After a series of architectural optimizations and a shift in focus, they successfully rebuilt the system to support 1 billion features per second without increasing database capacity.
If you do, here's more
ShareChat faced significant challenges scaling its machine learning (ML) feature store, initially designed for user features, to meet the demands of its short-form video app, Moj. Their original system crashed under a load of 1 million features per second, far below the target of 1 billion. This failure highlighted the need for a more robust architecture that could handle both user and post features efficiently, especially as the number of posts to rank surged from hundreds to thousands. The reliance on pre-aggregated data tiles for performance led to severe latency issues and limited scalability due to the initial database setup.
To address these shortcomings, ShareChat's engineering team undertook a major overhaul. They changed the database schema to consolidate feature rows and reduced the number of required rows from 2 billion to 200 million per second. This optimization involved updating their tiling configuration to include additional time segments, which cut the average tile requirement significantly. Furthermore, they switched ScyllaDB's compaction strategy to a leveled approach, which improved read I/O and effectively doubled the database's capacity without additional infrastructure costs.
Improving cache locality emerged as a critical strategy to alleviate load on the database. The team implemented consistent hashing using NGINX Ingress in their Kubernetes setup, enhancing cache hit rates and further reducing the demand on ScyllaDB. These adjustments not only met the performance goals set by ShareChat but also demonstrated the importance of continuous optimization in engineering practices, particularly in high-demand environments like social media platforms.
Questions about this article
No questions yet.