The article discusses the transformation of a batch machine learning inference system into a real-time system to handle explosive user growth, achieving a 5.8x reduction in latency and maintaining over 99.9% reliability. Key optimizations included migrating to Redis for faster data access, compiling models to native C binaries, and implementing gRPC for improved data transmission. These changes enabled the system to serve millions of predictions quickly while capturing significant revenue that would have otherwise been lost.
Flipkart's Promise team optimized the delivery date calculation process for their Search and Browse (S&B) page, reducing latency to 100ms for 100 items while scaling to 10 times the current query per second (QPS). The solution involved caching source and vendor capacities and decoupling their storage to enhance real-time delivery date accuracy and efficiency. These improvements ensure a better user experience without compromising on performance metrics during high demand.