Click any tag below to further narrow down your results
Links
Jeff Dean outlines essential timing metrics for various computing tasks. The list includes latencies for cache references, memory accesses, and network communications, providing clear benchmarks for developers. Understanding these numbers helps optimize performance in software engineering.
The article discusses the transformation of a batch machine learning inference system into a real-time system to handle explosive user growth, achieving a 5.8x reduction in latency and maintaining over 99.9% reliability. Key optimizations included migrating to Redis for faster data access, compiling models to native C binaries, and implementing gRPC for improved data transmission. These changes enabled the system to serve millions of predictions quickly while capturing significant revenue that would have otherwise been lost.
Flipkart's Promise team optimized the delivery date calculation process for their Search and Browse (S&B) page, reducing latency to 100ms for 100 items while scaling to 10 times the current query per second (QPS). The solution involved caching source and vendor capacities and decoupling their storage to enhance real-time delivery date accuracy and efficiency. These improvements ensure a better user experience without compromising on performance metrics during high demand.