5 links
tagged with all of: performance + latency
Click any tag below to further narrow down your results
Links
The article discusses the transformation of a batch machine learning inference system into a real-time system to handle explosive user growth, achieving a 5.8x reduction in latency and maintaining over 99.9% reliability. Key optimizations included migrating to Redis for faster data access, compiling models to native C binaries, and implementing gRPC for improved data transmission. These changes enabled the system to serve millions of predictions quickly while capturing significant revenue that would have otherwise been lost.
Companies looking to optimize infrastructure costs and service reliability should consider forming a performance engineering team. These teams can achieve significant cost savings and latency reductions, ultimately enhancing scalability and engineering efficiency. The article outlines the benefits and ROI of hiring performance engineers, emphasizing their role in both immediate optimizations and long-term strategic improvements.
Tail latency, or high-percentile latency, significantly impacts user experience in modern architectures with multiple service calls. As the number of parallel calls increases, the likelihood of encountering high-latency responses rises, making it crucial to monitor and understand latency statistics beyond just the mean. Effective monitoring should include awareness of high percentiles and consider customer use cases to capture the full picture of service performance.
A comparison of various PostgreSQL versions reveals transaction performance, latency, and transactions per second (TPS) metrics. The data highlights that PostgreSQL version 18 achieves the highest transaction count and TPS, while version 17 shows the lowest performance in these areas. Overall, the newer versions generally perform better in terms of latency and transaction efficiency.
The article discusses the importance of caching in web applications, highlighting how it can improve performance and reduce latency by storing frequently accessed data closer to the user. It also explores various caching strategies and technologies, providing insights on how to effectively implement caching mechanisms to enhance user experience and system efficiency.