5 links tagged with all of: infrastructure + optimization
Click any tag below to further narrow down your results
Links
This article details how Uber Eats developed its semantic search system to improve order discovery and conversion rates. It covers the architecture, model training, and challenges faced while scaling the platform to handle diverse queries effectively.
Zoomer is Meta's platform for automated debugging and optimization of AI workloads, enhancing performance across training and inference processes. It delivers insights that reduce training times and improve query performance, addressing inefficiencies in GPU utilization. The tool generates thousands of performance reports daily for various AI applications.
Patreon faced challenges in scaling its infrastructure for live events, necessitating cross-team collaboration to quantify capacity and optimize performance. Through careful analysis and prioritization of app requests, they focused on reducing load and enhancing user experience while maintaining system reliability. Key learnings emphasized the importance of optimizing both client and server aspects to achieve scalability.
Charlotte Qi discusses the challenges of serving large language models (LLMs) at Meta, focusing on the complexities of LLM inference and the need for efficient hardware and software solutions. She outlines the critical steps to optimize LLM serving, including fitting models to hardware, managing latency, and leveraging techniques like continuous batching and disaggregation to enhance performance.
Pinterest has enhanced its machine learning (ML) infrastructure by extending the capabilities of Ray beyond just training and inference. By addressing challenges such as slow data pipelines and inefficient compute usage, Pinterest implemented a Ray-native ML infrastructure that improves feature development, sampling, and labeling, leading to faster, more scalable ML iteration.