7 links tagged with all of: machine-learning + scalability
Click any tag below to further narrow down your results
Links
ShareChat engineers faced scalability issues with their ML feature store, initially unable to handle the required load. After a series of architectural optimizations and a shift in focus, they successfully rebuilt the system to support 1 billion features per second without increasing database capacity.
This article discusses the challenges and solutions in developing large-scale generative recommendation systems, particularly in managing user data and improving training efficiency. It highlights techniques like multi-modal item towers and sampled softmax to enhance performance while addressing issues like cold-start and latency.
LinkedIn has developed OpenConnect, a next-generation AI pipeline ecosystem that significantly enhances the efficiency and reliability of processing large volumes of data for AI applications. By addressing challenges from its previous ProML system, OpenConnect reduces launch times, improves iteration speed, and supports robust experimentation, thereby facilitating the deployment of AI features for over 1.2 billion members.
M1 introduces a hybrid linear RNN reasoning model based on the Mamba architecture, designed for scalable test-time computation in solving complex mathematical problems. By leveraging distillation from existing models and reinforcement learning, M1 achieves significant speed and accuracy improvements over traditional transformer models, matching the performance of state-of-the-art distilled reasoning models while utilizing memory-efficient inference techniques.
Inferless is a serverless GPU platform designed for effortless machine learning model deployment, allowing users to scale from zero to hundreds of GPUs quickly and efficiently. With features like automatic redeployment, zero infrastructure management, and enterprise-level security, it enables companies to save costs and enhance performance without the hassles of traditional GPU clusters. The platform will be sunsetting on October 31, 2025.
The webpage provides an overview of Baseten's Model APIs, which facilitate the deployment and management of machine learning models. It emphasizes ease of integration, scalability, and the ability to create robust APIs for various applications. Users can leverage these APIs to streamline their machine learning workflows and enhance application performance.
The article discusses Instagram's efforts to scale its recommendation system to handle 1,000 models, detailing the challenges faced and the strategies implemented to enhance user experience through personalized content. Key aspects include improvements in algorithm efficiency and data processing techniques.