3 links tagged with all of: data-processing + infrastructure
Click any tag below to further narrow down your results
Links
This article details how Spotify developed its data platform to manage 1.4 trillion data points daily from user interactions. It covers the evolution from improvised systems to a structured platform that supports data collection, processing, and management for various business needs.
Pinterest has enhanced its machine learning (ML) infrastructure by extending the capabilities of Ray beyond just training and inference. By addressing challenges such as slow data pipelines and inefficient compute usage, Pinterest implemented a Ray-native ML infrastructure that improves feature development, sampling, and labeling, leading to faster, more scalable ML iteration.
Pinterest is enhancing its ad retrieval systems by transitioning from online to offline Approximate Nearest Neighbors (ANN) algorithms to improve efficiency, reduce infrastructure costs, and maintain high performance amidst an expanding ad inventory. The article outlines the architecture, advantages, and use cases of offline ANN, particularly in similar item ads and visual embedding, while discussing the future potential of this approach within Pinterest's ad ecosystem.