3 links
tagged with all of: data-processing + pinterest
Click any tag below to further narrow down your results
Links
Pinterest has developed an effective Feature Backfill solution to accelerate machine learning feature iterations, overcoming challenges associated with traditional forward logging methods. This approach reduces iteration time and costs significantly, allowing engineers to integrate new features more efficiently while addressing issues like data integrity and resource management. The article details the evolution of their backfill processes, including a two-stage method to enhance parallel execution and reduce computational expenses.
Pinterest is transitioning from its aging Hadoop-based platform to a Kubernetes-based data processing solution named Moka, designed to address scalability and performance needs. The first part of this series discusses the rationale behind this shift, the architecture of the new platform, and initial design considerations, while outlining the benefits of using Kubernetes for data processing at massive scale.
Pinterest is enhancing its ad retrieval systems by transitioning from online to offline Approximate Nearest Neighbors (ANN) algorithms to improve efficiency, reduce infrastructure costs, and maintain high performance amidst an expanding ad inventory. The article outlines the architecture, advantages, and use cases of offline ANN, particularly in similar item ads and visual embedding, while discussing the future potential of this approach within Pinterest's ad ecosystem.