6 min read
|
Saved February 14, 2026
|
Copied!
Do you care about this?
This article details how Yelp developed its AI assistant to provide direct answers from business pages, overcoming challenges in data retrieval and generation. It explains the architectural decisions made to ensure accuracy and speed, including strategies for handling data freshness and content separation.
If you do, here's more
Yelp built its AI assistant, Yelp Assistant, to provide quick, accurate answers from its rich database of user reviews and business information. Traditional platforms often bury answers in various sections, making it hard for users to find what they need. The assistant uses Retrieval-Augmented Generation (RAG), which separates the retrieval of relevant snippets from the generation of responses. This involves an indexing pipeline that prepares data for retrieval, allowing the assistant to cite specific sources when answering user queries.
A robust data strategy is critical for the assistant's effectiveness. Yelp evolved through four main architectural shifts to improve data handling. First, it implemented a hybrid freshness model. Fast-moving data like reviews is updated in real-time, while slower-moving information, such as menus, is refreshed weekly. This approach ensures users get timely answers. Second, Yelp separated its data storage to handle unstructured content, like reviews, and structured facts, like operating hours, more efficiently. This minimizes the risk of inaccuracies in responses.
The third shift involved hybrid photo retrieval. By using both caption text and image embeddings, the system retrieves relevant images regardless of how users phrase their questions. Lastly, Yelp established a Content Fetching API to streamline data retrieval from various sources, reducing latency. This API ensures that the assistant can deliver responses under 100 milliseconds, improving the overall user experience without complicating the assistant's logic.
Questions about this article
No questions yet.