Quit Emailing Yourself

# generative-models → information-retrieval → attention-mechanism → scalable-solutions

1 link tagged with all of: generative-models + information-retrieval + attention-mechanism + scalable-solutions

Scalable In-context Ranking with Generative Models

In-context Ranking (ICR) utilizes the contextual understanding of large language models (LLMs) for information retrieval by incorporating the task description, candidate documents, and query into the model's input. This paper introduces BlockRank, a new method that enhances the efficiency of attention operations in LLMs by enforcing inter-document block sparsity and optimizing query-document relevance, achieving significant performance improvements and scalability for long context retrieval tasks. Experiments demonstrate that BlockRank matches or surpasses state-of-the-art methods while being considerably more efficient at inference.

Saved by tldr-importer · Last saved October 29, 2025 · 2 min read

+ in-context-ranking information-retrieval ✓ generative-models ✓ attention-mechanism ✓ scalable-solutions ✓

Links

Scalable In-context Ranking with Generative Models