6 links
tagged with all of: machine-learning + dataset
Click any tag below to further narrow down your results
Links
DeepMath-103K is a newly released dataset designed to enhance mathematical reasoning in language models, featuring a broad range of challenging and diverse math problems. It includes rigorous decontamination processes to ensure fair evaluation, with detailed problem structures that support various research applications. The accompanying models and code are open-sourced to facilitate further exploration and development in the field.
IMAGGarment-1 is a garment generation framework that allows for high-fidelity synthesis with precise control over silhouette, color, and logo placement, addressing the limitations of existing methods by enabling multi-conditional inputs. It utilizes a two-stage training approach, incorporating both a global appearance model and a local enhancement model, and is supported by the GarmentBench dataset, which comprises over 180K garment samples with various design conditions. Extensive experiments indicate that this framework significantly outperforms current baselines in terms of structural stability and visual fidelity.
StreamBridge is a framework designed to convert offline Video Large Language Models (Video-LLMs) into proactive streaming assistants, addressing issues of multi-turn understanding and proactive response mechanisms. It utilizes a memory buffer and a lightweight activation model for continuous engagement, alongside the creation of the Stream-IT dataset for enhanced streaming video comprehension. Experiments demonstrate that StreamBridge outperforms existing models, showcasing significant improvements in video understanding tasks.
MS MARCO Web Search is a comprehensive dataset designed for information retrieval research, featuring millions of real clicked query-document labels and a vast corpus from ClueWeb22. It supports various tasks in machine learning and retrieval systems, offering a benchmark for evaluating retrieval methods and performance across large datasets. Researchers can utilize this dataset to investigate the effectiveness of their techniques on both small and large data scales.
VistaDPO is a new framework for optimizing video understanding in Large Video Models (LVMs) by aligning text-video preferences at three hierarchical levels: instance, temporal, and perceptive. The authors introduce a dataset, VistaDPO-7k, consisting of 7.2K annotated QA pairs to address the challenges of video-language misalignment and hallucinations, showing significant performance improvements in various benchmarks.
REGEN introduces a new benchmark dataset aimed at enhancing the capabilities of large language models (LLMs) in generating personalized recommendations through natural language interactions. By augmenting the Amazon Product Reviews dataset with user critiques and contextual narratives, REGEN allows for more nuanced conversational recommendations that adapt to user feedback. The study demonstrates how models like LUMEN can effectively integrate recommendation and narrative generation, paving the way for more intuitive user experiences.