Click any tag below to further narrow down your results
Links
This paper explores how large language models make decisions during reasoning. It demonstrates that these models often encode their choices before generating text, influencing their subsequent thought processes. The research shows that altering initial decisions can change reasoning outcomes significantly.
This article explores how advanced AI models can generate detailed image descriptions and reasoning without actual image input, a phenomenon called mirage reasoning. It highlights vulnerabilities in these models, particularly in medical contexts, and introduces B-Clean, a method for better evaluating multimodal AI systems by minimizing non-visual inference.
The article discusses the importance of data activation in enhancing the performance of large language models (LLMs), particularly in the healthcare sector. It highlights recent advancements in transforming structured medical data into usable formats for LLMs, emphasizing the need for effective reasoning methods to fully leverage the potential of healthcare data.
Deep Think with Confidence (DeepConf) is introduced as a method to improve reasoning efficiency and performance in large language models by using internal confidence signals to filter out low-quality reasoning traces. It requires no additional training or tuning and can be easily integrated into existing systems. Evaluations show significant accuracy improvements and a reduction in generated tokens on various reasoning tasks.
The article reviews significant trends and developments in the LLM space throughout 2025, highlighting breakthroughs in reasoning, the rise of coding agents, and the increasing use of LLMs in command-line interfaces. It notes the evolution of tools and models, including the impact of asynchronous coding agents and the normalization of YOLO mode for improved efficiency.