1 link tagged with all of: reasoning + execution-capability
Click any tag below to further narrow down your results
Links
Continued scaling of large language models (LLMs) may not yield diminishing returns as previously thought; even small improvements in accuracy can lead to significant advancements in long-horizon task execution. The study reveals that LLMs struggle with longer tasks not due to reasoning limitations, but execution errors that compound over time, highlighting the importance of model size and strategic thinking in improving performance.