Quit Emailing Yourself

Quinn Slack on X: "The new metric “Off-the-Rails Cost” was shocking and useful for comparing Sonnet, Gemini, and Opus. We defined criteria for a “wasted thread”, such as when the model starts spitting out tons of leaked thinking or repeating tokens. Usually this means you need to abandon and" / X

1 min read | Saved February 14, 2026 | Copied!

ai 🤖 performance 🤖 metrics 🤖 analysis 🤖 amp 🤖

Do you care about this?

Quinn Slack discusses a new metric called "Off-the-Rails Cost," which compares the performance of AI models Sonnet, Gemini, and Opus. He highlights that 17.8% of costs for Gemini users are tied to "wasted threads," significantly worse than the other models. This analysis aims to improve Amp's functionality and may lead to automatic detection of these issues.

If you do, here's more

Quinn Slack highlights a new metric called “Off-the-Rails Cost,” which measures the inefficiencies in AI-generated threads. This metric defines a “wasted thread” as one where the model produces excessive irrelevant output or repeats itself, indicating that the thread should be abandoned. In their analysis, 17.8% of costs incurred by users of Gemini in the Amp platform were linked to these wasted threads. This figure is more than double that of Sonnet and nearly eight times worse than Opus, suggesting that Gemini is less cost-effective and efficient compared to its counterparts.

The findings from this analysis aim to enhance the Amp platform by allowing for better monitoring and remediation of wasted threads. There’s also a potential plan to integrate automatic detection of these inefficiencies into the product, which could lead to credits for users affected by the wasted threads. The aim is to provide privacy-preserving feedback to the developers of these models, helping them refine their systems and improve overall performance. This insight is particularly valuable as it identifies specific areas where improvements can be made, thereby enhancing user experience.

Questions about this article

No questions yet.