2 links tagged with all of: machine-learning + speculative-decoding
Click any tag below to further narrow down your results
Links
The SpecForge team, in partnership with industry leaders, has launched SpecBundle (Phase 1), a collection of production-ready EAGLE-3 model checkpoints aimed at enhancing speculative decoding in large language models. This release addresses the lack of accessible tools and high-quality draft models, while SpecForge v0.2 introduces major usability upgrades and multi-backend support for improved performance.
DFlash introduces a lightweight block diffusion model that enhances speculative decoding by enabling faster and more accurate parallel drafting. It combines the speed of diffusion models with the verification strength of autoregressive models, achieving significant performance improvements over existing methods like EAGLE-3. The approach demonstrates how to leverage the benefits of both model types without sacrificing quality.