Quit Emailing Yourself

# transformers → performance

2 links tagged with all of: transformers + performance

Links

Tricks from OpenAI gpt-oss YOU 🫵 can use with transformers

OpenAI's GPT-OSS models introduce several efficiency upgrades for transformers, including MXFP4 quantization and specialized kernels that enhance performance during model loading and execution. The updates allow for faster inference and fine-tuning while maintaining compatibility across major models in the transformers library. Additionally, community-contributed kernels are integrated to streamline usage and performance optimization.

Saved by tldr-importer · Last saved October 29, 2025 · 6 min read

transformers ✓ + quantization + gpt-oss + machine-learning performance ✓

Transformers backend integration in SGLang

SGLang has integrated Hugging Face transformers as a backend, enhancing inference performance for models while maintaining the flexibility of the transformers library. This integration allows for high-throughput, low-latency tasks and supports models not natively compatible with SGLang, streamlining deployment and usage. Key features include automatic fallback to transformers and optimized performance through mechanisms like RadixAttention.

Saved by tldr-importer · Last saved October 29, 2025 · 3 min read

+ sglang transformers ✓ + inference performance ✓ + integration