Quit Emailing Yourself

OpenAI gpt-oss LLMs use MXFP4: smaller, faster, cheaper

5 min read | Saved October 29, 2025 | Copied!

openai 🤖 mxfp4 🤖 inference 🤖 quantization 🤖 ai-models 🤖

Do you care about this?

OpenAI has adopted a new data type called MXFP4, which significantly reduces inference costs by up to 75% by making models smaller and faster. This micro-scaling block floating-point format allows for greater efficiency in running large language models (LLMs) on less hardware, potentially transforming how AI models are deployed across various platforms. OpenAI's move emphasizes the efficacy of MXFP4, effectively setting a new standard in model quantization for the industry.

If you do, here's more

Click "Generate Summary" to create a detailed 2-4 paragraph summary of this article.

Questions about this article

No questions yet.