OpenAI has adopted a new data type called MXFP4, which significantly reduces inference costs by up to 75% by making models smaller and faster. This micro-scaling block floating-point format allows for greater efficiency in running large language models (LLMs) on less hardware, potentially transforming how AI models are deployed across various platforms. OpenAI's move emphasizes the efficacy of MXFP4, effectively setting a new standard in model quantization for the industry.
openai ✓
mxfp4 ✓
inference ✓
quantization ✓
ai-models ✓