Perplexity evaluates OpenAI's newly released open-weight models, gpt-oss-20b and gpt-oss-120b, focusing on their implementation on NVIDIA H200 GPUs. The article discusses infrastructure decisions, kernel modifications, and performance optimizations made to efficiently integrate these models into their inference engine, ROSE.
gpt-oss ✓
+ openai
inference-engine ✓
performance ✓
nvidia ✓