KServe v0.15 has been released, enhancing capabilities for serving generative AI models, including support for large language models (LLMs) and advanced caching mechanisms. Key features include integration with Envoy AI Gateway, multi-node inference, and autoscaling with KEDA, aimed at improving performance and scalability for AI workloads. The update also introduces a dedicated documentation section for generative AI and various performance optimizations.
+ kserve
generative-ai ✓
kubernetes ✓
llm ✓
autoscaling ✓