Click any tag below to further narrow down your results
Links
Kthena is a new system tailored for Kubernetes that optimizes the routing, orchestration, and scheduling of Large Language Model (LLM) inference. It addresses key challenges like resource utilization and latency, offering features such as intelligent routing and production-grade orchestration. This sub-project of Volcano enhances support for AI lifecycle management.