Click any tag below to further narrow down your results
Links
CAST AI's report reveals that organizations waste significant cloud resources, using only 13% of provisioned CPUs and 20% of memory in Kubernetes clusters. The study highlights overprovisioning and low utilization of spot instances as key factors. It calls for AI-driven solutions to optimize resource management amid rising cloud costs.
The article discusses recent advancements in Kubernetes GPU management, focusing on dynamic resource allocation (DRA) and a new workload abstraction. DRA allows for more flexible GPU requests, while the workload abstraction aims to improve scheduling for complex AI deployments.
Google introduced Agent Sandbox, a new feature for Kubernetes that enhances security and performance for AI agents. It allows rapid provisioning of isolated environments for executing agent tasks, optimizing resource use while maintaining strong operational guardrails. GKE users can also leverage Pod Snapshots for faster start-up times.
This article offers a checklist to help platform engineers and SREs secure cloud and container workloads. It emphasizes the need for updated strategies in light of expanding attack surfaces and the integration of AI. The checklist covers asset inventory, vulnerability assessment, and compliance monitoring.
This article discusses Google's latest advancements in Google Kubernetes Engine (GKE) as it marks its 10th anniversary. Key updates include the introduction of Agent Sandbox for AI workloads, enhancements to autoscaling, and new compute classes to improve efficiency and performance across various workloads.
This article explains how to multiplex MCP servers to give AI agents access to specialized tools for specific tasks. It highlights the need for agents to interact with multiple servers simultaneously to enhance their capabilities, particularly in enterprise environments. The post also includes deployment instructions for two example servers: one for math functions and another for retrieving the current date.
The CNCF Technical Oversight Committee has approved KServe as an incubating project, recognizing its role as a scalable AI inference platform on Kubernetes. Originally developed under Kubeflow, KServe supports generative and predictive AI workloads and has seen broad adoption across various industries.
Industry experts predict significant changes in Kubernetes networking by 2026, focusing on the integration of VMs and containers, improved user experiences with KubeVirt, and the emergence of specialized roles like the Kubernetworker. The increasing demand for AI workloads will drive innovations in network management and microsegmentation strategies.
This article discusses the security challenges of deploying AI and machine learning workloads on Oracle Kubernetes Engine and Oracle Cloud Infrastructure. It highlights the shared responsibility model for security and outlines strategies for protecting against evolving threats, including runtime detection and posture management.
SkyPilot is a platform that allows AI teams to run and manage workloads across various infrastructures like Kubernetes and cloud services. It offers an easy interface for job management, resource provisioning, and cost optimization, supporting multiple hardware configurations without code changes.
HolmesGPT is an open-source AI tool designed to streamline troubleshooting in Kubernetes environments. It aggregates logs, metrics, and traces, helping on-call engineers diagnose issues faster by providing clear, actionable insights. The tool is extensible and community-driven, promoting collaboration in observability practices.
Google Cloud successfully tested a 130,000-node Kubernetes cluster, doubling the previous limit. The article details the architectural innovations that enable this scale and the implications for AI workloads, including advanced job scheduling and optimized storage solutions.
The OpenCost project outlined its achievements in 2025 and plans for 2026, including 11 releases that improved usability and multi-cloud cost tracking. Key advancements include an AI-ready MCP server for real-time cost analysis and ongoing community mentorship efforts. Future goals focus on tracking machine-learning workloads and enhancing supply chain security.
Amazon EKS has announced support for ultra scale clusters with up to 100,000 nodes, enabling significant advancements in artificial intelligence and machine learning workloads. The enhancements include architectural improvements and optimizations in the etcd data store, API servers, and overall cluster management, allowing for better performance, scalability, and reliability for AI/ML applications.
Google Kubernetes Engine (GKE) celebrates its 10th anniversary with the launch of an ebook detailing its evolution and impact on businesses. Highlighting customer success stories, including Signify and Niantic, the article emphasizes GKE's role in facilitating scalable cloud-native AI solutions while allowing teams to focus on innovation rather than infrastructure management.
Ark is a Kubernetes-based runtime environment designed for hosting AI agents, allowing teams to efficiently build agentic applications. It is currently in technical preview, encouraging community feedback to refine its features and functionality. Users need to set up a Kubernetes cluster and install necessary tools to get started with Ark.
Accelerate AI innovation by leveraging Google Kubernetes Engine (GKE) to effectively manage containers, enhancing performance while reducing operational complexities. The guide emphasizes optimizing costs and scalability, enabling technology leaders to overcome challenges in AI deployment and achieve significant returns on investment.
Docker Desktop 4.43 introduces significant updates aimed at enhancing the development and management of AI models and MCP tools, including improved model management features, expanded OpenAI API support, and enhanced integration with GitHub and VS Code. The release also includes new functionalities for the MCP Catalog, allowing users to submit their own servers and utilize secure OAuth authentication, alongside performance upgrades for Docker's AI agent, Gordon, which now supports multi-threaded conversations. Additionally, the Compose Bridge feature facilitates easy conversion of local configurations to Kubernetes setups.
Mastercard leverages Kubernetes to power its AI Workbench, enhancing secure innovation in its services. By utilizing Kubernetes' scalability and flexibility, Mastercard aims to accelerate the development of AI and machine learning applications, ensuring robust security measures are in place throughout the process. The integration of this technology demonstrates Mastercard's commitment to harnessing advanced solutions for improved customer experiences.
Rafay offers an infrastructure orchestration layer tailored for enterprise AI workloads and Kubernetes management, aiming to alleviate the complexities and costs of traditional infrastructure. The platform enhances GPU and CPU management, providing a secure and efficient environment for innovation in AI development. Analyst insights from a dedicated eBook highlight the advantages of GPU Clouds for accelerating AI application deployment.
kubectl-ai provides an intelligent interface that simplifies Kubernetes management by translating user intent into specific commands. It supports various AI models and offers multiple installation methods, including via krew, Docker, and direct downloads, allowing users to interact with Kubernetes more efficiently through natural language queries. The tool also allows for customization and configuration to enhance user experience and functionality.
Google Kubernetes Engine (GKE) is enhancing its capabilities to support AI workloads, with new features like Cluster Director for managing large clusters, GKE Inference Quickstart for simplifying AI model deployment, and GKE Autopilot for optimizing resource usage. These advancements aim to empower platform teams to efficiently scale and manage AI applications without needing to overhaul their existing Kubernetes investments.
Running AI workloads on Kubernetes presents unique networking and security challenges that require careful attention to protect sensitive data and maintain operational integrity. By implementing well-known security best practices, like securing API endpoints, controlling traffic with network policies, and enhancing observability, developers can mitigate risks and establish a robust security posture for their AI projects.
Amazon Web Services has launched AI on EKS, an open source initiative aimed at simplifying the deployment and scaling of AI/ML workloads on Amazon Elastic Kubernetes Service. This project provides deployment-ready blueprints, Terraform templates, and best practices to optimize infrastructure for large language models and other AI tasks, while separating it from the previously established Data on EKS initiative to enhance focus and maintainability.