5 links tagged with all of: infrastructure + observability
Click any tag below to further narrow down your results
Links
New Relic developed Weather Station, an internal system that performs over 100,000 connectivity checks per hour across its multi-cloud infrastructure. This tool allows for rapid detection and diagnosis of network issues by continuously validating network paths, significantly improving the speed of issue detection and resolution.
The article discusses the merging roles of infrastructure and observability teams as companies increasingly integrate observability into their offerings. It highlights key acquisitions and the growing importance of AI in incident response, while advocating for an open standard approach using OpenTelemetry and Apache Iceberg to manage data effectively.
Modern infrastructure complexity necessitates advanced observability tools, which can be achieved through cost-effective storage solutions, standardized data collection with OpenTelemetry, and the integration of machine learning and AI for better insight and efficiency. The evolution in observability is marked by the need for high-fidelity data, seamless signal correlation, and intelligent alert management to keep pace with scaling systems. Ultimately, successful observability will hinge on these innovations to maintain operational efficacy in increasingly intricate environments.
Character.AI has transformed its fragmented logging system into a centralized one, significantly improving query speeds and enabling real-time visibility for developers. By selectively capturing logs and introducing new features like live tailing and keyword search, the company aims for metric unification to enhance observability and support future growth.
k0rdent v1.0.0 has been released, marking a significant milestone with enhanced features for managing distributed infrastructure at scale using Kubernetes. This version focuses on unified observability, cost optimization, and improved operational capabilities through the k0rdent Cluster Manager and Observability & FinOps components, providing production-grade stability and advanced service management. Key highlights include automated IP management, multi-cluster support, and integration with popular observability tools for better resource tracking and financial accountability.