Click any tag below to further narrow down your results
Links
AI Observer is a self-hosted observability backend that monitors local AI coding assistants like Claude Code and Codex CLI. It tracks metrics such as token usage, API latency, and error rates through a real-time dashboard, keeping all data local without third-party services. Users can import historical session data and export telemetry in various formats.
This article discusses how an organization streamlined its observability across multiple cloud platforms using OpenTelemetry. By consolidating various tools into a single framework, they improved visibility, reduced resolution times, and minimized vendor lock-in. The approach emphasizes the importance of a standardized instrumentation for better monitoring and analysis.
This article introduces "OpenTelemetry For Dummies," a guide that clarifies observability in modern applications. It covers how to set up OpenTelemetry, interpret key telemetry signals, and implement best practices for effective monitoring.
The article discusses the evolution of OpenTelemetry and the challenges organizations face as they move past the initial excitement phase. It outlines specific issues like managing telemetry costs, quality data collection, and the need for improved tools and practices in observability. The author shares her wishlist for enhancements in OpenTelemetry by 2026.
The article discusses the current state of OpenTelemetry, highlighting its growing adoption but also the significant hurdles it faces, especially in supporting Rust and integrating with Prometheus. It addresses the complexities of implementation, issues with semantic conventions, and the challenges of adopting OpenTelemetry alongside existing Prometheus setups.
This article compares the OpenTelemetry Collector and agent, outlining their roles in telemetry data collection. The Collector centralizes data management, while the agent focuses on local data capture with minimal overhead. Choosing between them depends on your system's needs for scalability and performance.
This article outlines Grafana Labs' key achievements in 2025, including the launch of Grafana 12 and the introduction of the AI-powered Grafana Assistant. It also discusses significant milestones in open source projects and the expansion of Grafana's community efforts, particularly in Japan.
Grafana Alloy, the OpenTelemetry Collector distribution launched a year ago, has seen significant adoption and development, now supporting over 525,000 active instances. The article highlights Alloy's unique capabilities, including native pipelines for both OpenTelemetry and Prometheus, live debugging features, and Fleet Management for centralized control in Grafana Cloud. Future enhancements are focused on aligning with OpenTelemetry standards and improving user experience for debugging and configuration.
Effective cross-agent communication in agentic AI applications, particularly those built on Amazon Bedrock, relies on standardized telemetry and observability practices. By implementing OpenTelemetry solutions and monitoring mechanisms, organizations can enhance AI agent performance, ensure compliance, and streamline debugging processes. Best practices for observability, including secure communication and continuous feedback, are essential for optimizing the functionality of AI agents at scale.
The article discusses how to monitor agentic AI applications using Amazon CloudWatch, highlighting the importance of observability for ensuring reliability and performance. It details the setup of a sample Weather Forecaster application built with Strands Agents SDK, which utilizes CloudWatch to collect telemetry data, including metrics, traces, and logs, for comprehensive analysis. Additionally, it provides a step-by-step guide for deploying the application and analyzing the generated telemetry data in the CloudWatch console.
OpenTelemetry is an open-source observability framework designed to provide a standardized way to collect, process, and export telemetry data such as traces, metrics, and logs. It aims to help developers and organizations gain insights into their systems' performance and behavior, facilitating better monitoring and troubleshooting. By integrating with various backend systems, OpenTelemetry enhances observability across diverse environments and applications.
New Relic has announced the general availability of Fleet Control and Agent Control, designed to streamline the management of observability agents across various IT environments. This unified control plane aims to reduce operational overhead and security risks by automating the observability lifecycle, enabling consistent configurations and efficient deployments from a single interface. The platform also supports both Kubernetes clusters and host-based environments, enhancing its capabilities for enterprise-scale observability management.
HyperDX is a powerful tool integrated with ClickStack that enables engineers to efficiently search and visualize logs, metrics, and traces on any ClickHouse cluster. It supports full-text search, alert setup, and real-time logging, while also offering compatibility with OpenTelemetry for various programming languages. The platform aims to simplify observability and improve the debugging process for production issues.
The article critiques the current state of observability in tech, highlighting confusion around metrics, logs, and traces, largely attributed to OpenTelemetry's complex presentation. It advocates for the use of "Wide Events," as exemplified by Meta's Scuba system, which simplifies data collection and analysis, enabling deeper insights into system performance without the need for extensive terminology.