TL;DR — OpenTelemetry defines a vendor-neutral way to instrument applications and export metrics, traces, and logs. For AI services it captures request latency, token counts, model calls, cache hits, and downstream dependencies with one API.
What it is
OpenTelemetry is a CNCF project that standardizes instrumentation. It gives you SDKs, auto-instrumentation, the OpenTelemetry Collector, and semantic conventions so telemetry data looks the same across languages and backends.
Why it exists
AI platforms often split observability across multiple tools. OTel reduces that fragmentation by giving your services one instrumentation layer that can export to Prometheus, Grafana, Jaeger, or any vendor backend.
How it works
Applications create spans and metrics through OTel SDKs. A Collector receives the data, processes or batches it, then exports it to backends. That makes it easy to add traces to model inference chains or measure token usage without vendor lock-in.
Key features
- Traces, metrics, logs under one API.
- Collector pipeline for transform and export.
- Semantic conventions for consistent naming.
- Language SDKs for most production stacks.
Quick start
from opentelemetry import trace
tracer = trace.get_tracer(__name__)
with tracer.start_as_current_span("llm.request"):
...When to use, when to skip
Use it whenever you need portable instrumentation for AI services. Skip it if you're only collecting one metric and don't need traces or logs.
vs / alongside
| Tool | Role | Note |
|---|---|---|
| OpenTelemetry | Instrumentation standard | Vendor-neutral |
| Prometheus | Metrics store | Receives scraped metrics |
| Grafana | Visualization | Reads traces/metrics |
| Langfuse | LLM observability | Higher-level AI telemetry |
References
- OpenTelemetry — project home.
- OTel docs — SDKs and collector.
- open-telemetry — source org.
Verified against OpenTelemetry docs, May 2026.