Introduction
If you operate microservices in production, you already know the pain. A user reports a slow checkout. You open three different dashboards — Grafana for metrics, Jaeger for traces, and grep for logs. By the time you correlate the request ID across all three, the incident has been open for 45 minutes.
OpenTelemetry (OTel) solves this by unifying all three signals under one standard. It is now the CNCF's second-most active project after Kubernetes, and every major observability vendor — Datadog, Honeycomb, Grafana Labs, New Relic — has adopted its protocol. In 2026, if you are not instrumenting with OpenTelemetry, you are building technical debt every time you ship code.
This tutorial walks you through a complete OpenTelemetry setup: instrumentation with the OTel SDK, collector configuration, and exporting traces and metrics to Jaeger and Prometheus. Everything is hands-on with real YAML and code snippets you can run today.
By the end, you will have:
- A Python service auto-instrumented with traces and metrics
- An OpenTelemetry Collector processing and exporting telemetry
- Traces visible in Jaeger and metrics scraped by Prometheus
- A working mental model of OTel's pipeline architecture
What Is OpenTelemetry, Actually?
OpenTelemetry is not a backend. It is not a database, a dashboard, or an alerting engine. It is a telemetry pipeline standard — a specification, a set of SDKs, and a collector binary that together generate, process, and export traces, metrics, and logs.
The project emerged from the 2019 merger of OpenTracing and OpenCensus. Both were CNCF observability projects with overlapping goals. Rather than compete, they merged into a single standard. Today, OTel is at version 1.34+ and is considered stable for traces and metrics.
Three things make OpenTelemetry different from what came before:
-
Vendor-neutral instrumentation. You instrument once with the OTel SDK. Changing backends — from Jaeger to Honeycomb, or from Prometheus to Datadog — means changing an exporter config, not rewriting code.
-
The Collector. A standalone binary that receives, processes, and exports telemetry. You can run it as a sidecar, a daemonset, or a central gateway. It handles batching, filtering, sampling, and routing — all config-driven.
-
Context propagation. The
traceparentheader (W3C Trace Context standard) passes trace context across HTTP, gRPC, and message queues. Every hop in your distributed system links back to a single root span without custom headers.
The telemetry pipeline looks like this:
Application Code --> OTel SDK --> OTel Collector --> Backend (Jaeger/Prometheus/...)
(API calls) (auto/manual) (process/route) (store/query)
The SDK generates spans and metrics inside your application process. The Collector — a separate binary — receives them via OTLP (OpenTelemetry Protocol) over gRPC or HTTP, then applies processors and exports to one or more backends.
This separation matters. Your application never talks directly to Jaeger or Prometheus. It only talks to the Collector. The Collector absorbs backend changes without touching application code.
OpenTelemetry Architecture: The Pipeline Model
Every observability signal in OTel follows the same pipeline: Instrumentation → Processing → Export.
The Three Components
1. Instrumentation Libraries (SDK)
The SDK lives inside your application process. It creates spans, records metrics, and captures log events. OTel provides SDKs for Python, Go, Java, JavaScript, .NET, Rust, and more. You can use auto-instrumentation (zero code changes — the agent injects hooks at runtime) or manual instrumentation (explicit start_span() and end_span() calls in your code).
Auto-instrumentation covers most common libraries by default: HTTP frameworks (Flask, Express, Spring), database drivers (psycopg2, pgx, JDBC), and gRPC clients. For custom business logic, you add manual spans.
2. The OpenTelemetry Collector
The Collector is the backbone of any production OTel deployment. It is a single Go binary (otelcol-contrib) that runs three types of components in a pipeline:
- Receivers: Accept telemetry data (OTLP gRPC, OTLP HTTP, Jaeger, Zipkin, Prometheus scrape)
- Processors: Transform data in-flight (batch, filter, tail sampling, attributes mutation, redaction)
- Exporters: Send data to backends (Jaeger, Prometheus, Datadog, Honeycomb, Kafka, stdout)
The Collector decouples your application from backends. If you switch from Jaeger to Tempo, or add a second exporter for Honeycomb, you change one YAML file — not every microservice.
3. Exporters and Backends
Exporters are protocol-specific components that push data to observability backends. Common exporters include:
| Exporter | Protocol | Typical Backend |
|---|---|---|
otlp | gRPC/HTTP | Any OTLP-compatible backend (Jaeger, Tempo, Grafana Agent) |
prometheus | HTTP scrape | Prometheus server |
jaeger | Thrift/gRPC | Jaeger backend |
logging | stdout | Debugging during development |
kafka | Kafka | Long-term buffering, multi-datacenter pipelines |
The OTLP Protocol
All communication between the SDK and the Collector uses OTLP (OpenTelemetry Protocol). OTLP is a Protobuf-based protocol that runs over gRPC (port 4317) or HTTP/1.1 (port 4318). In 2026, OTLP over HTTP has matured enough that many teams prefer it over gRPC for simpler firewall traversal and load balancer compatibility.
A typical OTLP trace payload is a binary-encoded Protobuf message containing resource attributes (service name, host, namespace), span data (trace ID, span ID, parent span ID, start/end timestamps, attributes, events), and instrumentation scope.
Setting Up the OpenTelemetry Collector
Let's start with the Collector — it is the first piece you deploy because your applications need somewhere to send telemetry.
Step 1: Install the Collector
The recommended distribution is otelcol-contrib, which includes receivers and exporters for every major observability tool:
# Linux (AMD64)
wget https://github.com/open-telemetry/opentelemetry-collector-releases/releases/download/v0.110.0/otelcol-contrib_0.110.0_linux_amd64.tar.gz
tar -xzf otelcol-contrib_0.110.0_linux_amd64.tar.gz
sudo mv otelcol-contrib /usr/local/bin/
# Verify
otelcol-contrib --version
For Docker-based development:
docker run -d --name otel-collector \
-p 4317:4317 -p 4318:4318 -p 8888:8888 \
-v $(pwd)/otel-config.yaml:/etc/otelcol/config.yaml \
otel/opentelemetry-collector-contrib:0.110.0
Step 2: Write the Collector Configuration
Create otel-config.yaml. This is the heart of your observability pipeline:
receivers:
otlp:
protocols:
grpc:
endpoint: 0.0.0.0:4317
http:
endpoint: 0.0.0.0:4318
processors:
batch:
timeout: 5s
send_batch_size: 512
memory_limiter:
check_interval: 1s
limit_mib: 512
attributes:
actions:
- key: environment
value: production
action: upsert
exporters:
jaeger:
endpoint: jaeger-collector:14250
tls:
insecure: true
prometheus:
endpoint: 0.0.0.0:8889
namespace: otel
logging:
loglevel: debug
service:
pipelines:
traces:
receivers: [otlp]
processors: [memory_limiter, batch, attributes]
exporters: [jaeger, logging]
metrics:
receivers: [otlp]
processors: [memory_limiter, batch]
exporters: [prometheus, logging]
This configuration does several things:
- Receivers listen on ports 4317 (gRPC) and 4318 (HTTP) for OTLP data from applications
- Processors batch spans for efficiency, limit memory usage to 512 MiB, and add an
environment=productionattribute to every span - Exporters forward traces to Jaeger, expose metrics on port 8889 for Prometheus scraping, and log debug output to stdout
- Pipelines wire everything together — traces and metrics take different paths through the same Collector
The batch processor is critical for production. Without it, the Collector sends one span at a time to Jaeger, creating massive network overhead. Batching amortizes the cost across 512 spans.
Step 3: Run the Collector
otelcol-contrib --config=otel-config.yaml
You should see log output confirming that all receivers, processors, and exporters are active. The Collector is now ready to receive telemetry from your applications.
Instrumenting Your First Application
Now that the Collector is running, let's instrument a Python web service. We will use Flask for the HTTP layer and the OpenTelemetry Python SDK for auto-instrumentation, then add manual spans for custom business logic.
Step 1: Install Dependencies
pip install flask opentelemetry-api opentelemetry-sdk \
opentelemetry-instrumentation-flask \
opentelemetry-instrumentation-requests \
opentelemetry-exporter-otlp-proto-grpc
The key packages:
opentelemetry-apiandopentelemetry-sdk— the core OTel SDKopentelemetry-instrumentation-flask— auto-instrumentation for Flask (creates spans for each HTTP request automatically)opentelemetry-instrumentation-requests— auto-instrumentation for therequestslibrary (spans for outbound HTTP calls)opentelemetry-exporter-otlp-proto-grpc— the OTLP exporter that sends data to our Collector
Step 2: Write the Application
# app.py
from flask import Flask, request, jsonify
import requests
import time
from opentelemetry import trace
from opentelemetry.sdk.trace import TracerProvider
from opentelemetry.sdk.resources import Resource, SERVICE_NAME
from opentelemetry.exporter.otlp.proto.grpc.trace_exporter import OTLPSpanExporter
from opentelemetry.sdk.trace.export import BatchSpanProcessor
from opentelemetry.instrumentation.flask import FlaskInstrumentor
from opentelemetry.instrumentation.requests import RequestsInstrumentor
# --- OTel Setup ---
resource = Resource(attributes={
SERVICE_NAME: "checkout-service",
"deployment.environment": "staging"
})
provider = TracerProvider(resource=resource)
otlp_exporter = OTLPSpanExporter(
endpoint="http://localhost:4317",
insecure=True
)
provider.add_span_processor(BatchSpanProcessor(otlp_exporter))
trace.set_tracer_provider(provider)
# --- Application ---
app = Flask(__name__)
# Auto-instrument Flask and outgoing HTTP requests
FlaskInstrumentor().instrument_app(app)
RequestsInstrumentor().instrument()
# Get a tracer for manual instrumentation
tracer = trace.get_tracer(__name__)
@app.route("/checkout", methods=["POST"])
def checkout():
"""Process a checkout — spans created automatically by FlaskInstrumentor."""
data = request.get_json()
# Manual span for the payment processing step
with tracer.start_as_current_span("process_payment") as span:
span.set_attribute("payment.amount", data.get("amount", 0))
span.set_attribute("payment.method", data.get("method", "unknown"))
# Simulate payment work
time.sleep(0.15)
payment_result = process_payment(data.get("amount", 0))
span.set_attribute("payment.status", payment_result["status"])
span.set_status(trace.Status(trace.StatusCode.OK))
# Manual span for inventory update
with tracer.start_as_current_span("update_inventory") as span:
span.set_attribute("inventory.items", len(data.get("items", [])))
time.sleep(0.08)
# Outbound HTTP call — automatically traced by RequestsInstrumentor
resp = requests.post(
"http://inventory-service:5001/update",
json={"items": data.get("items", [])}
)
span.set_attribute("inventory.response_code", resp.status_code)
return jsonify({"status": "ok", "order_id": "ord-2026-abc"})
def process_payment(amount: float) -> dict:
"""Simulated payment gateway call."""
return {"status": "authorized", "transaction_id": "txn-42", "amount": amount}
if __name__ == "__main__":
app.run(host="0.0.0.0", port=5000)
What This Code Does
Every /checkout request now generates a trace with multiple spans:
- Root span — created automatically by
FlaskInstrumentorfor the HTTP request process_payment— manual span wrapping payment logic, with custom attributes (amount, method, status)update_inventory— manual span wrapping inventory logic- HTTP POST to inventory-service — nested span created by
RequestsInstrumentor, linked to the parentupdate_inventoryspan
Context propagation is automatic. When the /checkout handler calls requests.post(...), the OTel SDK injects the traceparent header into the outbound HTTP request. If the inventory service is also instrumented with OTel, it extracts that header and continues the same trace — creating a single distributed trace across both services.
Step 3: Run and Verify
# Terminal 1: Start the Collector (if not already running)
otelcol-contrib --config=otel-config.yaml
# Terminal 2: Start the Flask app
python app.py
# Terminal 3: Generate a trace
curl -X POST http://localhost:5000/checkout \
-H "Content-Type: application/json" \
-d '{"amount": 49.99, "method": "card", "items": [{"id": 1}, {"id": 2}]}'
Check the Collector's debug log output. You should see spans being received, processed, and exported. The logging exporter will print span summaries to stdout — useful for debugging before you wire up Jaeger.
Look for lines like:
Span #0
Trace ID : 6e8f4c7a1b2d3e4f5a6b7c8d9e0f1a2b
Parent ID :
ID : 3a4b5c6d7e8f9a0b
Name : POST /checkout
Kind : Server
...
Span #1
Trace ID : 6e8f4c7a1b2d3e4f5a6b7c8d9e0f1a2b
Parent ID : 3a4b5c6d7e8f9a0b
ID : 1b2c3d4e5f6a7b8c
Name : process_payment
...
The shared Trace ID across both spans confirms that context propagation is working — both spans belong to the same distributed trace.
Exporting Traces to Jaeger
The logging exporter is useful for debugging, but you need a real trace backend. Let's set up Jaeger and configure the Collector to forward traces.
Step 1: Run Jaeger All-in-One
For development, Jaeger's all-in-one image bundles the collector, query UI, and in-memory storage:
docker run -d --name jaeger \
-e COLLECTOR_OTLP_ENABLED=true \
-p 16686:16686 \
-p 4317:4317 \
jaegertracing/all-in-one:1.62
- Port 16686: Jaeger UI (open
http://localhost:16686) - Port 4317: OTLP gRPC receiver (Jaeger can accept OTLP directly as of 1.35+)
However, routing through our Collector is the production pattern. Update the Collector config to point at Jaeger:
# otel-config.yaml (exporter section update)
exporters:
otlp/jaeger:
endpoint: localhost:4317
tls:
insecure: true
# ... keep the other exporters
service:
pipelines:
traces:
receivers: [otlp]
processors: [memory_limiter, batch, attributes]
exporters: [otlp/jaeger, logging]
Step 2: Generate Traces and Inspect
Send a few checkout requests:
for i in $(seq 1 5); do
curl -s -X POST http://localhost:5000/checkout \
-H "Content-Type: application/json" \
-d '{"amount": 49.99, "method": "card", "items": [{"id": 1}]}' > /dev/null
done
Open Jaeger UI at http://localhost:16686:
- Select
checkout-servicefrom the Service dropdown - Click Find Traces
- You should see 5 traces, each containing multiple spans
Click any trace to view the waterfall diagram. You will see the parent POST /checkout span and its children — process_payment, update_inventory, and potentially the outbound HTTP call to inventory-service. Expand a span to see attributes like payment.amount, payment.method, and payment.status.
Debugging Tip: Missing Spans
If you see the root span but not the child spans, check:
# Verify the Collector is receiving spans
curl http://localhost:8888/metrics | grep otelcol_receiver_accepted_spans
# Check Collector logs for export errors
otelcol-contrib --config=otel-config.yaml 2>&1 | grep -i error
Common causes:
- Batch processor delay: Spans are batched for up to 5 seconds before export. Wait at least 5 seconds after sending a request.
- OTLP endpoint mismatch: The SDK sends to
localhost:4317but the Collector listens on a different host. Use0.0.0.0:4317in the Collector config for local dev. - TLS mismatch: If the Collector expects TLS but the SDK sends plaintext (or vice versa), the connection fails silently. Match
insecure: truesettings on both sides.
Exporting Metrics to Prometheus
Traces tell you what happened. Metrics tell you how often and how fast. OTel's metrics pipeline works the same way, but the Prometheus exporter is an HTTP server that Prometheus scrapes — it does not push.
Step 1: Configure Prometheus Scrape
Add a scrape target to your prometheus.yml:
scrape_configs:
- job_name: "otel-collector"
scrape_interval: 15s
static_configs:
- targets: ["localhost:8889"]
The Collector's Prometheus exporter already listens on port 8889 (from our earlier config). No additional setup is needed.
Step 2: Auto-Instrument Metrics
The Flask instrumentation also captures HTTP server metrics automatically:
# Add to app.py after the trace setup
from opentelemetry import metrics
from opentelemetry.sdk.metrics import MeterProvider
from opentelemetry.sdk.metrics.export import PeriodicExportingMetricReader
from opentelemetry.exporter.otlp.proto.grpc.metric_exporter import OTLPMetricExporter
metric_reader = PeriodicExportingMetricReader(
OTLPMetricExporter(endpoint="http://localhost:4317", insecure=True),
export_interval_millis=15000
)
meter_provider = MeterProvider(
resource=resource,
metric_readers=[metric_reader]
)
metrics.set_meter_provider(meter_provider)
This exports HTTP request counts, latency histograms, and error rates — all generated automatically by FlaskInstrumentor.
Step 3: Verify Metrics in Prometheus
# Check that Prometheus is scraping the Collector
curl http://localhost:9090/api/v1/targets | jq '.data.activeTargets[] | select(.labels.job=="otel-collector")'
# Query a metric
curl "http://localhost:9090/api/v1/query?query=http_server_duration_milliseconds_bucket"
The metrics pipeline is now live: your application generates metrics, the SDK ships them to the Collector, and Prometheus scrapes the Collector's Prometheus exporter endpoint. Grafana can query Prometheus to build dashboards.
Adding a Custom Metric
Beyond auto-instrumentation, add a business-level counter:
from opentelemetry import metrics
meter = metrics.get_meter(__name__)
order_counter = meter.create_counter(
"checkout.orders",
description="Number of completed checkouts",
unit="1"
)
@app.route("/checkout", methods=["POST"])
def checkout():
# ... existing code ...
order_counter.add(1, {"method": data.get("method", "unknown")})
return jsonify({"status": "ok"})
Now you have a checkout_orders_total metric in Prometheus, labeled by payment method. Query it to track business throughput — not just infrastructure health.
Deploying OpenTelemetry on Kubernetes
Running the Collector as a standalone binary works for development. In production, you deploy it to Kubernetes using one of three patterns. Each has tradeoffs in scalability, latency, and operational complexity.
Pattern 1: Sidecar (Per-Pod Collector)
A Collector container runs alongside your application container in the same pod. The application sends telemetry to localhost:4317.
# deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: checkout-service
spec:
replicas: 3
selector:
matchLabels:
app: checkout-service
template:
metadata:
labels:
app: checkout-service
spec:
containers:
- name: app
image: checkout-service:latest
ports:
- containerPort: 5000
env:
- name: OTEL_EXPORTER_OTLP_ENDPOINT
value: "http://localhost:4317"
- name: otel-collector
image: otel/opentelemetry-collector-contrib:0.110.0
args: ["--config=/etc/otelcol/config.yaml"]
volumeMounts:
- name: otel-config
mountPath: /etc/otelcol
volumes:
- name: otel-config
configMap:
name: otel-collector-sidecar-config
Pros: Simple, no network hops, pod-level isolation. Cons: One Collector per pod wastes resources. 100 pods = 100 Collectors. Not suitable for large clusters unless you run low-resource Collector replicas.
Pattern 2: DaemonSet (Per-Node Collector)
One Collector runs on every node as a DaemonSet. All pods on that node send telemetry to the node-local Collector via the host network or a node port.
# otel-daemonset.yaml
apiVersion: apps/v1
kind: DaemonSet
metadata:
name: otel-collector
namespace: observability
spec:
selector:
matchLabels:
app: otel-collector
template:
metadata:
labels:
app: otel-collector
spec:
hostNetwork: true
containers:
- name: otel-collector
image: otel/opentelemetry-collector-contrib:0.110.0
args: ["--config=/etc/otelcol/config.yaml"]
ports:
- containerPort: 4317
hostPort: 4317
- containerPort: 4318
hostPort: 4318
volumeMounts:
- name: otel-config
mountPath: /etc/otelcol
resources:
limits:
memory: 512Mi
cpu: 500m
volumes:
- name: otel-config
configMap:
name: otel-collector-daemonset-config
## Advanced Sampling Strategies
Tail sampling is one of OpenTelemetry's most powerful features — and one of the easiest to misconfigure. Understanding the decision flow will save you from exploding your telemetry bill or dropping critical traces.
### Head-Based Sampling (Probabilistic)
Head sampling happens at span creation time. The SDK decides immediately whether to keep or drop a span — no buffering, no delay. This is the default if you configure nothing else:
```yaml
# Collector config for head-based probabilistic sampling
processors:
probabilistic_sampler:
sampling_percentage: 10
This means 10% of all spans are kept, 90% are dropped instantly. The Collector never sees the dropped spans at all — those bytes never leave the application process. Use this when:
- You are cost-sensitive: Every exported span costs storage and network. At 1,000 requests per second, keeping 100% of spans can saturate your observability budget.
- You want trace completeness, not sample size: If you are debugging a specific slow request, dropping spans at the head means you lose context. Probabilistic sampling gives you a representative subset.
Tail-Based Sampling
Tail sampling makes the decision after all spans in a trace complete — 5 to 30 seconds later, when the full trace is assembled in the Collector. The processor evaluates decision policies:
processors:
tail_sampling:
decision_wait: 30s
policies:
- name: errors-and-slow
type: and
and_sub_policy:
- name: status_code
type: status_code
status_code: {status_codes: [ERROR]}
- name: latency-over-2s
type: latency
latency: {threshold_ms: 2000}
- name: probabilistic
type: probabilistic
probabilistic: {sampling_percentage: 25}
This configuration keeps 100% of traces that contain an error status code, samples 25% of all other traces, and drops the rest. Additionally, it keeps any span whose total duration exceeds 2 seconds. Tail sampling lets you capture the full picture of every slow request without storing every fast one.
When to Use Each
| Strategy | When to Use |
|---|---|
| Head (probabilistic) | You have a strict sampling budget. You cannot store more than X spans per second. Use for high-throughput, cost-sensitive, always-on observability. |
| Tail (policy-based) | You need every trace from a specific slow request. Use when debugging errors, analyzing latency, or auditing compliance. |
Common Pitfalls and Troubleshooting
1. The Collector Is Dropping Spans Silently
This is the most common OTel production issue. The Collector receives spans from the SDK, processes them through the batch processor, then drops them silently at the exporter. Root cause: the gRPC connection between the SDK and Collector times out.
Fix: Increase the send_batch_size and reduce timeout in the batch processor:
processors:
batch:
timeout: 10s
send_batch_size: 2048
Why this works: the default batch size is 512 spans. If the Collector receives 2,000 spans in 1 second, 1,488 of them exceed the default gRPC message size (4 MiB). The SDK sends 512 spans at a time. The Collector times out waiting for the remaining 488 — and drops them. Increase the batch size to 2,048 so the SDK sends larger chunks, fewer network round-trips.
2. The traceparent Header Is Missing
Your service A calls service B over HTTP. Service B is also instrumented with OTel. But the trace breaks — service B does not receive the traceparent header, so spans link back to service A but not to the same trace.
Diagnosis: Check for traceparent in the outbound HTTP headers:
curl -H "traceparent: 00-..." http://service-b:5000/endpoint
If the response header is missing, service B is not propagating context. The SDK does not inject traceparent into the outbound request. Fix: verify the instrumentation library is loaded and the HTTP client is configured.
# Explicitly configure the OTLP exporter with headers
from opentelemetry.propagators.textmap import TextMapPropagator
from opentelemetry import trace
# Set the global propagator BEFORE creating the TracerProvider
trace.set_span_processor(
CompositePropagator(
propagators=[
W3CTraceContextPropagator(),
BaggagePropagator()
]
)
)
# Then create the TracerProvider
provider = TracerProvider()
trace.set_tracer_provider(provider)
The W3CTraceContextPropagator injects the W3C traceparent header into every outbound HTTP request. Without it, distributed context propagation fails silently.
3. High Cardinality Attributes Crash the Backend
Span attributes like user.id, request.id, and session.id are unbounded. If a span carries thousands of unique attributes, the Jaeger backend rejects the entire batch.
Remediation: Drop high-cardinality attributes at the SDK level:
# Create a custom SpanProcessor that truncates attributes
from opentelemetry.sdk.trace.export import SpanExporter, BatchSpanProcessor
class AttributeLimitingProcessor(BatchSpanProcessor):
def on_end(self, span):
# Keep only these attributes — drop everything else
allowed_keys = {"http.method", "http.url", "http.status_code"}
span.attributes = {
k: v for k, v in span.attributes.items()
if k in allowed_keys
}
This SpanProcessor limits attributes to http.method, http.url, and http.status_code — dropping user.id, session tokens, and every other high-cardinality field. The backend stays stable.
4. Memory Usage Grows Unbounded
The Collector's memory consumption grows linearly with every span. Under sustained load, 512 MiB becomes 1 GiB, then 2 GiB. The OOM killer strikes.
Fix: Configure the memory_limiter processor aggressively:
processors:
memory_limiter:
limit_mib: 256
spike_limit_mib: 512
check_interval: 1s
The limit_mib sets a hard cap at 256 MiB. The spike_limit_mib allows brief spikes to 512 MiB during batch exports. Set both lower than the container memory limit if the Collector also runs a sidecar.
Security: Redacting Sensitive Data
OpenTelemetry traces can leak secrets. A span attribute like credit_card_number or user.email travels from your SDK through the Collector to Jaeger — and into your observability vendor's cloud. Every hop stores the attribute permanently.
Prevention: Filter sensitive attributes at the Collector level before they leave your network:
processors:
attributes:
actions:
- key: user.email
action: delete
- key: user.phone
action: delete
- key: credit_card.*
action: delete
- key: password
action: delete
This configuration strips user.email, user.phone, and any attribute matching the pattern credit_card.* or password from every span before it reaches the exporter. The sensitive data never leaves your boundary. Combine this with the k8sattributes processor to redact by label or annotation.
For full defense in depth, review the OTel Security documentation.
Further Reading
If you have made it this far, you now have a working OpenTelemetry pipeline — instrumentation, a Collector, and at least one observability backend. Here is where to go next:
- Kubernetes Security Best Practices 2026 — Hardening your cluster before instrumenting your workloads. Security is not optional when observability is production.
- Error Budgets: Stop Wasting Your SRE Team's Time — Budget for reliability, not just velocity. Your error budget is a policy decision, not a suggestion.
- OpenTelemetry Tracing: Instrument Your First Application (forthcoming) — Distributed tracing with manual context propagation. A complete guide to instrumenting every service.
Conclusion
OpenTelemetry is not a tool — it is a standard. Adopting it means instrumenting once with the SDK, processing through the Collector, and exporting to any backend without rewriting code. You have now walked through a complete setup: instrumentation with Python and Flask, Collector configuration in YAML, Jaeger for trace visualization, Prometheus for metrics, Kubernetes for production deployment, and operational patterns from DaemonSet to Gateway.
The most important things to remember:
- Instrument once, export anywhere. The OTel SDK decouples your application from every backend. Changing exporters in the Collector config is not a code change.
- The Collector is your control plane. Receivers, processors, and exporters form a pipeline. Data flows one way — from your code through the SDK to the Collector, then to Jaeger and Prometheus. You control the flow.
- Tail sampling saves budget. Not every span is worth storing. Decide what to keep at the head (probabilistic) or at the tail (policy-based). The Collector makes the decision.
The observability landscape in 2026 is converging on OpenTelemetry. Every major vendor now speaks OTLP. The standard is the protocol — adopt it before it becomes a migration project.