GitHub Actions

Dec 2025

OpenTelemetry for GitHub Actions: Traces, OTLP, and Pipeline Flamegraphs

Eddie Wangengineering

What OpenTelemetry Tracing Adds to CI Observability
Two Approaches: Trace Exporter Actions vs. OTel Collector
Approach 1: GitHub Actions That Export Traces Directly
Approach 2: The OTel Collector with the GitHub Receiver
Instrumenting Custom Steps to Emit Spans
OTLP Export to Observability Backends
Reading Pipeline Flamegraphs
Connecting CI Traces to Deployment Traces
When OTel Adds Value Over GitHub's Built-in Tools
Getting Started in 15 Minutes

Your GitHub Actions workflow just took 47 minutes. The logs say everything passed. But last week it was 28 minutes, and nobody changed the build configuration. GitHub's workflow UI will show you each step's duration in a flat list, but it won't tell you why things are slow, which jobs are blocking others, or how job queue time compares to actual execution time.

OpenTelemetry fixes this. By exporting your workflow runs as distributed traces, you get flamegraph-style views of every job and step, complete with timing data, failure annotations, and commit metadata. You can query traces across weeks of CI history, compare durations between branches, and pinpoint the step that's actually eating your minutes.

This guide covers the practical setup: how trace exporters work, how to configure the OTel Collector's GitHub receiver, which backends to send data to, and what the resulting traces actually look like when you open them.

What OpenTelemetry Tracing Adds to CI Observability

GitHub's built-in workflow view gives you a list of steps with green checkmarks and durations. That's useful when a step fails. It's much less useful when you're trying to answer questions like: "Why did the same workflow take 15 minutes longer on this commit?" or "Which of our 12 matrix jobs is consistently the slowest?"

OTel tracing models each workflow run as a trace, each job as a child span, and each step as a span within that job. This hierarchy gives you something GitHub's flat log view doesn't: a visual representation of parallelism, sequencing, and time gaps between jobs. When you look at a trace in a backend like Honeycomb or Grafana Tempo, you immediately see the waterfall, which jobs overlapped, which ran sequentially, and where the dead time lives.

Each span carries attributes from the GitHub API: commit SHA, branch name, author, runner labels, job conclusion, and PR metadata. So you can filter traces by author, query for all failed runs on a specific branch, or compare step durations before and after a config change.

Two Approaches: Trace Exporter Actions vs. OTel Collector

There are two main ways to get OpenTelemetry data out of GitHub Actions. They serve different needs, and you might end up using both.

Approach 1: GitHub Actions That Export Traces Directly

The simplest approach is adding a step at the end of your workflow (or a separate workflow_run-triggered workflow) that reads the completed run's data from the GitHub API and exports it as OTLP spans. Several open-source actions do this:

inception-health/otel-export-trace-action is the most established option with over 120 stars on GitHub. It reads the workflow run via the GitHub API, constructs a trace with spans for each job and step, and ships everything over OTLP/gRPC. It also supports JUnit artifact tracing if you upload test results as artifacts.
MNThomson/otel-action takes a leaner approach. You add it as the last step of your workflow, point it at an OTLP endpoint, and it exports spans for all jobs and steps. It's simpler to set up since it runs inline rather than requiring a separate workflow trigger.
corentinmusard/otel-cicd-action is a maintained fork of the inception-health action with updates for newer Node.js runtimes. Dash0's official guide recommends it.

Here's a minimal example using the workflow_run trigger pattern:

name: Export OTel Traces
on:
  workflow_run:
    workflows: [CI]
    types: [completed]

jobs:
  otel-export-trace:
    runs-on: ubuntu-latest
    steps:
      - name: Export Workflow Trace
        uses: inception-health/otel-export-trace-action@v1
        with:
          otlpEndpoint: grpc://otel-collector.example.com:4317
          otlpHeaders: ${{ secrets.OTLP_HEADERS }}
          githubToken: ${{ secrets.GITHUB_TOKEN }}
          runId: ${{ github.event.workflow_run.id }}

This fires after your CI workflow finishes, pulls the run data from GitHub's API, and pushes the trace to your OTLP endpoint. The otlpHeaders input takes comma-separated key-value pairs for authentication, for example x-honeycomb-team=YOUR_API_KEY for Honeycomb.

If you'd rather not create a separate workflow file, you can run it inline as a final job in your existing workflow:

jobs:
  build:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - run: npm ci && npm test

  otel-export:
    if: always()
    needs: [build]
    runs-on: ubuntu-latest
    steps:
      - uses: inception-health/otel-export-trace-action@v1
        with:
          otlpEndpoint: grpc://otel-collector.example.com:4317
          otlpHeaders: ${{ secrets.OTLP_HEADERS }}
          githubToken: ${{ secrets.GITHUB_TOKEN }}

The if: always() is important. Without it, the export job won't run when upstream jobs fail, which is exactly when you most want the trace data.

Approach 2: The OTel Collector with the GitHub Receiver

The action-based approach works well for getting started, but it has a limitation: it only captures data after a workflow finishes, and it runs as a GitHub Actions job itself, consuming runner minutes.

The OpenTelemetry Collector's githubreceiver (part of the otelcol-contrib distribution) takes a different approach. It runs outside GitHub entirely, sitting on your infrastructure and receiving data two ways:

Webhooks for traces. You configure a GitHub webhook to send workflow_run and workflow_job events to the collector. It converts these into OTel trace spans automatically.
API scraping for metrics. The receiver periodically calls the GitHub REST and GraphQL APIs to pull repository-level metrics: contributor counts, open PRs, issue counts, and workflow statistics.

Here's the collector configuration:

extensions:
  bearertokenauth/github:
    token: ${GH_PAT}

receivers:
  github:
    initial_delay: 1s
    collection_interval: 60s
    webhook:
      endpoint: 0.0.0.0:19418
      path: /events
      secret: ${GITHUB_WEBHOOK_SECRET}
    scrapers:
      scraper:
        github_org: your-org
        metrics:
          vcs.contributor.count:
            enabled: true
        auth:
          authenticator: bearertokenauth/github

exporters:
  otlp:
    endpoint: tempo.example.com:4317
    tls:
      insecure: false

service:
  extensions: [bearertokenauth/github]
  pipelines:
    traces:
      receivers: [github]
      exporters: [otlp]
    metrics:
      receivers: [github]
      exporters: [otlp]

The advantage here is that data collection is completely decoupled from your workflows. You don't add any steps to your CI configuration. The collector sits on a small VM or container, receives webhook events, and forwards everything to your observability backend.

The tradeoff: you need somewhere to run the collector and a publicly reachable endpoint for webhooks (or use something like ngrok for testing). For most platform teams, that's trivial. For smaller setups, the action-based approach is lower friction.

Instrumenting Custom Steps to Emit Spans

The trace exporters above give you job-level and step-level spans automatically. But sometimes a single step contains multiple expensive operations, like a build script that runs linting, compilation, and bundling in sequence. From the trace's perspective, that's one opaque span.

To get finer-grained visibility, you can instrument your scripts directly. If your build scripts are in Node.js, Python, or Go, you can use the OTel SDK for that language to create child spans within a step. For shell scripts, the otel-cli tool lets you wrap commands in spans from bash:

# Install otel-cli
curl -L https://github.com/equinix-labs/otel-cli/releases/latest/download/otel-cli-linux-amd64 -o /usr/local/bin/otel-cli
chmod +x /usr/local/bin/otel-cli

# Wrap individual commands in spans
otel-cli exec --name "lint" -- npm run lint
otel-cli exec --name "compile" -- npm run build
otel-cli exec --name "bundle" -- npm run bundle

Each otel-cli exec call creates a span with the command's duration and exit code. You point it at your OTLP endpoint via environment variables (OTEL_EXPORTER_OTLP_ENDPOINT), and the spans show up alongside your workflow-level trace data.

This kind of instrumentation is worth the effort for complex build steps. If your "Build" step takes 8 minutes, knowing that 6 of those minutes are TypeScript compilation and 2 are bundling changes what you'd optimize.

OTLP Export to Observability Backends

Because these tools all speak OTLP, you can send CI traces to whatever backend your team already uses for application observability. The setup differs mainly in endpoint URLs and authentication headers.

Honeycomb is probably the most common choice for CI tracing. Their free tier is generous, and the query interface is built for exploring trace data. Set your endpoint to grpc://api.honeycomb.io:443 and pass your API key as x-honeycomb-team=YOUR_KEY in the headers.

Grafana Tempo is the natural choice if you're already running Grafana for metrics. Tempo accepts OTLP natively, and you can build Grafana dashboards that correlate CI trace durations with deployment metrics. If you're using Grafana Cloud, the OTLP endpoint is something like tempo-us-central1.grafana.net:443.

Datadog supports OTLP ingestion through the Datadog Agent or directly to their OTLP intake endpoint. If your production APM is already in Datadog, having CI traces in the same tool means you can build traces that span from "PR merged" to "deployed and healthy in production."

Self-hosted Jaeger or SigNoz work too if you want to keep everything in-house. SigNoz in particular has a pre-built CI/CD dashboard that displays DORA metrics, pipeline health, and repository activity derived from the OTel Collector's GitHub receiver data.

Reading Pipeline Flamegraphs

Once traces are flowing into your backend, you'll see your workflow runs as waterfall diagrams (sometimes called flamegraphs, though they're technically Gantt charts). The root span is the workflow run itself. Child spans are jobs. Grandchild spans are individual steps.

What makes this view powerful is what it reveals about hidden time. Consider a workflow with three jobs: lint, test, and deploy. They have needs: dependencies, so they run sequentially. In a trace view, you'd see them stacked end-to-end. But you'd also see gaps between them. Those gaps are job queue time, the time GitHub spent provisioning a runner for the next job. On busy repositories or with larger runners, queue time can easily add 30-60 seconds per job transition.

Dash0 published a good example of this in practice. While setting up OTel tracing for their own CI, they discovered that a "Test Helm Charts" step was taking 2 minutes 19 seconds. The trace showed that most of that time was actually in the Checkout step, which was fetching the entire repository history including all branches and tags. The fix was trivial (add fetch-depth: 1), but without the trace data, nobody would have looked at checkout as the bottleneck.

Some useful queries once you have trace data:

P95 duration of the "Install Dependencies" step over the last 30 days. Is it trending up? Maybe your lockfile grew, or a registry is slower.
Compare workflow duration on main vs. feature branches. Feature branches sometimes trigger different matrix configurations.
Filter by github.conclusion = failure and group by step name. Which step fails most often?
Track runner provisioning time by looking at the gap between job creation and step execution start.

Connecting CI Traces to Deployment Traces

The real payoff comes when your CI traces and your production traces share the same backend. If your deploy step records the commit SHA as a span attribute, and your production services tag their traces with the running version, you can build queries that connect the two worlds.

"Show me all CI runs for commit abc123" followed by "show me error rates in production after that commit was deployed" is a powerful debugging flow. Some teams take it further by propagating a trace context from the CI deploy job into the deployment itself, creating a single trace that spans from "tests passed" through "Kubernetes rollout complete" to "first healthy request served."

This doesn't require anything exotic. The CI trace already includes github.head_sha as an attribute. If your deployment tooling (ArgoCD, Flux, a custom script) adds the same SHA as a resource attribute on the deployment trace, you can join across them in your backend.

When OTel Adds Value Over GitHub's Built-in Tools

GitHub does provide some native workflow metrics. The Actions tab shows run history, durations, and success/failure status. The REST API exposes workflow run and job timing data. For smaller teams with a handful of workflows, that might be enough.

OTel tracing starts adding real value when:

You have more than a few dozen workflows and need to query across them. "What's our slowest step across all repositories?" isn't a question GitHub's UI can answer.
You want historical trend analysis. GitHub retains workflow logs for 90 days, but you can't easily plot P95 step duration over 6 months. With traces in your own backend, retention is whatever you configure.
You need to correlate CI and production data. GitHub knows nothing about what happens after deployment.
You want alerting on CI performance regressions. Your observability backend can alert when a step's P95 duration crosses a threshold, something GitHub doesn't offer natively.
You run CI across multiple providers. If some workflows run on GitHub Actions and others on GitLab CI or Jenkins, OTel gives you a single pane of glass.

For a team with 5 repositories and straightforward workflows, the built-in GitHub metrics are fine. For platform teams managing CI across 50+ repositories with complex matrix builds, OTel tracing goes from nice-to-have to essential.

Getting Started in 15 Minutes

If you want to try this today, here's the quickest path:

Sign up for Honeycomb's free tier (20 million events/month, no credit card).
Create an API key and store it as a GitHub secret.
Add the trace export workflow file shown earlier.
Push a commit to trigger your CI workflow.
Open Honeycomb and look at the trace. You'll see every job and step laid out as a waterfall.

You'll probably notice something surprising in the first trace you look at. Most teams do. Maybe it's a checkout step pulling too much history, a cache restore that's not actually saving time, or a test suite where 80% of the duration is one slow integration test. That first surprise is the whole point.

OpenTelemetry for GitHub Actions: Traces, OTLP, and Pipeline Flamegraphs

Table of Contents

What OpenTelemetry Tracing Adds to CI Observability

Two Approaches: Trace Exporter Actions vs. OTel Collector

Approach 1: GitHub Actions That Export Traces Directly

Approach 2: The OTel Collector with the GitHub Receiver

Instrumenting Custom Steps to Emit Spans

OTLP Export to Observability Backends

Reading Pipeline Flamegraphs

Connecting CI Traces to Deployment Traces

When OTel Adds Value Over GitHub's Built-in Tools

Getting Started in 15 Minutes

Smarter reviews. Faster builds.
Start for Free in less than 2 min.

OpenTelemetry for GitHub Actions: Traces, OTLP, and Pipeline Flamegraphs

Table of Contents

What OpenTelemetry Tracing Adds to CI Observability

Two Approaches: Trace Exporter Actions vs. OTel Collector

Approach 1: GitHub Actions That Export Traces Directly

Approach 2: The OTel Collector with the GitHub Receiver

Instrumenting Custom Steps to Emit Spans

OTLP Export to Observability Backends

Reading Pipeline Flamegraphs

Connecting CI Traces to Deployment Traces

When OTel Adds Value Over GitHub's Built-in Tools

Getting Started in 15 Minutes

Smarter reviews. Faster builds. Start for Free in less than 2 min.

Smarter reviews. Faster builds.
Start for Free in less than 2 min.