Skip to content

Add OpenTelemetry OTLP exporter with full SDK support#218

Merged
kevinburkesegment merged 1 commit into
mainfrom
add-opentelemetry-otlp-support
Jun 15, 2026
Merged

Add OpenTelemetry OTLP exporter with full SDK support#218
kevinburkesegment merged 1 commit into
mainfrom
add-opentelemetry-otlp-support

Conversation

@etiennep

@etiennep etiennep commented Feb 6, 2026

Copy link
Copy Markdown
Contributor

Summary

Adds production-ready OpenTelemetry Protocol (OTLP) exporter using the official OpenTelemetry SDK with comprehensive support for both gRPC and HTTP/Protobuf transports.

Features

Dual Transport Support: gRPC and HTTP/Protobuf protocols
Environment Variables: Full OTEL_* environment variable support
Resource Detection: AWS (EC2, ECS, EKS, Lambda), GCP, Azure, K8s, host, process
All Metric Types: Counter, Gauge, Histogram with proper semantics
Tag Conversion: Automatic stats tags → OpenTelemetry attributes
Production Ready: Thread-safe, tested, documented

Usage

// Simple - uses environment variables
handler, err := otlp.NewSDKHandlerFromEnv(ctx)
stats.Register(handler)

// Or explicit configuration
handler, err := otlp.NewSDKHandler(ctx, otlp.SDKConfig{
    Protocol: otlp.ProtocolGRPC,
    Endpoint: "localhost:4317",
})

Implementation Highlights

  • Gauge Semantics: Uses UpDownCounter with delta calculation to maintain absolute value semantics (workaround until stable OTel SDK adds Gauge)
  • Context Handling: Background context for metric recording prevents context cancellation issues
  • Performance: Lock-free reads for instrument lookup, efficient caching
  • Resource Detection: Automatic cloud provider metadata detection

Documentation

Testing

  • ✅ Unit tests for all metric types
  • ✅ Gauge behavior verification
  • ✅ HTTP and gRPC protocol tests
  • ✅ Value type conversion tests
  • ✅ Performance benchmarks

Changes

  • Added otlp/sdk_handler.go - Main OpenTelemetry SDK integration
  • Added otlp/sdk_handler_test.go - Comprehensive tests
  • Added otlp/example_test.go - Usage examples
  • Added otlp/README.md - Complete documentation
  • Added otlp/IMPLEMENTATION_NOTES.md - Design decisions
  • Updated README.md - Added OpenTelemetry backend overview
  • Updated HISTORY.md - Added v5.9.0 release notes
  • Updated version/version.go - Bumped to 5.9.0
  • Updated otlp/go.mod - Added OpenTelemetry SDK dependencies

Backward Compatibility

Fully backward compatible - This is a new feature addition that doesn't change existing APIs. The legacy otlp.Handler remains available for existing users.

🤖 Generated with Claude Code

@etiennep etiennep force-pushed the add-opentelemetry-otlp-support branch from bad79e6 to e6d05d8 Compare February 6, 2026 10:26
sccoache
sccoache previously approved these changes Feb 6, 2026
@etiennep etiennep force-pushed the add-opentelemetry-otlp-support branch 3 times, most recently from a82c5e2 to 82249fe Compare February 6, 2026 17:08
Comment thread otlp/IMPLEMENTATION_NOTES.md Outdated

### 3. Instrument Caching

**Implementation**: Thread-safe two-level locking pattern

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can't tell if this is internal to the stats library or external - if users need to know about this.

Comment thread otlp/IMPLEMENTATION_NOTES.md Outdated

### Default: Cumulative Temporality

**Decision**: Use cumulative temporality for all metric instruments (Prometheus-compatible)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should rewrite this to be aimed more at people who might be curious - it's weird to talk about a "decision" without a discussion of the tradeoffs that led to that

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or maybe it just needs to be presented in reverse order

Comment thread otlp/README.md
- ✅ Resource detection
- ✅ Production-ready

2. **Handler** (Legacy): Custom OTLP implementation

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we have internal use of this?

If so I'd like to expand on why people shouldn't use this.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Asked claude to search for usage and it didn't find any. Adding clear deprecation notice.

Comment thread otlp/sdk_handler.go Outdated
// For gRPC: "localhost:4317"
// For HTTP: "http://localhost:4318"
// If empty, uses OTEL_EXPORTER_OTLP_ENDPOINT environment variable
Endpoint string

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note people frequently get tripped up between "Endpoint" and "EndpointURL" we should probably note the difference here and say this is explicitly "Endpoint"

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oooh good point.

Comment thread otlp/go.mod
go 1.23.0
go 1.24.0

require (

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this really have its own separate go.mod? I guess so the other callers don't pull in all of the otel dependencies?

Comment thread otlp/handler.go Outdated
// EndpointURL: "http://localhost:4318",
// })
//
// Status: Alpha. This Handler is still in heavy development phase. Do not use

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's remove this comment since this is now deprecated

Comment thread README.md
defer stats.Flush()

// Or use environment variables (simplest)
// export OTEL_EXPORTER_OTLP_ENDPOINT=http://localhost:4318

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This example switches from GRPC to HTTP in addition to switching from env vars to in memory - that's fine but let's be explicit this is making two changes (env var AND protocol) instead of just one

Comment thread README.md

### Prometheus

The [github.com/segmentio/stats/v5/prometheus](https://godoc.org/github.com/segmentio/stats/v5/prometheus) package exposes an HTTP handler that serves metrics in Prometheus format.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is dumb but for people who have never used Prometheus before, the pull model vs. push can be counterintuitive, could we just add a sentence explaining that's how this works? "Note that with Prometheus, the metric server will poll your client for changes - metrics are not pushed from a client to the server"

Comment thread otlp/sdk_handler.go Outdated
// If zero or not set, uses the SDK default (60 seconds)
ExportInterval time.Duration

// ExportTimeout specifies the timeout for exports

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd be more specific - "the maximum amount of time to wait for a request to the server to complete"

As written it could be confused with what ExportInterval does

Comment thread otlp/sdk_handler.go

// HTTPOptions are additional options for HTTP protocol
// Only used when Protocol is ProtocolHTTPProtobuf
HTTPOptions []otlpmetrichttp.Option

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My first question with this and GRPCOptions is "what options exist" can we link to the docs.

Comment thread otlp/sdk_handler.go Outdated
// Set defaults for histogram configuration
if config.ExponentialHistogram {
if config.ExponentialHistogramMaxSize == 0 {
config.ExponentialHistogramMaxSize = 160

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's pull into DefaultHistogramMaxSize const please

Comment thread otlp/sdk_handler.go Outdated
config.ExponentialHistogramMaxSize = 160
}
if config.ExponentialHistogramMaxScale == 0 {
config.ExponentialHistogramMaxScale = 20

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same

Comment thread otlp/sdk_handler.go Outdated
res := config.Resource
if res == nil {
var err error
res, err = resource.New(ctx,

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's put a timeout on this - we shouldn't hang forever because we couldn't get a resource

Comment thread otlp/sdk_handler.go Outdated
}

default:
return nil, fmt.Errorf("unsupported protocol: %s", protocol)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
return nil, fmt.Errorf("unsupported protocol: %s", protocol)
return nil, fmt.Errorf("unsupported protocol: %q", protocol)

"%q" will make clear 'you passed the empty string' vs. 'we forgot to include the variable in the error message'

Comment thread otlp/sdk_handler.go Outdated
case stats.Counter:
counter, err := meter.Int64Counter(name)
if err != nil {
log.Printf("stats/otlp: failed to create counter %s: %v", name, err)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

slog.Info() ?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same for all in this file

@kevinburkesegment kevinburkesegment force-pushed the add-opentelemetry-otlp-support branch 7 times, most recently from 136c900 to 3441b0b Compare June 15, 2026 16:17
Implement a production-ready OpenTelemetry Protocol (OTLP) exporter
using the official OpenTelemetry SDK, supporting both gRPC and HTTP
transports, and deprecate the legacy alpha handler in preparation for
v6.

Features:
- gRPC and HTTP/Protobuf protocol support
- Counter, Gauge, and Histogram metric types
- Optional exponential histogram aggregation
- Configurable temporality (cumulative default, Prometheus-compatible)
- Tag to attribute conversion
- Thread-safe instrument caching

Implementation:
- Gauges use the native Float64Gauge instrument for instantaneous
  value recording
- Background context for recording to avoid cancellation issues
- Lock-free reads for instrument lookup in hot path
- A single Meter is created once and reused across recordings

Environment variables:
- The transport protocol is resolved from OTEL_EXPORTER_OTLP_PROTOCOL,
  with OTEL_EXPORTER_OTLP_METRICS_PROTOCOL taking precedence; an
  unrecognized value is rejected. An explicit SDKConfig.Protocol always
  wins. We resolve only this variable ourselves because the
  otlpmetricgrpc/otlpmetrichttp exporters do not read the protocol
  selector.
- The exporters read the remaining OTEL_EXPORTER_OTLP_* variables
  (endpoint, headers, timeout, compression, ...) themselves; programmatic
  overrides are available via WithEndpointURL, GRPCOptions, and
  HTTPOptions.
- Resource attributes come from OTEL_RESOURCE_ATTRIBUTES/OTEL_SERVICE_NAME
  plus host and process detection. Cloud and Kubernetes detection is
  opt-in via the contrib/detectors/* packages.

Config API:
- SDKConfig.EndpointURL takes a full URL with scheme (http:// or
  https://); WithEndpointURL is used to avoid a known gRPC bug with the
  http:// scheme
- ExportInterval and ExportTimeout fall back to SDK defaults (60s and
  30s) when unset

Deprecations:
- Deprecate otlp.Handler (Alpha since 2022, minimal usage)
- Deprecate otlp.HTTPClient
- Deprecate otlp.NewHTTPClient()
All will be removed in v6.0.0. Migration path provided in deprecation
notices with code examples.

Testing:
- Unit tests and benchmarks for instrument handling and value conversion
- Integration tests that export to an in-process gRPC OTLP collector and
  assert on the metrics received over the wire, covering protocol
  resolution precedence and the invalid-protocol error path

Documentation:
- Complete README with configuration examples
- Cloud resource detector usage guides
- Implementation notes explaining design decisions and temporality
- Example code for common use cases
- HISTORY.md release notes for v5.9.0

Performance:
- Preallocate tag/attribute slices to their exact length and assign by
  index instead of appending, in the OTLP handlers and in the core
  stats.M and tagFuncMap.namedTagFuncs helpers

Bumps version to 5.9.0. All tests pass.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Co-Authored-By: Kevin Burke <kburke@twilio.com>
@kevinburkesegment kevinburkesegment force-pushed the add-opentelemetry-otlp-support branch from 3441b0b to 7e7070d Compare June 15, 2026 22:56
@kevinburkesegment kevinburkesegment merged commit 7e7070d into main Jun 15, 2026
7 of 9 checks passed
@kevinburkesegment kevinburkesegment deleted the add-opentelemetry-otlp-support branch June 15, 2026 22:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants