Skip to content

OpenTelemetry export

@nwire/telemetry-otel is an opt-in bridge that translates Nwire's canonical telemetry stream into OTLP spans + events. Pair it with any OTel backend — Datadog, Honeycomb, Tempo, Jaeger, Vector + GreptimeDB.

Install

bash
pnpm add @nwire/telemetry-otel @opentelemetry/api \
         @opentelemetry/sdk-trace-node \
         @opentelemetry/exporter-trace-otlp-http

Note: @opentelemetry/api is a peer dep of @nwire/telemetry-otel — Nwire never imports it directly. You bring the version your stack needs.

Wire it up

ts
// otel.ts — boot-time OTel setup
import { NodeTracerProvider } from "@opentelemetry/sdk-trace-node"
import { BatchSpanProcessor } from "@opentelemetry/sdk-trace-base"
import { OTLPTraceExporter } from "@opentelemetry/exporter-trace-otlp-http"
import { Resource } from "@opentelemetry/resources"
import { SemanticResourceAttributes } from "@opentelemetry/semantic-conventions"
import { trace } from "@opentelemetry/api"
import { attachOtelExporter } from "@nwire/telemetry-otel"

export function setupOtel(serviceName: string) {
  const provider = new NodeTracerProvider({
    resource: new Resource({
      [SemanticResourceAttributes.SERVICE_NAME]: serviceName,
      [SemanticResourceAttributes.SERVICE_VERSION]: process.env.GIT_SHA ?? "dev",
    }),
  })
  provider.addSpanProcessor(
    new BatchSpanProcessor(
      new OTLPTraceExporter({ url: process.env.OTLP_URL ?? "http://localhost:4318/v1/traces" }),
    ),
  )
  provider.register()
  return trace.getTracer(serviceName)
}
ts
// main.ts — boot the app + attach the exporter
import { setupOtel } from "./otel"

const tracer = setupOtel("amit")
const app = learnflowApp.create({ ... })
const detach = attachOtelExporter(app.runtime, { tracer })

await app.start()
// graceful shutdown
process.on("SIGTERM", async () => {
  detach()
  await app.stop()
  await trace.getActiveSpan()?.end()
})

That's it. Every Nwire telemetry record now flows out as OTLP. Studio keeps consuming the in-process stream natively — they coexist.

What gets exported

Telemetry kindOTel mapping
action.dispatchedopen span nwire.action {name}
action.completedclose OK with duration_ms
action.failedaction.failed span event
dlq.recordedclose ERROR with attempts + error
event.publishedspan event on parent action span (by causationId)
actor.transitionedspan event
projection.foldedspan event
reaction.firedad-hoc child span nwire.reaction {sourceEvent}
reaction.failedspan event with error
query.executedad-hoc span nwire.query {name}
timer.scheduled / timer.firedad-hoc span
external.call.startedopen span nwire.external {call}
external.call.completed / .failedclose OK/ERROR
inbound.webhook.receivedad-hoc span
outbox.flushedad-hoc span
inbox.dedup.hitad-hoc span
queue.job.*ad-hoc span per kind
cron.firedad-hoc span with late_by_ms

Span attributes

Every span carries:

nwire.app                   = "amit"
nwire.action                = "submissions.submit-answer"   (action spans)
nwire.message_id            = "..."
nwire.correlation_id        = "..."
nwire.causation_id          = "..."
nwire.tenant                = "school-tlv"     (when set)
nwire.user_id               = "avi"            (when set)
nwire.action.duration_ms    = 124              (action.completed)
nwire.action.emitted_events = "answer-submitted,…"
nwire.external.target       = "stripe//v1/payment_intents"
nwire.external.idempotency_key = "charge-order-123-42"

Persona / journeyStep / SLO metadata travels through the envelope too — see Studio-aware metadata.

Datadog / Honeycomb / Tempo

Set OTLP_URL to your backend's collector. Datadog Agent listens on localhost:4318/v1/traces; Honeycomb takes a Honeycomb-API-Key header; Tempo takes the gRPC port at 4317.

bash
# Datadog Agent on the host
OTLP_URL=http://localhost:4318/v1/traces pnpm dev

# Honeycomb direct
OTLP_URL=https://api.honeycomb.io/v1/traces \
HONEYCOMB_API_KEY=xxx \
  pnpm dev

# Tempo / Jaeger via OTLP
OTLP_URL=http://tempo:4318/v1/traces pnpm dev

Vector + GreptimeDB (cloud ingestion)

The recommended self-hosted observability stack: Vector routes OTLP → GreptimeDB stores + queries.

yaml
# vector.toml
[sources.otlp]
type = "opentelemetry"
grpc.address = "0.0.0.0:4317"
http.address = "0.0.0.0:4318"

[sinks.greptime]
type = "greptimedb_logs"  # also: greptimedb_metrics, greptimedb_traces
inputs = ["otlp"]
endpoint = "http://greptime:4001"
table = "nwire_traces"
yaml
# docker-compose.yml
services:
  greptime:
    image: greptime/greptimedb:latest
    command: standalone start --http-addr 0.0.0.0:4000 --rpc-addr 0.0.0.0:4001
    ports: ["4000:4000", "4001:4001"]
  vector:
    image: timberio/vector:0.43.0-alpine
    volumes: ["./vector.toml:/etc/vector/vector.toml:ro"]
    ports: ["4317:4317", "4318:4318"]
bash
OTLP_URL=http://localhost:4318/v1/traces pnpm dev

Then query in Greptime via SQL or PromQL. The Studio-aware metadata you declared on defineAction / defineEvent / defineActor becomes queryable dimensions automatically:

sql
-- p95 latency by persona
SELECT
  attributes['nwire.persona'] AS persona,
  approx_percentile_cont(attributes['nwire.action.duration_ms'], 0.95) AS p95
FROM nwire_traces
WHERE attributes['kind'] = 'span.kind.action.completed'
GROUP BY persona;

-- success rate by journey step
SELECT
  attributes['nwire.journey_step'] AS journey_step,
  SUM(CASE WHEN status_code = 'OK' THEN 1 ELSE 0 END) * 1.0 / COUNT(*) AS success_rate
FROM nwire_traces
GROUP BY journey_step;

-- stuck submissions
SELECT actor_key, MAX(end_time) - MIN(start_time) AS stuck_for
FROM nwire_traces
WHERE attributes['nwire.actor'] = 'submission'
  AND attributes['nwire.actor.to'] = 'under-review'
GROUP BY actor_key
HAVING stuck_for > INTERVAL '48 hours';

Filtering — keep volume sane in prod

By default attachOtelExporter forwards every record. In high-volume production you'll want to keep just the spans you actually look at:

ts
attachOtelExporter(app.runtime, {
  tracer,
  kinds: [
    "action.dispatched",
    "action.completed",
    "action.failed",
    "external.call.started",
    "external.call.completed",
    "external.call.failed",
    "dlq.recorded",
  ],
})

The other kinds still flow through Studio's local view (the canonical stream is untouched); only the OTLP export is filtered.

Sampling

For tracing-volume control, configure the OTel SDK's sampler at the provider level — Nwire doesn't intercept. ParentBased + TraceIdRatioBasedSampler is the common pattern; head sampling at 10% keeps shape while cutting cost.

ts
import { ParentBasedSampler, TraceIdRatioBasedSampler } from "@opentelemetry/sdk-trace-base"

new NodeTracerProvider({
  sampler: new ParentBasedSampler({ root: new TraceIdRatioBasedSampler(0.1) }),
  resource: ...,
})

Always sample errors at 100%

Configure a tail sampler at your collector (Otel Collector, Tempo) so spans containing action.failed / reaction.failed / dlq.recorded events get retained even when the parent trace was head-sampled out. Vector supports transforms.tail_sample with arbitrary VRL conditions.

eventsAsSpans — flatten for backends that prefer it

By default event.published is attached as a span event on the parent action span. Some backends (early Honeycomb, basic Jaeger) only render spans, not span events. Flip the mode:

ts
attachOtelExporter(app.runtime, { tracer, eventsAsSpans: true })

Each event becomes its own span (nwire.event submissions.answer-submitted) keyed to the same trace by parent action's traceId.

detach() and graceful shutdown

ts
const detach = attachOtelExporter(app.runtime, { tracer })

// later:
detach()  // unsubscribes; closes still-open spans with 'nwire.unsubscribed' event

Always detach() before provider.shutdown() in your shutdown handler so the SDK has clean spans to flush.

Common pitfalls

Trace context across cross-service hops

Nwire's correlation chain travels via envelope.correlationId / causationId — Nwire-native, not OTel-native. If your reactions ctx.request() an action in another service over the bus, OTel's trace will fork. Either bridge the trace context manually in your bus adapter (see the Multi-service guide), or accept that cross-service hops break the single-trace illusion and rely on correlationId for cross-service search.

OTel SDK is heavy

Adds ~20 MB to the bundle and ~50ms cold-start. Worth it for prod; disable in dev unless debugging trace shape.

See also

MIT licensed.