Testing

Nwire ships a real test substrate so you never have to wire one yourself: @nwire/test-kit. Three layers, increasing realism.

TL;DR

import { harness } from "@nwire/test-kit"

const t = await harness({ app: submissionsApp })
await t.dispatch(submitAnswer, { studentId: "avi", answer: "alef" })
await t.idle()

expect(t.telemetry.count("event.published")).toBe(1)
expect(t.telemetry.chain("submissions.answer-submitted")).toHaveLength(2)
await t.stop()

Five lines boot the whole app, fire a real dispatch through the runtime, wait for every reaction to settle, query the canonical telemetry stream, and shut down. No mocks of the runtime. No HTTP layer. Pure in-process speed.

What the harness gives you

const t = await harness({ app, providers? })

t.app                              // the booted App — escape hatch
t.dispatch(action, input, env?)    // returns event(s) the handler emitted
t.query(queryDef, input, tenant?)  // runs a query against the projection
t.idle(graceMs?, timeoutMs?)       // resolves when pipeline is quiet
t.telemetry                        // TelemetryProbe — see below
t.stop()                           // graceful shutdown

harness() calls app.create() with optional providers overrides (actorStore, projectionStore, bus, logger, deadLetterSink). With no overrides you get the framework's in-memory defaults — fast and isolated.

Three layers of realism

1. Unit/integration — in-memory, no I/O

The default. Use this for 90% of your tests. Pure logic checks, fast, deterministic.

describe("submissions.submit-answer", () => {
  it("emits answer-submitted and fires the auto-grade reaction", async () => {
    const t = await harness({ app: submissionsApp })
    await t.dispatch(submitAnswer, { studentId: "avi", answer: "alef" })
    await t.idle()

    expect(t.telemetry.count("event.published", e => e.event.eventName === "submissions.answer-submitted")).toBe(1)
    expect(t.telemetry.count("reaction.fired")).toBeGreaterThanOrEqual(1)
    await t.stop()
  })
})

2. Real-deps — docker compose preset

When you need to test against actual Mongo / NATS / Redis / Postgres. dockerCompose({ preset }) spins up a stack, returns ports, tears down on test end.

import { harness, dockerCompose } from "@nwire/test-kit"
import { MongoActorStore } from "@nwire/store-mongo"
import { InMemoryEventBus } from "@nwire/bus"
import { MongoClient } from "mongodb"

describe("submissions vs real Mongo", () => {
  let stack: Awaited<ReturnType<typeof dockerCompose>>
  let mongo: MongoClient

  beforeAll(async () => {
    stack = await dockerCompose({ preset: "mongo" })
    mongo = await MongoClient.connect(`mongodb://root:example@localhost:${stack.ports.mongo}`)
  })
  afterAll(async () => {
    await mongo.close()
    await stack.stop()
  })

  it("persists actor state across harness restarts", async () => {
    const t = await harness({
      app: submissionsApp,
      providers: { actorStore: new MongoActorStore(mongo.db("nwire-test")) },
    })
    await t.dispatch(submitAnswer, { studentId: "avi", answer: "alef" })
    await t.idle()
    await t.stop()

    const t2 = await harness({
      app: submissionsApp,
      providers: { actorStore: new MongoActorStore(mongo.db("nwire-test")) },
    })
    const view = await t2.app.runtime.getActorStore().load("submission", "avi", "")
    expect(view?.state).toBe("submitted")
    await t2.stop()
  })
})

Available presets: mongo, nats, mongo+nats, redis. Or pass your own yaml: string for a custom stack.

Skip in CI when Docker isn't available

Gate real-deps tests behind describe.skipIf(!process.env.HAS_DOCKER) or similar. Keep them in a separate *.integration.test.ts glob that CI runs on a nightly job.

3. BDD — Gherkin scenarios

For full end-to-end story coverage. Each .feature file is one user scenario; @nwire/test-kit/bdd wires steps to the harness.

gherkin

# avi-submits.feature
Feature: Avi submits an answer
  As a 9-year-old student
  I want to submit my answer to a Hebrew Letters exercise
  So that I see immediate feedback

  Scenario: high-confidence answer is auto-graded
    Given Avi is enrolled in "Hebrew Letters"
    When Avi submits the answer "alef" to exercise "ex-1"
    Then within 200ms the system records submissions.answer-submitted
    And within 1s the system records submissions.auto-graded
    And Avi sees a "Correct!" toast

// avi-submits.test.ts
import { feature } from "@nwire/test-kit"
import { harness } from "@nwire/test-kit"
import { submissionsApp } from "../app"

feature("./avi-submits.feature", (ctx) => {
  ctx.background(async () => { ctx.harness = await harness({ app: submissionsApp }) })
  ctx.afterScenario(async () => ctx.harness?.stop())

  ctx.given("Avi is enrolled in {course}", async (course) => {
    await ctx.harness.dispatch(enrollStudent, { studentId: "avi", course })
  })
  ctx.when("Avi submits the answer {answer} to exercise {id}", async (answer, id) => {
    await ctx.harness.dispatch(submitAnswer, { studentId: "avi", exerciseId: id, answer })
  })
  ctx.then("within {ms}ms the system records {event}", async (ms, event) => {
    await ctx.harness.idle(parseInt(ms))
    expect(ctx.harness.telemetry.count("event.published", e => e.event.eventName === event)).toBeGreaterThan(0)
  })
})

Requires @amiceli/vitest-cucumber as a peer dep: pnpm add -D @amiceli/vitest-cucumber.

TelemetryProbe — querying what happened

t.telemetry exposes the captured canonical telemetry stream. Five query methods:

// 1. count records of a kind, optionally matching a predicate
t.telemetry.count("event.published")
t.telemetry.count("event.published", e => e.event.eventName === "answer-submitted")
t.telemetry.count("action.failed")

// 2. walk a causal chain forward (by event name OR correlationId)
const chain = t.telemetry.chain("submissions.answer-submitted")
// → [action.dispatched, event.published, actor.transitioned, reaction.fired, …]

// 3. observed event→event pairs from causation chains
const pairs = t.telemetry.observedPairs()
// → Map<"submissions.answer-submitted→submissions.auto-graded", 1>

// 4. drift = observed pairs not in your declared set
const declared = new Set(["a→b", "b→c"])
t.telemetry.drift(declared)
// → [{ pair: "a→d", count: 1 }]   // a→d fired but wasn't declared

// 5. anything that failed
t.telemetry.errors()
// → [{ kind: "action.failed", ... }, { kind: "reaction.failed", ... }]

idle() — waiting for the pipeline to settle

The single most useful primitive. Dispatching an action kicks off a cascade — reactions fire, projections fold, timers schedule. t.idle() resolves once the cascade quiets:

await t.dispatch(submitAnswer, { ... })
await t.idle()
// guaranteed: no action in flight, no telemetry record for ≥25ms

Tunables: idle(graceMs = 25, timeoutMs = 5000). Bump graceMs if your reactions chain into long async work (e.g., real Mongo calls).

Don't await raw dispatch() and assume reactions ran

runtime.dispatch() returns when the handler returns. Reactions that ctx.request() other actions complete after — they're in the same correlation but separate microtask windows. Always await t.idle() before asserting reaction effects.

Fixtures — zod-driven factories

import { factory, sequence } from "@nwire/test-kit"

const submissionFactory = factory(SubmissionInput, {
  studentId: sequence((n) => `student-${n}`),
  exerciseId: sequence((n) => `ex-${n}`),
  answer: "alef",
})

const s = submissionFactory.build()
// { studentId: "student-1", exerciseId: "ex-1", answer: "alef" }

const batch = submissionFactory.buildList(10)

Factories validate against the zod schema on build — wrong input is caught at fixture-creation time, not at first dispatch.

Pattern: test the auto-grade chain

it("Avi → submit → auto-grade → mastery update — full chain", async () => {
  const t = await harness({ app: submissionsApp })

  // Arrange: Avi already enrolled
  await t.dispatch(enrollStudent, { studentId: "avi", course: "hebrew-letters" })

  // Act
  await t.dispatch(submitAnswer, { studentId: "avi", exerciseId: "ex-1", answer: "alef" })
  await t.idle()  // wait for auto-grade reaction + mastery contribution

  // Assert the WHOLE chain fired
  const chain = t.telemetry.chain("submissions.answer-submitted")
  const eventNames = chain
    .filter(r => r.kind === "event.published")
    .map(r => (r as any).event.eventName)

  expect(eventNames).toEqual([
    "enrollments.student-enrolled",          // from arrange
    "submissions.answer-submitted",          // from act
    "submissions.auto-graded",               // from auto-grade reaction
    "mastery.contribution-recorded",         // from cross-domain reaction
  ])

  // Assert observable side effects
  const mastery = await t.query(masterySnapshot, { studentId: "avi" })
  expect(mastery.totalContributions).toBe(1)

  // Assert no drift / no errors
  expect(t.telemetry.drift(declaredChainsForSubmissions).length).toBe(0)
  expect(t.telemetry.errors()).toHaveLength(0)

  await t.stop()
})

Pattern: assert SLO compliance

it("submit-answer is fast enough", async () => {
  const t = await harness({ app: submissionsApp })

  await Promise.all(
    Array.from({ length: 50 }, (_, i) =>
      t.dispatch(submitAnswer, { studentId: `s-${i}`, answer: "alef" }),
    ),
  )
  await t.idle()

  const completed = t.telemetry.ofKind("action.completed")
    .filter(r => r.action === "submissions.submit-answer")

  const durations = completed.map(r => r.durationMs).sort((a, b) => a - b)
  const p95 = durations[Math.floor(durations.length * 0.95)]

  expect(p95).toBeLessThan(200)  // matches the declared SLO
  await t.stop()
})

Testing ​

TL;DR ​

What the harness gives you ​

Three layers of realism ​

1. Unit/integration — in-memory, no I/O ​

2. Real-deps — docker compose preset ​

3. BDD — Gherkin scenarios ​

TelemetryProbe — querying what happened ​

idle() — waiting for the pipeline to settle ​

Fixtures — zod-driven factories ​

Pattern: test the auto-grade chain ​

Pattern: assert SLO compliance ​

See also ​