Observability

Every attempt is observable. Wire the onAttempt hook into your existing tracing stack — no new dependencies, no adapters.

The onAttempt hook

Called after every attempt — pass or fail. Gives you structured data about what happened, what category of failure it was, and what specific issues were found.

enforce(schema, runFn, {
  onAttempt: (event) => {
    console.log(`Attempt ${event.number}: ${event.ok ? "pass" : "fail"}`);

    if (!event.ok) {
      console.log(`  Category: ${event.category}`);
      console.log(`  Issues: ${event.issues.join(", ")}`);
      console.log(`  Duration: ${event.durationMS}ms`);
    }
  },
});

AttemptEvent fields

Field	Type	Description
ok	boolean	Whether this attempt passed validation
number	number	Attempt number (1-indexed)
category	FailureCategory \| undefined	Failure type (undefined on success)
issues	string[]	Specific violation messages
durationMS	number	How long this attempt took

Built-in failure categories

Every failed attempt is classified into one of 8 categories. Each gets a targeted repair message — not generic "try again."

Category	What happened	Default repair
EMPTY_RESPONSE	Model returned nothing	Ask for JSON matching the schema
REFUSAL	Model declined the task	Redirect to structured data task
NO_JSON	No JSON found in response	Ask for JSON only, no prose
TRUNCATED	JSON cut off mid-response	Ask for shorter, complete response
PARSE_ERROR	Malformed JSON	Ask for strictly valid JSON
VALIDATION_ERROR	Valid JSON, failed schema	List specific Zod errors
INVARIANT_ERROR	Correct types, failed constraints	List specific violations
RUN_ERROR	Function threw an exception	Ask to try again

Custom repair overrides

Override the repair message for any category, or disable retry entirely by passing false.

enforce(schema, run, {
  repairs: {
    REFUSAL: (detail) => [{
      role: "user",
      content: "This is an approved data extraction task. Return the JSON.",
    }],
    TRUNCATED: false, // stop immediately, don't retry
  },
});

Langfuse

Trace each enforce call and every attempt as a Langfuse observation. See which invariants fire, how often, and whether prompt changes reduce them.

import { startActiveObservation, startObservation } from "@langfuse/tracing";

const result = await startActiveObservation("enforce-sentiment", async () => {
  return enforce(schema, async (attempt) => {
    const gen = startObservation(
      `llm-call-${attempt.number}`,
      { model: "gpt-4o-mini", input: attempt.prompt },
      { asType: "generation" },
    );
    const res = await openai.chat.completions.create({ /* ... */ });
    gen.update({ output: res.choices[0].message.content }).end();
    return res.choices[0].message.content;
  }, {
    onAttempt: (event) => {
      const span = startObservation(`attempt-${event.number}`, {
        output: { ok: event.ok, category: event.category, issues: event.issues },
      });
      span.end();
    },
  });
});

What you see in Langfuse

Trace: enforce-sentiment

├─ Generation: llm-call-1 (gpt-4o-mini)

├─ Span: attempt-1 { category: "INVARIANT_ERROR", issues: ["confidence too low"] }

├─ Generation: llm-call-2 (gpt-4o-mini)

└─ Span: attempt-2 { ok: true }

OpenTelemetry

Works with any OTLP backend — Datadog, Grafana, Honeycomb, Jaeger. Each attempt becomes a span with attributes you can query and dashboard.

import { trace } from "@opentelemetry/api";

const tracer = trace.getTracer("llm-contract");

const result = await tracer.startActiveSpan("enforce", async (rootSpan) => {
  const res = await enforce(schema, runFn, {
    onAttempt: (event) => {
      const span = tracer.startSpan(`attempt-${event.number}`);
      span.setAttribute("attempt.ok", event.ok);
      span.setAttribute("attempt.category", event.category ?? "none");
      span.setAttribute("attempt.issues", event.issues.join("; "));
      span.setAttribute("attempt.durationMS", event.durationMS);
      span.end();
    },
  });
  rootSpan.setAttribute("result.ok", res.ok);
  rootSpan.setAttribute("result.attempts", res.ok ? res.attempts : -1);
  rootSpan.end();
  return res;
});

What you can query

attempt.category = "INVARIANT_ERROR" — filter by failure type
result.attempts > 1 — find schemas that need retries
attempt.issues CONTAINS "end must be" — track specific invariants

Vercel AI SDK

Use enforce with generateText for invariants and targeted repair on top of your existing Vercel AI SDK setup. Replaces generateObject when you need cross-field validation.

import { generateText } from "ai";
import { openai } from "@ai-sdk/openai";

const result = await enforce(schema, async (attempt) => {
  const { text } = await generateText({
    model: openai("gpt-4o-mini"),
    messages: [
      { role: "system", content: attempt.prompt },
      { role: "user", content: input },
      ...attempt.fixes,
    ],
  });
  return text;
}, {
  invariants: [
    (d) => Math.abs(d.subtotal + d.tax - d.total) < 0.01
      || `subtotal + tax != total`,
  ],
});

Why use enforce instead of generateObject?

generateObject

Validates schema
Retries with same prompt
No cross-field checks
No per-attempt hooks

enforce

Validates schema + invariants
Retries with targeted repair
Cross-field validation
Per-attempt observability