Observability

Every attempt is observable. Wire the onAttempt hook into your existing tracing stack — no new dependencies, no adapters.

The onAttempt hook

Called after every attempt — pass or fail. Gives you structured data about what happened, what category of failure it was, and what specific issues were found.

enforce(schema, runFn, {
  onAttempt: (event) => {
    console.log(`Attempt ${event.number}: ${event.ok ? "pass" : "fail"}`);

    if (!event.ok) {
      console.log(`  Category: ${event.category}`);
      console.log(`  Issues: ${event.issues.join(", ")}`);
      console.log(`  Duration: ${event.durationMS}ms`);
    }
  },
});

AttemptEvent fields

Field Type Description
okbooleanWhether this attempt passed validation
numbernumberAttempt number (1-indexed)
categoryFailureCategory | undefinedFailure type (undefined on success)
issuesstring[]Specific violation messages
durationMSnumberHow long this attempt took

Built-in failure categories

Every failed attempt is classified into one of 8 categories. Each gets a targeted repair message — not generic "try again."

Category What happened Default repair
EMPTY_RESPONSEModel returned nothingAsk for JSON matching the schema
REFUSALModel declined the taskRedirect to structured data task
NO_JSONNo JSON found in responseAsk for JSON only, no prose
TRUNCATEDJSON cut off mid-responseAsk for shorter, complete response
PARSE_ERRORMalformed JSONAsk for strictly valid JSON
VALIDATION_ERRORValid JSON, failed schemaList specific Zod errors
INVARIANT_ERRORCorrect types, failed constraintsList specific violations
RUN_ERRORFunction threw an exceptionAsk to try again

Custom repair overrides

Override the repair message for any category, or disable retry entirely by passing false.

enforce(schema, run, {
  repairs: {
    REFUSAL: (detail) => [{
      role: "user",
      content: "This is an approved data extraction task. Return the JSON.",
    }],
    TRUNCATED: false, // stop immediately, don't retry
  },
});

Langfuse

Trace each enforce call and every attempt as a Langfuse observation. See which invariants fire, how often, and whether prompt changes reduce them.

import { startActiveObservation, startObservation } from "@langfuse/tracing";

const result = await startActiveObservation("enforce-sentiment", async () => {
  return enforce(schema, async (attempt) => {
    const gen = startObservation(
      `llm-call-${attempt.number}`,
      { model: "gpt-4o-mini", input: attempt.prompt },
      { asType: "generation" },
    );
    const res = await openai.chat.completions.create({ /* ... */ });
    gen.update({ output: res.choices[0].message.content }).end();
    return res.choices[0].message.content;
  }, {
    onAttempt: (event) => {
      const span = startObservation(`attempt-${event.number}`, {
        output: { ok: event.ok, category: event.category, issues: event.issues },
      });
      span.end();
    },
  });
});

What you see in Langfuse

Trace: enforce-sentiment

├─ Generation: llm-call-1 (gpt-4o-mini)

├─ Span: attempt-1 { category: "INVARIANT_ERROR", issues: ["confidence too low"] }

├─ Generation: llm-call-2 (gpt-4o-mini)

└─ Span: attempt-2 { ok: true }

OpenTelemetry

Works with any OTLP backend — Datadog, Grafana, Honeycomb, Jaeger. Each attempt becomes a span with attributes you can query and dashboard.

import { trace } from "@opentelemetry/api";

const tracer = trace.getTracer("llm-contract");

const result = await tracer.startActiveSpan("enforce", async (rootSpan) => {
  const res = await enforce(schema, runFn, {
    onAttempt: (event) => {
      const span = tracer.startSpan(`attempt-${event.number}`);
      span.setAttribute("attempt.ok", event.ok);
      span.setAttribute("attempt.category", event.category ?? "none");
      span.setAttribute("attempt.issues", event.issues.join("; "));
      span.setAttribute("attempt.durationMS", event.durationMS);
      span.end();
    },
  });
  rootSpan.setAttribute("result.ok", res.ok);
  rootSpan.setAttribute("result.attempts", res.ok ? res.attempts : -1);
  rootSpan.end();
  return res;
});

What you can query

Vercel AI SDK

Use enforce with generateText for invariants and targeted repair on top of your existing Vercel AI SDK setup. Replaces generateObject when you need cross-field validation.

import { generateText } from "ai";
import { openai } from "@ai-sdk/openai";

const result = await enforce(schema, async (attempt) => {
  const { text } = await generateText({
    model: openai("gpt-4o-mini"),
    messages: [
      { role: "system", content: attempt.prompt },
      { role: "user", content: input },
      ...attempt.fixes,
    ],
  });
  return text;
}, {
  invariants: [
    (d) => Math.abs(d.subtotal + d.tax - d.total) < 0.01
      || `subtotal + tax != total`,
  ],
});

Why use enforce instead of generateObject?

generateObject

  • Validates schema
  • Retries with same prompt
  • No cross-field checks
  • No per-attempt hooks

enforce

  • Validates schema + invariants
  • Retries with targeted repair
  • Cross-field validation
  • Per-attempt observability