28 June 2026
Getting an LLM to return JSON you can actually trust

The first time you wire an LLM into real code, you hit the same wall everyone does. You ask for JSON, and most of the time you get JSON - and then, occasionally, you get this:
Sure! Here's the data you asked for:
```json
{ "name": "Ada", "role": "engineer" }
```
Let me know if you'd like anything else!
JSON.parse throws, your request 500s, and a user sees a broken page because the model felt chatty. If you're putting an LLM in production, "usually valid" isn't good enough. Here's the stack of techniques I use to make structured output boringly reliable.
1. Never parse hope - validate
The core mistake is treating the model's output as trusted data. It's untrusted input, exactly like a form field from a stranger. So the first rule: define a schema and validate against it.
import { z } from "zod";
const Person = z.object({
name: z.string().min(1),
role: z.enum(["engineer", "designer", "founder"]),
years: z.number().int().nonnegative(),
});
type Person = z.infer<typeof Person>;
Now the model's reply has to earn its way into your types. Anything malformed gets caught at the boundary instead of detonating three functions deep.
2. Use the platform's structured mode
Before you engineer around the problem, check what the API already gives you. Most providers now offer some form of guaranteed structure:
| Feature | What it does |
|---|---|
| JSON mode | Forces output to be syntactically valid JSON |
| Structured outputs | Constrains output to your schema, not just "JSON" |
| Tool / function call | Model returns arguments matching a typed signature |
Structured outputs and tool-calling are the big ones - they constrain decoding so the model literally can't emit a field you didn't define. Reach for these first; they remove whole categories of failure for free.
3. Turn the temperature down
Sampling temperature controls randomness. For creative writing you want it high. For structured extraction you want it low - often 0 to 0.2. You're not asking the model to be imaginative; you're asking it to fill in a form correctly. Low temperature makes the shape stable and repeatable.
4. Retry - and feed the error back
Even with all of the above, you'll get the occasional miss. Don't just retry blindly; tell the model what broke. The validation error is a perfect correction signal.
async function extract(input: string, tries = 2): Promise<Person> {
let lastError = "";
for (let i = 0; i < tries; i++) {
const raw = await callModel(buildPrompt(input, lastError));
const parsed = Person.safeParse(safeJson(raw));
if (parsed.success) return parsed.data;
lastError = JSON.stringify(parsed.error.issues);
}
throw new Error("Model could not produce valid output");
}
On the second pass the prompt now includes "your last answer failed validation with these issues: …, fix them." Models are surprisingly good at correcting themselves when you hand them the exact complaint.
Treat the LLM like an unreliable junior who's brilliant but careless. You don't trust the first draft - you give precise feedback and check the work.
5. Ground it, don't let it invent
A model asked to produce a field it doesn't know will often make one up rather than leave it blank - that's hallucination. Two cheap defenses:
- Allow "unknown." Add
z.null()or an explicit"unknown"enum value and tell the model to use it when the input doesn't say. Giving it a legal way to say "I don't know" beats forcing a guess. - Keep it extractive. When you can, frame the task as "pull these fields from this text" rather than "tell me about X." Extraction from provided context hallucinates far less than open-ended generation.
The mental model
Reliable LLM output isn't one trick, it's a posture: constrain what you can, validate everything, and build a correction loop for the rest.
constrain (schema/tools) → low temperature → validate → retry with the error → fall back gracefully
None of this is exotic. It's the same defensive engineering you'd apply to any flaky external dependency - which, for now, is exactly what a language model is. Treat it that way and the "magic" becomes just another well-behaved part of your system.
Building something with LLMs and want to compare approaches? Say hi.