The AGENTS.md specification: every section, in priority order

There's no official AGENTS.md spec — this is the working one. Every section the file needs, the order that actually matters, the ~5KB budget that silently truncates it, and what to offload to TOOLS.md instead. Grounded in the template VibeKit regenerates for every hosted app.

There are a lot of "what is AGENTS.md" posts now, including one of my own. They answer why the file exists. This one is different: it's the spec — the actual anatomy of the file, section by section, in the order that matters, learned from regenerating it for every app we host.

There is no official AGENTS.md standard. What follows is the working spec we converged on after shipping the file into production on every VibeKit app — the version that survived contact with real users, real agents, and one painful truncation bug. If you're writing an AGENTS.md for Claude Code, Cursor, or your own agent loop, this is the shape I'd start from.

The constraint that shapes everything: it's a budget, not a document

Before any section, understand the one fact that governs all of them: AGENTS.md is not a document you write, it's a budget you spend.

The agent reads this file on every turn, prepended to its working context. That context is finite, and most agent runtimes cap how much of the file they'll actually inject. Ours truncates AGENTS.md at roughly 5,000 characters when it builds the agent's bootstrap context. Anything past that byte is silently dropped — not errored, not warned. Dropped.

We learned this the hard way. A non-English user's first deploy burned ~12 seconds on path-confusion errors because the "use relative paths, not absolute" guidance lived past character 5,000. The agent never saw it. The rule was in the file and might as well not have been.

So the real spec rule, the one that makes all the others make sense, is: order by consequence. The first 4,500 characters are the only ones you can count on. Everything in the file competes for that space. If a section can't justify its bytes against "what breaks if the agent doesn't know this," it goes in a different file (more on that below).

Every section ordering decision in this spec follows from that single constraint.

The spec, section by section

Here's the order we ship, highest-consequence first. The full template is renderAgentsMd in agent-templates.ts — this is the annotated skeleton.

1. Identity & environment (first — the agent can't act without it)

# my-dashboard — Agent

App: **my-dashboard** at https://my-dashboard.vibekit.bot
Repo: user/my-dashboard | Port: 4001 | Container: vk-my-dashboard

Three lines, top of file. The agent needs to know what it's working on before anything else: the product name, the live URL, the repo, the port it must bind. This is cheap (under 200 chars) and load-bearing — an agent that doesn't know its own port writes a server that's unreachable after deploy.

Keep it factual. No prose. The product name matters more than you'd think: agents default to calling the project by its repo name, which drifts after renames. State the canonical name once, here.

2. NEVER rules (the negative constraints that break the product)

This is the highest-priority behavioral section, and it goes second — right after the agent knows what it's touching. Counterintuitively, the things you forbid matter more than the things you request.

## NEVER (highest priority — these break the product)

- NEVER mention "localhost" or "npm start" as a user instruction. The
  user is on a phone — no terminal, no laptop. Test URL is always the
  live one.
- NEVER claim you "deployed" the app. You write code; the user taps Deploy.
- NEVER tell the user to run shell commands or curl. They can't.

Why negatives first: a good behavior the agent misses is a worse turn. A forbidden behavior the agent commits is a broken product. In our case, the single most damaging failure was agents telling phone-only users to "open http://localhost:8000" — instructions the user physically cannot follow. One sentence of prohibition prevents an entire class of dead-end conversations.

Be specific about why each NEVER exists in half a clause ("the user is on a phone"). Models follow rules they understand better than rules they're just handed.

3. Setup (the every-session bootstrap)

## Setup
source .vibekit-env   # loads API URL, key, subdomain, app id
For real work also read STATUS.md, MEMORY.md. Skip for greetings.

The deterministic first move. This is where AGENTS.md hands off to the other two files in the persistent-memory pattern: STATUS.md (what's happening now) and MEMORY.md (durable decisions). AGENTS.md is the contract; those are the state. Keeping them separate is what lets AGENTS.md stay inside its budget — it doesn't carry project history, it points at the file that does.

Note the explicit "skip for greetings." Without it, agents dutifully source and cat three files to answer "hi," adding latency to a turn that needed none.

4. Behavioral rules (modes, defaults, tone)

The largest section, and the first one you trim when you're over budget. Group it:

Mode-switching — trivial messages get text-only replies; build requests get tools. Give an explicit tool-call ceiling ("≤3 per turn, exceed only for explicit build/debug").
Defaults that prevent crash loops — for us: "Express + vanilla HTML/CSS/JS; React/Vite needs a build step and breaks unless asked." Encode the stack decision so the agent doesn't reinvent it (and reintroduce the same break) every session.
Tone — concise, outcome-only, no "Let me try…" reasoning dumps in user-facing text.

This section is where teams over-spend. Every "always be helpful" platitude is bytes stolen from a NEVER rule that actually prevents a failure. Cut anything the base model already does well.

5. Concrete good/bad examples (models pattern-match better than they parse)

### Bad (NEVER say)
- "Open http://localhost:8000 in your browser"
- "I've deployed your app"

### Good (say instead)
- "Changes saved. Tap the ↑ Deploy arrow to review and publish."

This is the highest-leverage-per-byte section in the whole file. A paragraph explaining your tone gets paraphrased; two ❌/✓ pairs get imitated. When a rule keeps getting violated, don't rewrite the rule — add an example of the failure and its fix. The model closes the gap from a demonstration far faster than from a description.

6. Overflow: what does not go in AGENTS.md

Everything that's reference rather than contract moves to an on-demand tier. We keep the full API documentation, the debug runbook, and the skill registry in a separate TOOLS.md the agent reads only when it needs them:

## More docs
- Full API reference: cat TOOLS.md
- Logs: /api/v1/hosting/app/$SUBDOMAIN/logs?lines=50

This is the release valve for the budget problem. AGENTS.md answers "how do I behave, every turn." TOOLS.md answers "how do I do this specific thing, when it comes up." Conflating them is how files blow past 5,000 characters and start silently dropping their own rules. If a piece of knowledge is only relevant in 1-in-20 turns, it's a TOOLS.md entry, not an AGENTS.md line.

7. Safety & recovery (last — important, but rarely the bottleneck)

## Safety
- Before rm -rf / DROP TABLE / git reset --hard: ask first.
- Recovery: git log --oneline -10 → git checkout <hash> -- <file>.

Genuinely important, deliberately last. Destructive-op guardrails fire rarely, so they can live at the bottom of the budget. If they get truncated on a pathological 5KB-overflow turn, you have bigger problems — but in practice this section is small enough to always fit.

Format conventions

The spec is as much about how you write each line as what sections exist:

Markdown, flat headings. ## for sections, ### for sub-rules. Agents navigate by heading; deep nesting buries rules.
Imperative voice. "Bind to 0.0.0.0," not "the app should probably bind to 0.0.0.0." Hedged instructions get hedged compliance.
Specifics over principles. "≤3 tool calls per turn" beats "be efficient." A number is a rule; an adjective is a suggestion.
No meta-prose. No "this section covers…" preamble. Every sentence is either a fact the agent needs or a rule it follows. Preamble is pure budget waste.
State the why in a half-clause, not a paragraph. "use relative paths — the sandbox rejects /mnt/efs/..." earns more compliance than the rule alone, at almost no byte cost.

Anti-patterns (what quietly wastes the budget)

After auditing a lot of these files — ours and others' — the recurring failure modes:

The system prompt in disguise. "You are a helpful, harmless assistant…" The base model already is that. AGENTS.md is for this app's deltas from default behavior, not a re-statement of the model's training.
Stale project history. "We migrated from Postgres to SQLite in March." That's MEMORY.md's job. AGENTS.md is the contract, not the changelog — mixing them means the contract gets truncated to make room for trivia.
Aspirational rules nobody enforces. If the runtime can't or won't act on a line, it's decoration. "Always write tests" in a file the agent reads but no CI checks is a wish, not a spec.
Walls of prose. Three paragraphs explaining the deploy flow lose to one ❌/✓ pair. If you're writing sentences, you're probably over budget and under-compliance.
Ignoring the limit entirely. The most common one. People write a beautiful 9KB AGENTS.md and never learn that their agent only reads the first half. Check your runtime's injection cap. Then write to it.

How it's actually read

One more thing the spec implies but doesn't say outright: this file is regenerated, not hand-edited. Ours is rewritten before every agent interaction by renderAgentsMd, stamped with the current app's name, port, and domain — which is why the template ends with _This file is overwritten before every interaction. Do not edit manually._ Per-app facts are interpolated; the structure is fixed. If your AGENTS.md is static and hand-maintained, that's fine for a single repo — but the moment you're running agents across many apps, templating it is what keeps every one of them inside the budget and in sync.

A reference skeleton

Strip it to the load-bearing minimum and you get this — copy it, fill it in, keep it under ~4,500 characters:

# <app> — Agent
App: <app> at <live-url> | Repo: <repo> | Port: <port>

## NEVER (these break the product)
- NEVER <the failure mode that hurts most>
- NEVER <the second one>

## Setup
<the one-line bootstrap>
For real work also read STATUS.md, MEMORY.md. Skip for greetings.

## Rules
- <mode-switching rule with a number>
- <stack default that prevents the recurring crash>
- <tone: concise, outcome-only>

### Bad → Good
- ❌ <a real failure> → ✓ <its fix>

## More docs
- Reference: cat TOOLS.md

## Safety
- Before destructive ops: ask first.

That's the whole spec. Identity, then what-not-to-do, then how-to-start, then how-to-behave, then examples, then a pointer to everything else, then guardrails — ordered so that if the file gets cut off, what survives is what matters most.

AGENTS.md isn't hard to write. It's hard to write short — and short is the entire game, because the agent only ever reads the part that fits. Spend the budget on what breaks if it's missing, and offload the rest. If you want the longer argument for why a file like this beats a longer context window, that's in the companion post.