Best practices

Get the most out of FlareCode — write tasks agents can finish, pick the right mode and model, and review fast.

FlareCode does its best work when you hand it a clear outcome, give it a map of the repo, and review what comes back like any other Pull Request. These are the habits that separate a one-shot PR from a stalled task.

Write a task the agent can finish

The single biggest lever is the task itself. Describe the outcome and the bar for "done" — the agent plans the steps.

Add rate limiting to POST /api/login: max 5 requests/min per IP,
return 429 with a JSON error. Add a passing test. Open a PR.

State the done condition. "Add a passing test", "make pnpm typecheck clean", "the endpoint returns 429" — a checkable bar lets the agent evaluate and correct itself.
One outcome per task. Two unrelated changes are two tasks — easier to review, easier to roll back, cheaper when one fails.
Point at the code. Naming a file or directory ("in apps/api/src/routes") is optional but speeds things up.
Name your constraints. "No new dependencies", "keep the existing test style", "don't touch the schema."

Pick the mode: goals or interactive

Same agent, same sandbox, same PR gate — the difference is how much you steer.

	Goals	Interactive
You	hand off, review later	steer in real time
Best for	concrete deliverables	exploration, tight loops
Output	a Pull Request	a Pull Request

Reach for goals when the task has a clear finish line and you want to walk away. Reach for interactive when you're exploring an unfamiliar codebase or the work needs back-and-forth. They mix: start a goal, jump in to nudge it, then hand the rest back. See Interactive vs goals.

Give the agent a map

The agent reads your repo, not your mind. Two files do most of the work:

AGENTS.md at the repo root — conventions, commands, and rules the executor reads natively (how to run tests, what not to touch, the stack). Keep it current; it's the cheapest way to raise output quality.
SKILLS.md — repeatable procedures the agent can invoke.

If your repo already has these, the agent follows them. If it doesn't, a short AGENTS.md ("use pnpm, run pnpm test, never edit generated/") pays for itself on the first task.

Choose a model for the job

Model choice is per agent and can change later.

Bundled (default) — routine work, cost-predictable, no keys to manage.
A frontier model via BYOK — hard refactors, gnarly debugging, anything where you want the strongest reasoning and are happy to pay provider rates.

Use a cheap bundled model for chores and a frontier BYOK model for a tricky service. Details in Choosing a model.

Run a fleet, not a queue

FlareCode is built for many repos at once. Don't serialize work you can parallelize:

One agent per concern — a bug fix and a docs change run side by side, not back to back.
Spread across repos — each agent has its own sandbox and review gate.

See Multi-repo.

Keep costs predictable

Every task runs under a maxCostUsd cap — if it's exceeded, the task stops and posts a failure log rather than running away. Set the cap to match the task's size, and watch month-to-date spend in Usage & billing. Failed tasks aren't billed. More in Costs and limits.

Review like a PR

Every change ships as a branch named flarecode/task-<id> and opens a Pull Request. The agent never pushes to your default branch — you merge, on your terms.

Read the diff and the agent's reasoning before merging.
Let CI run; the PR shows status.
Treat it like a teammate's PR — request changes by replying with a follow-up task.

Auto-merge on green CI exists but is opt-in on every plan, off by default. See Submit a task and the Security model.

Iterate when it stalls

If a goal gets stuck, you don't have to start over. Jump into interactive to course-correct ("use the existing helper, not a new one"), then hand the rest back as a goal. For common failure causes, see Troubleshooting.

Best practices

On this page