Your code runs inside an agent we host, so the box around it has to be genuinely hard to get out of. We ran a pentest against it and closed the findings.
What changed
- Model credentials never enter the sandbox. Every model call — on the free tier and with your own BYOK keys — is brokered server-side. No provider key is ever present on the machine your code runs on, and the per-task cost cap is enforced at that same boundary.
- Git credentials never hit disk. Clones, fetches, and pushes authenticate per-command; nothing is written into the repository's git config, so a filesystem read turns up no reusable token.
- The agent runs unprivileged. The agent process and everything it spawns run as an ordinary, confined user scoped to its
/workspace— a compromised run can't read system files or reach another workspace. The privilege drop is fail-safe by design. - We red-team ourselves on every change. An automated suite spins up a throwaway agent, actively tries to exfiltrate secrets and break out of the sandbox, scans everything the session returned for leaks, and fails the build if anything escapes.
Why
"Hand your repo to a hosted agent" is a real ask. We'd rather attack our own sandbox on every commit than find out the hard way.