Agent Scopes in Practice

Last week I wrote about agent scopes — borrowing the OAuth concept of permission boundaries and applying it to AI coding agents. The idea was mostly conceptual: define what an agent is allowed to touch, surface hidden complexity early, give it a framework for flagging when it hits a wall.

Since then I've built it out into a working tool: agent-scope-skill. This post covers what changed when I moved from idea to implementation.

From inline notation to YAML files

The original post used a compact inline notation to express what was granted and denied:

Granted: `+edit:endpoint`. Denied: `-add:endpoint`, `-add:api-client`

The goal was never to include full scope definitions in the prompt—only the grant/deny identifiers. However, even that can become unwieldy quickly. A given project could have tens, if not hundreds, of scopes, and you don't want to deny them all.

The real implementation separates two concerns. Scope definitions live in agent-scope.yaml files co-located with source code — the agent reads these to understand what each scope permits or forbids, and which paths it covers. Scope identifiers appear in the prompt, but only the granted ones. Everything not explicitly granted is implicitly denied. This skill includes tools to programmatically convert the granted scope into a comprehensive scoping report detailing what is allowed and what is not.

A scope definition looks like this:

scopes:
  edit:api-endpoint:
    allow: |
      You may modify the logic of existing API endpoints.
      This includes changing validation rules, response shapes, status codes,
      and business logic within existing controllers.
      You must not change the endpoint path or HTTP method.
    deny: |
      Modifying existing API endpoint logic is not in scope for this task.
      Leave all existing controllers and route handlers unchanged.
    paths:
      - src/api/routes/**
      - src/api/controllers/**
    labels:
      - api

Both allow and deny are written in natural language — because ultimately these become part of what the agent reads. The paths field gives the tooling something concrete to work with when auditing file changes.

The CLI

The skill ships a Python CLI you the agent can run via uvx with no installation:

uvx --from agent-scope/assets agent-scope <command>

There are five commands:

determine-scope [SCOPE...] — reports every scope as ALLOWED or DENIED
describe-scope SCOPE_NAME — shows the full definition of one scope
describe-all-scopes — shows everything, useful before requesting scopes
audit-scope FILE... — checks changed files against denied scopes (exits 1 on violations)
validate — lints all agent-scope.yaml files

The one that matters most is determine-scope. An agent runs it at the start of a task after being handed its granted scopes. The output is a structured report — which scopes are ALLOWED, which are DENIED, what each one permits or forbids, and which paths it covers. The agent reads this and works within it.

The whole thing relies on a snippet in AGENTS.md that tells any agent opening the repo that it's covered by agent scopes and to load the skill. The initialisation step adds this automatically when onboarding a new project.

What the naming convention resolved

The original post described a taxonomy loosely: add:endpoint, edit:endpoint, remove:endpoint. The implementation makes this a hard convention — every logical concern must define all three variants. Omitting one is caught by validate.

This matters because partial definitions create ambiguity. If add:component and edit:component are defined but remove:component isn't, what does the agent do when it needs to delete one? It guesses — and it may decide that edit:component covers removal because editing is vaguely adjacent. Requiring all three forces you to be deliberate about every operation type.

The planning phase

One gap the original post left open: what does an agent do before it has scopes? How does it know what to request?

The describe-all-scopes command addresses this. An agent without granted scopes can run it against the project root, read every defined scope in full, and then propose the specific ones it needs for the task. Instead of "I'll need permission to touch the API layer", it can say "I'll need edit:api-endpoint and add:migration."

This matters most in fully autonomous workflows. A human hands an agent a task description and a codebase. The agent runs describe-all-scopes, drafts a plan, and returns with both the plan and a list of scopes it expects to need. The human reviews and approves — or adjusts the scope list. From that point, the agent goes away and implements it entirely without further human involvement. The scopes are what make that handoff trustworthy. You know before implementation starts exactly what the agent is and isn't allowed to touch.

Implicit denial, per scope

This is where the YAML definitions do real work. The original post flagged the implicit-versus-explicit denial question as genuinely unresolved:

"agents will likely handle explicit denial better in practice — a clear 'you cannot do X' in the prompt will likely result in better adherence"

The implementation lands somewhere in between. Each scope definition has an assume field — deny by default. So ungranted scopes are implicitly denied at the scope level, but the deny text in each definition makes the denial explicit in the output the agent reads. You get the cleanliness of implicit-deny-by-default for the input with the clarity of explicit denial text in the context.

One scope — edit:tests in the example project — has assume: allow, which means it's granted unless explicitly overridden. Tests are almost always fair game; this avoids having to include edit:tests in every task's grant list.

How it's held up in practice

I've been running this across a few real projects. The thing that's stood out most is the difference the planning step makes. When I did the upfront planning — letting the agent survey the scopes and propose what it needed — it never came back mid-task to ask for additional scope. Not once. When I skipped that step and just handed it scopes directly, it did come back. Several times. That's the pre-task planning conversation doing its job: surfacing what's actually needed before a line of code is written, rather than discovering it mid-implementation. The agent works with more confidence too. It knows what's in bounds, it knows what isn't, and it doesn't hedge.

Local, interactive use is a different story — and honestly, it was never really the target. When you're sitting alongside the agent, watching it work and steering it in real time, you're already the scope. You're there to redirect it if it veers off course. Adding the formal overhead of defining and passing scopes just introduces friction that you'd otherwise handle naturally. This skill earns its keep when the agent is operating without you.

The most amusing thing I discovered: you need a add:scope scope. I wanted to add a new scope definition to a project mid-task, but I hadn't granted myself that permission. A genuine catch-22 — and a sign the system is working as intended.

Trying it out

The repo is at github.com/almcc/agent-scope-skill. To add it to an existing project, install the skill and ask the agent to initialise the project for agent scopes. It will pick the closest starter template for your project type (frontend, backend API, CLI, library), adjust the paths: globs to match your directory structure, and wire up the AGENTS.md constraint so every subsequent agent is aware of it from the start.

What's next

Some things I'd like to do next.

A pre-commit hook that checks files being committed against the paths defined in the active scopes. If any changed file falls outside the permitted paths, the commit fails with a clear explanation of which scope was violated. For this to work, the tooling would need to cache the current grant list somewhere that the hook can read it. But the payoff is significant: an agent operating without a human gets an immediate feedback loop at the exact moment it tries to commit. It can't silently drift outside its bounds and only discover the problem at review.

Adding the granted scopes to the commit as a commit trailer (Scope: edit:api-endpoint, add:migration), so every commit carries a machine-readable record of the permission set it was made under. That flows naturally into everything downstream: CI, review tooling, audit logs.

Scope adherence checking as part of MR review. When an automated agent is reviewing a merge request, the scopes give it something concrete to work with beyond just reading the diff — it can verify that the changes actually stay within the granted scopes, flag anything that drifted, and reason about whether the implementation matches what was originally approved. The scope becomes part of the review contract. I always prefer the reviewer to be a different LLM than the one that wrote the code. Two LLMs rarely make the same misinterpretation — so where the author skipped over something, the reviewer is less likely to skip over it too.

Scope labels on MRs, whether or not the change was agent-originated. Tagging an MR with the scopes it touches gives reviewers a signal before they open the diff. An MR labelled edit:api-endpoint gets routed differently to one labelled add:migration, edit:schema. This matters more than it might seem. The ability to create code with AI is already outpacing the number of people available to review it — that gap is only going to widen. Prioritising the review queue is going to become one of the harder problems in AI-assisted development, and lightweight signals like scope labels are the kind of thing that help.