This week I locked an architectural principle into my AI system’s decisions log.
It was a tool-selection hierarchy I’d been saying out loud for months but never put in writing.
CLI first. Then MCP. Then direct API calls. Quality always overrides class, so the lower-precedence option wins when it produces materially better output, but the default ordering is fixed. Document the override reason when you break the precedence. That’s the rule.
The rule existed in my head because of a pattern I kept hitting. CLI tools tend to be older, better-documented, and built by people who care about Unix composition. MCP servers tend to be newer, glossier, and built fast around a single integration target. Direct API code tends to be the heaviest lift and the easiest to abandon. When three tools can do the same job, the CLI has usually been load-bearing for someone else’s work for years. That earns it the first slot.
So I wrote the principle down. Then I did the thing you’re supposed to do with a freshly-locked principle. I ran it as an audit lens against the current tool stack.
i.The expected output
The expected output was a list of MCP or API usage that should swap to CLI equivalents. That was the whole point. Apply the rule, find the violators, file the swaps.
What I actually got was zero swap candidates.
I’ll walk that, because the zero is more interesting than a swap list would have been.
Every MCP in the stack survived the audit. Google MCP wraps Drive, Docs, Sheets, Calendar, and Gmail through OAuth. There are CLIs in that neighborhood (gcloud, gsutil, gmail-cli forks), but none of them produce the same compositional surface for an agent acting on shared workspace state. The MCP wins on quality. Document the override. Move on.
Zapier MCP fans out to eight thousand apps. There is no CLI equivalent at that scale. Quality override, documented, move on.
Context-mode runs as both a plugin and an MCP because the workflow demands tight loop with the harness. CLI exists, but the plugin form is what’s actually used. Override, documented, move on.
By the third entry the audit had stopped feeling productive. The lens wasn’t finding anything. I almost set it down.
— On audit lenses The principle is theexcuse to look. Thelooking is what findsthe real concern.
ii.What the lens caught
Then the lens caught the thing it wasn’t built to catch.
While walking the MCP list, I noticed a row I hadn’t been thinking about. The Anthropic-managed connectors that ship inside claude.ai itself. The Google Drive and Gmail connectors that just exist in the side panel, no install step, no OAuth flow on my end, no managed credential.
I’d been using them for months without thinking about them as a tool surface.
And the thing about Anthropic-managed connectors is they authenticate against whoever’s logged into claude.ai. Which means they were running against my personal identity, the same identity my entire AI system is supposed to be isolated from.

The OAuth isolation architecture I’d locked back in May said the agent gets its own identity, with its own scopes and its own revocation surface. Single revoke point. Clean separation. The whole point was that if something goes sideways with Bishop’s tool authorities, I yank one credential and the blast radius is contained. My personal Drive, my personal Gmail, my personal Calendar all stay clean because the agent doesn’t have keys to them.
Except the Anthropic-managed connectors had been quietly using my personal keys this whole time. They don’t go through the local Google MCP fork. They don’t carry Bishop’s identity. They authenticate against my logged-in claude.ai session, which is me, which is the exact identity the architecture says the agent doesn’t get.
The isolation wasn’t broken. The isolation was being bypassed by a tool surface the architecture had never accounted for.
iii.The asymmetry
This is what I want to name about the move.
The lens I built was for swaps. The lens that ran was for swaps. The lens found something that wasn’t a swap. And the thing it found was more important than any swap would have been, because a missed swap is a quality cost and a bypassed identity boundary is a security cost.
That asymmetry is the lesson.
When you apply a principle as an audit lens, the principle scans for what it knows how to scan for. The surface concern is in the principle’s name. CLI versus MCP. Precedence violations. Class assignments.
But the act of walking the tool surface, principle in hand, makes you look at things you’d normally walk past. You stop in front of every entry. You ask whether the entry conforms. The principle is the excuse to look. The looking is what finds the real concern.
The Anthropic connectors weren’t a precedence violation. They were higher up the tool stack than CLI versus MCP versus API. They were a who is this even running as question that the precedence rule had no language for. The rule didn’t catch the issue. The rule got me to the row where the issue was sitting, and then my eyes did the rest.
iv.The lens is the walk
I think this is what good principles do when you use them as lenses. They’re not perfect filters. They’re slow walks. The principle gives you a reason to stop at every row in a table you’d otherwise scroll past. Some of those rows are precedence violations. Some are something else entirely.
If I’d run a security audit explicitly, I would have looked for credential leaks and OAuth scope drift and exposed keys. I might have missed the connector row, because the connectors don’t read like a security surface. They read like a UI affordance.
If I’d run a quality audit, I would have looked for better tool fits. I would have missed the connector row, because the connectors look like a quality win (they’re frictionless, fast, baked in).
The tool-selection hierarchy audit caught it because the audit forced me to write down what surface every tool was running on. Once you write that down, the identity column is sitting right next to the precedence column. The principle didn’t see the identity column. I did. But I wouldn’t have been looking at it without the principle making me build the table.
The keep, from this week, is that the tool hierarchy survives as an architectural discipline even though its first audit pass produced no swaps. The discipline isn’t change the stack. The discipline is walk the stack with a reason to stop.
And the portable lesson, the one I think generalizes past my specific system, is that the value of an architectural principle isn’t measured by how often it produces the action its name implies.
A CLI-first rule that finds zero swaps but catches an identity-isolation bypass is doing more work than a CLI-first rule that finds three swaps and nothing else. The principle is the lens. The lens is the walk. The walk is what finds the thing.
The thing isn’t always in the lens’s name.
Sometimes the lens catches what was sitting next to what you thought you were looking for, and that’s the real return.
Drafted with Bishop, my AI partner.
Words picked, edited, and approved by me.