This matters for big engineering teams who want to put _some_ kind of guardrails...

bisonbear · 2026-03-27T19:35:05 1774640105

I'm also thinking on how we can put guardrails on Claude - but more around context changes. For example, if you go and change AGENTS.md, that affects every dev in the repo. How do we make sure that the change they made is actually beneficial? and thinking further, how do we check that it works on every tool/model used by devs in the repo? does the change stay stable over time?

nunez · 2026-03-28T06:17:19 1774678639

Given the scope that AGENTS has, I would use PRs to test those changes and discuss them like any other large-impact area of the codebase (like configs).

If you wanted to be more “corporate” about it, then assuming that devs are using some enterprise wrapper around Claude or whatever, I would bake an instruction into the system prompt that ensures that AGENTS is only read from the main branch to force this convention.

This is harder to guarantee since these tools are non-deterministic.

bisonbear · 2026-03-29T17:55:24 1774806924

PRs for AGENTS.md are necessary, but not sufficient, exactly because of non-determinism. You can LGTM the AGENTS.md change, but it's so hard to know what downstream behavioral effects it has. I feel like the only way to really know is by building a benchmark on your repo, and actually A/B testing the AGENTS.md change. I'm building something in the space - happy to share if it's something that sounds interesting to you

dominotw · 2026-03-27T20:02:46 1774641766

NO EXCEPTIONS!!!!!!!!!!!!!!!!!!!!!!!!

cute that you think cluade gives a rat ass about this.

nunez · 2026-03-28T06:18:27 1774678707

Claude won’t do me wrong; that’s what the exclamation marks are for!