How I write software with LLMs

akhrail1996 · 2026-03-16T06:18:17 1773641897

Genuine question: what's the evidence that the architect → developer → reviewer pipeline actually produces better results than just... talking to one strong model in one session?

The author uses different models for each role, which I get. But I run production agents on Opus daily and in my experience, if you give it good context and clear direction in a single conversation, the output is already solid. The ceremony of splitting into "architect" and "developer" feels like it gives you a sense of control and legibility, but I'm not convinced it catches errors that a single model wouldn't catch on its own with a good prompt.

arialdomartini · 2026-03-16T07:56:51 1773647811

This is anecdotal but just a couple days ago, with some colleagues, we conducted a little experiment to gather that evidence.

We used a hierarchy of agents to analyze a requirement, letting agents with different personas (architect, business analyst, security expert, developer, infra etc) discuss a request and distill a solution. They all had access to the source code of the project to work on.

Then we provided the very same input, including the personas' definition, straight to Claude Code, and we compared the result.

They council of agents got to a very good result, consuming about 12$, mostly using Opus 4.6.

To our surprise, going straight with a single prompt in Claude Code got to a similar good result, faster and consuming 0.3$ and mostly using Haiku.

This surely deserves more investigation, but our assumption / hypothesis so far is that coordination and communication between agents has a remarkable cost.

Should this be the case, I personally would not be surprised:

- the reason why we humans do job separation is because we have an inherent limited capacity. We cannot reach the point to be experts in all the needed fields : we just can't acquire the needed knowledge to be good architects, good business analysts, good security experts. Apparently, that's not a problem for a LLM. So, probably, job separation is not a needed pattern as it is for humans.

- Job separation has an inherent high cost and just does not scale. Notably, most of the problems in human organizations are about coordination, and the larger the organization the higher the cost for processes, to the point processed turn in bureaucracy. In IT companies, many problems are at the interface between groups, because the low-bandwidth communication and inherent ambiguity of language. I'm not surprised that a single LLM can communicate with itself way better and cheaper that a council of agents, which inevitably faces the same communication challenges of a society of people.

titanomachy · 2026-03-16T09:47:46 1773654466

If it could be done with 30 cents of Haiku calls, maybe it wasn't a complicated enough project to provide good signal?

jimbokun · 2026-03-22T04:34:22 1774154062

In that case a $12 program is probably too big to meaningfully review. Probably better to have smaller chunks you can review instead of generating one really large program in one shot.

arialdomartini · 2026-03-16T10:20:15 1773656415

Fair point. I could try with a harder problem. This still does not explain why Claude Code felt the need to use Opus, and why Opus felt the need to burn 12$ or such an easy task. I mean, it's 40 times the cost.

titanomachy · 2026-03-16T10:40:34 1773657634

I'm a bit confused actually, you said you used Claude Code for both examples? Was that a typo, or was it (1) Claude Code instructed to use a hierarchy of agents and (2) Claude Code allowed to do whatever it wants?

throwawayffffas · 2026-03-17T10:47:10 1773744430

I think the benefit may be task separation and cleaning the context between tasks. Asking a single session to do all three has a couple of downsides.

1. The context for each task gets longer, which we know degrades performance.

2. In that longer context, implicit decisions are made in the thinking steps, the model is probably more likely to go through with bad decisions that were made 20 steps back.

The way Stavros does it, is Architect -> Dev -> Review. By splitting the task in three sessions, we get a fresh and shorter context for each task. At minimum skipping the thinking messages and intermediary tool output, should increase the chances of a better result.

Using different agent personas and models at least introduces variability at the token generation, whether it's good or bad, I do not know. As far as I know in general it's supposed to help.

Having the sessions communicate I think is a mistake, because you lose all of the benefits of cleaning up the context, and given the chattiness of LLMs you are probably going to fill up the context with multiple thinking rounds over the same message, one from the session that outputs the message and one from the session reading the message, you are probably going to have competing tool uses, each session using it's own tool calls to read the same content, it will probably be a huge mess.

The way I do it is I have a large session that I interact with and task with planning and agent spawning. I don't have dedicated personas or agents. The benefits the way I see them are I have a single session with an extensive context about what we are doing and then a dedicated task handler with a much more focused context.

What I have seen with my setup is, impressively good performance at the beginning that degrades as feedback and tweaks around work pile up.

TimTheTinker · 2026-03-19T15:24:18 1773933858

Framing LLM use for dev tasks as "narrative" is powerful.

If you want specific, empirical, targeted advice or work from an LLM, you have to frame the conversation correctly. "You are a tenured Computer Science professor agent being consulted on a data structure problem" goes a very long way.

Similarly, context window length and prior progress exerts significant pressure on how an LLM frames its work. At some point (often around 200k-400k tokens in), they seem to reach a "we're in the conclusion of this narrative" point and will sometimes do crazy stuff to reach whatever real or perceived goal there is.

mikkupikku · 2026-03-16T18:02:44 1773684164

Probably the same reason it takes a team of developers and managers 6 months to write what one or two developers can do on their own in one week. The overhead caused by constant meetings and negotiations is massive.

andrekandre · 2026-03-21T02:59:51 1774061991

  > The overhead caused by constant meetings and negotiations is massive.

this is my life ngl. i really wish these ai companies would work in automating away all this bullshit instead of just code code code

just the other day i was asked to prepare slides for a presentation about something everyone already knows (among many other useless side-work)... i feel like with "ai" in general we are applying bandages where my real problem is the big machine that gives me paper cuts all day...

jimbokun · 2026-03-22T04:32:17 1774153937

Even with humans I’ve found full ownership of a project from architecture to implementation to deployment and operation, produces the best results.

Less context switching and communication overhead. Focus on well thought out and documented APIs to divide work across developers and support communication and collaboration .

Miraste · 2026-03-16T18:35:43 1773686143

LLMs also don't have the primary advantage humans get from job separation, diverse perspectives. A council of Opuses are all exploring the exact same weights with the exact same hardware, unlike multiple humans with unique brains and memories. Even with different ones, Codex 5.3 is far more similar to Opus than any two humans are to each other. Telling an Opus agent to focus on security puts it in a different part of the weights, but it's the same graph-- it's not really more of an expert than a general Opus agent with a rule to maintain secure practices.

visarga · 2026-03-16T19:08:34 1773688114

You can differentiate by context, one sees the work session, the other sees just the code. Same model, but different perspectives. Or by model, there are at least 7 decent models between the top 3 providers.

Miraste · 2026-03-16T20:49:48 1773694188

I know, but none of those is nearly as much of a difference as another human looking at code. The top models have such overlapping training data they sometimes identify as each other.

visarga · 2026-03-16T19:06:33 1773687993

An ensemble can spot more bugs / fixes than a single model. I run claude, codex and gemini in parallel for reviews.

2026-03-16T08:38:20 1773650300

[dead]

theshrike79 · 2026-03-17T08:42:50 1773736970

Agentic pipelines and systems fall into the same issues as humans who work together, mostly communication.

It's not like they can dump their full context to the "manager" agent, they need to condense stuff, which will result in misinterpreted information or missing information on decisions down the line.

IMO this was more relevant when agents had limited context windows

AtreusS · 2026-03-17T12:00:41 1773748841

This I believe is true. Have been working on an Agentic architecture and whenever there was a new requirement the simple workflow was to create an specific agent for that. Earlier the context windows were small and this was the default solution. Overtime our total agents have become vast that it is a headache to maintain and debug.

BloondAndDoom · 2026-03-16T22:02:51 1773698571

Absolutely works with frontier models. What do you think about smaller models in these pipelines? That’s literally what I’m working on, with qwen3.5-27b and im splitting the task to 4 steps and not sure if that’s the way to go. Do you have any experience to share?

moduspol · 2026-03-16T15:44:13 1773675853

To me, such techniques feel like temporary cudgels that may or may not even help that will be obsolete in 1-6 months.

This is similar to telling Claude Code to write its steps into a separate markdown file, or use separate agents to independently perform many tasks, or some of the other things that were commonly posted about 3-6+ months ago. Now Claude Code does that on its own if necessary, so it's probably a net negative to instruct it separately.

Some prompting techniques seem ageless (e.g. giving it a way to validate its output), but a lot of these feel like temporary scaffolding that I don't see a lot of value in building a workflow around.

TheMuenster · 2026-03-16T18:08:01 1773684481

Totally agree - the fundamental concept here of automatically improving context control when writing code is absolutely something that will be baked into agents in 6 months. The reason it hasn't yet is mainly because the improvements it makes seem to be very marginal.

You can contrast this to something like reasoning, which offered very large, very clear improvements in fundamental performance, and as a result was tackled very aggressively by all the labs. Or (like you mentioned) todo lists, which gave relatively small gains but were implemented relatively quickly. Automatic context control is just going to take more time to get it right, and the gains will be quite small.

visarga · 2026-03-16T19:12:28 1773688348

Workflow matters too, how you organize your docs, work tasks, reviews. If you do it all by hand you spend a lot of time manually enforcing a process that can be automated.

I think task files with checkable gates are a very interesting animal - they carry intent, plan, work and reviews, at the end of work can become docs. Can be executed, but also passed as value, and reflect on themselves - so they sport homoiconicity and reflexion.

kybernetikos · 2026-03-16T07:33:42 1773646422

There's a lot of cargo culting, but it's inevitable in a situation like this where the truth is model dependent and changing the whole time and people have created companies on the premise they can teach you how to use ai well.

_heimdall · 2026-03-16T14:30:51 1773671451

Its also inevitable given that we still don't even really know how these models work or what they do at inference time.

We know input/output pairs, when using a reasoning model we can see a separate stream of text that is supposedly insight into what the model is "thinking" during inference, and when using multiple agents we see what text they send to each other. That's it.

andrekandre · 2026-03-21T03:03:33 1774062213

  > separate stream of text that is supposedly insight into what the model is "thinking" during inference

taking a look at those streams is almost disturbing and hilarious at the same time... like looking into the mind of a paranoiac.

never_inline · 2026-03-16T09:39:37 1773653977

I think this is just anthropomorphism. Sub agents make sense as a context saving mechanism.

Aider did an "architect-editor" split where architect is just a "programmer" who doesn't bother about formatting the changes as diff, then a weak model converts them into diffs and they got better results with it. This is nothing like human teams though.

TheMuenster · 2026-03-16T17:55:43 1773683743

Absolutely agree with this. The main reason for this improving performance is simply that the context is being better controlled, not that this approach is actually going going to yield better results fundamentally.

Some people have turned context control into hallucinated anthropomorphic frameworks (Gas Town being perhaps the best example). If that's how they prefer to mentally model context control, that's fine. But it's not the anthropomorphism that's helping here.

jaredklewis · 2026-03-16T06:59:29 1773644369

> what's the evidence

What’s the evidence for anything software engineers use? Tests, type checkers, syntax highlighting, IDEs, code review, pair programming, and so on.

In my experience, evidence for the efficacy of software engineering practices falls into two categories:

- the intuitions of developers, based in their experiences.

- scientific studies, which are unconvincing. Some are unconvincing because they attempt to measure the productivity of working software engineers, which is difficult; you have to rely on qualitative measures like manager evaluations or quantitative but meaningless measures like LOC or tickets closed. Others are unconvincing because they instead measure the practice against some well defined task (like a coding puzzle) that is totally unlike actual software engineering.

Evidence for this LLM pattern is the same. Some developers have an intuition it works better.

codemog · 2026-03-16T07:16:24 1773645384

My friend, there’s tons of evidence of all that stuff you talked about in hundreds of papers on arxiv. But you dismiss it entirely in your second bullet point, so I’m not entirely sure what you expect.

jaredklewis · 2026-03-16T13:50:42 1773669042

I’ve read dozens of them and find them unconvincing for the reasons outlined. If you want a more specific critique, link a paper.

I personally like and use tests, formal verification, and so on. But the evidence for these methods are weak.

edit: To be clear, I am not ragging on the researchers. I think it's just kind of an inherently messy field with pretty much endless variables to control for and not a lot of good quantifiable metrics to rely on.

thesz · 2026-03-16T07:16:55 1773645415

You can measure customer facing defects.

Also, lines of code is not completely meaningless metric. What one should measure is lines of code that is not verified by compiler. E.g., in C++ you cannot have unbalanced brackets or use incorrectly typed value, but you still may have off-by-one error.

Given all that, you can measure customer facing defect density and compare different tools, whether they are programming languages, IDEs or LLM-supported workflow.

jaredklewis · 2026-03-16T15:50:11 1773676211

It's not the worst metric ever, but is the study in question observational or controlling the problem?

My issue with the observational studies is that basically everything is uncontrolled. Maybe the IDEs are causing less defects. Or maybe some problems are just harder and more defect prone than others. Maybe some teams are better managed or get clearer specifications and so on. Maybe some organizations are better at recording defects and can't be fairly compared with organizations that just report less. The studies don't ever reach the scale where you become confident these things wash out.

If they control the problem, most of those issues are eliminated (though not all, for example the experience and education of the participants still needs to be controlled), but now you are left wondering how well the findings transfer from the toy projects in the experiment into real life.

But finally, it's still not a perfect metric because not all defects are equal, right? What if some tool/process helped you reduce a large number of mostly cosmetic defects, but increases the occurrence of catastrophic defects?

re: LoC, there's some signal here, but it's such a noisy channel, I've never read a study that I thought put it to good use. Happy to have my mind changed if you have a link to one.

codeflo · 2026-03-16T07:56:33 1773647793

> Also, lines of code is not completely meaningless metric.

Comparing lines of code can be meaningful, mostly if you can keep a lot of other things constant, like coding style, developer experience, domain, tech stack. There are many style differences between LLM and human generated code, so that I expect 1000 lines of LLM code do a lot less than 1000 lines of human code, even in the exact same codebase.

jacquesm · 2026-03-16T07:27:17 1773646037

The proper metric is the defect escape rate.

exidex · 2026-03-16T07:35:03 1773646503

Now you have to count defects

jacquesm · 2026-03-16T07:36:40 1773646600

You have to do that anyway, and in fact you probably were already doing that. If you do not track this then you are leaving a lot on the table.

exidex · 2026-03-16T10:49:10 1773658150

I was more thinking in terms of creating a benchmark which would optimized during training. For regular projects, I agree, you have to count that anyway

slopinthebag · 2026-03-16T07:49:41 1773647381

Most developer intuitions are wrong.

See: OOP

vbezhenar · 2026-03-16T08:50:07 1773651007

Intuition is subjective. It's hard to convert subjective experience to objective facts.

tomgp · 2026-03-16T09:23:35 1773653015

That's what science is though * our intuition/ hunch/ guess is X * now let's design an experiment which can falsify X

lbreakjai · 2026-03-16T08:58:21 1773651501

The different models is a big one. In my workflow, I've got opus doing the deep thinking, and kimi doing the implementation. It helps manage costs.

Sample size of one, but I found it helps guard against the model drifting off. My different agents have different permissions. The worker can not edit the plan. The QA or planner can't modify the code. This is something I sometimes catch codex doing, modifying unrelated stuff while working.

sigbottle · 2026-03-16T09:28:09 1773653289

I recently had a horrible misalignment issue with a 1 agent loop. I've never done RL research, but this kind of shit was the exact kind of thing I heard about in RL papers - shimming out what should be network tests by echoing "completed" with the 'verification' being grepping for "completed", and then actually going and marking that off as "done" in the plan doc...

Admittedly I was using gsdv2; I've never had this issue with codex and claude. Sure, some RL hacking such as silent defaults or overly defensive code for no reason. Nothing that seemed basically actively malicious such as the above though. Still, gsdv2 is a 1-agent scaffolding pipeline.

I think the issue is that these 1-agent pipelines are "YOU MUST PLAN IMPLEMENT VERIFY EVERYTHING YOURSELF!" and extremely aggressive language like that. I think that kind of language coerces the agent to do actively malicious hacks, especially if the pipeline itself doesn't see "I am blocked, shifting tasks" as a valid outcome.

1-agent pipelines are like a horrible horrible DFS. I still somewhat function when I'm in DFS mode, but that's because I have longer memory than a goldfish.

jumploops · 2026-03-16T07:16:40 1773645400

After "fully vibecoding" (i.e. I don't read the code) a few projects, the important aspect of this isn't so much the different agents, but the development process.

Ironically, it resembles waterfall much more so than agile, in that you spec everything (tech stack, packages, open questions, etc.) up front and then pass that spec to an implementation stage. From here you either iterate, or create a PR.

Even with agile, it's similar, in that you have some high-level customer need, pass that to the dev team, and then pass their output to QA.

What's the evidence? Admittedly anecdotal, as I'm not sure of any benchmarks that test this thoroughly, but in my experience this flow helps avoid the pitfall of slop that occurs when you let the agent run wild until it's "done."

"Done" is often subjective, and you can absolutely reach a done state just with vanilla codex/claude code.

Note: I don't use a hierarchy of agents, but my process follows a similar design/plan -> implement -> debug iteration flow.

totomz · 2026-03-16T07:17:31 1773645451

I think the splitting make sense to give more specific prompts and isolated context to different agents. The "architect" does not need to have the code style guide in its context, that actually could be misleading and contains information that drives it away from the architecture

ako · 2026-03-16T07:59:42 1773647982

Wouldn’t skills already solve this? A harness can start a new agent with a specific skill if it thinks that makes sense.

est · 2026-03-16T07:19:34 1773645574

> the architect → developer → reviewer pipeline actually produces better results than just... talking to one strong model in one session?

There's a 63 pages paper with mathematical proof if you really into this.

https://arxiv.org/html/2601.03220v1

My takeaway: AI learns from real-world texts, and real-world corpus are used to have a role split of architect/developer/reviewer

codeflo · 2026-03-16T07:46:43 1773647203

>> the architect → developer → reviewer pipeline actually produces better results than just... talking to one strong model in one session?

> There's a 63 page paper with mathematical proof if you really into this.

> https://arxiv.org/html/2601.03220v1

I'm confused. The linked paper is not primarily a mathematics paper, and to the extent that it is, proves nothing remotely like the question that was asked.

est · 2026-03-16T09:30:25 1773653425

> proves nothing remotely like the question that was asked

I am not an expert, but by my understanding, the paper prooves that a computationally bounded "observer" may fail to extract all the structure present in the model in one computation. aka you can't always one-shot perfect code.

However, arrange many pipelines of roles "observers" may gradually get you there

cbg0 · 2026-03-16T07:59:31 1773647971

Perhaps this paper might be more relevant with regards to multi-agent pipelines https://arxiv.org/html/2404.04834v4

anhner · 2026-03-16T07:50:21 1773647421

Can you explain how this paper is relevant to the comment you replied to?

troupo · 2026-03-16T06:42:24 1773643344

> produces better results than just... talking to one strong model in one session?

I think the author admits that it doesn't, doesn't realise it and just goes on:

--- start quote ---

On projects where I have no understanding of the underlying technology (e.g. mobile apps), the code still quickly becomes a mess of bad choices. However, on projects where I know the technologies used well (e.g. backend apps, though not necessarily in Python), this hasn’t happened yet

--- end quote ---

BloondAndDoom · 2026-03-16T22:00:13 1773698413

I think many people don’t understand that you can just say check again to the model you use and it will just find bugs, repeat it until there are no bugs. Sounds stupid but it works. It’s intuitive to assume another will help more and better review but yeah I don’t know.

Also in my experience codex always writes better code and more sensible than Claude (but slower)

zingar · 2026-03-16T10:45:46 1773657946

Nitpick: I don’t think architect is a good name for this role. It’s more of a technical project kickoff function: these are the things we anticipate we need to do, these are the risks etc.

I do find it different from the thinking that one does when writing code so I’m not surprised to find it useful to separate the step into different context, with different tools.

Is it useful to tell something “you are an architect?” I doubt it but I don’t have proof apart from getting reasonable results without it.

With human teams I expect every developer to learn how to do this, for their own good and to prevent bottlenecks on one person. I usually find this to be a signal of good outcomes and so I question the wisdom of biasing the LLM towards training data that originates in spaces where “architect” is a job title.

stavros · 2026-03-16T10:52:30 1773658350

It's not about splitting for quality, it's about cost optimisation (Sonnet implements, which is cheaper). The quality comes with the reviewers.

Notice that I didn't split out any roles that use the same model, as I don't think it makes sense to use new roles just to use roles.

indigodaddy · 2026-03-16T14:02:37 1773669757

Is Sonnet good enough for, hey I have this bug (it's easy to describe and identify), now go fix it?

rayrayf · 2026-03-22T20:09:12 1774210152

And also the sales>product>engineering loop. If one can - with all the context - accomplish what the traditional loop is meant to (explore validity, feasibility, and strategic or revenue impact) how do all our jobs change. Know this isn’t new but the organizational dynamics are fun to dive in to.

avnsv · 2026-03-17T07:51:19 1773733879

Using different models for the architect and developer roles isn't necessarily better because each model's solution lives in a different vector space. So when the architect (model A) produces a solution, the developer (model B) will implement the solution based on a different "mental map". So you may end up with a gap.

I normally use the same model (e.g. Codex 5.3) for planning, implementation, and testing, and then have another model (e.g. Opus 4.6) review the result to identify any issues and edge cases the developer didn't foresee and the tester didn't spot. Then I take the output and pass it back to the developer model to have it fix the issues.

dep_b · 2026-03-16T10:23:13 1773656593

“…if you give it good context…” that’s what the architect session is for basically. You throw around ideas and store the direction you want to go.

Then you execute it with a clean context.

Clean context is needed for maximum performance while not remembering implementation dead ends you already discarded

Havoc · 2026-03-16T08:38:21 1773650301

Yeah always seemed pretty sus to me to.

At the same time I can see a more linear approach doing similar. Like when I ask for an implementation plan that is functional not all that different from an architect agent even if not wrapped in such a persona

_heimdall · 2026-03-16T14:39:33 1773671973

Once model providers started releasing "reasoning" models, and later roles and multi-agent systems, it seemed pretty clear to me they are just automating the process of prompt engineering.

They track everything we all do in a chat, then learn the patterns that work and build them in. Rinse and repeat.

Tarq0n · 2026-03-16T08:12:49 1773648769

In machine learning, ensembles of weaker models can outperform a single strong model because they have different distributions of errors. Machine learning models tend to have more pronounced bias in their design than LLMs though.

So to me it makes sense to have models with different architecture/data/post training refine each other's answers. I have no idea whether adding the personas would be expected to make a difference though.

hakanderyal · 2026-03-16T07:52:56 1773647576

One added benefit is it allows you to throw more tokens to the problem. It’s the most impactful benefit even.

Context & how LLMs work requires this.

From my experience no frontier model produces bug free & error free code with the first pass, no matter how much planning you do beforehand.

With 3 tiers, you spend your token & context budget in full in 3 phases. Plan, implement, review.

If the feature is complex, multiple round of reviews, from scratch.

It works.

palmotea · 2026-03-16T07:14:41 1773645281

> Genuine question: what's the evidence that the architect → developer → reviewer pipeline actually produces better results than just... talking to one strong model in one session?

Using multiple agents in different roles seems like it'd guard against one model/agent going off the rails with a hallucination or something.

jwilliams · 2026-03-16T10:00:40 1773655240

If you know what you need, my experience is that a well-formed single-prompt that fits the context gives the best results (and fastest).

If you’re exploring an idea or iterating, the roles can help break it down and understand your own requirements. Personally I do that “away” from the code though.

awesome_dude · 2026-03-16T07:51:13 1773647473

I have been using different models for the same role - asking (say) Gemini, then, if I don't like the answer asking Claude, then telling each LLM what the other one said to see where it all ends up

Well I was until the session limit for a week kicked in.

fleetfox · 2026-03-16T09:04:12 1773651852

Even for reducing the context size it's probably worth it. If you have to go back back and forth on both problem and implementation even with these new "large" contexts if find quality degrading pretty fast.

luxcem · 2026-03-16T09:13:29 1773652409

The agent "personalities" and LLM workflow really looks like cargo-cult behavior. It looks like it should be better but we don't really have data backing this.

xnx · 2026-03-16T21:30:05 1773696605

We are at the horseless carriage stage where people are recreating the old system without considering if it is necessary.

hansonkd · 2026-03-16T14:39:34 1773671974

Any of these abstractions are just temporary.

imiric · 2026-03-16T06:55:40 1773644140

Evidence? My friend, most of the practices in this field are promoted and adopted based on hand-waving, feelings, and anecdata from influencers.

Maybe you should write and share your own article to counter this one.

z3t4 · 2026-03-16T07:10:04 1773645004

Also if something is fun, we prefer to to it that way instead of the boring way. Then it depends on how many mines you step on, after a while you try to avoid the mines. That's when your productivity goes down radically. If we see something shiny we'll happily run over the minefield again though.

christofosho · 2026-03-16T02:35:28 1773628528

I like reading these types of breakdowns. Really gives you ideas and insight into how others are approaching development with agents. I'm surprised the author hasn't broken down the developer agent persona into smaller subagents. There is a lot of context used when your agent needs to write in a larger breadth of code areas (i.e. database queries, tests, business logic, infrastructure, the general code skeleton). I've also read[1] that having a researcher and then a planner helps with context management in the pre-dev stage as well. I like his use of multiple reviewers, and am similarly surprised that they aren't refined into specialized roles.

I'll admit to being a "one prompt to rule them all" developer, and will not let a chat go longer than the first input I give. If mistakes are made, I fix the system prompt or the input prompt and try again. And I make sure the work is broken down as much as possible. That means taking the time to do some discovery before I hit send.

Is anyone else using many smaller specific agents? What types of patterns are you employing? TIA

1. https://github.com/humanlayer/advanced-context-engineering-f...

marcus_holmes · 2026-03-16T03:14:34 1773630874

that reference you give is pretty dated now, based on a talk from August which is the Beforetimes of the newer models that have given such a step change in productivity.

The key change I've found is really around orchestration - as TFA says, you don't run the prompt yourself. The orchestrator runs the whole thing. It gets you to talk to the architect/planner, then the output of that plan is sent to another agent, automatically. In his case he's using an architect, a developer, and some reviewers. I've been using a Superpowers-based [0] orchestration system, which runs a brainstorm, then a design plan, then an implementation plan, then some devs, then some reviewers, and loops back to the implementation plan to check progress and correctness.

It's actually fun. I've been coding for 40+ years now, and I'm enjoying this :)

[0] https://github.com/obra/superpowers

indigodaddy · 2026-03-16T03:53:04 1773633184

Can you bolt superpowers onto an existing project so that it uses the approach going forward (I'm using Opencode), or would that get too messy?

eclipxe · 2026-03-16T06:32:15 1773642735

Yes. But gsd is even better - especially gsd2

marcus_holmes · 2026-03-17T00:11:30 1773706290

looks interesting, thanks for the tip :)

marcus_holmes · 2026-03-17T00:10:33 1773706233

These are just skills, so you can add the skills to your setup and start using them whenever you like.

indigodaddy · 2026-03-17T14:46:03 1773758763

Thanks

stavros · 2026-03-16T11:20:08 1773660008

I don't think that splitting into subagents that use the same model will really help. I need to clarify this in the post, but the split is 1) so I can use Sonnet to code and save on some tokens and 2) so I can get other models to review, to get a different perspective.

It seems to me that splitting into subagents that use the same model is kind of like asking a person to wear three different hats and do three different parts of the job instead of just asking them to do it all with one hat. You're likely to get similar results.

chriswarbo · 2026-03-16T11:35:52 1773660952

I'm considering using subagents, as a way to manage context and delegate "simple" tasks to cheaper models (if you want to see tokens burn, watch Opus try fixing a misplaced ')' in a Lisp file!).

I see what you mean w.r.t. different hats; but is it useful to have different tools available? For example, a "planner" having Web access and read-only file access, versus a "developer" having write access to files but no Web access?

stavros · 2026-03-16T11:50:12 1773661812

Yes, if you want to separate capabilities, definitely.

lbreakjai · 2026-03-16T08:49:21 1773650961

It's interesting to see some patterns starting to emerge. Over time, I ended up with a similar workflow. Instead of using plan files within the repository, I'm using notion as the memory and source of truth.

My "thinker" agent will ask questions, explore, and refine. It will write a feature page in notion, and split the implementation into tasks in a kanban board, for an "executor" to pick up, implement, and pass to a QA agent, which will either flag it or move it to human review.

I really love it. All of our other documentation lives in notion, so I can easily reference and link business requirements. I also find it much easier to make sense of the steps by checking the tickets on the board rather than in a file.

Reviewing is simpler too. I can pick the ticket in the human review column, read the requirements again, check the QA comments, and then look at the code. Had a lot of fun playing with it yesterday, and I shared it here:

https://github.com/marcosloic/notion-agent-hive

Cthulhu_ · 2026-03-16T09:01:04 1773651664

No criticism or anything, but it really does feel / sound like you (and others who embraced LLMs and agentic coding) aspire to be more of a product manager than a coder. Thing is, a "real" PM comes with a lot more requirements and there's less demand for them - more requirements in that you need to be a people person and willing to spend at least half your time in meetings, and less demand because one PM will organize the work for half a dozen developers (minimum).

Some people say LLM assisted coding will cost a lot of developers' jobs, but posts like this imply it'll cost (solve?) a lot of management / overhead too.

Mind you I've always thought project managers are kinda wasteful, as a software developer I'd love for Someone Else to just curate a list of tasks and their requirements / acceptance criteria. But unfortunately that's not the reality and it's often up to the developers themselves to create the tasks and fill them in, then execute them. Which of course begs the question, why do we still have a PM?

(the above is anecdotal and not a universal experience I'm sure. I hope.)

lbreakjai · 2026-03-16T09:58:28 1773655108

I worked with some excellent PMs in the past, it's an entirely different skillset. This wasn't really meant to replace what they do. I really wanted something with which to work at feature-level. That is, after all the hard work of figuring out _what_ to build has been done.

> as a software developer I'd love for Someone Else to just curate a list of tasks and their requirements / acceptance criteria

That's interesting. In every team I worked in, I always fought really hard against anyone but developers being able to write tickets on the board.

fooster · 2026-03-16T12:14:52 1773663292

“one PM will organize the work for half a dozen developer”

That isn’t the job of a PM.

adampunk · 2026-03-16T18:07:53 1773684473

This seems more about how you view PMs than anything else.

highfrequency · 2026-03-16T15:39:19 1773675559

> I’ll tell the LLM my main goal (which will be a very specific feature or bugfix e.g. “I want to add retries with exponential backoff to Stavrobot so that it can retry if the LLM provider is down”), and talk to it until I’m sure it understands what I want. This step takes the most time, sometimes even up to half an hour of back-and-forth until we finalize all the goals, limitations, and tradeoffs of the approach, and agree on what the end architecture should look like.

This sounds sensible, but also makes me wonder how much time is actually being saved if implementing a "very specific feature or bugfix" still takes an hour of back and forth with an LLM.

Can't help but think that this is still just an awkward intermediate phase of development with adolescent LLMs where we need to think about implementation choices at all.

stavros · 2026-03-16T16:48:36 1773679716

Small features or bugfixes generally take a minute or two of conversation.

miguelgrinberg · 2026-03-16T10:00:26 1773655226

> One thing I’ve noticed is that different people get wildly different results with LLMs, so I suspect there’s some element of how you’re talking to them that affects the results.

It's always easier to blame the prompt and convince yourself that you have some sort of talent in how you talk to LLMs that other's don't.

In my experience the differences are mostly in how the code produced by the LLM is reviewed. Developers who have experience reviewing code are more likely to find problems immediately and complain they aren't getting great results without a lot of hand holding. And those who rarely or never reviewed code from other developers are invariably going to miss stuff and rate the output they get higher.

zackify · 2026-03-16T12:21:33 1773663693

This definitely is the case. I was talking to someone complaining about how llms don't work good.

They said it couldn't fix an issue it made.

I asked if they gave it any way to validate what it did.

They did not, some people really are saying "fix this" instead of saying "x fn is doing y when someone makes a request to it. Please attempt to fix x and validate it by accessing the endpoint after and writing tests"

Its shocking some people don't give it any real instruction or way to check itself.

In addition I get great results doing voice to text with very specific workflows. Asking it to add a new feature where I describe what functions I want changed then review as I go vs wait for the end.

mbesto · 2026-03-16T13:29:21 1773667761

> Its shocking some people don't give it any real instruction or way to check itself.

It's not shocking. The tech world is telling them that "Claude will write all of their app easily" with zero instructions/guidelines so of course they're going to send prompts like that.

tracker1 · 2026-03-16T16:05:26 1773677126

I think the implications of limited to no instructions are a little to way off depending on what you're doing... CRUD APIs, sure... especially if you have a well defined DB schema and API surface/approach. Anything that might get complex, less so.

Two areas I've really appreciated LLMs so far... one is being able to make web components that do one thing well in encapsulation.. I can bring it into my project and just use it... AI can scaffold a test/demo app that exercises the component with ease and testing becomes pretty straight forward.

The other for me has been in bridging rust to wasm and even FFI interfaces so I can use underlying systems from Deno/Bun/Node with relative ease... it's been pretty nice all around to say the least.

That said, this all takes work... lots of design work up front for how things should function... weather it's a ui component or an API backend library. From there, you have to add in testing, and some iteration to discover and ensure there aren't behavioral bugs in place. Actually reviewing code and especially the written test logic. LLMs tend to over-test in ways that are excessive or redundant a lot of the time. Especially when a longer test function effectively also tests underlying functionalities that each had their own tests... cut them out.

There's nothing "free" and it's not all that "easy" either, assuming you actually care about the final product. It's definitely work, but it's more about the outcome and creation than the grunt work. As a developer, you'll be expected to think a lot more, plan and oversee what's getting done as opposed to being able to just bang out your own simple boilerplate for weeks at a time.

andrekandre · 2026-03-21T03:15:37 1774062937

  > this all takes work... lots of design work up front for how things should function... weather it's a ui component or an API backend library. From there, you have to add in testing, and some iteration to discover and ensure there aren't behavioral bugs in place.

thats the reality, but the marketing to the top-level people (and mass media) is like the other poster stated, and that filters to devs as well causing this big gap in expectations going in

mikkupikku · 2026-03-16T13:36:57 1773668217

It's surprising they don't learn better after their first hour or two of use. Or maybe they do know better but don't like the thing so they deliberately give it rope to hang itself with, then blame overzealous marketting.

petcat · 2026-03-16T12:27:30 1773664050

If you tell a human junior developer just "fix this" then they will spend a week on a wild-goose chase with nothing to show for it.

At least the LLM will only take 5 minutes to tell you they don't know what to do.

ruszki · 2026-03-16T12:47:59 1773665279

Do they? I’ve never got a response that something was impossible, or stupid. LLMs are happy to verify that a noop does nothing, if they don’t know how to fix something. They rather make something useless than really tackle a problem, if they can make tests green that way, or they can claim that something “works”.

And’ve I never asked Claude Code something which is really impossible, or even really difficult.

mikkupikku · 2026-03-16T13:43:58 1773668638

Claude code will happily tell me my ideas are stupid, but I think that's because I nest my ideas in between other alternative ideas and ask for an evaluation of all of them. This effectively combats the sycophantic tendencies.

Still, sometimes claude will tell me off even when I don't give it alternatives. Last night I told it to use luasocket from an mpv userscript to connect to a zeromq Unix socket (and also implement zmq in pure lua) connected to an ffmpeg zmq filter to change filter parameters on the fly. Claude code all but called me stupid and told me to just reload the filter graph through normal mpv means when I make a change. Which was a good call, but I told it to do the thing anyway and it ended up working well, so what does it really know... Anyway, I like that it pushes back, but agrees to commit when I insist.

seunosewa · 2026-03-16T15:57:58 1773676678

After such hard-won wins, ask the AI to save what it learned during the session to a MD file.

IanCal · 2026-03-16T13:23:52 1773667432

I've definitely had pushback on what to do or approaches, yes. I've had this more recently because I've been pushing more on a side of "I want to know if this would end up being fast enough / allow something that it'd be worth doing". I've had to argue harder for something recently, and I'm genuinely not sure if it is possible or not. While it's not flat out refused to do it, it's explained to me why it won't work, and taken some pushing to implement parts of it. My gut feeling is that the blockers it is describing are real but we can sidestep them by taking a wilder swing at the change, but I'm not sure I'm right.

speakingmoistly · 2026-03-16T13:01:00 1773666060

To be fair, that happening feels more like poor management and mentorship than "juniors are scatterbrained".

Over time, you build up the right reflexes that avoid a one-week goose chase with them. Heck, since we're working with people, you don't just say " fix this", you earmark time to make sure everyone is aligned on what needs done and what the plan is.

dkersten · 2026-03-16T15:59:34 1773676774

> At least the LLM will only take 5 minutes to tell you they don't know what to do.

In my experience, the LLM will happily try the wrong thing over and over for hours. It rarely will say it doesn’t know.

Ancapistani · 2026-03-16T16:25:30 1773678330

Don’t ask it to make changes off the bat, then - ask it to make a plan. Then inspect the plan, change it if necessary, and go from there.

dkersten · 2026-03-17T16:24:31 1773764671

I do. I tend to follow a strict Research, Plan, Implement workflow. It does greatly help, but it doesn’t eliminate all problems.

icedchai · 2026-03-16T22:28:33 1773700113

An LLM might take 5 minutes, or 20 minutes, and still do the wrong thing. Rarely have I seen an LLM not "know what to do." A coworker told it to fix some unit tests, it churned away for a while, then changed a bunch of assert status == 200 to 500. Good news, tests pass now!

sobjornstad · 2026-03-16T12:49:17 1773665357

There are subtler versions of this too. I've been working on a TUI app for a couple of weeks, and having great success getting it to interactively test by sending tmux commands, but every once in a while it would just deliver code that didn't work. I finally realized it was because the capture tools I gave it didn't capture the cursor location, so it would, understandably, get confused about where it was and what was selected.

I promptly went and fixed this before doing any more work, because I know if I was put in that situation I would refuse to do any more work until I could actually use the app properly. In general, if you wouldn't be able to solve a problem with the tools you give an LLM, it will probably do a bad job too.

raw_anon_1111 · 2026-03-16T15:29:01 1773674941

I made that mistake when I first started using Claude Code/Codex. Now I give it access to my isolated DEV AWS account with appropriately scoped permissions on the IAM level with temporary credentials and tell it how to validate the code and have in my markdown file to always use $x to test any changes to $y.

It’s gotten a lot better.

tracker1 · 2026-03-16T15:57:59 1773676679

Yeah, the more time I spend in planning and working through design/api documentation for how I want something to work, the better it does... Similar for testing against your specifications, not the code... once you have a defined API surface and functional/unit tests for what you're trying to do, it's all the harder for AI to actually mess things up. Even more interesting is IMO how well the agents work with Rust vs other languages the more well defined your specifications are.

Ancapistani · 2026-03-16T16:11:23 1773677483

> some people really are saying "fix this" instead of saying "x fn is doing y when someone makes a request to it. Please attempt to fix x and validate it by accessing the endpoint after and writing tests"

This works about 85% of the time IME, in Claude Code. My normal workflow on most bugs is to just say “fix this” and paste the logs. The key is that I do it in plan mode, then thoroughly inspect and refine the plan before allowing it to proceed.

rirze · 2026-03-16T15:51:47 1773676307

Untested Hypothesis: LLM instruction is usually an intelligence+communication-based skill. I find in my non-authoritative experience that users who give short form instructions are generally ill prepared for technical motivation (whether they're motivating LLMs or humans).

jzig · 2026-03-16T13:31:36 1773667896

lol that is still “how you’re talking to them that affects the results” just more specific

sshine · 2026-03-16T16:25:23 1773678323

Feeding the LLM a "copy as cURL" for its feedback loop instead of letting it manage the dev server was an unlock for me.

raw_anon_1111 · 2026-03-16T11:43:38 1773661418

I have 30 years of experience delivering code and 10 years of leading architecture. My argument is the only thing that matters is does the entire implementation - code + architecture (your database, networking, your runtime that determines scaling, etc) meet the functional and none functional requirements. Functional = does it meet the business requirements and UX and non functional = scalability, security, performance, concurrency, etc.

I only carefully review the parts of the implementation that I know “work on my machine but will break once I put in a real world scenario”. Even before AI I wasn’t one of the people who got into geek wars worrying about which GOF pattern you should have used.

All except for concurrency where it’s hard to have automated tests, I care more about the unit or honestly integration tests and testing for scalability than the code. Your login isn’t slow because you chose to use a for loop instead of a while loop. I will have my agents run the appropriate tests after code changes

I didn’t look at a line of code for my vibe coded admin UI authenticated with AWS cognito that at most will be used by less than a dozen people and whoever maintains it will probably also use a coding agent. I did review the functionality and UX.

Code before AI was always the grind between my architectural vision and implementation

awakeasleep · 2026-03-16T12:11:47 1773663107

Explain how fragility of implementation, like spaghetti code, high coupling low cohesion fit into your world view?

petcat · 2026-03-16T12:20:36 1773663636

As human developers, I think we're struggling with "letting go" of the code. The code we write (or agents write) is really just an intermediate representation (IR) of the solution.

For instance, GCC will inline functions, unroll loops, and myriad other optimizations that we don't care about (and actually want!). But when we review the ASM that GCC generates we are not concerned with the "spaghetti" and the "high coupling" and "low cohesion". We care that it works, and is correct for what it is supposed to do.

Source code in a higher-level language is not really different anymore. Agents write the code, maybe we guide them on patterns and correct them when they are obviously wrong, but the code is just the work-item artifact that comes out of extensive specification, discussion, proposal review, and more review of the reviews.

A well-guided, iterative process and problem/solution description should be able to generate an equivalent implementation whether a human is writing the code or an agent.

sarchertech · 2026-03-16T12:31:26 1773664286

A compiler uses rigorous modeling and testing to ensure that generated code is semantically equivalent. It can do this because it is translating from one formal language to another.

Translating a natural prompt on the other hand requires the LLM to make thousands of small decisions that will be different each time you regenerate the artifact. Even ignoring non-determinism, prompt instability means that any small change to the spec will result in a vastly different program.

A natural language spec and test suite cannot be complete enough to encode all of these differences without being at least as complex as the code.

Therefore each time you regenerate large sections of code without review, you will see scores of observable behavior differences that will surface to the user as churn, jank, and broken workflows.

Your tests will not encode every user workflow, not even close. Ask yourself if you have ever worked on a non trivial piece of software where you could randomly regenerate 10% of the implementation while keeping to the spec without seeing a flurry of bug reports.

This may change if LLMs improve such that they are able to reason about code changes to the degree a human can. As of today they cannot do this and require tests and human code review to prevent them from spinning out. But I suspect at that point they’ll be doing our job, as well as the CEOs and we’ll have bigger problems.

LogicFailsMe · 2026-03-16T13:22:59 1773667379

I don't see a world where a motivated soul can build a business from a laptop and a token service as a problem. I see it as opportunity.

I feel similarly about Hollywood and the creation of media. We're not there in either case yet, but we will be. That's pretty clear. and when I look at the feudal society that is the entertainment industry here, I don't understand why so many of the serfs are trying to perpetuate it in its current state. And I really don't get why engineers think this technology is going to turn them into serfs unless they let that happen to them themselves. If you can build things, AI coding agents will let you build faster and more for the same amount of effort.

I am assuming given the rate of advance of AI coding systems in the past year that there is plenty of improvement to come before this plateaus. I'm sure that will include AI generated systems to do security reviews that will be at human or better level. I've already seen Claude find 20 plus-year-old bugs in my own code. They weren't particularly mission critical but they were there the whole time. I've also seen it do amazingly sophisticated reverse engineering of assembly code only to fall over flat on its face for the simplest tasks.

sarchertech · 2026-03-16T13:26:39 1773667599

That depends on how fast that change happens. If 45% of jobs evaporate in a a 5 year period, a complete societal collapse is the likely outcome.

LogicFailsMe · 2026-03-16T14:08:16 1773670096

Sounds like influencer nonsense to me. Touch grass. If the people are fed and housed, there's no collapse. And if the billionaire class lets them starve, they will finally go through some things just like the aristocracy in France once did. And I think even Peter Thiel is smarter than that. You can feed yourself for <$1000 a year on beans and rice. Not saying you'd enjoy it, but you won't starve. So for ~$40B annually, the billionaires buy themselves revolution insurance. Fantastic value.

OTOH if what you're really talking about is the long-term collapse in our ludicrous carbon footprint when we finally run out of fossil fuels and we didn't invest in renewables or nuclear to replace them, well, I'm with you there.

sarchertech · 2026-03-16T14:50:17 1773672617

>Sounds like influencer nonsense to me. Touch grass.

I don't even know what this means.

The worst unemployment during the Weimar Republic was 25-30%. Unemployment in the Great Depression peaked at 25%.

So yeah if we get to 45% unemployment and those are the highest paying jobs on average then yeah it's gonna be bad. Then you add in second order effects where none of those people have the money to pay the other 55% who are still employed.

We might get to a UBI relatively quickly and peacefully. But I'm not betting on it.

>finally go through some things just like the aristocracy in France once did.

Yeah that's probably the most likely scenario, but that quickly devolved into a death and imprisonment for far more than the aristocrats and eventually ended with Napoleon trying to take over Europe and millions of deaths overall.

The world didn't literally end, but it was 40 years of war, famine, disease, and death, and not a lot of time to think about starting businesses with your laptop.

LogicFailsMe · 2026-03-16T15:23:15 1773674595

And the dark ages lasted a millennium. Sounds like quite an improvement on that. And if America didn't want a society hellbent on living the worst possible timeline, why did it re-elect President Voldemaga and give him the football? And then, even when he breaks nearly every political promise, his support remains better than his predecessor? Anyway, I think the richest ~1135 Americans won't let you starve, but they'll be happy to watch you die young of things that had stopped killing people for quite some time whilst they skim all the cream. And that seems to be what the plurality wants or they'd vote differently.

The good news is that America is ~5% of the world. And the more we keep punching ourselves in the face, the better the chance someone else pulls ahead. But still, we have nukes, so we're still the town bully for the immediate future.

sarchertech · 2026-03-16T16:04:57 1773677097

What are you even arguing about? I have absolutely no idea where you are going with this.

LogicFailsMe · 2026-03-16T16:17:59 1773677879

Yeah I figured that. You think society is going to collapse because of AI. I don't. But I do think that stupid narrative is prevalent in the media right now and the C-suite happily proclaiming they're going to lay people off and replace them with AI got the ball rolling in the first place. Now it has momentum of its own with lunatics like Eliezer Yudkowsky once again getting taken seriously.

Fortunately, the other 95% of humanity is far less doomer about their prospects. So if America wants to be the new neanderthals, they'll be happy to be the new cro magnons.

sarchertech · 2026-03-16T16:37:17 1773679037

I don't think society is going to collapse because of AI because I don't think the current architectures have any chance of becoming AGI. I think that if AGI is even something we're capable of it's very far off.

I think that if CEOs can replace us soon, it's because AGI got here much sooner than I predicted. And if that happens we have 2 options Mad Max and Star Trek and Mad Max is the more likely of the 2.

LogicFailsMe · 2026-03-16T17:01:47 1773680507

What's with all the catastrophic thinking then? Mad Max? Collapse of Society because 45% unemployment? I really hate people on principle but I have more faith in them looking out for their own self interest than you do apparently. Mad Max specifically requires a ridiculous amount of intact infrastructure for all the gasoline (you know gasoline goes bad in 3-6 months? Yeah didn't think so), manufacturing for all the parts for all those crazy custom build road warrior wagons, and ranches of livestock for all the leather for all the cool outfits (and with all that cow, no one needs to starve but oh the infrastructure needed to keep the cows fed).

If doom porn is your thing, try watching Threads or The Day After, especially Threads. That said, I don't think Star Trek is possible, maybe The Expanse but more likely we run out of cheap energy before we get off world.

As for the AGI, it all depends on your definition. We're already at Amazon IC1/IC2 coding performance with these agents (I speak from experience previously managing them). If we get to IC3, one person will be able to build a $1B company and run it or sell it. If you're a purist like me and insist we stick to douchebag racist Nick Bostrom's superintelligence definition of AGI, then we agree. But I expect 24/7 IC3 level engineering as a service for $200/month to be more than enough and I think that's a year or two away. And you can either prepare for that or scream how the sky is falling, your choice.

sarchertech · 2026-03-16T19:41:13 1773690073

>Mad Max specifically requires a ridiculous amount of intact infrastructure for all the gasoline (you know gasoline goes bad in 3-6 months? Yeah didn't think so)

Is this a joke or do you have a learning disability?

>But I expect 24/7 IC3 level engineering as a service for $200/month to be more than enough and I think that's a year or two away. And you can either prepare for that or scream how the sky is falling, your choice.

Or I could do neither and write you off as a gasbag who doesn't know what he's talking about like all the other ex-amazon management I've had the pleasure to work with over the years.

LogicFailsMe · 2026-03-16T20:19:23 1773692363

I guess you have a really short context buffer with all this frequently forgetting things you've said yourself.

But that aside, how's all that self-righteousness working out for you?

sarchertech · 2026-03-17T00:38:40 1773707920

I bet you have ex-Amazon prominently in your LinkedIn profile.

LogicFailsMe · 2026-03-17T01:51:07 1773712267

Don't have a LinkedIn profile, don't need one. But I'm guessing you're listed under LinkedIn Lunatics.

sarchertech · 2026-03-17T11:46:18 1773747978

I read back through a few of your posts and you’re either schizophrenic, or a very elaborate troll.

I know a few older people who started posting like this when they hit their 50s. I’ve only got a few years left. Hopefully I can avoid it, but maybe it’s inevitable.

LogicFailsMe · 2026-03-17T15:36:21 1773761781

Ageism: now that's a warrior's flex, amIRight?

People like myself in their 50s to 60s who had the experience of banging the metal on imperfect buggy hardware late into the night to mine gems before Python made the entire software engineering community pivot to a core competency of syntax pedanting plus stringing library calls together are having a real party with AI agents effectively doing the same thing they did 30 years ago. I personally never stopped coding even through my one awful experience as an engineering manager.

But you do you, and hear me now, dismiss me later. There won't be 45% unemployment because the minute AI starts replacing current engineering skills for real is the minute the people it targets wake up and start learning how to work with AI coding agents that will be dramatically better than today. People resist change until there are no other options, just look at fossil fuels. The free market will work that one out too eventually.

And no amount of some nontechnical guy vibe scienceing his way to a working mRNA vaccine for his cancer-ridden dog or an engineer unlocking mods to Disney Infinity just from the binary and Claude Code or an entire web browser ported to rust will ever convince you these things are not the enemy. And that's going to put you through some things down the road. So of course, since this will never happen I'm an elaborate troll or a nutcase just like the people who pulled all those things off, never mind all the evidence mounting that these things can be amazing in the right hands. That's CRAZYTALK! Stochastic Parrot! Glorified Autocomplete! Mad Max! Mad Max! DLSS 5!

andrekandre · 2026-03-21T03:36:01 1774064161

  > There won't be 45% unemployment because the minute AI starts replacing current engineering skills for real is the minute the people it targets wake up and start learning how to work with AI coding agents that will be dramatically better than today.

i have a slightly different take: the theory of bullshit jobs* says most likely we will have an increase in the amount of work expected which will just increase the amount of management and specialists (think scrum master and friends for ai) and busywork needed for all this new technology, so most definitely "jobs" will not be going away imo

* https://en.wikipedia.org/wiki/Bullshit_Jobs

jplusequalt · 2026-03-16T14:37:23 1773671843

>You can feed yourself for <$1000 a year on beans and rice. Not saying you'd enjoy it, but you won't starve. So for ~$40B annually, the billionaires buy themselves revolution insurance. Fantastic value.

You are the epitome of the tech bro.

LogicFailsMe · 2026-03-16T15:14:35 1773674075

Sure, sure. Understanding how these sociopaths think clearly makes me a tech bro rather than someone who incorporates worst-case scenarios into my planning. Suggesting they would maintain minimum viable society to save their own asses means I'm in favor of it, right? This is why I work remotely.

bhaak · 2026-03-16T14:29:42 1773671382

Peter Thiel might be smarter than that but I’m not sure about the other ones.

Look how Musk treated the Twitter devs or Bezos any of his workers or Trump anybody.

LogicFailsMe · 2026-03-16T15:17:45 1773674265

They're all quite intelligent. And they're world class experts in saving their own bacon. Doesn't mean they have any ethics though nor any emotional intelligence after decades of being surrounded by toadies and bootlickers.

bhaak · 2026-03-17T07:14:02 1773731642

Smart is not equal to intelligent.

You can be very intelligent but have a blind eye on some trivial things.

I’m certain that some of them think they are untouchable (or even just are well prepared). We will only see if that’s really true if shit hits the fan.

LogicFailsMe · 2026-03-17T15:45:33 1773762333

We all know they have bunkers and we roughly know where they are. I got suspended on reddit for threatening harm to others for saying that a couple weeks back. But I don't think we need to raid the bunkers in your TEOTWAWKI scenario, their bodyguards will do all the heavy-lifting once they realize the power balance has shifted. But I also don't expect a SHTF scenario, just a slow creeping enshitification of living standards instead of actually implementing a UBI.

And then the survivors who band together to rebuild community instead of chasing some idiotic Mad Max scenario will ultimately prevail. And yes, they are blind to that other option because they wouldn't end up on top.

jplusequalt · 2026-03-16T14:34:18 1773671658

>If you can build things, AI coding agents will let you build faster and more for the same amount of effort.

But you aren't building, your LLM is. Also, you are only thinking about ways as you, a supposed builder, will benefit from this technology. Have you considered how all previous waves of new technologies have introduced downstream effects that have muddied our societies? LLMs are not unique in this regard, and we should be critical on those who are trying to force them into every device we own.

raw_anon_1111 · 2026-03-16T14:37:49 1773671869

Would you say the general contractor for your home isn’t a builder because he didn’t install the toilets?

jplusequalt · 2026-03-16T16:52:36 1773679956

I think this argument would be make more sense if you were talking about an architect, or the customer.

A contractor is still very much putting the house together.

raw_anon_1111 · 2026-03-16T16:59:30 1773680370

The general contractor is not doing the actual building as much as he is coordinating all of the specialist, making sure things run smoothly and scheduling things based on dependencies and coordinating with the customer. I’ve had two houses built from the ground up

LogicFailsMe · 2026-03-16T17:10:13 1773681013

3 myself and I have yet to meet a "vibe" contractor.

raw_anon_1111 · 2026-03-16T17:14:37 1773681277

And he is also not inspecting every screw, wire, etc. He delegates

LogicFailsMe · 2026-03-16T17:30:09 1773682209

Oh you're preaching to the choir. I think we are entering a punctuated equilibrium here w/r to the future of SW engineering. And the people who have the free time to go on to podcasts and insist AI coding agents can't do anything useful rather than learning their abilities and their limitations and especially how to wield them are going to go through some things. If you really want to trigger these sorts, ask them why they delegate code generation to compilers and interpreters without understanding each and every ISA at the instruction level. To that end, I am devoid of compassion after having gone through similar nonsense w/r to GPUs 20 years ago. Times change, people don't.

raw_anon_1111 · 2026-03-16T17:36:49 1773682609

I haven’t stayed relevant and able to find jobs quickly for 30 years by being the old man shouting at the clouds.

I started my career in 1996 programming in C and Fortran on mainframes and got my first only and hopefully last job at BigTech at 46 7 jobs later.

I’m no longer there. Every project I’ve had in the last two years has had classic ML and then LLMs integrated into the implementation. I have very much jumped on the coding agent bandwagon.

LogicFailsMe · 2026-03-16T18:00:57 1773684057

Started mine around the same time and yes, keeping up keeps one employed. What's disheartening however is how little keeping up the key decision makers and stakeholders at FAANNG do and it explains idiocy like already trying to fire engineers and replace them with AI. Hilarity ensued of course because hilarity always ensues for people like that, but hilarity and shenanigans appears to be inexhaustible resources.

raw_anon_1111 · 2026-03-17T04:52:06 1773723126

I very much would rather get a daily anal probe with a cactus than ever work at BigTech again even knowing the trade off that I now at 51 make the same as 25 year old L5 I mentored when they were an intern and their first year back as an L4 before I left.

LogicFailsMe · 2026-03-17T15:40:46 1773762046

If you have FIRE money, getting off the hamster wheel of despair that is tech industry culture is the winning move. Well-played.

raw_anon_1111 · 2026-03-17T16:20:09 1773764409

Not quite FIRE money. I still need to work for awhile - I just don’t need to chase money. I make “enough” to live comfortably, travel like I want (not first class.), save enough for retirement (max out 401K + catchup contributions + max out HSA + max out Roth).

We did choose to downsize and move to state tax free Florida.

If I have to retire before I’m 65, exit plan is to move to Costa Rica (where we are right now for 6 weeks)

LogicFailsMe · 2026-03-16T15:16:40 1773674200

I think that's precisely his thinking and don't let him know about all those fancy expensive unitasker tools they have that you probably don't that let them do it far more cost effectively and better than the typical homeowner. Won't you think of the jerbs(tm)? And to Captain dystopia, life expectencies were increasing monotonically until COVID. Wonder what changed?

k3nx · 2026-03-16T16:36:46 1773679006

I've struggled a bit with this myself. I'm having a paradigm shift. I used to say "but I like writing code". But like the article says, that's not really true. I like building things, the code was just a way to do that. If you want to get pedantic, I wasn't building things before AI either, the compiler/linker was doing that for me. I see this is just another level of abstraction. I still get to decide how things work, what "layers" I want to introduce. I still get to say, no, I don't like that. So instead of being the "grunt", I'm the designer/architect. I'm still building what I want. Boilerplate code was never something I enjoyed before anyway. I'm loving (like actually giggling) having the AI tie all the bits for me and getting up and running with things working. It reminds me of my Delphi days: File->New Project, and you're ready to go. I think I was burnt out. AI is helping me find joy again. I also disable AI in all my apps as well, so I'm still on the fence about several things too.

andrekandre · 2026-03-21T03:43:06 1774064586

  > I'm having a paradigm shift. I used to say "but I like writing code". But like the article says, that's not really true. I like building things, the code was just a way to do that.

i get this; for me i find coding is fun as video games so i don't personally want to turn all code to ai, but what i DO WANT is for it to automate away drudgery of repeating actions and changes (or when i get stuck be a rubber duck for me)... i want to focus my creativity on the interesting parts myself and learn and grow to a better programmer... it may sound crazy but programming is relaxing for me lol

druide67 · 2026-03-16T17:56:13 1773683773

This resonates. I spent years thinking I enjoyed coding, but what I actually enjoy is designing elegant solutions built on solid architecture. Inventing, innovating, building progressively on strong foundations. The real pleasure is the finished product (is it ever really finished though?) — seeing it's useful and makes people's lives easier, while knowing it's well-built technically. The user doesn't see that part, but we know.

With AI, by always planning first, pushing it to explore alternative technical approaches, making it explain its choices — the creative construction process gets easier. You stay the conductor. Refactoring, new features, testing — all facilitated. Add regular AI-driven audits to catch defects, and of course the expert eye that nothing replaces.

One thing that worries me though: how will junior devs build that expert eye if AI handles the grunt work? Learning through struggle is how most of us developed intuition. That's a real problem for the next generation.

petcat · 2026-03-16T12:56:37 1773665797

> A compiler uses rigorous modeling and testing to ensure that generated code is semantically equivalent.

Here are the reported miscompilation bugs in GCC so far in 2026. The ones labeled "wrong-code".

https://gcc.gnu.org/bugzilla/buglist.cgi?chfield=%5BBug%20cr...

I count 121 of them.

sarchertech · 2026-03-16T13:01:50 1773666110

If you can’t understand the difference between a bug that will rarely cause a compiler encountering an edge case to generate a wrong instruction and an LLM that will generate 2 completely different programs with zero overlap because you added a single word to your prompt, then I don’t know what to tell you.

petcat · 2026-03-16T13:17:17 1773667037

The point is that expert humans (the GCC developers) writing code (C++) that generates code (ASM) does not appear to be as deterministic as you seem to think it is.

sarchertech · 2026-03-16T13:23:23 1773667403

I’m very aware of that, but I’m also aware that it’s rare enough that the compiler doesn’t emit semantically equivalent code that most people can ignore it. That’s not the case with LLMs.

I’m also not particularly concerned with non-determinism but with chaos. Determinism in LLMs is likely solvable, prompt instability is not.

jplusequalt · 2026-03-16T14:39:45 1773671985

Classic HN-ism. To focus on the semantics of a statement while ignoring the greater point in order to argue why someone is wrong.

anthonyrstevens · 2026-03-16T15:30:53 1773675053

I think it's a perfectly fine point. The OP said (my interpretation) that LLMs are messy, non-deterministic, and can produce bad code. The same is true of many humans, even those whose "job" is to produce clean, predictable, good code. The OP would like the argument to be narrowly about LLMs, but the bigger point even is "who generates the final code, and why and how much do we trust them?"

sarchertech · 2026-03-16T19:52:08 1773690728

As of right now agents have almost no ability to reason about the impact of code changes on existing functionality.

A human can produce a 100k LOC program with absolute no external guardrails at all. An agent Can't do that. To produce a 100k LOC program they require external feedback forcing them from spiraling off into building something completely different.

This may change. Agents may get better.

petcat · 2026-03-16T14:52:23 1773672743

I argued the greater point? Software code-generation is not deterministic, whether it's done by expert humans or by LLMs.

sarchertech · 2026-03-16T14:56:15 1773672975

It has nothing to do with determinism. It's the difference between nearly perfectly but not quite perfectly translating between rigorously specified formal languages and translating an ambiguous natural language specification into a formal one.

The first is a purely mechanical process, the second is not and requires thousands of decisions that can go either way.

raw_anon_1111 · 2026-03-16T15:02:57 1773673377

And that’s no different than human developers

sarchertech · 2026-03-16T16:01:30 1773676890

The difference is that a human is that a human can reason about their code changes to a much higher degree than an AI can. If you don't think this is true and you think we're working with AGI, why would you bother architecting anything all or building in any guard rails. Why not just feed the AI the text of the contract your working from and let it rip.

raw_anon_1111 · 2026-03-16T17:01:10 1773680470

You give way too much credit to the average mid level ticket taker. And again, why do I care how the code does it as long as it meets the functional and none functional requirements?

sarchertech · 2026-03-16T19:43:27 1773690207

Because in a real application with real users all of the functional and non-functional requirements aren't documented anywhere but in the code.

raw_anon_1111 · 2026-03-17T17:16:13 1773767773

If only a coding agent had access to your code…

sarchertech · 2026-03-17T23:10:12 1773789012

You realize that coding agents aren’t AGI right? They aren’t capable of reasoning about a code changes impact on anything other than their immediate goal to anywhere near the level even a terrible human programmer is. That why we have the agentic workflow in the first place. They absolutely require guardrails.

Claude will absolutely change anything that’s not bolted to the floor. If you’ve used it on legacy software with users or if you reviewed the output you’d see this.

jcranmer · 2026-03-16T14:24:55 1773671095

Compilers are some of the largest, most complex pieces of software out there. It should be no surprise that they come with bugs as all other large, complex pieces of software do.

Kye · 2026-03-16T14:48:38 1773672518

This seems to apply easily to LLMs as language coprocessors that can output code. How long was it before people trusted compilers?

sarchertech · 2026-03-16T14:52:52 1773672772

If you don't understand the difference between something that rigorously translates one formal language to another one and something that will spit out a completely different piece of software with 0 lines of overlap based on a one word prompt change, I don't know what to tell you.

anthonyrstevens · 2026-03-16T15:32:04 1773675124

"rigorously" is doing a lot of heavy lifting here.

sarchertech · 2026-03-16T15:58:40 1773676720

Let's substitute rigorously with "in an extremely thorough, careful, and methodical way."

raw_anon_1111 · 2026-03-16T12:46:14 1773665174

As if when you delegate tasks to humans they are deterministic. I would hope that your test cases cover the requirements. If not, your implementation is just as brittle when other developers come online or even when you come back to a project after six months.

sarchertech · 2026-03-16T12:58:36 1773665916

1. Agents aren’t humans. A human can write a working 100k LOC application with zero tests (not saying they should but they could and have). An agent cannot do this.

Agents require tests to keep them from spinning out and your tests do not cover all of the behaviors you care about.

2. If you doubt that your tests don’t cover all your requirements, 99.9% of every production bug you’ve ever had completely passed your test suite.

raw_anon_1111 · 2026-03-16T13:36:11 1773668171

I have never known a human that could or did write 100K lines of bug free working code without running parts of it first and testing.

So humans also don’t write bug free code or tests that cover all use cases - how is that an argument that humans are better?

sarchertech · 2026-03-16T15:53:53 1773676433

Not that humans can't write 100k line programs bug free or without running parts of it.

An AI cannot write a 100k line program on its own without external guard rails otherwise it spins out. This has nothing to do with whether the agent is allowed to run the code itself. This is well documented. Look at what was required to allow Claude to write a "C compiler".

This has nothing to do with whether it's bug free. It literally can't produce a working 100k LOC program without external guardrails.

raw_anon_1111 · 2026-03-16T17:02:45 1773680565

Absolutely no one is arguing that you shouldn’t have a combination of manual and automated tests around either AI or human generated code or that you shouldn’t have a thoughtful design

sarchertech · 2026-03-16T19:47:27 1773690447

In a non-trivial app you can't test your way through all of the e2e workflows and thoughtful design isn't what I'm talking about.

How many bugs have you seen that passed your automated and manual testing? Probably 99.9% of them.

Now imagine that you take those same test suites and you unleash an agent on the code that has far worse reasoning capabilities than a human and you tell them they can change anything in the code as long as the tests pass.

raw_anon_1111 · 2026-03-16T19:53:52 1773690832

So if bugs pass through testing which they have forever, wouldn’t that imply that humans are just as fallible as AI - and slower?

I never suggested letting agents code for a day on end. I use AI to code well defined tasks and treat it like a mid level ticket taker

sarchertech · 2026-03-17T12:11:39 1773749499

If you have an employee who codes 2x faster than everyone else but produces 10x the bugs, would your suggestion to be to let him rip and stop reviewing his code output?

> I never suggested letting agents code for a day on end. I use AI to code well defined tasks and treat it like a mid level ticket taker

It doesn’t matter how long you’re letting it run. If you aren’t reviewing the output, you have no way of knowing when it changes untested behavior.

I regularly find Claude doing insane things that I never would have thought to test against, that would have made it into prod if I hadn’t renewed the code.

raw_anon_1111 · 2026-03-17T13:08:33 1773752913

> It doesn’t matter how long you’re letting it run. If you aren’t reviewing the output, you have no way of knowing when it changes untested behavior.

You’re focused on the output , I’m focused on the behavior. Thats the difference. Just like when I delegate a task to either another developer or another company like the random Salesforce integration or even a third party API I need to integrate with.

sarchertech · 2026-03-17T15:11:11 1773760271

Unfortunately you are not equipped to observe and test all or even most of the behavior of a non-trivial system.

And if you attempt to treat every module in your system like it’s untrusted 3rd party code you’ll run into severe complexity and size limits. No one codes large systems like that because it’s not possible. There are always escape hatches and entanglements.

raw_anon_1111 · 2026-03-17T15:35:46 1773761746

Actual a little company you might have heard of called Amazon does…

Jeff Bezos mandated it in 2002.

https://konghq.com/blog/enterprise/api-mandate

AWS S3 by itself is made up of 200+ micro services

sarchertech · 2026-03-17T23:13:57 1773789237

Except that they don’t. The API mandate gets violated all the time. And no one at Amazon actually treats every other team as a 3rd party.

raw_anon_1111 · 2026-03-17T23:51:01 1773791461

Have you worked at Amazon?

I am not saying they treat every other team as a third party. I am saying they treat the code itself as a black box with well defined interfaces. They aren’t reaching in to another services data store to retrieve information.

sarchertech · 2026-03-18T00:44:32 1773794672

How do you know someone is ex-Amazon ? Don’t worry they’ll tell you.

I haven’t mentioned it 5 times already so you can be pretty sure I haven’t. I know too many people that have worked there to ever make that mistake.

But more importantly, have you worked there in the last decade?

raw_anon_1111 · 2026-03-18T04:03:43 1773806623

Then you have know idea how Amazon works. I was there from 2020-2023.

And before that I worked at 4 product companies when new to the company managers /directors/CTOs needed someone to bring in best practices and build and teach teams.

I honestly wonder what type of “big ball of mud” implementations (https://blog.codinghorror.com/the-big-ball-of-mud-and-other-...) you have had to deal with that weren’t properly componentized and covered by tests.

throwaw12 · 2026-03-16T12:49:56 1773665396

Valid points. But crucial part of not "letting go" of the code is because we are responsible for that code at the moment.

If, in the future, LLM providers will take ownership of our on-calls for the code they have produced, I would write "AUTO-REVIEW-ACCEPTER" bot to accept everything and deploy it to production.

If, company requires me to own something, then I should be aware about what's that thing and understand ins and outs in detail and be able to quickly adjust when things go wrong

raw_anon_1111 · 2026-03-16T13:40:05 1773668405

In the past ten years as a team lead/architect/person who was responsible for outsourced implementations (ie Salesforce/Workday integrations, etc), I’ve been responsible for a lot of code I didn’t write. What sense would it have made for me to review the code of the web front end of the web developer for best practices when I haven’t written a web app since 2002?

throwaw12 · 2026-03-16T15:14:19 1773674059

as a team lead, if you are not aware of what's happening in the team, what kind of team lead is this?

on the other hand, you may have been an engineering manager, who is responsible for the team, but a lot of times they do not participate in on-call rotations (only as last escalation)

IanCal · 2026-03-17T10:01:27 1773741687

> what kind of team lead is this?

One that trusts the team?

Knowing what's happening in the team and personally reviewing parts of the code for best practices are very different things. Are the other team members happy? Does development seem to go smoothly, quickly and without constantly breaking? Does the team struggle to upgrade or refactor things? At some level you have to start trusting that the people working know what they're doing, and help guide from a higher level so they understand how to make the right tradeoffs for the business.

raw_anon_1111 · 2026-03-16T15:19:51 1773674391

As a team lead, I know the architecture, the functional and non functional requirements, I know the website is suppose to do $x but I definitely didn’t guide how since I haven’t done web development in a quarter century, I know the best practices for architecture and data engineering (to a point).

That doesn’t mean I did a code review for all of the developers. I will ask them how they solved for a problem that I know can be tricky or did they take into account for something.

jmalicki · 2026-03-16T15:12:20 1773673940

I've actually found that well-written well-documented non-spaghetti code is even more important now that we have LLMs.

Why? Because LLMs can get easily confused, so they need well written code they can understand if the LLM is going to maintain the codebase it writes.

The cleaner I keep my codebase, and the better (not necessarily more) abstracted it is, the easier it is for the LLM to understand the code within its limited context window. Good abstractions help the right level of understanding fit within the context window, etc.

I would argue that use of LLMs change what good code is, since "good" now means you have to meaningfully fit good ideas in chunks of 125k tokens.

raw_anon_1111 · 2026-03-16T15:15:25 1773674125

I somewhat agree. But that’s more about modularity. It helps when I can just have Claude code focus on one folder with its own Claude file where it describes the invariants - the inputs and outputs.

sarchertech · 2026-03-17T15:19:40 1773760780

If you don’t read the code how the heck do you know anything about modularity? How do you know that Module A doesn’t import module B, run the function but then ignore it and implement the code itself? How do you even know it doesn’t import module C?

Claude code regularly does all of these things. Claude code really really likes to reimplement the behavior in tests instead of actually exercising the code you told it to btw. Which means you 100% have to verify the test code at the very least.

raw_anon_1111 · 2026-03-17T16:31:18 1773765078

Well I know because my code is in separately deployed Lambdas that are either zip files uploaded to Lambda or Docker containers run on Lambda that only interact via APi Gateway, a lambda invoke, SNS -> SQS to Lambda, etc and my IAM roles are narrowly defined to only allow Lambda A to interact with just the Lambdas I tell it to.

And if Claude tried to use an AWS service in its code that I didn’t want it to use, it would have to also modify the IAM IAC.

In some cases the components are in completely separate repositories.

It’s the same type of hard separation I did when there were multiple teams at the company where I was the architect. It was mostly Docker/Fargate back then.

Having separately defined services with well defined interfaces does an amazing job at helping developers ramp up faster and it reduces the blast radius of changes. It’s the same with coding agents. Heck back then, even when micro services shared the same database I enforced a rule that each service had to use a database role that only had access to the tables it was responsible for.

I have been saying repeatedly I focus on the tests and architecture and I mentioned in another reply that I focus on public interface stability with well defined interaction points between what I build and the larger org - again just like I did at product companies.

There is also a reason the seven companies I went into before consulting (including GE when it was still a F10 company) I was almost always coming into new initiatives where I could build/lead the entire system from scratch or could separate out the implementation from the larger system with well defined inputs and outputs. It wasn’t always micro services. It might have been separate packages/namespaces with well defined interfaces.

Yeah my first job out of college was building data entry systems in C from scratch for a major client that was the basis of a new department for the company.

And it’s what Amazon internally does (not Lambda micro services) and has since Jeff Bezos’s “API Mandate” in 2002.

sarchertech · 2026-03-18T00:23:18 1773793398

This sounds like an absolute hellscape of an app architecture but you do you. It also doesn’t stop anything but the Module A imports C without you knowing about it. It doesn’t stop module A from just copy pasting the code from C and saying it’s using B.

>almost always coming into new initiatives

That says a lot about why you are so confident in this stuff.

raw_anon_1111 · 2026-03-18T06:49:32 1773816572

Yes microservice based architecture is something no modern company does…

Including the one that you were so confident doesn’t do it even though you never worked there…

Yet I don’t suffer from spooky action at a distance and a fear of changes because my testing infrastructure is weak…

Either I know what I’m doing or I’ve bullshitted my way into multiple companies into hiring me to lead architecture and/or teams from 60 person startups to the US’s second largest employer.

Did I mention that one of those companies was the company that acquired the startup I worked for before going to BigTech reached out to me to be the architect overseeing all of their acquisitions and try to integrate them based on the work I did? I didn’t accept the offer. I’ve done the “work for a PE owned company that was a getting bigger by a acquiring other companies and lead the integration thing before”

So they must have been impressed with the long term maintenance of the system to ask me back almost four years after I left

sarchertech · 2026-03-18T17:34:04 1773855244

If the only evidence you have that your software is maintainable is that a company once asked you to come back, and you have no actual experience maintaining large applications with millions of users, you essentially have data to base any of your claims on.

You may have 30 years experience architecting new applications, but when it comes to maintaining large applications, you’re a neophyte.

If you don’t have first hand experience with what long term maintenance looks like for your creations, you don’t have any reason to be telling anyone how to write maintainable software.

If I were you I’d be suffering from imposter syndrome big time. What if you’re just a really good salesman and bullshitter? If I were you I’d want to stick around at a few places to see first hand how my designs hold up.