I see it hallucinate quite often in development but mostly in getting small details wrong that are automatically corrected by lint processes. Large scale hallucination seems better guarded but I also suspect it’s because latitude is constrained by context and harnesses like lint, type systems, as well as fine tuned tool flows in coding models to control for divergence. But I would classify making mistakes like variable names wrong or package naming or signatures wrong as hallucations.
Interestingly I’ve learned more about languages and systems and tools I use in the last few years working with agentic coding than I did in 35 years of artisanal programming. I am still vastly superior at making decisions about systems and techniques and approaches than the agentic tools, but they are like a really really well read intern who knows a great deal of detail about errata but have very little experience. They enthusiastically make mistakes but take feedback - at least up front - even if they often forget because they don’t totally understand and haven’t internalized it.
The claim you should know everything about everything you work on is an intensely naive one. If you’ve worked on a team of more than one there’s a lot of stuff you don’t totally grok. If you work in an old code base there’s almost every bit of it that’s unfamiliar. If you work in a massive monorepo built over decades, you’re lucky if you even understand the parts everyone considers you an expert in it.
I often get the impression folks making these claims are either very junior themselves or work basically alone or on some project for 20 years. No one who works in a team or larger org can claim they know everything in their code base. No one doing agentic programming can either. But I can at least ask the agent a question and it will be able to answer it. And after reading other people’s code for most of my adult life, I absolutely can read the LLMs. The fact a machine wrote crappy code vs a human bothers me not in the least, and at least the machine will take my feedback and act on it.
you have 35 years of experience and have already built up the learning capability and general framework to acquire new knowledge. you know how to use agentic coding as a tool to supplement your work. the juniors who start today don't have that, they overrely on agentic coding and do not know what they don't know
Exactly this. We need to be more precise than blanket statements like "agentic coding is a trap" and start figuring out what a "tasteful" application of agentic coding looks like. ChatGPT is destroying liberal arts curriculums because students can choose to not do anything of the thinking themselves and produce mediocre work that passes the bar. I think the same problem is showing itself with agentic coding, just with more directly measurable consequences (because the pile of software ends up failing in a more spectacular way than the pile of bad writing).
On liberal arts is simply a matter of what the students want to get out of the class, vs what the teacher wants the students to do: There's a huge disconnect in goals and expectations, so there's no way for the teacher to actually win. The fact that there's such disconnect should give the departments pause.
This doesn't happen at all for using agentic coding: What the programmer wants and what the boss wants are pretty well aligned. There are corner cases where someone isn't allowed to use LLMs, but does it anyway, but in most cases, the organization agrees.
> what the students want to get out of the class, vs what the teacher wants the students to do: There's a huge disconnect in goals and expectations, so there's no way for the teacher to actually win. The fact that there's such disconnect should give the departments pause.
Unless the teacher's role is to scaffold and support the students in acquiring what the students want, gain trust and lower the disconnect.
Honestly I'm not really thinking about the boss-programmer relationship, but rather the programmer-agent relationship. At best, you get what fnordpiglet is talking about, where it's a symbiotic relationship. On the other side of the coin, you get a parasitic relationship like the OP is talking about: the agent delivers results, you take credit, you fail to develop (or maintain) long-term skills, you become a non-value-adding middleman, you get replaced.
I think it's most easily summarized by: "It's still important to know things and what was important to know before hasn't really changed". If anything, agentic coding highlights and accensuates the need for good systems and software design knowhow.
IMO, by the time todays juniors would have 5-10 years of expected experience, the entire field will be something different altogether. Language choice distribution will collapse (if not change altogether), whole new modalities of monitoring and progressive delivery guardrails will come into play, essentially creating a 24/7 incremental rollout of pure agentic code, correctness will be determined by a mix of language features and self-monitoring by models in production and automated testing against production snapshots in pre-production, and deep debugging will the be province of a select group of engineers and there will be a pathway to those roles for juniors, but those roles will be coveted and difficult to break into (and probably will require education and maybe even informal accreditation).
Just as "use code for contracts" failed for crypto currencies, "use AI output as prod" will fail for AI. Both is based on "just don't make catastrophic mistakes anymore".
You also wrongly assume that requirements can always easily expressed as natural language.
Another point: Software Engineering always starts where tooling capabilities stop. You don't get a competitive advantage by building without engineers what anybody everybody else can build without engineers.
> Just as "use code for contracts" failed for crypto currencies, "use AI output as prod" will fail for AI. Both is based on "just don't make catastrophic mistakes anymore".
What I think will happen is AI will write code and it will do the best it can to mitigate mistakes prior to rollout, but once rollout time occurs, rollout will be incremental and it will self monitor by defining success conditions at rollout time. The nature of the code will mitigate "catastrophe" to a small group at worst, but most likely initial rollout will just run new versions of the code in a simulated context (language design could benefit from this) and analyze potential outcomes without affecting current functionality.
But when the code goes live... it will be slowly scope changes progressively (think feature/experiment flags) and if it fails in the initial cohort, it will redirect. If success is positive, it will increase the rollout cohort.
This is a normal software engineering practice today, but it's labor and process intensive when driven by humans. But in a world where humans are less involved, this process is scalable.
This assumes failures can be detected and fixed more easily than generating the corresponding change. I am not convinced that's the case.
Counter points to my own arguments:
1. We don't know yet in detail what AI is good at.
2. AI doesn't need to be perfect, just "good enough", whatever that means for a specific project. More failures while saving hundreds of thousands dollars each year might be acceptable, for example.
> 2. AI doesn't need to be perfect, just "good enough", whatever that means for a specific project. More failures while saving hundreds of thousands dollars each year might be acceptable, for example.
This I think is the unexplored aspect of what's happening right now. Guardrails around "good enough" systems is where the future value lies. In the future code will never be as good as when the artisans were writing it, but if you have an automated process to validate/verify mediocre code (and kick it back to AI for refinement when it fails) before it's fully productionized, then you have a pathway to scaling agentic coding.
If you are working with AI to define the purpose and goal of the change -- which is to say planning how the changes to the code should result in some sort of feature/bugfix/whatever, then planning phase should ask you to define clear success conditions for the code that it writes. These could be otel/datadog metrics, or some kind of funnel metric or some cessation of errors in your APM, whatern. In any case the outcome of the change is what I mean by validate/verify. Mediocre code can solve issues and we can tolerate mediocre code in that sense. The guardrails kick back failing "mediocre" code, it accepts working mediocre code.
And this could easily apply to every change we made by hand before AI, it was just a tedious process to layer these things into code when we were just fixing bugs and whatnot. In an AI writes all the code world adding this kind of stuff as table stakes for a changeset is zero cost, effort wise.
> rollout will be incremental and it will self monitor by defining success conditions at rollout time.
This sounds a lot like allowing an LLM to define tests as well as implementation, and allowing the LLM to update the tests to make the code pass. Recently people have come to understand (again?) that testing and evaluation works better outside of the sandbox.
Sorry I wasn't very clear about that part. I think success conditions are described by stakeholders, whoever that is, and then the implementation of monitoring them is probably created by the LLM. For engineering level stakeholders that's going to be metrics, performance, etc. Whereas for more business side stakeholders that'll be a mix of data metrics and product feature metrics, click-through rates, stuff like that
> Another point: Software Engineering always starts where tooling capabilities stop. You don't get a competitive advantage by building without engineers what anybody everybody else can build without engineers.
I'd note here that the long arc of software engineering has been commodifying the discipline into tooling. Ask any unix greybeard how shitty modern abstractions are and they'll give you all you can stomach and yet the wheel turns despite their treasured insights.
Up until about four months ago, the sentiment by so many programmers online (here, reddit, etc) was that LLMs were absolutely useless at coding. And yet, many of us were puzzled, because we were having success with them, at coding.
People were burying their heads.
Today, there are not many of those people left. Some, but not a lot. Because you can only deny reality for so long.
I don't know what the coding world is going to look like in 5-10 years, but everything has changed radically in the space of a year from maybe 10% of people using agents to code to probably 95% of people now. In about a YEAR.
I don't know, but my assumption is these things will get better to a point where they will be automating close to 100% of coding, and deploying, and verifying, etc. The old job we had will be completely changed well before 10 years. I still think us "engineers" will have a role to play, but I genuinely don't know what it will look like.
> I don't know what the coding world is going to look like in 5-10 years, but everything has changed radically in the space of a year from maybe 10% of people using agents to code to probably 95% of people now. In about a YEAR.
Last I saw about a week ago, the stats were about 35%. There may be some confusion around this:
1. The absolute number could have remained the same but the sheer volume of vibe-coders who never coded before raised the percentage. For example, if 100 out of a population of 1000 people uses AI then the percentage is 10%. If, over the next year 9k new vibers were created but none of the existing 1000 people changed their workflow, you will see 9100 people out of 10000 people using AI - that's now a 91% rate of people using AI to code even though none of the people since last year changed the way they work.
2. Last I checked, pre-AI, there were about 12m working developers in the world (SO survey extrapolated). As of February this year, CC, by itself, had 60k subscribers. Even if we err on the side of optimism and assume every single subscriber is running the agent, that's still not 95% of developers.
> I still think us "engineers" will have a role to play, but I genuinely don't know what it will look like.
??? We already know what it looks like - "Business Analyst" has bee a role since forever (at least since 1995, when I entered the workforce). If you wanted a role where you wrote no code but merely drew up specs for the programmers to code, you could have had it as a BA.
It's just that few of us wanted to do that as it paid half what an engineer made. Now with the supply of BAs potentially doubling, it will pay a quarter of what an engineer used to make.
> I don't believe those figures. Perhaps we could discuss more if you linked the reports.
No report is going to help because there are no actual figures for token providers at the moment. We'll have to wait for them to IPO before we'll know for sure.
And yet you are citing some concrete report you aren't sharing. The problem with your original comment is you scoped it to subscribers (which I assume is how this unshared report was framed). API billing for enterprises will far surpass that number in individual users. Claude Code is available at my org to every person employed here and that's just shy of 2k people, and I can say with confidence we are not 3% of Anthropics customer base alone.
someone probably made this same argument against certain frameworks over the years and juniors still figured it out. we need to stop trying to babysit learning for hypothetical situations.
the bar to "start" is lower and the bar to actually competency is higher now, juniors who want to actually learn instead of just pressing enter over and over again will do so regardless of whatever you do to "help" them.
It's not really a hypothetical. I work with one junior who's submitted an incorrect bugfix 3 times and counting; he seems genuinely incapable of processing the idea that there's a correctness issue he has to resolve, rather than a prompt engineering issue that will allow Claude to figure it out if only he asks in the right way.
that's not the tooling's fault i feel. i've used LLMs to help explore and debug issues, point me to the right documentation to investigate, etc. I WISH i had something like this 30 years ago.
Juniors will figure out how to make things work just as well as we did. They may end up with a different set of skills, but competitive advantage is still a thing, and so competition will mean they end up with the best skills suited for the environment.
If a junior builds something with agents that turns into a mess they can’t debug, that will teach them something. If they care about getting better, they will learn to understand why that happened and how to avoid it next time.
It’s not all that different than writing code directly and having it turn into a mess they can’t debug—something we all did when we were learning to program.
It is in many ways far easier to write robust, modular, and secure software with agents than by hand, because it’s now so easy to refactor and write extensive tests. There is nothing magical about coding by hand that makes it the only way to learn the principles of software design. You can learn through working with agents too.
> that will teach them something. If they care about getting better,
This pre-supposes the idea that the business is _willing_ to let that happen, which is increasingly unlikely. The current, widespread attitude amongst stakeholders is “who cares, get the model to fix it and move on”.
At least, when we wrote code by hand, needing to fix things by hand was a forcing function: one that now, from the business perspective, no longer exists.
This is what I have been thinking. Business will always try to do more with less because their only true goal is figuring out how to make more money. They will sacrifice giving those juniors time to learn from their mistakes for the sake of making more widgets (code). From the wider generational view, they will rob today's juniors from the chance to learn and thereby keep the talent pipeline full so they can profit today, the future (and the developers who will arrive there) be damned. The economic game is flawed because it only ever comes down to a single output that is optimized for: money. One solution? I think software people might consider forming unions. I know that's antithetical to the lone coder ethos, but if what this comment reflects is true, the industry needs a check and balance to prevent it from destroying its foundation from the inside.
Why would a business train anyone when they can lean on the govt to provide unbankruptable loans to the student to go to university to learn themselves
That’s also true without AI. Engineers want more time to polish and businesses want to ship the 80/20 solution that’s good enough to sell. There's always going to be a tension there regardless of tools.
Don't you see the problem? Now engineers literally do not have any leverage. Did the model make it work? Yes? Then ship it, what are we waiting around for?
That sounds pretty much the same as it’s always been? It used to be: “Does the happy path work? Then ship it! There’s no time to make it robust or clean up tech debt.”
Now there actually is time to make things robust if you learn how to do it.
> Now there actually is time to make things robust if you learn how to do it.
What makes you think you are going to be given time to polish it? You would be pushed to another project. You have more responsibilities with none of the growth.
You’re assuming that building something robustly is significantly more time consuming than the “quick and dirty” version. But that’s not really true anymore. You might need to spend another hour or two thinking through the task up front, but the implementation takes roughly the same amount of time either way.
One cannot build something robust just by thinking about it _a priori_, and while this was somewhat at the periphery of the author's argument, it is important.
You can’t get every detail right up front, but you can build a robust foundation from the beginning.
The argument seems to be that AI is causing managers to demand faster results, and so everything has to be a one-shotted mess of slop that just barely works. My point is that it doesn’t take much longer to build something solid instead. Implementation time and quality/robustness are not tightly coupled in the way they used to be.
Coding by hand is not mere typing symbols into editor that LLMs are now replacing, it’s thinking, abstracting, deciding how to apply your knowledge and experience, searching for information.
And of course in the current workplace where there’s often a push from managers to use LLMs as much as possible and to put as much work as possible on yourself, in this churn junior will not get to learn anything besides prompting and simple tooling.
> thinking, abstracting, deciding how to apply your knowledge and experience, searching for information
None of this requires coding by hand. I can do those things better and faster with agents helping me. That incudes unfamiliar areas where I am effectively a junior.
Line by line is no longer what I need to think about. I think about types/schemas, architectural division, contracts between services and components, how to test thoroughly, scaling properties, security properties, and these kinds of things.
> the juniors who start today don't have that, they overrely on agentic coding and do not know what they don't know
Y'all need to stop worrying about the kids.
They're smarter than us and will run circles around us.
They're going to look at us like dinosaurs and they're going to solve problems of scale and scope 10x or more than what we ever did.
Hate to "old man yells at cloud" this, but so many people are falling into the trap because of personal biases.
While the fear that "smartphones might make kids less computer literate" is true, that's because PCs are not as necessary as they once were. The kids that turn into engineers are fine and are every bit as capable.
You've got to be kidding me. I'd have never managed to become an engineering professional if smartphones were around when I was a horny teenager. There's simply no way.
The proof is in the fact that the savvy Atherton dwellers work hard to keep their kids away from the crack they themselves have foisted upon the world, or at least to delay or forestall the encounter.
> Also, no one gets smarter by outsourcing thinking.
Thinking is happening at a higher level. Humans are adepts at abstraction, and they are always capable of looking under the hood when needed.
We've never had so much societal capability as now. And that's only going to accelerate. Smart people will use these tools effectively. Don't be so bearish on human ingenuity.
Think of these tools as bullshit / busy work removers. You can focus on what matters and get more done than ever before. Deeper work, more connective work. It also opens fields of research up in an interdisciplinary fashion. People might explore outside of their limited domain now that they have help.
For example preliterate people have absolutely insane memory. In comparison my memory sucks. Having to use notes, look things up etc sucks. Literacy is a tradeoff but at least it can be argued to be worth it.
Then there is smartphones. This is not the same. The tradeoffs compared to pre smartphones cannot be argued to be worth it imo and I was 20 years old when they were introduced. They make society and lives worse. It's not just about not being able to use PC but your attention and social skills sucking.
Then there is AI which is even worse than smartphones. The tradeoffs are so unthinkably bad I can't really even describe it.
Due to English-language limitation my most adult life, I struggled to code. Used visual coding etc. But of course, I can't make a living on drag-and-drop harness.
Comes in GPT-3.5, accelerated my learning. Now I'm running my incorporated company, just launched one software-hardware hybrid product. Second one is a micro-SaaS in closed beta.
The point is: when people use "juniors" as a fixed shaped blobs of matter, they focus on the juniors that were in any case going to make mistakes: AI or not. Misses the key point of agentic usage.
So now you can code? If I sat you in front of a computer with no internet and no GPU but your choice of IDE, you would actually be able to produce a product?
Okay, after re-reading the thread, looks like, this question was not in bad faith. While commenting, I was reading from bottom-to-top and so made an opinion about you based on your bottom thread comments (which still stands correct), but for other readers, they deserve an answer, not the questioner.
Anyone wondering about my proficiency, I can code without internet or AI help. But it takes enormous amount of time and mistakes.
>> Comes in GPT-3.5, accelerated my learning. Now I'm running my incorporated company, just launched one software-hardware hybrid product. Second one is a micro-SaaS in closed beta.
> accelerated what learning? learning to code? learning to engineer? learning to manage? learning to market?
I'm pretty certain that you think you're talking to an owner of a business but you're actually talking to an AI-techbro whose "software-hardware hybrid product" and "incorporated company" has exactly zero revenue after it was prompted into existence in the hope that it will make some money before other people realise they could prompt the same thing for less.
No, look. You claimed things that I am skeptical about.
Vibing a product into existence without needing any development knowledge or experience just means you now have a "product" that can't really be sold for money.
I didn’t speak English my early teenage years, and that haven’t stopped me from reading books about programming in my native language. I remember spending hours in bookshops, excited to pick up next book to devour and try out.
I don't believe in victimhood and so didn't want to go here, but since we are comparing notes:
English alphabets came into my education at the age of 10. I got my first computer at the age of 21. I began speaking broken English around the age of 23. Proper internet at the age of 25 or so.
Not to mention, my native language doesn't have programming books, even today.
Of course, an avid reader and Science nerd here. Curiosity and tinkering never stopped.
Out of curiosity. What is your first/native language?
In my country, english is hardly anyones first language, but its' mandatory in schools so I've never had the experience with having to find knowledge but its gate-kept behind a translation wall.
My native language is Gujarati. Done my schooling and college in Gujarati too.
Absolutely, I understand what you're saying.
One of the things people miss out, in most of the discussions is that they think "if you were really serious, you would have figured it out". I agree with that in most instances but language and skill acquisition is a complex process as everyone knows.
English being the de-facto reservoir of programming knowledge and applications, it takes substantial amount of time and effort to cross the threshold of understanding and transference.
In any case, I'm an eternal optimist and I believe in action. It was a great experience listening to people's opinion here and I was kind of shocked to find that some of them are so siloed in their chambers, that's interesting nonetheless.
This post does not make the claim that "you should know everything about everything you work on" - its making the claim that writing code and being able to read code effectively are intrinsically linked.
Author here. Thank you. I absolutely was not stating that, and I don't think its possible or necessary to "know everything", but there's certainly a movement in the industry (and not a small one) that is advocating to abandon even looking at code. It started with Karpathy's "vibe coding" tweet and has extended to now where entire teams are performing "LGTM" level code reviews, while simultaneously reporting that they are finding it difficult to remember how to read, nevertheless write, code properly. "Thinking in code" is a form of thinking that is distinctly different than staying at the higher engineering levels of planning and architecture, and yet they are completely intertwined and interdependent. I plan better when I am engaging with code, and I code better when I know how to plan properly.
I've been thinking about it as a sort of debt. My team sees AI as nothing but a positive. Work gets done faster, what's not to like. I think the piper will need to be paid at some point when we look back and realize how a) completely dependent we've become b) unable to reason about our own code base.
I really hope I'm wrong but I can see it already happening.
I wonder if it's not so much the coding that people don't want to write, but it's more about the weight of all the orchestration, data engineering and research that has to be done (or, understood in the first place) to get anything off the ground these days. It feels off the charts complicated, and of course is now shifting rapidly.
Agreed. I don't know anything about turning sand into transistors or assembly but do well. So I don't know my full stack either.
What is important is not being afraid to learn the rest of your system and keeping an index.
Most importantly it's about being able to spin up on anything quickly. That's how you have wide reach. Digging in when you have to, gliding high when you have to. Appropriate level for the problem at hand.
When I was in college eons ago they taught CS folks all of engineering. "When do I need to know chem-e or analog control systems?" We asked. "You won't. You just need to be able to spin up on it enough to code it and then forget it. We're providing you a strong base."
> The claim you should know everything about everything you work on is an intensely naive one.
I disagree with this take. Personally, I pride myself in learning the code bases I work on in detail, sometimes better than the leads for those code bases. I’m not saying that everyone should do so, but it’s achievable and not naive at all.
Knowing it better than the leads isn’t that hard - they spend most their life in meetings and teaching people how to think. Knowing the code base in detail is important - but I’m certain unless you wrote it all, there are parts you don’t know. I’m sure what you do is build enough scaffold understanding and depth in the core parts you can visit any part and understand it. But I’m also certain there are parts that based on pure recall you care unaware of the details. Someone else wrote it, you haven’t had to read it yet, and thus it’s a black box. Either that or your code base is quite small relative to the team size, or the team is very unproductive. The supposition one person is fully aware of any growing code base built by a team or organization - or a monorepo being built by 10,000 developers over 15 years - is prideful. A lot of it works because it works and you accept that unless you need to inspect a part because it’s not working. Whether a machine wrote it or an intern 10 years ago did, it’s a black box until it has to not be.
Even if you did write it all, unless you are regularly in all of it (which sounds like a horrible job to me), or it is rather small, in my experience you will be at some point trying to git blame some section of code you don’t understand only to find the finger pointed at your ghost.
> The claim you should know everything about everything you work on is an intensely naive one
This is a slight tangent from that, but I place a lot of value on the ability to offload some/most of the mental model to AI. I need to know less about everything (involved in this one task) when working on it, because a lot of the peripheral information can be handled by the AI. I find that _incredibly_ useful.
I have also seen the learning acceleration, there's a significantly increased set of techniques and technologies I have learned how to apply.
From a person perspective though, I'm apprehensive about the effect AI will have on the human "very well read intern." People who know a lot very deeply about specific areas are fascinating to talk to, but now almost everyone is able to at least emulate deep knowledge about an area through the use of AI. The productivity is there, but the human connection is missing.
I agree. I think this is personally very useful, but I think a great deal of what made computing an amazing industry to work in is going to or has already died. I suspect the general field as it existed will entirely cease to exist before the current set of well read interns have had much of a career chance. It is sad, but we have finally succeeded in programming ourselves out of jobs.
> The claim you should know everything about everything you work on is an intensely naive one.
Author here. Where did you find I was stating that? As other users said, that's not at all found in my writing. The rest of your post goes on a tangent about this notion, but seems like its more of a personal pontification, rather than a critique of anything I wrote.
> The claim you should know everything about everything you work on is an intensely naive one.
It is true that you normally do not need to know everything, or even most of it.
Despite this, it is necessary to be able to discover and understand quickly anything about the project or system on which you work.
I have seen plenty of software teams that became stuck at some point because they could not solve some trivial problem that required a zoom into the project where some extra skills were required for understanding what they saw, like understanding a lower-level language, or assembly language or some less usual algorithms or networking protocols and so on.
Or otherwise they were stuck not because they lacked the skills to interpret what they saw, but because they used something that was a black box, like a proprietary library or a proprietary operating system, and it was impossible to determine what it really did instead of what it was expected to do, without being able to dive into its internals.
So I believe that the environment should always enable you to know everything about everything you work on, even if this should be only very seldom necessary.
> But I can at least ask the agent a question and it will be able to answer it
A problem here is that, in some sense, the agent that wrote the code is not the same agent that is answering questions about it. if the original agent didn't leave their reasoning, you are probably out of luck.
There are tools like git-ai [0] that capture LLM sessions and associate each file edit to a specific agent action, and let agents query a given piece of code to read the conversation around it (what the user prompted, what was the reasoning of the LLM that created the code, etc). They could change the balance, but are not widely used
"Hey! Just popping in to say that agentic coding is actually pretty great and is making me better in all the ways; but also want to say at same time that it's actually not all that different from anything else, so we can chalk up any critique to it to individual naivety and bias."
i agree with this. While refactoring the outdated code at my company, i realized that “you can’t know everything.” after the refactor, i can ask the LLM questions and get answers, but unfortunately, when integrating a new feature, it keeps treating it as a “new layer.”
I think it's important to at least have a mental model of code you directly commit to the codebase, and that doesn't happen if it was written by an agent.
I’ve build a configuration transpiler to Claude code and codex and found I can switch pretty quickly between both and run both at once. At the moment codex performs better. Prior CC did. There is no vendor lockin and this is an old canard in technology that LLMs in fact themselves make irrelevant. Once you’ve got an implementation that uses X converting it to Y is almost trivial with an LLM because the spec is canonical in the reference.
It’s buried in my dotfiles and not easily extracted. But the idea isn’t a hard one to implement, except the coding engineers are woefully unaware of themselves. Codex is easier because it’s open source. Claude you kind of have to futz with it for a while. Once you have the intermediate form working and outputting config for the two I’m sure you can coerce it to any other agent that comes along with similar constructs (marketplaces, etc). Theres some nuance for some MCPs particular those that download binaries like rust MCPs but its very complex I found and probably better to avoid unless you really need it.
Yeah the language is bizarre and clearly artificial. It was made by an openclaw agent almost certainly. It took me a long time to understand their point amongst points.
To be fair, taking an average SWE at $160k/y, and spending $1k/m, and offloading mechanical ticket work from their working set sounds like a bargain to me. They could be spending the time on design and planning and working on new things, figuring out how to save costs in optimizations. In fact for every soul sucking mechanical task you offload, the better of you are overall.
It’s not like AI is the first time this happened. CI/CD and extensive preflight and integration and canary testing is also a way of saving engineer time and improving throughput at the cost of latency and compute resources. This is just moving up the semantic stack.
Obviously as engineers we say “awesome more features and products!” but management says “awesome fewer engineers!” either way pasting the ticket in and letting a machine do the work for a fraction of the cost was the right choice. There’s no John Henry award.
> pasting the ticket in and letting a machine do the work for a fraction of the cost was the right choice
If it were producing equivalent outcomes, sure. So far I haven't personally seeing strong evidence for that. LLMs do write code pretty competently at this point, but actually solving the correct problem, and without introducing unintended consequences, is a different matter entirely
This. LLMs are terrible at planning/architecture and maintaining clarity of vision across a project. There are lots of tools that mitigate these issues but they're going to keep coming up regardless because of the fundamental nature of LLMs.
If you're not doing the design of the solutions for problems as an engineer or at least making the decisions and owning the maintenance of that architecture/design, what even is your job at that point?
I’ve found LLMs very good at two things: 1. Recommending paths forward, 2. Following established architecture. Your job is to be able to treat the LLM and code as sheep
> and offloading mechanical ticket work from their working set sounds like a bargain to me
Unfortunately the people who offload the work of understanding and interacting with tickets just end up offloading the consequences to everyone else who has to do extra work to make sure their LLM understands the task, review the work to make sure they built the right thing, and on and on.
The same thing happens when people start sending AI bots to attend meetings: The person freed up their own time, but now everyone else has to work hard to make sure their AI bot gets the right message to them and follow up to make sure what was supposed to happen in the meeting gets to them.
If someone sends a bot to a meeting, warn them the first time. Fire them the second, for exactly the reason that you said in your last paragraph: They're pushing their work onto other people.
I think if codex can fill in some functional gaps that shouldn’t be that huge - like having defined agents in plugins like Claude code - it’s actually a preferable product. It’s faster in every way, seems to manage context a lot better - compaction isn’t a completely end of world event to be avoided at all costs. With the addition of defined thinking and the fact it actually seems to follow tool calling instructions, it’s handler for permissions, and other features it’s frankly a better tool overall. 5.5 seems to be a reasonable model.
Anthropic seems to have really killed their advantage by squandering the immense good will they built up by blundering over and over again the last few months with the developer community.
Tonight, for instance, after the incident had recovered, I restarted my work. On my Max account my usage period completely exhausted in 4 minutes of sonnet subagent work. This was long after prime time, and the workload was a fraction what I normally do.
These days I run codex concurrently and have gotten my marketplaces and plugins and MCPs adapted to it - other than the agents which I do lean heavily on - and generally find it a capable replacement. Anthropic needs to take notice and get their house in order.
Microsoft specializes in taking successful products and pumping them full of malware, spyware, bloatware, and adware once they have a critical mass of users. It is often preceded by quality dropping significantly due to under investment and McKinsey being brought in to find a way to prop up declining revenues - of course the answer is never to invest in making it a superior product again, but monetization strategies.
Two 9’s? You have to work pretty hard to do that badly. That’s like bragging you graduated with a C average from Harvard after your father endowed a chair to get you in.
Given GitHub has become a utility service globally this should be frankly worrisome to everyone let alone the developer community actively using it. It’s intertwined into many things now beyond simply source code hosting and PRs. And I am surprised GitHub leadership is ok with the state of things. Having worked at a lot of 5-6 9’s shops, this would have been all hands on deck, all roadmaps paused, figure it out or perish sorts of stuff.
We don't have to let it be a utility service. It's not like the power and water to your house where laying new pipes is a monumental and stupid effort. $3 per month can get you a VPS to run your git hosting on - if you even need git hosting, and aren't just using GitHub because it's there.
Gemini helped them build it but didn’t / couldn’t attribute it from its corpus. I think we will see a surge of “rediscovery” that’s unattributed training surfacing of prior work that wasn’t widely recognized at the time.
Gemini is perfectly capable of searching the web. Pretty good at it really. As are most agents. If such a surge happens, it’s purely because of laziness.
reply