Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

A compiler uses rigorous modeling and testing to ensure that generated code is semantically equivalent. It can do this because it is translating from one formal language to another.

Translating a natural prompt on the other hand requires the LLM to make thousands of small decisions that will be different each time you regenerate the artifact. Even ignoring non-determinism, prompt instability means that any small change to the spec will result in a vastly different program.

A natural language spec and test suite cannot be complete enough to encode all of these differences without being at least as complex as the code.

Therefore each time you regenerate large sections of code without review, you will see scores of observable behavior differences that will surface to the user as churn, jank, and broken workflows.

Your tests will not encode every user workflow, not even close. Ask yourself if you have ever worked on a non trivial piece of software where you could randomly regenerate 10% of the implementation while keeping to the spec without seeing a flurry of bug reports.

This may change if LLMs improve such that they are able to reason about code changes to the degree a human can. As of today they cannot do this and require tests and human code review to prevent them from spinning out. But I suspect at that point they’ll be doing our job, as well as the CEOs and we’ll have bigger problems.

 help



I don't see a world where a motivated soul can build a business from a laptop and a token service as a problem. I see it as opportunity.

I feel similarly about Hollywood and the creation of media. We're not there in either case yet, but we will be. That's pretty clear. and when I look at the feudal society that is the entertainment industry here, I don't understand why so many of the serfs are trying to perpetuate it in its current state. And I really don't get why engineers think this technology is going to turn them into serfs unless they let that happen to them themselves. If you can build things, AI coding agents will let you build faster and more for the same amount of effort.

I am assuming given the rate of advance of AI coding systems in the past year that there is plenty of improvement to come before this plateaus. I'm sure that will include AI generated systems to do security reviews that will be at human or better level. I've already seen Claude find 20 plus-year-old bugs in my own code. They weren't particularly mission critical but they were there the whole time. I've also seen it do amazingly sophisticated reverse engineering of assembly code only to fall over flat on its face for the simplest tasks.


That depends on how fast that change happens. If 45% of jobs evaporate in a a 5 year period, a complete societal collapse is the likely outcome.

Sounds like influencer nonsense to me. Touch grass. If the people are fed and housed, there's no collapse. And if the billionaire class lets them starve, they will finally go through some things just like the aristocracy in France once did. And I think even Peter Thiel is smarter than that. You can feed yourself for <$1000 a year on beans and rice. Not saying you'd enjoy it, but you won't starve. So for ~$40B annually, the billionaires buy themselves revolution insurance. Fantastic value.

OTOH if what you're really talking about is the long-term collapse in our ludicrous carbon footprint when we finally run out of fossil fuels and we didn't invest in renewables or nuclear to replace them, well, I'm with you there.


>Sounds like influencer nonsense to me. Touch grass.

I don't even know what this means.

The worst unemployment during the Weimar Republic was 25-30%. Unemployment in the Great Depression peaked at 25%.

So yeah if we get to 45% unemployment and those are the highest paying jobs on average then yeah it's gonna be bad. Then you add in second order effects where none of those people have the money to pay the other 55% who are still employed.

We might get to a UBI relatively quickly and peacefully. But I'm not betting on it.

>finally go through some things just like the aristocracy in France once did.

Yeah that's probably the most likely scenario, but that quickly devolved into a death and imprisonment for far more than the aristocrats and eventually ended with Napoleon trying to take over Europe and millions of deaths overall.

The world didn't literally end, but it was 40 years of war, famine, disease, and death, and not a lot of time to think about starting businesses with your laptop.


And the dark ages lasted a millennium. Sounds like quite an improvement on that. And if America didn't want a society hellbent on living the worst possible timeline, why did it re-elect President Voldemaga and give him the football? And then, even when he breaks nearly every political promise, his support remains better than his predecessor? Anyway, I think the richest ~1135 Americans won't let you starve, but they'll be happy to watch you die young of things that had stopped killing people for quite some time whilst they skim all the cream. And that seems to be what the plurality wants or they'd vote differently.

The good news is that America is ~5% of the world. And the more we keep punching ourselves in the face, the better the chance someone else pulls ahead. But still, we have nukes, so we're still the town bully for the immediate future.


What are you even arguing about? I have absolutely no idea where you are going with this.

Yeah I figured that. You think society is going to collapse because of AI. I don't. But I do think that stupid narrative is prevalent in the media right now and the C-suite happily proclaiming they're going to lay people off and replace them with AI got the ball rolling in the first place. Now it has momentum of its own with lunatics like Eliezer Yudkowsky once again getting taken seriously.

Fortunately, the other 95% of humanity is far less doomer about their prospects. So if America wants to be the new neanderthals, they'll be happy to be the new cro magnons.


I don't think society is going to collapse because of AI because I don't think the current architectures have any chance of becoming AGI. I think that if AGI is even something we're capable of it's very far off.

I think that if CEOs can replace us soon, it's because AGI got here much sooner than I predicted. And if that happens we have 2 options Mad Max and Star Trek and Mad Max is the more likely of the 2.


What's with all the catastrophic thinking then? Mad Max? Collapse of Society because 45% unemployment? I really hate people on principle but I have more faith in them looking out for their own self interest than you do apparently. Mad Max specifically requires a ridiculous amount of intact infrastructure for all the gasoline (you know gasoline goes bad in 3-6 months? Yeah didn't think so), manufacturing for all the parts for all those crazy custom build road warrior wagons, and ranches of livestock for all the leather for all the cool outfits (and with all that cow, no one needs to starve but oh the infrastructure needed to keep the cows fed).

If doom porn is your thing, try watching Threads or The Day After, especially Threads. That said, I don't think Star Trek is possible, maybe The Expanse but more likely we run out of cheap energy before we get off world.

As for the AGI, it all depends on your definition. We're already at Amazon IC1/IC2 coding performance with these agents (I speak from experience previously managing them). If we get to IC3, one person will be able to build a $1B company and run it or sell it. If you're a purist like me and insist we stick to douchebag racist Nick Bostrom's superintelligence definition of AGI, then we agree. But I expect 24/7 IC3 level engineering as a service for $200/month to be more than enough and I think that's a year or two away. And you can either prepare for that or scream how the sky is falling, your choice.


>Mad Max specifically requires a ridiculous amount of intact infrastructure for all the gasoline (you know gasoline goes bad in 3-6 months? Yeah didn't think so)

Is this a joke or do you have a learning disability?

>But I expect 24/7 IC3 level engineering as a service for $200/month to be more than enough and I think that's a year or two away. And you can either prepare for that or scream how the sky is falling, your choice.

Or I could do neither and write you off as a gasbag who doesn't know what he's talking about like all the other ex-amazon management I've had the pleasure to work with over the years.


I guess you have a really short context buffer with all this frequently forgetting things you've said yourself.

But that aside, how's all that self-righteousness working out for you?


I bet you have ex-Amazon prominently in your LinkedIn profile.

Don't have a LinkedIn profile, don't need one. But I'm guessing you're listed under LinkedIn Lunatics.

I read back through a few of your posts and you’re either schizophrenic, or a very elaborate troll.

I know a few older people who started posting like this when they hit their 50s. I’ve only got a few years left. Hopefully I can avoid it, but maybe it’s inevitable.


Ageism: now that's a warrior's flex, amIRight?

People like myself in their 50s to 60s who had the experience of banging the metal on imperfect buggy hardware late into the night to mine gems before Python made the entire software engineering community pivot to a core competency of syntax pedanting plus stringing library calls together are having a real party with AI agents effectively doing the same thing they did 30 years ago. I personally never stopped coding even through my one awful experience as an engineering manager.

But you do you, and hear me now, dismiss me later. There won't be 45% unemployment because the minute AI starts replacing current engineering skills for real is the minute the people it targets wake up and start learning how to work with AI coding agents that will be dramatically better than today. People resist change until there are no other options, just look at fossil fuels. The free market will work that one out too eventually.

And no amount of some nontechnical guy vibe scienceing his way to a working mRNA vaccine for his cancer-ridden dog or an engineer unlocking mods to Disney Infinity just from the binary and Claude Code or an entire web browser ported to rust will ever convince you these things are not the enemy. And that's going to put you through some things down the road. So of course, since this will never happen I'm an elaborate troll or a nutcase just like the people who pulled all those things off, never mind all the evidence mounting that these things can be amazing in the right hands. That's CRAZYTALK! Stochastic Parrot! Glorified Autocomplete! Mad Max! Mad Max! DLSS 5!


>You can feed yourself for <$1000 a year on beans and rice. Not saying you'd enjoy it, but you won't starve. So for ~$40B annually, the billionaires buy themselves revolution insurance. Fantastic value.

You are the epitome of the tech bro.


Sure, sure. Understanding how these sociopaths think clearly makes me a tech bro rather than someone who incorporates worst-case scenarios into my planning. Suggesting they would maintain minimum viable society to save their own asses means I'm in favor of it, right? This is why I work remotely.

Peter Thiel might be smarter than that but I’m not sure about the other ones.

Look how Musk treated the Twitter devs or Bezos any of his workers or Trump anybody.


They're all quite intelligent. And they're world class experts in saving their own bacon. Doesn't mean they have any ethics though nor any emotional intelligence after decades of being surrounded by toadies and bootlickers.

Smart is not equal to intelligent.

You can be very intelligent but have a blind eye on some trivial things.

I’m certain that some of them think they are untouchable (or even just are well prepared). We will only see if that’s really true if shit hits the fan.


We all know they have bunkers and we roughly know where they are. I got suspended on reddit for threatening harm to others for saying that a couple weeks back. But I don't think we need to raid the bunkers in your TEOTWAWKI scenario, their bodyguards will do all the heavy-lifting once they realize the power balance has shifted. But I also don't expect a SHTF scenario, just a slow creeping enshitification of living standards instead of actually implementing a UBI.

And then the survivors who band together to rebuild community instead of chasing some idiotic Mad Max scenario will ultimately prevail. And yes, they are blind to that other option because they wouldn't end up on top.


>If you can build things, AI coding agents will let you build faster and more for the same amount of effort.

But you aren't building, your LLM is. Also, you are only thinking about ways as you, a supposed builder, will benefit from this technology. Have you considered how all previous waves of new technologies have introduced downstream effects that have muddied our societies? LLMs are not unique in this regard, and we should be critical on those who are trying to force them into every device we own.


Would you say the general contractor for your home isn’t a builder because he didn’t install the toilets?

I think this argument would be make more sense if you were talking about an architect, or the customer.

A contractor is still very much putting the house together.


The general contractor is not doing the actual building as much as he is coordinating all of the specialist, making sure things run smoothly and scheduling things based on dependencies and coordinating with the customer. I’ve had two houses built from the ground up

3 myself and I have yet to meet a "vibe" contractor.

And he is also not inspecting every screw, wire, etc. He delegates

Oh you're preaching to the choir. I think we are entering a punctuated equilibrium here w/r to the future of SW engineering. And the people who have the free time to go on to podcasts and insist AI coding agents can't do anything useful rather than learning their abilities and their limitations and especially how to wield them are going to go through some things. If you really want to trigger these sorts, ask them why they delegate code generation to compilers and interpreters without understanding each and every ISA at the instruction level. To that end, I am devoid of compassion after having gone through similar nonsense w/r to GPUs 20 years ago. Times change, people don't.

I haven’t stayed relevant and able to find jobs quickly for 30 years by being the old man shouting at the clouds.

I started my career in 1996 programming in C and Fortran on mainframes and got my first only and hopefully last job at BigTech at 46 7 jobs later.

I’m no longer there. Every project I’ve had in the last two years has had classic ML and then LLMs integrated into the implementation. I have very much jumped on the coding agent bandwagon.


Started mine around the same time and yes, keeping up keeps one employed. What's disheartening however is how little keeping up the key decision makers and stakeholders at FAANNG do and it explains idiocy like already trying to fire engineers and replace them with AI. Hilarity ensued of course because hilarity always ensues for people like that, but hilarity and shenanigans appears to be inexhaustible resources.

I very much would rather get a daily anal probe with a cactus than ever work at BigTech again even knowing the trade off that I now at 51 make the same as 25 year old L5 I mentored when they were an intern and their first year back as an L4 before I left.

If you have FIRE money, getting off the hamster wheel of despair that is tech industry culture is the winning move. Well-played.

Not quite FIRE money. I still need to work for awhile - I just don’t need to chase money. I make “enough” to live comfortably, travel like I want (not first class.), save enough for retirement (max out 401K + catchup contributions + max out HSA + max out Roth).

We did choose to downsize and move to state tax free Florida.

If I have to retire before I’m 65, exit plan is to move to Costa Rica (where we are right now for 6 weeks)


I think that's precisely his thinking and don't let him know about all those fancy expensive unitasker tools they have that you probably don't that let them do it far more cost effectively and better than the typical homeowner. Won't you think of the jerbs(tm)? And to Captain dystopia, life expectencies were increasing monotonically until COVID. Wonder what changed?

I've struggled a bit with this myself. I'm having a paradigm shift. I used to say "but I like writing code". But like the article says, that's not really true. I like building things, the code was just a way to do that. If you want to get pedantic, I wasn't building things before AI either, the compiler/linker was doing that for me. I see this is just another level of abstraction. I still get to decide how things work, what "layers" I want to introduce. I still get to say, no, I don't like that. So instead of being the "grunt", I'm the designer/architect. I'm still building what I want. Boilerplate code was never something I enjoyed before anyway. I'm loving (like actually giggling) having the AI tie all the bits for me and getting up and running with things working. It reminds me of my Delphi days: File->New Project, and you're ready to go. I think I was burnt out. AI is helping me find joy again. I also disable AI in all my apps as well, so I'm still on the fence about several things too.

This resonates. I spent years thinking I enjoyed coding, but what I actually enjoy is designing elegant solutions built on solid architecture. Inventing, innovating, building progressively on strong foundations. The real pleasure is the finished product (is it ever really finished though?) — seeing it's useful and makes people's lives easier, while knowing it's well-built technically. The user doesn't see that part, but we know.

With AI, by always planning first, pushing it to explore alternative technical approaches, making it explain its choices — the creative construction process gets easier. You stay the conductor. Refactoring, new features, testing — all facilitated. Add regular AI-driven audits to catch defects, and of course the expert eye that nothing replaces.

One thing that worries me though: how will junior devs build that expert eye if AI handles the grunt work? Learning through struggle is how most of us developed intuition. That's a real problem for the next generation.


> A compiler uses rigorous modeling and testing to ensure that generated code is semantically equivalent.

Here are the reported miscompilation bugs in GCC so far in 2026. The ones labeled "wrong-code".

https://gcc.gnu.org/bugzilla/buglist.cgi?chfield=%5BBug%20cr...

I count 121 of them.


If you can’t understand the difference between a bug that will rarely cause a compiler encountering an edge case to generate a wrong instruction and an LLM that will generate 2 completely different programs with zero overlap because you added a single word to your prompt, then I don’t know what to tell you.

The point is that expert humans (the GCC developers) writing code (C++) that generates code (ASM) does not appear to be as deterministic as you seem to think it is.

I’m very aware of that, but I’m also aware that it’s rare enough that the compiler doesn’t emit semantically equivalent code that most people can ignore it. That’s not the case with LLMs.

I’m also not particularly concerned with non-determinism but with chaos. Determinism in LLMs is likely solvable, prompt instability is not.


Classic HN-ism. To focus on the semantics of a statement while ignoring the greater point in order to argue why someone is wrong.

I think it's a perfectly fine point. The OP said (my interpretation) that LLMs are messy, non-deterministic, and can produce bad code. The same is true of many humans, even those whose "job" is to produce clean, predictable, good code. The OP would like the argument to be narrowly about LLMs, but the bigger point even is "who generates the final code, and why and how much do we trust them?"

As of right now agents have almost no ability to reason about the impact of code changes on existing functionality.

A human can produce a 100k LOC program with absolute no external guardrails at all. An agent Can't do that. To produce a 100k LOC program they require external feedback forcing them from spiraling off into building something completely different.

This may change. Agents may get better.


I argued the greater point? Software code-generation is not deterministic, whether it's done by expert humans or by LLMs.

It has nothing to do with determinism. It's the difference between nearly perfectly but not quite perfectly translating between rigorously specified formal languages and translating an ambiguous natural language specification into a formal one.

The first is a purely mechanical process, the second is not and requires thousands of decisions that can go either way.


And that’s no different than human developers

The difference is that a human is that a human can reason about their code changes to a much higher degree than an AI can. If you don't think this is true and you think we're working with AGI, why would you bother architecting anything all or building in any guard rails. Why not just feed the AI the text of the contract your working from and let it rip.

You give way too much credit to the average mid level ticket taker. And again, why do I care how the code does it as long as it meets the functional and none functional requirements?

Because in a real application with real users all of the functional and non-functional requirements aren't documented anywhere but in the code.

If only a coding agent had access to your code…

You realize that coding agents aren’t AGI right? They aren’t capable of reasoning about a code changes impact on anything other than their immediate goal to anywhere near the level even a terrible human programmer is. That why we have the agentic workflow in the first place. They absolutely require guardrails.

Claude will absolutely change anything that’s not bolted to the floor. If you’ve used it on legacy software with users or if you reviewed the output you’d see this.


Compilers are some of the largest, most complex pieces of software out there. It should be no surprise that they come with bugs as all other large, complex pieces of software do.

This seems to apply easily to LLMs as language coprocessors that can output code. How long was it before people trusted compilers?

If you don't understand the difference between something that rigorously translates one formal language to another one and something that will spit out a completely different piece of software with 0 lines of overlap based on a one word prompt change, I don't know what to tell you.

"rigorously" is doing a lot of heavy lifting here.

Let's substitute rigorously with "in an extremely thorough, careful, and methodical way."

As if when you delegate tasks to humans they are deterministic. I would hope that your test cases cover the requirements. If not, your implementation is just as brittle when other developers come online or even when you come back to a project after six months.

1. Agents aren’t humans. A human can write a working 100k LOC application with zero tests (not saying they should but they could and have). An agent cannot do this.

Agents require tests to keep them from spinning out and your tests do not cover all of the behaviors you care about.

2. If you doubt that your tests don’t cover all your requirements, 99.9% of every production bug you’ve ever had completely passed your test suite.


I have never known a human that could or did write 100K lines of bug free working code without running parts of it first and testing.

So humans also don’t write bug free code or tests that cover all use cases - how is that an argument that humans are better?


Not that humans can't write 100k line programs bug free or without running parts of it.

An AI cannot write a 100k line program on its own without external guard rails otherwise it spins out. This has nothing to do with whether the agent is allowed to run the code itself. This is well documented. Look at what was required to allow Claude to write a "C compiler".

This has nothing to do with whether it's bug free. It literally can't produce a working 100k LOC program without external guardrails.


Absolutely no one is arguing that you shouldn’t have a combination of manual and automated tests around either AI or human generated code or that you shouldn’t have a thoughtful design

In a non-trivial app you can't test your way through all of the e2e workflows and thoughtful design isn't what I'm talking about.

How many bugs have you seen that passed your automated and manual testing? Probably 99.9% of them.

Now imagine that you take those same test suites and you unleash an agent on the code that has far worse reasoning capabilities than a human and you tell them they can change anything in the code as long as the tests pass.


So if bugs pass through testing which they have forever, wouldn’t that imply that humans are just as fallible as AI - and slower?

I never suggested letting agents code for a day on end. I use AI to code well defined tasks and treat it like a mid level ticket taker


If you have an employee who codes 2x faster than everyone else but produces 10x the bugs, would your suggestion to be to let him rip and stop reviewing his code output?

> I never suggested letting agents code for a day on end. I use AI to code well defined tasks and treat it like a mid level ticket taker

It doesn’t matter how long you’re letting it run. If you aren’t reviewing the output, you have no way of knowing when it changes untested behavior.

I regularly find Claude doing insane things that I never would have thought to test against, that would have made it into prod if I hadn’t renewed the code.


> It doesn’t matter how long you’re letting it run. If you aren’t reviewing the output, you have no way of knowing when it changes untested behavior.

You’re focused on the output , I’m focused on the behavior. Thats the difference. Just like when I delegate a task to either another developer or another company like the random Salesforce integration or even a third party API I need to integrate with.


Unfortunately you are not equipped to observe and test all or even most of the behavior of a non-trivial system.

And if you attempt to treat every module in your system like it’s untrusted 3rd party code you’ll run into severe complexity and size limits. No one codes large systems like that because it’s not possible. There are always escape hatches and entanglements.


Actual a little company you might have heard of called Amazon does…

Jeff Bezos mandated it in 2002.

https://konghq.com/blog/enterprise/api-mandate

AWS S3 by itself is made up of 200+ micro services


Except that they don’t. The API mandate gets violated all the time. And no one at Amazon actually treats every other team as a 3rd party.

Have you worked at Amazon?

I am not saying they treat every other team as a third party. I am saying they treat the code itself as a black box with well defined interfaces. They aren’t reaching in to another services data store to retrieve information.


How do you know someone is ex-Amazon ? Don’t worry they’ll tell you.

I haven’t mentioned it 5 times already so you can be pretty sure I haven’t. I know too many people that have worked there to ever make that mistake.

But more importantly, have you worked there in the last decade?


Then you have know idea how Amazon works. I was there from 2020-2023.

And before that I worked at 4 product companies when new to the company managers /directors/CTOs needed someone to bring in best practices and build and teach teams.

I honestly wonder what type of “big ball of mud” implementations (https://blog.codinghorror.com/the-big-ball-of-mud-and-other-...) you have had to deal with that weren’t properly componentized and covered by tests.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: