More

skeeter2020 · 2026-03-27T18:23:58 1774635838

>> would often result in opening some internet page instead of the app you wanted.

and even worse, in Edge!

skeeter2020 · 2026-03-27T15:57:32 1774627052

it can ALWAYS get worse...

skeeter2020 · 2026-03-27T15:51:51 1774626711

>> not taking the chance is cowardly & nihilistic

It seems reasonable to argue that giving up on a planet where everyone but a handful of people will be for the long-term future is the cowardly path.

skeeter2020 · 2026-03-27T15:50:08 1774626608

>> Our Mars robots are awesome, but they take years to accomplish what astronauts could do in days.

What? The unmanned space program has been beyond the edges of our solar system. Meanwhile humans have been day tourists in space. I don't know how you can come to this conclusion that "humans > robots" when humans have never even been close to the surface of Mars.

>> Even just a tiny temporarily occupied Mars science outpost would be a tremendous boost to our understanding of the planet

How many robots could we land with the equivalent resources, or telescope satellites, or autonomous probes?

skeeter2020 · 2026-03-24T15:04:51 1774364691

I spent about a week doing an "experiment" greenfield app. I saw 4 types of issues:

0. It runs way too fast and far ahead. You need to slow it down, force planning only and explicitly present a multi-step (i.e. numbered plan) and say "we'll do #1 first, then do the rest in future steps".

take-away: This is likely solved with experience and changing how I work - or maybe caring less? The problem is the model can produce much faster than you can consume, but it runs down dead ends that destroy YOUR context. I think if you were running a bunch of autonomous agents this would be less noticeable, but impact 1-3 negatively and get very expensive.

1. lots of "just plain wrong" details. You catch this developing or testing because it doesn't work, or you know from experience it's wrong just by looking at it. Or you've already corrected it and need to point out the previous context.

take-away: If you were vibe coding you'd solve all these eventually. Addressing #0 with "MORE AI" would probably help (i.e. AI to play/validate, etc).

2. Serious runtime issues that are not necessarily bugs. Examples: it made a lot of client-side API endpoints public that didn't even need to exist, or at least needed to be scoped to the current auth. It missed basic filtering and SQL clauses that constrained data. It hardcoded important data (but not necessarily secrets) like ports, etc. It made assumptions that worked fine in development but could be big issues in public.

take-away: AI starts to build traps here. Vibe coders are in big trouble because everything works but that's not really the end goal. Problems could range from 3am downtime call-outs to getting your infrastructure owned or data breaches. More serious: experienced devs who go all-in on autonomous coding might be three months from their last manual code review and be in the same position as a vibe coder. You'd need a week or more to onboard and figure out what was going on, and fix it, which is probably too late.

3. It made (at least) one huge architectural mistake (this is a pretty simple project so I'm not sure there's space for more). I saw it coming but kept going in the spirit of my experiment.

take-away: TBD. I'm going to try and use AI to refactor this, but it is non trivial. It could take as long as the initial app did to fix. If you followed the current pro-AI narrative you'd only notice it when your app started to intermittently fail - or you got you cloud provider's bill.

Schiendelman · 2026-03-24T19:26:31 1774380391

I'm a product manager, and a lot of the things I see people do wrong is because they don't have any product management experience. It takes quite a bit of work to develop a really good theory of what should be in your functional spec. Edge cases come up all the time in real software engineering, and often handling all those cases is spread across multiple engineers. A good product manager has a view of all of it, expects many of those issues from the agent, and plans for coaching it through them.

causal · 2026-03-25T14:27:02 1774448822

I'm an engineer and I totally agree. Engineers + LLMs exacerbate the timeless problem of not understanding the reality behind the problem. Validating solutions against reality is hard and LLMs just hallucinate their way around unknowns.

nevertoolate · 2026-03-24T19:49:35 1774381775

Poe’s law is strong with this one

tristor · 2026-03-24T20:12:12 1774383132

I think that's an incredibly reductionist and sarcastic take. I'm also in Product, but was an engineer for over a decade prior. I find that having strong structured functional specifications and a good holistic understanding of the solution you're trying to build goes a long way with AI tooling. Just like any software project, eliminating false starts and getting a clear set of requirements up front can minimize engineering time required to complete something, as long as things don't change in the middle. When your cycle time is an afternoon instead of two quarters, that type of up front investment pays off much better.

I still think AI tooling is lacking, but you can get significantly better results by structuring your plans appropriately.

Schiendelman · 2026-03-24T20:01:37 1774382497

Tell me more! I'm trying to figure out how you got that.

nevertoolate · 2026-03-25T19:31:55 1774467115

I understand that you are serious. I am also serious here.

Have you built anything purely with LLM which is novel and is used by people who expect that their data is managed securely, and the application is well maintained so they can trust it?

I have been writing specifications, rfcs, adrs, conducting architecture reviews, code reviews and what not for quite a bit of time now. Also I’ve driven cross organisational product initiatives etc. I’m experimenting with openspec with my team now on a brownfield project and have some good results.

Having said all that I seriously doubt that if you treat the english language spec and your pm oversight as the sole QA pillars of a stochastic model transformer you are making a mistake.

Schiendelman · 2026-03-26T05:57:10 1774504630

Well, I was a QA engineer and an SDET before being a product manager, how does that change your opinion?

nevertoolate · 2026-03-26T08:21:52 1774513312

I think it is great!

The issue is that validation needs presence and it is the limiting factor - common knowledge, but is part of the “physics”. Also maintenance gets really tricky if the codebase has warts in it - which it will have. I get much more easy to understand architecture out of an LLM driven code generation process if I follow it and course correct / update the spec process based on learnings.

Example: yesterday I’ve introduced a batch job and realized during the implementation phase that some refactoring is needed so the error boundary can be reused in the batch application from the main backend. This was unplanned and definitely not a functional requirement - could be documented as non-functional. There was a gap between the agent’s knowledge and mine even though the error handling pattern is well documented in the repository. Of course this can be documented better next time if we update the process of openspec writing but having these gaps is inevitable unless formal and half-formal definitions are introduced - but still there needs to be someone with “fresh eyes” in the loop.

causal · 2026-03-25T14:28:31 1774448911

I think it's just sarcasm coming from the stereotypical HN attitude that Product Managers only get in the way of the real work of engineering. Ignore it; they're basically proving your point.

snackerblues · 2026-03-25T06:29:40 1774420180

> you know from experience it's wrong just by looking at it

You do, because you have actual experience programming. Fortnite McBroccolihair III doesn't, so how is he supposed to know what's wrong?

skeeter2020 · 2026-03-24T14:57:33 1774364253

That's no where near the end stage of launching a business.

ghywertelling · 2026-03-24T16:35:23 1774370123

It is already having impact in triaging support tickets and faster resolution using logs.

calvinmorrison · 2026-03-24T16:45:55 1774370755

it's past the end stage, we are already in business. it's just something I am not an expert in, I have used in the past (by having real ops engineers build it for me) and now I have something that gives us insight into our production stack, alerts, etc, that isnt janky and covers my goals. So... yeah that is valuable and improves my business.

stronglikedan · 2026-03-24T14:58:43 1774364323

I used it to design my business cards!

skeeter2020 · 2026-03-24T00:49:53 1774313393

receptionist as a service has been a thing for like... forever. You are never going to solve the problem of accurately estimating and quoting with AI or an answering service, so pay for someone to answer the phone and take down the details; have a mechanic or trained service rep review and estimate. Cheap code that doesn't solve the problem is not cheap.

eucyclos · 2026-03-24T01:56:57 1774317417

Couldn't an ai take down the details and pass it to a mechanic or trained service rep?

ssl-3 · 2026-03-24T05:44:06 1774331046

Yes, of course. The bot can request information and the customer can provide it if they feel like it, and then someone qualified can call them back when they have their hands free.

But there's no bot, per se, needed at all. An answering machine from 1993 can do this same information-gathering job. :)

camillomiller · 2026-03-24T07:09:02 1774336142

I can see a useful simple case of structuring a good answering system and then using AI to do STT then using Claude to structure the callback data

ssl-3 · 2026-03-24T07:25:58 1774337158

Good point.

So update the device from 1993's new-fangled digital answering machine to 2009's Google Voice, and have it do the transcription from voicemail to text.

Someone will still have to call Bill back about his Honda (which is actually the Kia he bought for his daughter -- Bill is not a very technical guy these days[1] and he confuses such concepts regularly) in order to get any trading of money for services done.

It doesn't take an LLM to get there, and Bill would probably prefer to avoid being frustrated by the bot's insistent nature.

[1]: https://news.ycombinator.com/item?id=47356166

camillomiller · 2026-03-24T09:10:06 1774343406

Look, you‘re kicking an open door. I think LLMs applied like this are just a layer of complexity that os mostly replacing lower level programming solutions that could do the same thing

Mrngl1991 · 2026-03-24T10:37:57 1774348677

The transcription + callback loop is honestly underrated. Most of the value here is just capturing intent accurately ("Honda" vs "Kia" aside) so the mechanic can prioritize callbacks. A dumb voicemail-to-text pipeline handles that fine. The LLM layer adds complexity without solving the actual bottleneck, which is someone qualified picking up the phone.

ssl-3 · 2026-03-24T17:44:09 1774374249

You nailed it.

But I'm not sure that a bot can be trusted to make good decisions about priority, either. So even if it makes good decisions based on context (which it can increasingly-often do, but does not always do), it lacks the context that is necessary to form the basis of good decisions.

Suppose a message comes into the box with this form: "This is Wendy, can you call me? My car is making that noise again."

The bot might deprioritize that call because it lacks actionable contextual information. "My job as a bot is to get more jobs into the shop. This call does not have enough data to do that, so I'll shove to the bottom of list of callbacks behind more-actionable jobs."

But the mechanic? The mechanic knows Wendy's Ford very well, and he also knows Wendy. She's a been a good customer for over a decade. The mechanic also knows the noise, and that Wendy has 3 little kids and that she's vacationing 900 miles away on a road trip with those kids in that Ford. The context is all there inside of the mechanic's brain to combine and mean that this might be the highest-priority call he gets all week.

Wendy may not have actively relayed any urgency in her message, but the urgency is real and she needs called back right away. She needs answers about what to do (keep driving and look into it when she gets back? pull over immediately and get a tow to a decent local shop? maybe she even needs help finding such a shop?) pretty much immediately. Not because it means more business today, but because it means more business for years.

The mechanic can spot this from a list of transcripts in an instant and give her a ring back Right Now. The bot is NFG at this.

The addition of the bot only adds noise to the process, and that noise only works to Wendy's detriment. When the bot adds detrimental noise to Wendy's situation, it also adds detriment to the shop's longevity.

The presence of the bot -- even as a prioritizing sorting mechanism -- asymptotically shifts the state from an excellent shop that knows their customers very well to a bot-driven customer-averse hellscape.

(And no, the answer isn't to make the bot into an all-knowing oracle that actively gets fed all context. The documentation burden would be more expensive, time-wise (and thus money-wise) than hiring a competent human receptionist who answers the phone, handles the front door traffic, and absorbs context from their surroundings. A person who chatted with Wendy last Thursday right before she left for her trip is always going to be superior to a bot.)

skeeter2020 · 2026-03-23T15:30:22 1774279822

I'm happy to report that my one-person sysops has successfully hit nine-fives for the 20th year in a row!

skeeter2020 · 2026-03-22T20:37:42 1774211862

ah yes, those fat cat ranchers might have to get off their golden thrones and do some hard work for a change. You should maybe look into the business as both a rancher and the food supply chain. A big benefit is that ranchers are far better partners and stewards of the land than developers and other industries (like oil and gas).

If you think ranching hasn't changed in 2000 years you know nothing about it. First, what we see in Canada & the US is most similar to Spanish open grazing of ~200 years ago, not some sort of neolithic practice from several thousand years ago. Then the obvious and game changer was barbed wire, and now intensive industrialization such as feed lots, genetic selection & artificial insemination, GPS tracking and data-based herd management. Public grazing is such a minor part of the picture now. The technology you call for is IMO the worst development: factory meat and massive consolidation.

trhway · 2026-03-22T22:47:50 1774219670

> A big benefit is that ranchers are far better partners and stewards of the land than developers and other industries (like oil and gas).

it is a strawman.

The wildlife would be much better without the cattle grazing.

>open grazing of ~200 years ago, not some sort of neolithic practice from several thousand years ago.

no difference.

>Then the obvious and game changer was barbed wire,

bad for wildlife. And basically lazy/primitive version of shepherd dogs.

>and now intensive industrialization such as feed lots,

general mechanization of Industrial Revolution, engines instead of people, no innovation specific to cattle.

> genetic selection & artificial insemination,

bottle of vodka to the owner of the prized bull who would bring the bull to do his business at my grandma's farm

> GPS tracking

more convenient and cheaper to graze those vast federal lands

>and data-based herd management.

that has a very nice ring to it, must be something.

skeeter2020 · 2026-03-22T20:25:55 1774211155

doesn't look like much; the seem to use electron for almost everything in this space. If they had faith in Maui something (VS Code, Teams, Outlook, ... calculator?) would use it.