If there's one common thread across LLM criticisms, it's that they're not perfec...

Terr_ · on March 28, 2025

> Verifying the accuracy of its statements is easy.

For single statements, sometimes, but not always. For all of the many statements, no. Having the human attention and discipline to mindfully verify every single one without fail? Impossible.

Every software product/process that assumes the user has superhuman vigilance is doomed to fail badly.

> Automation centaurs are great: they relieve humans of drudgework and let them focus on the creative and satisfying parts of their jobs. That's how AI-assisted coding is pitched [...]

> But a hallucinating AI is a terrible co-pilot. It's just good enough to get the job done much of the time, but it also sneakily inserts booby-traps that are statistically guaranteed to look as plausible as the good code (that's what a next-word-guessing program does: guesses the statistically most likely word).

> This turns AI-"assisted" coders into reverse centaurs. The AI can churn out code at superhuman speed, and you, the human in the loop, must maintain perfect vigilance and attention as you review that code, spotting the cleverly disguised hooks for malicious code that the AI can't be prevented from inserting into its code. As qntm writes, "code review [is] difficult relative to writing new code":

-- https://pluralistic.net/2025/03/18/asbestos-in-the-walls/

crazygringo · on March 28, 2025

> Having the human attention and discipline to mindfully verify every single one without fail? Impossible.

I mean, how do you live life?

The people you talk to in your life say factually wrong things all the time.

How do you deal with it?

With common sense, a decent bullshit detector, and a healthy level of skepticism.

LLM's aren't calculators. You're not supposed to rely on them to give perfect answers. That would be crazy.

And I don't need to verify "every single statement". I just need to verify whichever part I need to use for something else. I can run the code it produces to see if it works. I can look up the reference to see if it exists. I can Google the particular fact to see if it's real. It's really very little effort. And the verification is orders of magnitude easier and faster than coming up with the information in the first place. Which is what makes LLM's so incredibly helpful.

abrichr · on March 28, 2025

> I just need to verify whichever part I need to use for something else. I can run the code it produces to see if it works. I can look up the reference to see if it exists. I can Google the particular fact to see if it's real. It's really very little effort. And the verification is orders of magnitude easier and faster than coming up with the information in the first place. Which is what makes LLM's so incredibly helpful.

Well put.

Especially this:

> I can run the code it produces to see if it works.

You can get it to generate tests (and easy ways for you to verify correctness).

madethisnow · on March 28, 2025

It's really funny how most anecdotes and comments about the utility and value of interacting with LLM's can be applied to anecdotes and comments about human beings themselves. Majority of people havent realized yet that consciousness is assumed by our society, and that we, in fact, don't know what it is or if we have it. Let alone prescribing another entity with it.

sc68cal · on March 28, 2025

> Does it fabricate references? Absolutely, maybe about a third of the time

And you don't have concerns about that? What kind of damage is that doing to our society, long term, if we have a system that _everyone_ uses and it's just accepted that a third of the time it is just making shit up?

crazygringo · on March 28, 2025

No, I don't. Because I know it does and it's incredibly easy to type something into Google Scholar and see if a reference exists.

Like, I can ask a friend and they'll mistakenly make up a reference. "Yeah, didn't so-and-so write a paper on that? Oh they didn't? Oh never mind, I must have been thinking of something else." Does that mean I should never ask my friend about anything ever again?

Nobody should be using these as sources of infallible truth. That's a bonkers attitude. We should be using them as insanely knowledgeable tutors who are sometimes wrong. Ask and then verify.

The net benefit is huge.

IncreasePosts · on March 28, 2025

No, that doesn't mean you should never ask your friend things again if they make that mistake. But, if 30% of all their references are made up then you might start to question everything your friend says. And looking up references to every claim you're reading is not a productive use of time.

crazygringo · on March 28, 2025

If my friend has a million times more knowledge than the average human being, then I'm willing to put up with a 30% error rate on references.

And I'm talking about references when doing deep academic research. Looking them up is absolutely a productive use of time -- I'm asking for the references so I can read them. I'm not asking for them for fun.

Remember, it's hundreds of times easier to verify information than it is to find it in the first place. That's the basic principle of what makes LLM's so incredibly valuable.

yaro330 · on March 28, 2025

But how can you be sure that the info is correct if it made up the reference? Where did it pull the info? What good is a friend that's just bullshiting their way through every conversation hoping you wouldn't notice?

A third of the time is an insane number, if 30% of code that I wrote contained non existent headers I would be fired long ago.

hnfong · on March 28, 2025

A person who's bullshitting their way doesn't get a 70% accuracy. For yes/no questions they'll get 50%. For open ended questions they'll be lucky to get 1%.

You're really underestimating the difficulty of getting 70% accuracy for general open-ended questions.

And while you might think you're better than 70%, I'm pretty sure if you didn't run your code through compilers and linters, and testing for at least a couple times, your code doesn't get anywhere near 70% correct.

t4TLLLSZ185x · on March 28, 2025

Because he reads the reference document…

bobsmooth · on March 28, 2025

"you might start to question everything your friend says"

That's exactly what the OP is saying. Verify everything.

pixl97 · on March 28, 2025

Maybe I'm getting old, but sometimes it feels like everybody is young now and has only lived in a world where they can look up anything at a moments notice and now things they are infallible.

Having lived a decent chunk of my life pre-internet, or at least fast and available internet, looking back at those days you realize just how often people were wrong about things. Old wives tales, made up statistics, imagined scenarios, people really do seem to confabulate a lot of information.

bjornsing · on March 28, 2025

> And you don't have concerns about that? What kind of damage is that doing to our society, long term, if we have a system that _everyone_ uses and it's just accepted that a third of the time it is just making shit up?

Main problem with our society is that two thirds of what _everyone_ says is made up shit / motivated reasoning. The random errors LLMs make are relatively benign, because there is no motivation behind them. They are just noise. Look through them.

Davidzheng · on March 28, 2025

I think a third of facts i say are false as stated and I do not think I'm worse than 30th percentile in humans at truthfulness

abenga · on March 28, 2025

You are not a trusted authority relied on by millions and expected to make decisions for them, and you could choose not to say something you aren't sure that you know.

johnfn · on March 28, 2025

You might be surprised to hear that people talk to other people and trust their judgements.

mijoharas · on March 28, 2025

So, I've sometimes wondered about this.

Could it end up being a net benefit? will the realistic sounding but incorrect facts generated by A.I. make people engage with arguments more critically, and be less likely to believe random statements they're given?

Now, I don't know, or even think it is likely that this will happen, but I find it an interesting thought experiment.

HankStallone · on March 28, 2025

That's hilarious; I had no idea it was that bad. And for every conscientious researcher who actually runs down all the references to separate the 2/3 good from the 1/3 bad, how many will just paste them in, adding to the already sky-high pile of garbage out there?

MisterKent · on March 28, 2025

This. 100% this.

LLMs will spit out responses with zero backing with 100% conviction. People see citations and assume it's correct. We're conditioned for it thanks to....everything ever in history. Rarely do I need to check a wikipedia entry's source.

So why do people not understand that: this is absolutely going to pour jet fuel on misinformation in the world. And we as a society are allowed to hold a bar higher for what we'll accept get shoved down our throats by corporate overlords that want their VC payout.

crazygringo · on March 28, 2025

> People see citations and assume it's correct.

The solution is to set expectations, not to throw away one of the most valuable tools ever created.

If you read a supermarket tabloid, do you think the stories about aliens are true? No, because you've been taught that tabloids are sensationalist. When you listen to campaign ads, do you think they're true? When you ask a buddy about geography halfway across the world, do you assume every answer they give is right?

It's just about having realistic expectations. And people tend to learn those fast.

> Rarely do I need to check a wikipedia entry's source.

I suggest you start. Wikipedia is full of citations that don't back up the text of the article. And that's when there are even citations to begin with. I can't count the number of times I've wanted to verify something on Wikipedia, and there either wasn't a citation, or there was one related to the topic but that didn't have anything related to the specific assertion being made.

madethisnow · on March 28, 2025

people lie more