Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

If there's one common thread across LLM criticisms, it's that they're not perfect.

These critics don't seem to have learned the lesson that the perfect is the enemy of the good.

I use ChatGPT all the time for academic research. Does it fabricate references? Absolutely, maybe about a third of the time. But has it pointed me to important research papers I might never have found otherwise? Absolutely.

The rate of inaccuracies and falsehoods doesn't matter. What matters is, is it saving you time and increasing your productivity. Verifying the accuracy of its statements is easy. While finding the knowledge it spits out in the first place is hard. The net balance is a huge positive.

People are bullish on LLM's because they can save you days' worth of work, like every day. My research productivity has gone way up with ChatGPT -- asking it to explain ideas, related concepts, relevant papers, and so forth. It's amazing.



> Verifying the accuracy of its statements is easy.

For single statements, sometimes, but not always. For all of the many statements, no. Having the human attention and discipline to mindfully verify every single one without fail? Impossible.

Every software product/process that assumes the user has superhuman vigilance is doomed to fail badly.

> Automation centaurs are great: they relieve humans of drudgework and let them focus on the creative and satisfying parts of their jobs. That's how AI-assisted coding is pitched [...]

> But a hallucinating AI is a terrible co-pilot. It's just good enough to get the job done much of the time, but it also sneakily inserts booby-traps that are statistically guaranteed to look as plausible as the good code (that's what a next-word-guessing program does: guesses the statistically most likely word).

> This turns AI-"assisted" coders into reverse centaurs. The AI can churn out code at superhuman speed, and you, the human in the loop, must maintain perfect vigilance and attention as you review that code, spotting the cleverly disguised hooks for malicious code that the AI can't be prevented from inserting into its code. As qntm writes, "code review [is] difficult relative to writing new code":

-- https://pluralistic.net/2025/03/18/asbestos-in-the-walls/


> Having the human attention and discipline to mindfully verify every single one without fail? Impossible.

I mean, how do you live life?

The people you talk to in your life say factually wrong things all the time.

How do you deal with it?

With common sense, a decent bullshit detector, and a healthy level of skepticism.

LLM's aren't calculators. You're not supposed to rely on them to give perfect answers. That would be crazy.

And I don't need to verify "every single statement". I just need to verify whichever part I need to use for something else. I can run the code it produces to see if it works. I can look up the reference to see if it exists. I can Google the particular fact to see if it's real. It's really very little effort. And the verification is orders of magnitude easier and faster than coming up with the information in the first place. Which is what makes LLM's so incredibly helpful.


> I just need to verify whichever part I need to use for something else. I can run the code it produces to see if it works. I can look up the reference to see if it exists. I can Google the particular fact to see if it's real. It's really very little effort. And the verification is orders of magnitude easier and faster than coming up with the information in the first place. Which is what makes LLM's so incredibly helpful.

Well put.

Especially this:

> I can run the code it produces to see if it works.

You can get it to generate tests (and easy ways for you to verify correctness).


It's really funny how most anecdotes and comments about the utility and value of interacting with LLM's can be applied to anecdotes and comments about human beings themselves. Majority of people havent realized yet that consciousness is assumed by our society, and that we, in fact, don't know what it is or if we have it. Let alone prescribing another entity with it.


> Does it fabricate references? Absolutely, maybe about a third of the time

And you don't have concerns about that? What kind of damage is that doing to our society, long term, if we have a system that _everyone_ uses and it's just accepted that a third of the time it is just making shit up?


No, I don't. Because I know it does and it's incredibly easy to type something into Google Scholar and see if a reference exists.

Like, I can ask a friend and they'll mistakenly make up a reference. "Yeah, didn't so-and-so write a paper on that? Oh they didn't? Oh never mind, I must have been thinking of something else." Does that mean I should never ask my friend about anything ever again?

Nobody should be using these as sources of infallible truth. That's a bonkers attitude. We should be using them as insanely knowledgeable tutors who are sometimes wrong. Ask and then verify.

The net benefit is huge.


No, that doesn't mean you should never ask your friend things again if they make that mistake. But, if 30% of all their references are made up then you might start to question everything your friend says. And looking up references to every claim you're reading is not a productive use of time.


If my friend has a million times more knowledge than the average human being, then I'm willing to put up with a 30% error rate on references.

And I'm talking about references when doing deep academic research. Looking them up is absolutely a productive use of time -- I'm asking for the references so I can read them. I'm not asking for them for fun.

Remember, it's hundreds of times easier to verify information than it is to find it in the first place. That's the basic principle of what makes LLM's so incredibly valuable.


But how can you be sure that the info is correct if it made up the reference? Where did it pull the info? What good is a friend that's just bullshiting their way through every conversation hoping you wouldn't notice?

A third of the time is an insane number, if 30% of code that I wrote contained non existent headers I would be fired long ago.


A person who's bullshitting their way doesn't get a 70% accuracy. For yes/no questions they'll get 50%. For open ended questions they'll be lucky to get 1%.

You're really underestimating the difficulty of getting 70% accuracy for general open-ended questions.

And while you might think you're better than 70%, I'm pretty sure if you didn't run your code through compilers and linters, and testing for at least a couple times, your code doesn't get anywhere near 70% correct.


Because he reads the reference document…


"you might start to question everything your friend says"

That's exactly what the OP is saying. Verify everything.


Maybe I'm getting old, but sometimes it feels like everybody is young now and has only lived in a world where they can look up anything at a moments notice and now things they are infallible.

Having lived a decent chunk of my life pre-internet, or at least fast and available internet, looking back at those days you realize just how often people were wrong about things. Old wives tales, made up statistics, imagined scenarios, people really do seem to confabulate a lot of information.


> And you don't have concerns about that? What kind of damage is that doing to our society, long term, if we have a system that _everyone_ uses and it's just accepted that a third of the time it is just making shit up?

Main problem with our society is that two thirds of what _everyone_ says is made up shit / motivated reasoning. The random errors LLMs make are relatively benign, because there is no motivation behind them. They are just noise. Look through them.


I think a third of facts i say are false as stated and I do not think I'm worse than 30th percentile in humans at truthfulness


You are not a trusted authority relied on by millions and expected to make decisions for them, and you could choose not to say something you aren't sure that you know.


You might be surprised to hear that people talk to other people and trust their judgements.


So, I've sometimes wondered about this.

Could it end up being a net benefit? will the realistic sounding but incorrect facts generated by A.I. make people engage with arguments more critically, and be less likely to believe random statements they're given?

Now, I don't know, or even think it is likely that this will happen, but I find it an interesting thought experiment.


That's hilarious; I had no idea it was that bad. And for every conscientious researcher who actually runs down all the references to separate the 2/3 good from the 1/3 bad, how many will just paste them in, adding to the already sky-high pile of garbage out there?


This. 100% this.

LLMs will spit out responses with zero backing with 100% conviction. People see citations and assume it's correct. We're conditioned for it thanks to....everything ever in history. Rarely do I need to check a wikipedia entry's source.

So why do people not understand that: this is absolutely going to pour jet fuel on misinformation in the world. And we as a society are allowed to hold a bar higher for what we'll accept get shoved down our throats by corporate overlords that want their VC payout.


> People see citations and assume it's correct.

The solution is to set expectations, not to throw away one of the most valuable tools ever created.

If you read a supermarket tabloid, do you think the stories about aliens are true? No, because you've been taught that tabloids are sensationalist. When you listen to campaign ads, do you think they're true? When you ask a buddy about geography halfway across the world, do you assume every answer they give is right?

It's just about having realistic expectations. And people tend to learn those fast.

> Rarely do I need to check a wikipedia entry's source.

I suggest you start. Wikipedia is full of citations that don't back up the text of the article. And that's when there are even citations to begin with. I can't count the number of times I've wanted to verify something on Wikipedia, and there either wasn't a citation, or there was one related to the topic but that didn't have anything related to the specific assertion being made.


people lie more




Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: