More

zulban · 2026-03-28T17:24:15 1774718655

Generally, published papers don't give a damn about reproducibility. I've seen it identified as a crisis by many. Publishers, reviewers, and researchers mostly don't care about that level of basic rigor. There's no professional repercussions or embarrassment.

Agreed - if I was a reviewer for LLM papers it would be an instant rejection not listing the versions and prompts used.

epistasis · 2026-03-28T17:44:12 1774719852

I'm not so sure of that opinion on reproducibility. The last peer review I did was for a small journal that explicitly does not evaluate for high scientific significance, merely for correctness, which generally means straightforward acceptance. The other two reviews were positive, as was mine, except I said that the methods need to be described more and ideally the code placed somewhere. That was enough for a complete rejection of the paper, without asking for the simple revisions I requested. It was a very serious action taken merely because I requested better reproducibility!

(Personally I think the lack of reproducibility comes back mostly to peer reviewers that haven't thought through enough about the steps they'd need to take to reproduce, and instead focus on the results...)

zulban · 2026-03-28T19:25:41 1774725941

I'm not sure how one example contradicts documented huge overall trends, but okay.

epistasis · 2026-03-28T19:31:24 1774726284

I think publishers care about this a lot, but most researchers do not seem to care as much about reproducibility.

catlifeonmars · 2026-03-28T18:41:43 1774723303

> and instead focus on the results...

This points to (and everyone knows this) incentives misalignment between the funders of research and the public. Researchers are caught in the middle

epistasis · 2026-03-28T19:28:58 1774726138

Eh, I'm not so sure about the funding side there, researchers are not really caught at all and are fully responsible, IMHO. Peer reviewers exist to enforce community standards, and are not influenced to avoid reproducibility concerns by funding sources. The results are always more interesting than reproducibility, of course, and I think that's why the get the attention! Also, there needs to be greater involvement of grad students (who do most of the actual work) in peer review, IMHO, because most PIs spend their day in meetings reviewing results, setting directions, writing grants, and have little time for actual lab work, and are thus disconnected from it.

There needs to be more public naming and shaming in science social media and in conference talks, but especially when there are social gatherings at conferences and people are able to gossip. There was a bit of this with Google's various papers, as they got away with figurative murder on lack of reproducibility for commercial purposes. But eventually Google did share more.

Most journals have standards for depositing expensive datasets, but that's a clear yes/no answer. Reproducibility is a very subjective question in comparison to data deposition, and must be subjectively evaluated by peer reviewers. I'd like to see more peer review guidelines with explicit check boxes for various aspects of reproducibility.

catlifeonmars · 2026-03-29T18:13:07 1774807987

> Reproducibility is a very subjective question in comparison to data deposition

Yeah I can definitely see why this is the case because it isn’t real until someone actually tries to reproduce the results. At that point it leaves the realm of subjectivity and becomes a question of cost.

bjourne · 2026-03-28T21:52:23 1774734743

The comment is wrong -- model versions are clearly specified in the supplement.

ghywertelling · 2026-03-28T18:43:28 1774723408

The same about surveys and polls. I know no one who has ever been polled or surveyed. When will we stop this fascination with made up infographics crisis?

inetknght · 2026-03-28T19:20:05 1774725605

> Generally, published papers don't give a damn about reproducibility

While this is sadly true, it's especially true when talking about things that are stochastic in nature.

LLMs outputs, for example, are notoriously unreproducible.

zulban · 2026-03-28T19:24:13 1774725853

> LLMs outputs, for example, are notoriously unreproducible.

Only in the same way that an individual in a medical study cannot be "reproduced" for the next study. However the overall statistical outcomes of studying a specific LLM can be reproduced.

KellyCriterion · 2026-03-28T17:50:39 1774720239

Do they reproduce any submitted papers at all?

Does this happen?

I can remember this room-temperature-super-conductor guy whose experiments where replicated, but this seems rare?

linhns · 2026-03-28T18:54:38 1774724078

Yes, those are the only papers that worth a jot of reading.

zulban · 2026-03-23T13:28:05 1774272485

Not comfortable. But making choices in the real world is about choosing the best option, not the perfect option.

zulban · 2026-03-19T13:27:18 1773926838

I've learned a bit today about how often people on hn read the article when commenting. Or potentially bots who are way off. The title alone isn't enough to totally grasp what happened here, or the methods used.

Extremely conservative detection. The real number must be much higher.

zulban · 2026-03-14T13:48:27 1773496107

But then if the AI is detected that person can be permanently banned. No more AI. No new accounts.

hrimfaxi · 2026-03-14T14:14:11 1773497651

So if someone compromises your identity they can unperson you? How will the AI be detected? By another AI?

zulban · 2026-03-14T16:28:03 1773505683

"So if someone compromises your identity they can unperson you?"

You've identified a problem that unrelated systems also have. Like banks and identity theft. This solution isn't responsible for causing that problem.

"How will the AI be detected? By another AI?"

However a platform likes to. Let the best platform win.

zulban · 2026-03-07T14:55:41 1772895341

> I love it. It feels like it did back then. I’m chasing the midnight hour and not getting any sleep.

I highly recommend this blog post about vibe coding, gambling, and flow. Glad you're having a great time! Just something to consider.

https://www.fast.ai/posts/2026-01-28-dark-flow/

zulban · 2026-03-03T16:47:41 1772556461

I saw some research awhile ago that 60% of the time, "reject cookies" is ignored.

zulban · 2026-03-01T14:23:35 1772375015

That's not how enshittification, vendor lock in, and network effects work. You're participating in the collective delusion that we have perfect market competition.

simianwords · 2026-03-01T14:24:11 1772375051

explain the dynamics of how enshittification would work here?

zulban · 2026-03-01T14:26:58 1772375218

You won't get good answers asking to be spoonfed on a random discussion forum by strangers. If you're truly curious, look it up, maybe read a book by Cory Doctorow.

simianwords · 2026-03-01T14:30:59 1772375459

he's a charlatan

zulban · 2026-03-01T14:49:03 1772376543

You seem very confident for someone who just asked about enshittification basics. You're being dishonest.

simianwords · 2026-03-01T14:51:18 1772376678

you think i was asking you about the basics but i was asking about how the dynamics would work in this context - which you couldn't so you resorted to some insults.

WarmWash · 2026-03-01T14:31:28 1772375488

People stay on the shitty platform because it is convenient and still does what they need.

See extreme-enshitification-of-already-shitty Windows vs free Linux.

simianwords · 2026-03-01T14:34:50 1772375690

free linux is a way worse product

SoftTalker · 2026-03-01T17:01:44 1772384504

Linux is becomming enshitified, at least the big distros. Snaps, some would argue systemd, wayland, etc. Continually requiring more and more resources just to install and run.

zulban · 2026-02-20T17:26:02 1771608362

If your roommates leave your rent doesn't go down. Infrastructure can't be trivially cut in half.

zulban · 2026-01-26T18:03:18 1769450598

What corporate norms are notably different in this context?

zulban · 2026-01-15T18:36:01 1768502161

If you don't need to design a new product, you can focus on execution.

You may want to learn about design and novelty. Some people just want to learn about execution.