I hope it eventually comes back once it is more stable. Would be great to have a...

masklinn · 2025-09-30T10:58:51 1759229931

It was not removed due to instability but due to the maintainer’s inability to respect guidelines set by others when they don’t personally agree.

This is not the first project for which this was an issue, and said maintainer has shown no will to alter their behaviour before or since.

burnte · 2025-09-30T15:11:31 1759245091

Yep. All people asked him to do was slow down a bit because they felt it was too much change at once. He refused for any reason other than his own to slow down. H said he only saw three reasons to slow down and none of them applied so Linus should just accept my patch now.

I never understand why some people are unwilling to make any attempt at getting along. Some people seem to feel any level of compromise is too much.

akimbostrawman · 2025-09-30T11:14:01 1759230841

He justified breaking the guidelines to address critical issues. one can hope these kind of problems would not happen that frequently in a stable project, besides it is still experimental.

jeltz · 2025-09-30T14:18:07 1759241887

What he actually did was that he bundled the fix for the critical issue with dubugging tools and a totally new experimental feature. I totally get why they stopped working with him.

bcrl · 2025-09-30T14:48:22 1759243702

A feature that was put in place to allow users encountering the bug to recover their data. It's not as black and white as you are portraying.

jeltz · 2025-09-30T14:56:16 1759244176

It was still an entirely new and experimental feature which had not been properly reviewed. Why couldn't this feature wait until next kernel version? Other file systems have had their recovery tools improved over many years.

bcrl · 2025-09-30T15:44:55 1759247095

Filesystems like ext2/3/4 have their recovery tools in userland. Most of the recovery features in bcachefs are in the kernel. As a result, it is inevitable that at some point there was and will be a need to push a new feature into a stable release for the purpose of data recovery.

Over the long term the number of cases where such a response is needed will decrease as expected.

Do you really want to live in a world where data losses in stable releases is considered Okay?

stycznik · 2025-09-30T16:43:37 1759250617

>it is inevitable that at some point there was and will be a need to push a new feature into a stable release for the purpose of data recovery

It's really not, the proper way to recover your important data is to restore from backups, not to force other people to bend longstanding rules for you.

>Do you really want to live in a world where data losses in stable releases is considered Okay?

Bcachefs is an experimental filesystem.

bcrl · 2025-10-01T00:19:28 1759277968

Experimental does not mean "should be blocked from receiving timely updates" last I checked.

stycznik · 2025-10-01T00:46:22 1759279582

If it really was so urgent why didn't Kent just tell people to get those updates from his personal tree? There are rules in place if you want your stuff to get into Linus's tree, expecting Linus to pull whatever you sent him without any resistance whatsoever is likely just going to end up with him deleting your project, just like what happened here.

bcrl · 2025-10-01T01:29:59 1759282199

Because distributions don't ship Kent's kernel tree, and they're not going to. Distributions like Fedora ship as close to mainline as possible these days because of the pain experienced from shipping a heavily patched kernel in the past. Release cycles are upwards of 3 months for Linus' tree. With that kind of lengthy release cycle, for an experimental codebase which is undergoing rapid stabilization it was the right call: you don't want old code to linger around longer than necessary when they're predominantly bug fixes that successfully pass regression tests. The choice should be with the maintainer.

solarkraft · 2025-10-02T21:24:42 1759440282

Now they don’t ship bcachefs at all. Seems like a weird trade to me, but hey, no version is being shipped without recovery features now, huh.

akimbostrawman · 2025-10-04T09:33:03 1759570383

Experimental means assume your data could disappear at anytime.

There is no reason to break kernel guidelines to deliver a fix.

streb-lo · 2025-09-30T16:08:50 1759248530

That's a good argument to keep recovery tools in userland rather than bend the kernel around them.

Why do they need to be in the kernel anyways? Presumably they are running on an unmounted device?

bcrl · 2025-09-30T16:24:56 1759249496

No, it is not. bcachefs needs to have all the code for error recover in the kernel as it needs to be available when a storage device fails in any of a myriad of ways.

Maintaining a piece of code that needs to run in both user space and the kernel is messy and time consuming. You end up running into issues where dependencies require the porting of gobs of infrastructure from the kernel into userspace. That's easy for some thing, very hard for others. There's a better place to spend those resource: by stabilizing bcachefs in the kernel where it belongs.

Other people have tried and failed at this before, and I'm sure that someone will try the same thing again in the future and relearn the same lesson. I know as business requirements for a former employer resulted in such a beast. Other people thought they could just run their userspace code in the kernel, but they didn't know about limits on kernel stack size, they didn't know about contexts where blocking vs non-blocking behaviour is required or how that interacted with softirqs. Please, just don't do this or advocate for it.

cwillu · 2025-09-30T18:14:31 1759256071

Please review https://lore.kernel.org/all/20250627144441.GA349175@fedora/#... and https://lore.kernel.org/all/20250628015934.GB4253@mit.edu/

Denvercoder9 · 2025-09-30T17:31:44 1759253504

bcachefs in the upstream kernel was explicitly marked as being experimental, you can't consider it a stable release.

irusensei · 2025-09-30T15:25:14 1759245914

The fact you got downvoted makes me shake my head. One could still interpret this as contributor violation and thats fair.

If I'm not mistaken Kent pushed recovery routines in the RC to handle some catastrophic bug some user caused by loading the current metadata format into an old 6.12 kernel.

It isn't some sinister "sneaking features". This fact seems to be omitted by clickbaity coverage over the situation.

bcrl · 2025-09-30T16:16:53 1759249013

As I pointed out elsewhere, there was another -rc release put out shortly after that effectively added back in a feature that was removed 10 releases back. Granted, it was only a small thing, but it shows that there is nuance in application of the rule.

Rule 1: don't assume malice.

ranger_danger · 2025-09-30T12:08:13 1759234093

Except he had a history of rushing large changes in at the last minute that were always critical, and would constantly argue about policy during the same time, which is not the appropriate time or place.

He refused to acknowledge his place on the totem pole and thought he knew better than everyone else, and that they should change their ways to suit his whims.

cogman10 · 2025-09-30T13:43:39 1759239819

The way I read it, it was wrapping in feature and bug fix changes. "We found a critical bug here, to fix it instead of backporting a fix we want to pull in all the code which was built before the bug was discovered".

I can understand the motivation. It's a PITA to support an older version of code. But that's not how linux gets it's stability.

MBCook · 2025-09-30T14:40:52 1759243252

He also had bad interactions with other developers, like constantly shitting on other file systems and generally behaving like a jerk completely unnecessarily.

fooqux · 2025-09-30T14:25:51 1759242351

That's just bad git hygiene, and lots of lead devs deal with this across the development world. One change per commit/PR please.

cogman10 · 2025-09-30T14:35:14 1759242914

I think it's more natural.

    Commit A: introduce the bug
    Commit B: change architecture
    Commit C: add a feature
    Commit D: fix A using code present in B and C.

The issue ends up being that D needs to be reimplemented to fix A because B and C don't exist on the tip.

Since linux has closed windows and long term kernels it means the fix to the same bug could need to be done in multiple ways.

Multiple changes per PR is bad, but I assume it's still one change per commit.

fooqux · 2025-09-30T22:09:40 1759270180

Yeah, but then you run into scenarios where A+D is tested and ready, but B and/or C are not. Git does give you tools to separate them, but most people don't like doing that for various reasons.

IMHO, it may be more natural, but only during development. Trying to do a git bisect on git histories like the above is a huge pain. Trying to split things up when A is ready but B/C are not is a huge pain.

typpilol · 2025-09-30T19:36:00 1759260960

Linux is not known for its stable ABI lol

mrweasel · 2025-09-30T12:03:38 1759233818

My take away from trying to follow the discussion on the kernel mailing list was that the Bcachefs developer want to work in a certain way, that Linus does not think that fit in with the rest of the kernel (to put it mildly). Having Bcachefs in the kernel certainly helps with adoption, but I can't help thinking that a kernel module might be more inline with the development process that Bcachefs wants.

The underlying problem might have been importing Bcachefs into the mainline kernel to early in it's life cycle.

jeltz · 2025-09-30T14:24:49 1759242289

No, this is a pure people problem which would have happened no matter the state of Bcachefs. Kent refuses to respect other people's time and rules since that would require him to change how he works.

typpilol · 2025-09-30T19:35:05 1759260905

On the other hand, Linus is constantly changing the submit times based on his own personal travel.

A lot of people aren't going to keep up with Linus personal travel plans just so they don't send a late patch.

mrktf · 2025-09-30T12:21:46 1759234906

As for occasional follower, my opinion is that: Kent overdid with bending rules until Linus & co got fed up.

happymellon · 2025-09-30T14:02:14 1759240934

> He justified breaking the guidelines to address critical issues

That claim was to add new logging functionality to allow better troubleshooting to eventually address critical issues.

This should have been out of trunk for someone to test, rather than claiming it to be something that wasn't strictly true. Especially when it's the kernel.

bityard · 2025-09-30T15:13:53 1759245233

I hope it comes back too, just not with Kent as the lead developer.

lproven · 2025-09-30T11:35:37 1759232137

> I hope it eventually comes back once it is more stable.

Yes, me too.

> Would be great to have an in kernel alternative to ZFS

Yes it would.

> for parity RAID.

No.

Think of the Pareto Principle here. 80% of the people only use 20% of the functionality. BUT they don't all use the same 20% so overall you need 80% of the functionality... or more.

ZFS is one of the rivals here.

But Btrfs is another. Stratis is another. HAMMER2 is another. MDRAID is another. LVM is another.

All proviude some or all of that 20% and all have pros and cons.

The point is that, yes, ZFS is good at RAID and it's much much easier than ext4 on MDRAID or something.

Btrfs can do that too.

But ZFS and Btrfs do COW snapshots. Those are important too. OpenSUSE, Garuda Linux, siduction and others depend on Btrfs COW.

OK, fine, no problem, your use case is RAID. I use that too. Good.

But COW is just as important.

Integrity is just as important and Btrfs fails at that. That is why the Bcachefs slogan is "the COW filesystem that won't eat your data."

Btrfs ate my data 2-3 times a year for 4 years.

Doesn't matter how many people who praise it, what matters are the victims who have been burned when it fails. They prove that it does fail.

The point is not "I can do that with ext4 on mdraid" or "I can do that with LVM2" or "Btrfs is fine for me".

The point is something that can do _all of these_ and do it _better_ -- and here, "better" includes "in a simpler way".

Simpler here meaning "simpler to set up" and also "simpler in implementation" (compared to, say, Btrfs on LVM2, or Btrfs on mdraid, or LVM on mdraid, or ext4 on LVM on RAID.

Something that can remove entire layers of the stack and leave the same functionality is valuable.

Something that can remove 90% of the setup steps and leave identical functionality matters... Because different people do those steps in different order, or skip some, and you need to document that, and none of us document stuff enough.

The recovery steps for LVM on RAID are totally different from RAID on LVM. The recovery for Btrfs on mdraid is totally different from just Btrfs RAID.

This is why tools that eliminate this matter. Because when it matters whether you have

1 - 2 - 3 - 4 - 5

or

1 - 2 - 4 - 3 - 5

Then the sword that chops the Gordian knot here is one tool that does 1-5 in a single step.

This remains true even if you only use 1 and 5, or 2 and 3, and it still matters if you only do 4.

pessimizer · 2025-09-30T17:27:10 1759253230

As far as I know, ZFS is either for smart people who want to do something sophisticated or trendy people who want to do something unwise.

> ext4 on MDRAID or something

Are trivially easy to set up, expand, or replace drives; require no upkeep; and no setup when placed into entirely different systems. Anybody using ZFS or ZFS-like to do some trivial standard RAID setup (unless they are used to and comfortable with ZFS, which is an entirely different story) is just begging to lose data. MDADM is fine.

yjftsjthsd-h · 2025-09-30T17:54:33 1759254873

> As far as I know, ZFS is either for smart people who want to do something sophisticated or trendy people who want to do something unwise.

Or people who want data checksums.

> Anybody using ZFS or ZFS-like to do some trivial standard RAID setup (unless they are used to and comfortable with ZFS, which is an entirely different story) is just begging to lose data.

How? You just... hand it some devices, and it makes a pool. Drive replacement is a single command.

lproven · 2025-09-30T22:34:24 1759271664

No. None of this.

> Are trivially easy to set up

Done it. Been doing it for 25+ years.

ZFS is easier. MUCH easier, and much quicker too.

> expand

As easy with ZFS.

> or replace drives;

Easier with XFS.

> require no upkeep;

False. Ext4 requires the occasional check. This must be done offline. ZFS doesn't and can be scrubbed while online and actively in use.

> and no setup when placed into entirely different systems.

Same as ZFS.

> Anybody using ZFS or ZFS-like to do some trivial standard RAID setup (unless they are used to and comfortable with ZFS, which is an entirely different story) is just begging to lose data.

False.

> MDADM is fine.

I am not saying it isn't. I am saying ZFS is better.

I think you haven't tried it, because your claims betray serious ignorance of what it can do.

I built my main NAS box's ZRAID with the drives in USB-3 caddies on a Raspberry Pi 4. I moved it to the built in SATA controllers of an HP Microserver running TrueNAS Core.

Imported and just worked. No reconfig, no rebuild, nothing.

It moves seamlessly between Arm and x86, Linux and FreeBSD, no problem at all. Round trip if you want.

ZoomZoomZoom · 2025-09-30T11:20:30 1759231230

More stable than what?

I have a multidevice filesystem, comprised of old HDDs and one sketchy PCI-SATA extension. This FS was assembled in 2019 and, though it went through periods of being non-writable, is still working and I haven't lost any[1] data. This is more than 5 years, multitude of FS version upgrades, multiple device replacements with corresponding data evacuation and rereplication.

[1] Technically, I did lose some, when a dying device started misbehaving and writing garbage, and I was impatient and ran a destructive fsck (with fix_errors) before waiting for a bug patch.

Don't want to compare it to other solutions but this is impressive even on its own merits.

tarruda · 2025-09-30T12:35:03 1759235703

> More stable than what?

IIRC the whole drama began because Kent was constantly pushing new features along with critical bug fixes after the proper merge window.

I meant stable in the sense where most changes are bug fixes, reducing the friction of working within the kernel schedules.

MBCook · 2025-09-30T14:41:38 1759243298

It was also an attitude/civility thing in addition to the code stuff.