Yep. All people asked him to do was slow down a bit because they felt it was too much change at once. He refused for any reason other than his own to slow down. H said he only saw three reasons to slow down and none of them applied so Linus should just accept my patch now.
I never understand why some people are unwilling to make any attempt at getting along. Some people seem to feel any level of compromise is too much.
He justified breaking the guidelines to address critical issues. one can hope these kind of problems would not happen that frequently in a stable project, besides it is still experimental.
What he actually did was that he bundled the fix for the critical issue with dubugging tools and a totally new experimental feature. I totally get why they stopped working with him.
It was still an entirely new and experimental feature which had not been properly reviewed. Why couldn't this feature wait until next kernel version? Other file systems have had their recovery tools improved over many years.
Filesystems like ext2/3/4 have their recovery tools in userland. Most of the recovery features in bcachefs are in the kernel. As a result, it is inevitable that at some point there was and will be a need to push a new feature into a stable release for the purpose of data recovery.
Over the long term the number of cases where such a response is needed will decrease as expected.
Do you really want to live in a world where data losses in stable releases is considered Okay?
>it is inevitable that at some point there was and will be a need to push a new feature into a stable release for the purpose of data recovery
It's really not, the proper way to recover your important data is to restore from backups, not to force other people to bend longstanding rules for you.
>Do you really want to live in a world where data losses in stable releases is considered Okay?
If it really was so urgent why didn't Kent just tell people to get those updates from his personal tree? There are rules in place if you want your stuff to get into Linus's tree, expecting Linus to pull whatever you sent him without any resistance whatsoever is likely just going to end up with him deleting your project, just like what happened here.
Because distributions don't ship Kent's kernel tree, and they're not going to. Distributions like Fedora ship as close to mainline as possible these days because of the pain experienced from shipping a heavily patched kernel in the past. Release cycles are upwards of 3 months for Linus' tree. With that kind of lengthy release cycle, for an experimental codebase which is undergoing rapid stabilization it was the right call: you don't want old code to linger around longer than necessary when they're predominantly bug fixes that successfully pass regression tests. The choice should be with the maintainer.
No, it is not. bcachefs needs to have all the code for error recover in the kernel as it needs to be available when a storage device fails in any of a myriad of ways.
Maintaining a piece of code that needs to run in both user space and the kernel is messy and time consuming. You end up running into issues where dependencies require the porting of gobs of infrastructure from the kernel into userspace. That's easy for some thing, very hard for others. There's a better place to spend those resource: by stabilizing bcachefs in the kernel where it belongs.
Other people have tried and failed at this before, and I'm sure that someone will try the same thing again in the future and relearn the same lesson. I know as business requirements for a former employer resulted in such a beast. Other people thought they could just run their userspace code in the kernel, but they didn't know about limits on kernel stack size, they didn't know about contexts where blocking vs non-blocking behaviour is required or how that interacted with softirqs. Please, just don't do this or advocate for it.
The fact you got downvoted makes me shake my head. One could still interpret this as contributor violation and thats fair.
If I'm not mistaken Kent pushed recovery routines in the RC to handle some catastrophic bug some user caused by loading the current metadata format into an old 6.12 kernel.
It isn't some sinister "sneaking features". This fact seems to be omitted by clickbaity coverage over the situation.
As I pointed out elsewhere, there was another -rc release put out shortly after that effectively added back in a feature that was removed 10 releases back. Granted, it was only a small thing, but it shows that there is nuance in application of the rule.
Except he had a history of rushing large changes in at the last minute that were always critical, and would constantly argue about policy during the same time, which is not the appropriate time or place.
He refused to acknowledge his place on the totem pole and thought he knew better than everyone else, and that they should change their ways to suit his whims.
The way I read it, it was wrapping in feature and bug fix changes. "We found a critical bug here, to fix it instead of backporting a fix we want to pull in all the code which was built before the bug was discovered".
I can understand the motivation. It's a PITA to support an older version of code. But that's not how linux gets it's stability.
He also had bad interactions with other developers, like constantly shitting on other file systems and generally behaving like a jerk completely unnecessarily.
Yeah, but then you run into scenarios where A+D is tested and ready, but B and/or C are not. Git does give you tools to separate them, but most people don't like doing that for various reasons.
IMHO, it may be more natural, but only during development. Trying to do a git bisect on git histories like the above is a huge pain. Trying to split things up when A is ready but B/C are not is a huge pain.
My take away from trying to follow the discussion on the kernel mailing list was that the Bcachefs developer want to work in a certain way, that Linus does not think that fit in with the rest of the kernel (to put it mildly). Having Bcachefs in the kernel certainly helps with adoption, but I can't help thinking that a kernel module might be more inline with the development process that Bcachefs wants.
The underlying problem might have been importing Bcachefs into the mainline kernel to early in it's life cycle.
No, this is a pure people problem which would have happened no matter the state of Bcachefs. Kent refuses to respect other people's time and rules since that would require him to change how he works.
> He justified breaking the guidelines to address critical issues
That claim was to add new logging functionality to allow better troubleshooting to eventually address critical issues.
This should have been out of trunk for someone to test, rather than claiming it to be something that wasn't strictly true. Especially when it's the kernel.
> I hope it eventually comes back once it is more stable.
Yes, me too.
> Would be great to have an in kernel alternative to ZFS
Yes it would.
> for parity RAID.
No.
Think of the Pareto Principle here. 80% of the people only use 20% of the functionality. BUT they don't all use the same 20% so overall you need 80% of the functionality... or more.
ZFS is one of the rivals here.
But Btrfs is another. Stratis is another. HAMMER2 is another. MDRAID is another. LVM is another.
All proviude some or all of that 20% and all have pros and cons.
The point is that, yes, ZFS is good at RAID and it's much much easier than ext4 on MDRAID or something.
Btrfs can do that too.
But ZFS and Btrfs do COW snapshots. Those are important too. OpenSUSE, Garuda Linux, siduction and others depend on Btrfs COW.
OK, fine, no problem, your use case is RAID. I use that too. Good.
But COW is just as important.
Integrity is just as important and Btrfs fails at that. That is why the Bcachefs slogan is "the COW filesystem that won't eat your data."
Btrfs ate my data 2-3 times a year for 4 years.
Doesn't matter how many people who praise it, what matters are the victims who have been burned when it fails. They prove that it does fail.
The point is not "I can do that with ext4 on mdraid" or "I can do that with LVM2" or "Btrfs is fine for me".
The point is something that can do _all of these_ and do it _better_ -- and here, "better" includes "in a simpler way".
Simpler here meaning "simpler to set up" and also "simpler in implementation" (compared to, say, Btrfs on LVM2, or Btrfs on mdraid, or LVM on mdraid, or ext4 on LVM on RAID.
Something that can remove entire layers of the stack and leave the same functionality is valuable.
Something that can remove 90% of the setup steps and leave identical functionality matters... Because different people do those steps in different order, or skip some, and you need to document that, and none of us document stuff enough.
The recovery steps for LVM on RAID are totally different from RAID on LVM. The recovery for Btrfs on mdraid is totally different from just Btrfs RAID.
This is why tools that eliminate this matter. Because when it matters whether you have
1 - 2 - 3 - 4 - 5
or
1 - 2 - 4 - 3 - 5
Then the sword that chops the Gordian knot here is one tool that does 1-5 in a single step.
This remains true even if you only use 1 and 5, or 2 and 3, and it still matters if you only do 4.
As far as I know, ZFS is either for smart people who want to do something sophisticated or trendy people who want to do something unwise.
> ext4 on MDRAID or something
Are trivially easy to set up, expand, or replace drives; require no upkeep; and no setup when placed into entirely different systems. Anybody using ZFS or ZFS-like to do some trivial standard RAID setup (unless they are used to and comfortable with ZFS, which is an entirely different story) is just begging to lose data. MDADM is fine.
> As far as I know, ZFS is either for smart people who want to do something sophisticated or trendy people who want to do something unwise.
Or people who want data checksums.
> Anybody using ZFS or ZFS-like to do some trivial standard RAID setup (unless they are used to and comfortable with ZFS, which is an entirely different story) is just begging to lose data.
How? You just... hand it some devices, and it makes a pool. Drive replacement is a single command.
False. Ext4 requires the occasional check. This must be done offline. ZFS doesn't and can be scrubbed while online and actively in use.
> and no setup when placed into entirely different systems.
Same as ZFS.
> Anybody using ZFS or ZFS-like to do some trivial standard RAID setup (unless they are used to and comfortable with ZFS, which is an entirely different story) is just begging to lose data.
False.
> MDADM is fine.
I am not saying it isn't. I am saying ZFS is better.
I think you haven't tried it, because your claims betray serious ignorance of what it can do.
I built my main NAS box's ZRAID with the drives in USB-3 caddies on a Raspberry Pi 4. I moved it to the built in SATA controllers of an HP Microserver running TrueNAS Core.
Imported and just worked. No reconfig, no rebuild, nothing.
It moves seamlessly between Arm and x86, Linux and FreeBSD, no problem at all. Round trip if you want.
I have a multidevice filesystem, comprised of old HDDs and one sketchy PCI-SATA extension. This FS was assembled in 2019 and, though it went through periods of being non-writable, is still working and I haven't lost any[1] data. This is more than 5 years, multitude of FS version upgrades, multiple device replacements with corresponding data evacuation and rereplication.
[1] Technically, I did lose some, when a dying device started misbehaving and writing garbage, and I was impatient and ran a destructive fsck (with fix_errors) before waiting for a bug patch.
Don't want to compare it to other solutions but this is impressive even on its own merits.
Would be great to have an in kernel alternative to ZFS for parity RAID.