383
Linus Torvalds Begins Expressing Regrets Merging Bcachefs
(www.phoronix.com)
From Wikipedia, the free encyclopedia
Linux is a family of open source Unix-like operating systems based on the Linux kernel, an operating system kernel first released on September 17, 1991 by Linus Torvalds. Linux is typically packaged in a Linux distribution (or distro for short).
Distributions include the Linux kernel and supporting system software and libraries, many of which are provided by the GNU Project. Many Linux distributions use the word "Linux" in their name, but the Free Software Foundation uses the name GNU/Linux to emphasize the importance of GNU software, causing some controversy.
Community icon by Alpár-Etele Méder, licensed under CC BY 3.0
Use ext4. It just works.
ext4 is intended for a completely different use case, though? bcachefs is competing with btrfs and ZFS in big storage arrays spanning multiple drives, probably with SSD cache. ext4 is a nice filesystem for client devices, but doesn't support some things which are kinda fundamental at larger scales like data checksumming, snapshots, or transparent compression.
What's cool about bcache is that it can have fully tiered storage. It can move data from a hard drive to a SSD and vis versa. It isn't a cache like in ZFS as ZFS wipes the cache drive on mount and adding a cache doesn't increase capacity
@possiblylinux127 @DaPorkchop_. ZFS has a persistent L2ARC cache now.
There's XFS for larger scale stuff.
XFS still isn't a multi-device filesystem, though... of course you can run it on top of mdraid/LVM, but that still doesn't come close to the flexibility of what these specialized filesystems can do. Being able to simply run
btrfs device add /dev/sdx1 /
and immediately having the new space available is far less hassle than adding a device to an md array, then resizing the partition and then resizing the filesystem (and removing a device is even worse). Snapshots are a similar deal - sure, LVM can let you snapshot your entire virtual block device, but your snapshots are block devices themselves which need to be explicitly mounted, while in btrfs/bcachefs a snapshot is just a directory, and can be isolated to a specific subvolume rather than the entire block device.Data checksums are also substantially less useful when the filesystem can't address the underlying devices individually, because it makes repairing the data from a replica impossible. If you have a file on an md RAID1 device and one of the replicas has a bad block, you might be able to detect the bitrot by verifying the checksum, but you can't actually fix it, because even though there is a second copy of the data on another drive, mdadm simply exposes a simple block device and doesn't provide any way to read from "the other copy". mdraid can recover from total drive failure, but not data corruption.
Is XFS still maintained?
One of the best filesystem codebases out there. Really a top notch file system if you don't need to resize it once it's created. It is a write through, not copy on write, so some features such as snapshots are not possible using XFS. If you don't care about features found in btrfs, zfs or bcachefs, and you don't need to resize the partition after creating it, XFS is a solid and very fast choice.
Ext4 codebase is known to be very complex and some people say even scary. It just works because everybody's using it and bugs have been fixed years ago.
xfs_growfs is a thing. I know nothing about xfs. Is this something I should avoid for some reason?
https://docs.redhat.com/en/documentation/red_hat_enterprise_linux/6/html/storage_administration_guide/xfsgrow
No reason to avoid it. Just know that you can't easily shink the filesystem, only grow it. To shrink you'd need to create a new FS then copy the data over manually.
I heard that ext4s best feature was its fsck utils being extremely robust and able to recover from a lot of problems. Which does not shine a great light on the filesystem itself :/ and probably a result of the complex codebase.
It's used in RHEL.
Yes.
Honestly I'm fine with ZFS on larger scale, but on desktop I want a filesystem that can do compression (like NTFS on windows) and snapshots.
I have actually used compression a lot, and it spared me a lot of space. No, srorage is not cheap, or else I'm awaiting your shipment.
Other than that I'm doing differential backups on windows, and from time to time it's very useful that I can grab a file to which something just happened. Snapshots cost much less storage than complete copies, which I couldn't afford, but this way I have daily diffs for a few years back, and it only costs a TB or so.
Sadly I have yet to see a truly compassionate FS 🥲
Yeah, same :D
It was a typo, I have meant compression. Specifically a per-file controlled compression, not per-directory or per-dataset.
You might as well say use fat32 it just works.
FAT32 does not just work for my Linux OS.
To people who just want to browse the web, use Office applications and a few other things, ext4 just works and FAT32 really just doesn't.
I get the point you're trying to make, FAT32 also has a small file size and is missing some features, ext4 is like that to for instance Bcachefs.
But FAT32 (and exFAT and a few others) have a completely different use cases; I couldn't use FAT32 for Linux and expect it to work, I also couldn't use ext4 for my USB stick and expect it to just work as a USB stick.
Why not? It can be adapted to a smaller drive size fairly easily during filesystem creation.
True, but for me and many others USBs are also just massively portable. Since macOS, Windows and many others (phones, consoles, smart TVs...) don't speak ext4 but do all speak FAT32 and exFAT, that makes exFAT the way to go on USB drives.
ExFAT and FAT32 are not the same.
I know they're not, I never said they were
Not really. It has a quite small file size limit afaik.
Like there's not a bunch of stuff EXT 4 can't do that BTRFS and whatever this other acronym soup can do.
It's the entire point of my post. E x t 4 does work but it doesn't do the stuff these other file systems do so they are an advantageous choice for some things.
One point: ext4 has a maximum file size of 16TiB. To a regular user that is stupidly huge and of no concern but it's exactly the type of thing you overlook if you "just use ext4" on anything and everything then end up with your database broken at work because of said bad advice.
Use the filesystem that makes the most sense for your use case. Consider it every single time you format a disk. Don't become complacent! Also fuck around with the new shit from time to time! I decided to format my Linux desktop partitions with btrfs over a decade ago and as a result I'm an excellent user of that filesystem but you know what? I'm thinking I'll try bcachefs soon and fiddle around more with my zfs partition on my HTPC.
BTW: If you're thinking about trying out btrfs I would encourage you to learn about it's non-trivial maintenance tasks. btrfs needs you to fuck with it from time to time or you'll run out of disk space "for no reason". You can schedule cron jobs to take care of everything (as I have done) but you still need to learn how it all works. It's not a "set it and forget it" FS like ext4.
For a few years I used a distro that had btrfs as default, including scheduled automatic maintenance. Never had to bother about manual balancing or fiddeling with the FS.
I have 52 terabytes of BTRFS, I've been on it for about 5 years.
I think we're just talking about different priorities. For me stability is the most important in production. For you features seem to matter more. For me it's enough if a file system can store, write, read and not lose files. I guess it depends on what the use case and the budget are.
Yeah, some people have needs that you don't have. That's why I commented on your blanket statement of just use EXT4.
I have BTRFS in production all over the place. Snapshots are extremely useful for what I do.
ext4 aims to not lose data under the assumption that the single underlying drive is reliable. btrfs/bcachefs/ZFS assume that one/many of the perhaps dozens of underlying drives could fail entirely or start returning garbage at any time, and try to ensure that the bad drive can be kicked out and replaced without losing any data or interrupting the system. They're both aiming for stability, but stability requirements are much different at scale than a "dumb" filesystem can offer, because once you have enough drives one of them WILL fail and ext4 cannot save you in that situation.
Complaining that datacenter-grade filesystems are unreliable when using them in your home computer is like removing all but one of the engines from a 747 and then complaining that it's prone to crashing. Of course it is, because it was designed under the assumption that there would be redundancy.
Which is exactly why you'd want to run a CoW filesystem with redundancy.
Well, yes use-case is key. But interestingly ext4 will never detect bitrot/errors/corruption. BTRFS will detect corrupted files because its targeted users wants to know. It makes it difficult to say what's the more reliable FS because first we'd have to define "reliable" and the perception of it and who/what do we blame when the FS tells us there's a corrupted file detected?. Do we shoot the messenger?
It also does not support unix file permissions - so for most installs it does indeed not work.
No. You can layer ext4 with LVM and LUKS to get a lot of features (but not all) that you get with BTRFS or ZFS. FAT is not suitable for anything other than legacy stuff.
My point is there are features that you don't get in EXT that are completely reasonable to use and workflows.
When someone says just use EXT4, they're just missing the fact that people may want or need those other features.
Your response to FAT is exactly my point.
Torvalds rejected the merge, and that's pretty much what he said - no one is using bcachefs.
There's no reason for a "fix" to be 1k+ lines, these sorts of changes need to come earlier in the release cycle.
The article is not about which filesystem to use or not, but about the size and contents of the patches submitted in relation to bcachefs. It seems that the submitted changes which should have been just fixes also contain new functionality. Though it is very nice to see how active and enthusiastic the development of bcachefs is, mixing fixes with new functionality is hard to review and dangerous as it can introduce additional issues. Again, while I appreciate Kents work, I understand Linus' concerns.
I once had the whole FS corrupted and I don't remember if it was XFS or ZFS (probably the latter). Also I like messing around with interesting software that might not support less common filesystems so I just stick with ext4. XFS is great though.
I wouldn't say, "repairing XFS is much easier." Yeah,
fsck -y
with XFS is really all you have to do 99% of the time but also you're much more likely to get corrupted stuff when you're in that situation compared to say, btrfs which supports snapshotting and redundancy.Another problem with XFS is its lack of flexibility. By that I don't mean, "you can configure it across any number of partitions on-the-fly in any number of (extreme) ways" (like you can with btrfs and zfs). I mean it doesn't have very many options as to how it should deal with things like inodes (e.g. tail allocation). You can increase the total amount of space allowed for inode allocation but only when you create the filesystem and even then it has a (kind of absurdly) limited number that would surprise most folks here.
As an example, with an XFS filesystem, in order to store 2 billion symlimks (each one takes an inode) you would need 1TiB of storage just for the inodes. Contrast that with something like btrfs with
max_inline
set to 2048 (the default) and 2 billion symlimks will take up a little less than 1GB (assuming a simplistic setup on at least a 50GB single partition).Learn more about btrfs inlining: https://btrfs.readthedocs.io/en/latest/Inline-files.html
I'm pretty sure btrfs is not small or "indie"
You had corruption with btrfs? Was this with a spinning disk or an SSD?
I've been using btrfs for over a decade on several filesystems/machines and I've had my share of problems (mostly due to ignorance) but I've never encountered corruption. Mostly I just run out of disk space because I forgot to balance or the disk itself had an issue and I lost whatever it was that was stored in those blocks.
I've had to repair a btrfs partition before due to who-knows-what back when it was new but it's been over a decade since I've had an issue like that. I remember
btrfs check --repair
being totally useless back then haha. My memory on that event is fuzzy but I think I fixed whatever it was bitching about by remounting the filesystem with an extra option that forced it to recreate a cache of some sort. It ran for many years after that until the disk spun itself into oblivion.