[Sugar-devel] RFC: Btrfs as the embedded volume format for SoaS
Chris Murphy
lists at colorremedies.com
Wed Aug 9 19:56:06 EDT 2017
Hi,
I'm mostly involved in Fedora QA, specialty is I'm a bug magnet. And
also on linux-btrfs@, the upstream development list for Btrfs.
The idea is: use Btrfs instead of ext4 inside the LiveOS/rootfs.img
found inside LiveOS/squashfs.img which is found on the SoaS ISO.
This is an idea, food for thought, perhaps a pre-proposal. I've
considered the pros and cons quite a bit up to this point including
the multiple points in Fedora policy and compose that would get
touched by this, or alternatives. I've also mentioned it to Martin
Bříza who has developed (Fedora) Media Writer, and he's open minded
about it although it doesn't necessarily require support be located
there.
Features live images benefit from Btrfs as rootfs:
- always enabled metadata and data checksumming
- seed/sprout feature for persistent overlay
- online resize
- compression
- flash optimized
Media checking
----------------------
Existing problem:
The current rd.live.check calls isomd5sum to check the media, it's an
out of band check (you have to wait), and the user can opt out, and
when the existing livecd-tools based persistent overlay is enabled on
stick media, it breaks isomd5sum so it's silently skipped even if the
user chooses it. They have no idea it isn't actually being used unless
they know what to look for in the journal.
How does Btrfs address it:
Btrfs always checks data and metadata checksums, the user can't opt
out. And we get it for free even though it doesn't come with any
notification. A future feature might be to enable a Btrfs scrub on a
schedule, parse the result, and if there's a problem inform the user
in some sane way.
Persistent overlay
-------------------------
Existing problem:
The overlay file method is slow, complicated, and fragile. The df
command gives incorrect information, file deletion does not free up
space on the overlay, once it's exhausted it blows up without warning,
etc.
How Btrfs addresses this:
The seed/sprout feature is explicitly designed for this use case and
most of the code is in the kernel rather than depending on initramfs
for assembly and setup. The rootfs.img is the ro seed, and any block
device is designated rw sprout. All writes go to the sprout. Deleting
files does free extents, so the space is reusable.
The initial setup in user space involves:
1. btrfstune -S1 rootfs.img
2. mount rootfs.img /sysroot
3. btrfs device add /dev/sda2 /sysroot
4. mount -o remount,rw /sysroot
Where 1. happens during image compose. And steps 2. & 3. could happen
either in initramfs, or because it's rather simple code, the Media
Writer could do this, where sda2 represents a free space partition on
the stick. Device add and remove always implies file system resize,
which is done online.
It is possible for the sprout to be a zram device, for a RAM based
overlay that resets at each reboot.
A new feature could give the user the option to create a persistent
overlay from within SoaS, where it would do the steps 2, 3, 4, and
modify the bootloader configuration accordingly.
Another option is per user sprouts, each of which can optionally be encrypted.
Yet another option is permanent installation to the stick or some
other media. If the partition used as sprout is at least as big as the
seed, it's possible to 'btrfs device delete' the seed. This causes
extents to be copied from the seed to the sprout, turning it into an
ordinary non-seed/non-sprout volume. Now the original partition
containing the seed (the ISO image) can be wiped, and reused as free
space. This would be done entirely online. This could be an option for
(destructive) installation or upgrades on devices running Sugar with
built-in storage rather than a stick.
Online resize
------------------
Maybe the user, after the fact, wants a regular FAT32 volume on their
stick for sharing between platforms. This could be a tickbox admin
feature, really simple, by leveraging online resize. Other use cases
are also possible.
Compression
------------------
Saves space and improves performance with slower flash media. Can
apply to just the sprout, is a mount time option. It could also be a
compose time option to apply to the seed image with the consequence
that the resulting image will be bigger since squashfs xz compresses
better; but open question is which is faster on a typical USB stick
and hardware, usually xz is more expensive.
Flash optimized
----------------------
There are two ssd mount options which can help improve performance and
life of flash media. It's admittedly a minor feature for modern SSDs
with smarter wear leveling capabilities, but USB sticks probably still
benefit from this.
Anyway, I have used the Btrfs seed/device feature quite a bit with
single profile (not raid profiles), on a stick and with zram device
for temporary overlay. And it's way more straightforward code wise
than the existing livecd-tools and dracut code to leverage device
mapper. But that code already exists and is maintained and does work.
So the questions are: is it interesting and useful? Does it help solve
real problems users have, and make for an overall better experience?
Is there another way to achieve some of the same things with less
work?
OK this is long enough for now. Thanks for reading.
--
Chris Murphy
--
Chris Murphy
More information about the Sugar-devel
mailing list