[SoaS] [Sugar-devel] SOAS 2 problems

Bernie Innocenti bernie at codewiz.org
Tue Jan 26 09:06:08 EST 2010


On Mon, 2010-01-25 at 20:21 +0100, Sebastian Dziallas wrote:
> > Have you tried writing to the journal until you fill up the overlay
> > space?
> 
> On a reasonably big usb stick (with prices dropping all day long), this 
> problem will probably not be solved, but rather appear less and less.

On larger USB sticks it will take *longer* to appear, but it *will*
certainly appear after some time. As the journal fills up with activity
data, soon or later the overlay snapshot will run out of free pages.


> Note that I'm not saying that we don't need to investigate here.
> 
> > Unless I'm seriously misunderstanding how LVM snapshots work, this
> > should systematically make the flash drive inoperable until reformatted,
> > and all data inaccessible.
> 
> Walter, Dave and others have been doing tests concerning the reliability 
> of our current layout in the post-Strawberry time, if I recall 
> correctly. It turned out that the compressed read-only squashfs was not 
> at all the root of the issues some people were seeing.

Yes, the problem is definitely not squashfs itself, because all it does
is store a read/only ext3 image.

> Rather, those were directly related to a corrupted overlay file, whether 
> it was caused by unplugging the usb stick too early or filling up the 
> overlay file quickly. Resetting the overlay file took away all these 
> issues, though.

Unplugging the USB stick shouldn't be causing any major filesystem
corruption when using ext3 directly. It may be problematic with the
overlay, though, as it cannot guarantee the write ordering semantics
expected by the filesystem.

The corruption scenario I've seen with Caroline is much simpler to
reproduce:

 1. write normally to the filesystem (dd if=/dev/random of=foobar)
 2. boom!


> As much as it might be wrong: Upstream has a reason in doing so. And 
> we're going to stick with what upstream is doing. I used to explain a 
> major goal for this release to be increasing the sustainability of the 
> whole process. Diverging from upstream is not going to achieve this.

I tried to make sense of the reasons behind the current layout by asking
around. My impression is that it evolved in incremental steps over the
years, without rethinking the design for USB storage at all.

 1) a read-only squashfs of course made sense for the live CD

 2) storing an ext3 image inside squashfs provided a bug-free
    filesystem and probably better compression

 3) transferring this ext3-within-squashfs to a DOS filesystem was
    the most obvious way to create a bootable USB stick

 4) using device-mapper snapshots was a clever way to make the
    embedded ext3 filesystem also writable

See? Every step makes perfect sense if you consider the previous
situation.

But... wait a moment... all these steps, 1 through 4, are totally
superfluous when you have a USB stick!!!


> OLPC went through this. And so did I. The final Blueberry build was 
> created late in the night before I was having another school exam in the 
> morning and directly afterwards rushing to the airport to catch a plane 
> to Toronto.

You did an amazing job, this is not being questioned.


> Anything that diverges from upstream (let it be Fedora, Sugar, or any 
> other project) leaves us with a gap. Everything we hack up ourselves 
> will return to us when it comes to support.

I'm not proposing to hack a new thing. I'm proposing to _remove_ a hack.

All we have to do, is install Fedora to the USB stick normally, as it
would be installed to a hard drive. We can do this using the standard
tools for installing fedora (liveinst, or whatever).

At the end of composition, we just drop the additional steps performed
by livecd-iso-to-disk.


>And we don't want our users 
> to download either a 4 GB image (or a 700 MB one, which takes them half 
> a day to uncompress), right?

If the image is padded with zeros and compressed with xz (lzma),
it may become even smaller than the current iso image. Wanna bet? :-)





> > The fix for us is easier, because we don't really need any of the fancy
> > things that livecd-iso-to-disk does. All we need to do is stop
> > mentioning it in our wiki and switch to liveinst, as Peter Robinson
> > suggests, or any other tool that does the (rather trivial) job of
> > transferring an ext3 image onto a (flash) drive.
> 
> I don't think there's an "us" and "them" there. People from Fedora 
> contribute to Sugar on a Stick, in the same way as people from Sugar 
> Labs do. It's somehow what makes this project so cool, too.

I don't want to give the wrong impression that I'm anti-Fedora. I've
been using Fedora on multiple machines since when it was still called
Red Hat :-)

Nevertheless, the Fedora Live CD has a different audience and use-cases
than Sugar on a Stick. We don't necessarily have to blindly imitate them
(regular Fedora Live) if it doesn't work well for us (SoaS). The Fedora
Live CD is mostly for showcasing Fedora without installing it, while
SoaS is supposed to be a production environment.


> So. I was talking to Sascha yesterday on IRC and we came up with some 
> ideas how to proceed here. I don't think that the we should stop using 
> squashfs images. Because I don't see a reason to stop doing so.

Let me reverse the viewpoint: do you see a reason to *continue* using
squashfs?


> On the overlay, an approach that we came up with introduced a new 
> partition (possibly ext4) for the overlay only.

Where would this new partition be mounted?


> That would mean that the 
> bootloader and the squashfs file would stay on a fat32 partition, 
> allowing users even to exchange data with their windows machines.

This is currently not possible anyway because the journal is stored in
the ext3 partition, and windows does not have tools to read the journal
anyway.

Even if one could import/export files to the DOS partition, I don't
think we should (mis)design our partition layout all around this
usecase.


>  On the 
> other hand, we'd reference the new overlay partition (instead of the 
> overlay file) in the syslinux.cfg.
> 
> I don't know whether this is possible. But now is the time to work it 
> out. It would allow us to continue to use the existing infrastructure, 
> while working on the roots of relevant issues directly.

Even if it were possible (but I do not understand how), this would
indeed require some additional coding and testing to get there from
where we're standing.

What useful features does squashfs bring to pay off for all the extra
work and complexity it requires?

/me is a minimalist. What's not there will not break ;-)

-- 
   // Bernie Innocenti - http://codewiz.org/
 \X/  Sugar Labs       - http://sugarlabs.org/



More information about the SoaS mailing list