Fedora 18 --> 19 fedup upgrade woes - readonly filesystm and fstab file the culprit

As indicated now in the title of Redhat Bugzilla 969648 the problem is the "data=writeback" parameter in /etc/fstab. Fedup and Fedora 19 have a problem with this particular setting. 


I have two fedora machines - a desktop and a laptop. During the last upgrade cycle, Fedora 17-->18, the desktop upgraded without any problem whatsoever. On the laptop, I had managed to shoot myself in the foot due to basically having some parts of my kernel boot commandline that were present in grub.cfg but not in /etc/default/grub. (These were the parts that blacklisted nouveau so that the nvidia binary driver could run). At the end of the fedup procedure I had run

   grub2-mkconfig -o /boot/grub2/grub.cfg

as instructed and this wiped out the parts of my command line that were needed to run the nvidia driver and due to that I had all sorts of problems figuring out the problem. Of course once the problem was diagnosed the fix was relatively easy.

This time around, in going from 18-->19 things were, I have to say, even more difficult. It would be hard to detail step by step what went wrong and all the different things I tried. But basically a highly simplified version would be as follows.

I ran the fedup cli in the usual way  (after a yum update and a reboot)

sudo fedup-cli  --network 19

This initially seemed to work fine and it went and got the needed packages to boot into the fedup-upgrade kernel.  I checked the fedup log for errors and there were none and everything seemed to finish in an orderly fashion.

I don't remember if the next boot menu "looked right" that is to say it had the fedup-upgrade option for booting. I'm supposing that it did as things really had yet to go wrong.

On booting into the upgrade image one would normally expect the fedup process or procedure to automatically go grab the approximately 2000 F19 packages it needed. However it never really got the chance to do that - the boot screen seemed to end with the message

systemd-readahead-collect[###]: Failed to read event: Value too large for datatype
(Here ### is a number which I think is not particularly relevant).

At this point I tried many, many, many different things and searched the web for this error message was really getting nowhere fast.

Then I got the idea to try

sudo yum --releasever=19 distro-sync

from reading http://kashyapc.wordpress.com/2013/06/09/f18-f19-distro-sync-with-yum/ which was very helpful.

I guess I should also mention that I could get a text console at this point with CTRL-ALT-F2 and could login as root. But when I tried to run distro-sync I got errors due to a read-only filesystem. A simple command from the console such as  touch foo produced the same sort of error.

This sent me on a chase to try to track down the causes of a read-only filesystem after an upgrade.

Which, in turn, led me to https://bugzilla.redhat.com/show_bug.cgi?id=969648 and this basically turned into the crux of the problem. In fact the title of the bugzilla perfectly describes my issue - "Fedup to F19 fails due to read-only root if "data=writeback" in mount opts".

If you read the workarounds mentioned in the aforementioned bugzilla you'll find that people that ended up with read-only filesystems were able to work around this issue by reworking their /etc/fstab files.

To make what's already a long story a little bit shorter - here's what I did (by means of a rescue disk, and the whole /mnt/sysimage drill).  (I commented out the lines in red and replaced them with the lines in blue).


#UUID=ab686964-0ca8-4f00-8d9b-39753d89a74b     /                       ext4    defaults,discard,data=writeback,noatime,commit=15       1 1
/dev/sda2   /   ext4    defaults       1 1
#UUID=91138954-e05b-422a-8852-1ac2e77d73c3    /boot                ext4    defaults,discard,data=writeback,noatime,commit=15       1 2
/dev/sda1  /boot   ext4 defaults   1 2
#UUID=e5b99d8f-fc98-426b-a003-40743325bd6d     /home              ext4    defaults,discard,data=writeback,noatime,commit=15       1 2
/dev/sda5  /home  ext4    defaults       1 2
UUID=8bbf4a4d-6048-45ad-a4f2-3cc9878af5d8 swap                    swap    defaults       0 0

The bottom line is that these simple changes fixed the issue. After doing this I was able to
yum distro-sync.  There were many more manipulations that I ended up doing, many of which were dead ends or blind alleys, but this was basically the definitive fix.  And I believe this even more strongly because once things were working I reverted to the old form of the file and things promptly broke again and in the same way. I then reverted back and I'm typing this now from the newly upgraded F19 system.

Several comments on the /etc/fstab.

1. The non-default parameters were put in in an attempt to optimize for the SSD in the laptop.

2. I highly doubt (although I haven't yet proven) that the UUID piece is the culprit - as one can easily see the UUID was left in for the swap case and this caused zero issues. In fact I'll probably redo the table putting the UUID's back in and I expect there will be no problem there. Although I haven't yet done this.

3. I suspect that one or more of the non-default parameters is at fault - I think that the key one for SSD's is "discard" and one day I may play around and try to narrow this down a bit more. In fact the bugzilla is now suggesting that it's the data=writeback that is at issue. I still need to confirm.

Like with many issues the keys are search the web for answers, don't panic, and have a rescue disk handy.

Finally I tend to think that the systemd-readahead-collect[###]: Failed to read event: Value too large for datatype error was a bit of a red herring - once the filesystem issue was resolved, this error never reappeared.

Comments

Ole Sandum said…
Thank you thank you thank you.
Precisely identical problem. And for the exact same reason: having followed some guidelines for optimizing SSD performance.
Google + Glotzer = solution

Popular posts from this blog

Hit failing alternator with a hammer to confirm diagnosis of failing alternator due to bad brushes

alternatives --install gets stuck: failed to read link: No such file or directory

Using SSH, SOCKS, tsocks, and proxy settings to create a simultaneous "dual use" work/home computer