Details
-
Task
-
Resolution: Done
-
P2: Important
-
None
-
None
-
None
Description
One of the most important properties of an update system is reliability. When you buy a device you expect it to simply work (boot), even after system updates
OSTree is safe for power cut scenarios. This is achieved by an atomic symlink swap for boot loader entries file when all update processes have finished and file system buffers have been flushed.
The atomic swap is done via renameat() system call. Utilizing the pattern described in the renameat man pages:
'If newpath already exists, it will be atomically replaced (subject to a few conditions; see ERRORS below), so that there is no point at which another process attempting to access newpath will find it missing."
The atomicity here is meant at the software layer, if power cut occurs during this process then it is a filesystem's task to ensure that the file contains old or new contents, not garbage.
AFAIK, FS journals are replayed when mounting the FS. Mounting happens after the boot loader has done its job.
Do FS drivers that are shipped with boot loaders are smart enough to detect inconsistent FS and replay the FS journal? Or they simply assume that FS is in a clean state?
If boot loaders assume that FS is in a consistent state, how to ensure that the file system actually is in consistent state when boot loader (u-boot, grub, etc.) attempts to read from it?
1) Somehow detect unclean shut down.. Reboot into recovery initramfs (on a read-only partition) and fix FS state on boot/rootfs partition.. reboot again. This time system should see that everything is fine and boot normally.
2) Something else?
Some useful resources:
Preventing Filesystem Corruption in Embedded Linux: https://www.embeddedarm.com/about/resource.php?item=459
List of various techniques: http://stackoverflow.com/questions/14460091/embedded-file-system-and-power-off
Attachments
Issue Links
- relates to
-
QTBUG-52784 Detect and recover from hanging boot / kernel panic
- Closed