Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Flatcar systemd units fail at first boot after disk is resized (resize done because the storage was full) #1279

Open
ader1990 opened this issue Dec 6, 2023 · 1 comment
Labels
kind/bug Something isn't working

Comments

@ader1990
Copy link

ader1990 commented Dec 6, 2023

Description

If the Flatcar storage is full, a power off is made and a subsequent disk resize is performed.
After the instance is started, the partition is resized automatically (vda9) but some of the systemd units fail to start.

Impact

Low. If another reboot is performed, the systemd units are back running fine.

Environment and steps to reproduce

How to reproduce:

  • create a Flatcar qemu-kvm instance using the https://alpha.release.flatcar-linux.net/amd64-usr/current/flatcar_production_qemu.sh
  • in the vm, create a more than enough big file using dd if=/dev/zero of=toobig.file bs=1G count=1000 which ends when it runs out of space
  • shutdown the vm
  • start the vm - a bunch of systemd services are in failed state (expected).
  • shut down the vm
  • qemu-img resize to a bigger size
  • start the vm -- the same systemd services are in failed state (unexpected), but the vda9 has been resized
  • reboot the vm (systemd services are not in the failed state).

The image used was https://alpha.release.flatcar-linux.net/amd64-usr/current/flatcar_production_qemu_image.img

The issue is that there are systemd units starting before the resize has been performed.

Failed units:

Failed Units: 3
  systemd-hwdb-update.service
  systemd-journal-catalog-update.service
  systemd-update-done.service
@ader1990 ader1990 added the kind/bug Something isn't working label Dec 6, 2023
@ader1990 ader1990 changed the title Flatcar systemd units fail at first boot after disk is resized (because storage was full) Flatcar systemd units fail at first boot after disk is resized (resize done because the storage was full) Dec 6, 2023
@pothos
Copy link
Member

pothos commented Dec 6, 2023

Thanks, for systemd-hwdb-update.service we should pre-build the DB at image generation with systemd-hwdb --usr --root=/build/amd64-usr/. When /usr/lib/udev/hwdb.bin exists it will be used. In the update postinst action we can delete the file in /etc from the upperdir or delete it at boot.

For the rest we should do the resize from the initrd, before initrd-setup-root to make sure that we always use the available space. (Maybe systemd-repart could do that.)

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
kind/bug Something isn't working
Projects
Development

No branches or pull requests

2 participants