How to shrink root filesystem without booting a livecd

Today i looked for a solution to shrink a root filesystem wihtout rebooting a livecd.

After a while i found something on a mailinglist i want to share with you.

Thanks a lot to Andrew Wood and Tom Hunt for this instructions.

  1. Ensure the system is in a stable state

    Make sure no one else is using it and nothing else important is going on. It's probably a good idea to stop service-providing units like httpd or ftpd, just to ensure external connections don't disrupt things in the middle.:

    systemctl stop httpd
    systemctl stop nfs-server
    # and so on....
    
  1. Unmount all unused filesystems:

    umount -a
    

    This will print a number of 'Target is busy' warnings, for the root volume itself and for various temporary/system FSs. These can be ignored for the moment. What's important is that no on-disk filesystems remain mounted, except the root filesystem itself. Verify this:

    # mount alone provides the info, but column makes it possible to read
    mount | column -t
    

    If you see any on-disk filesystems still mounted, then something is still running that shouldn't be. Check what it is using fuser:

    # if necessary:
    yum install psmisc
    # then:
    fuser -vm <mountpoint>
    systemctl stop <whatever>
    umount -a
    # repeat as required...
    
  2. Make the temporary root:

    mkdir /tmp/tmproot
    mount -t tmpfs none /tmp/tmproot
    mkdir /tmp/tmproot/{proc,sys,dev,run,usr,var,tmp,oldroot}
    cp -ax /{bin,etc,mnt,sbin,lib,lib64} /tmp/tmproot/
    cp -ax /usr/{bin,sbin,lib,lib64} /tmp/tmproot/usr/
    cp -ax /var/{account,empty,lib,local,lock,nis,opt,preserve,run,spool,tmp,yp} /tmp/tmproot/var/
    

    This creates a very minimal root system, which breaks (among other things) manpage viewing (no /usr/share), user-level customizations (no /root or /home) and so forth. This is intentional, as it constitutes encouragement not to stay in such a jury-rigged root system any longer than necessary.

    At this point you should also ensure that all the necessary software is installed, as it will also assuredly break the package manager. Glance through all the steps, and make sure you have the necessary executables.

  3. Pivot into the root:

    mount --make-rprivate / # necessary for pivot_root to work
    pivot_root /tmp/tmproot /tmp/tmproot/oldroot
    for i in dev proc sys run; do mount --move /oldroot/$i /$i; done
    

    systemd causes mounts to allow subtree sharing by default (as with mount --make-shared), and this causes pivot_root to fail. Hence, we turn this off globally with mount --make-rprivate /. System and temporary filesystems are moved wholesale into the new root. This is necessary to make it work at all; the sockets for communication with systemd, among other things, live in /run, and so there's no way to make running processes close it.

  4. Ensure remote access survived the changeover:

    systemctl restart sshd
    systemctl status sshd
    

    After restarting sshd, ensure that you can get in, by opening another terminal and connecting to the machine again via ssh. If you can't, fix the problem before moving on.

    Once you've verified you can connect in again, exit the shell you're currently using and reconnect. This allows the remaining forked sshd to exit and ensures the new one isn't holding /oldroot.

  5. Close everything still using the old root:

    fuser -vm /oldroot
    

    This will print a list of processes still holding onto the old root directory. On my system, it looked like this:

                 USER        PID ACCESS COMMAND
    /oldroot:    root     kernel mount /oldroot
                 root          1 ...e. systemd
                 root        549 ...e. systemd-journal
                 root        563 ...e. lvmetad
                 root        581 f..e. systemd-udevd
                 root        700 F..e. auditd
                 root        723 ...e. NetworkManager
                 root        727 ...e. irqbalance
                 root        730 F..e. tuned
                 root        736 ...e. smartd
                 root        737 F..e. rsyslogd
                 root        741 ...e. abrtd
                 chrony      742 ...e. chronyd
                 root        743 ...e. abrt-watch-log
                 libstoragemgmt    745 ...e. lsmd
                 root        746 ...e. systemd-logind
                 dbus        747 ...e. dbus-daemon
                 root        753 ..ce. atd
                 root        754 ...e. crond
                 root        770 ...e. agetty
                 polkitd     782 ...e. polkitd
                 root       1682 F.ce. master
                 postfix    1714 ..ce. qmgr
                 postfix   12658 ..ce. pickup
    

    You need to deal with each one of these processes before you can unmount /oldroot. The brute-force approach is simply kill $PID for each, but this can break things. To do it more softly:

    systemctl | grep running
    

    This creates a list of running services. You should be able to correlate this with the list of processes holding /oldroot, then issue systemctl restart for each of them. Some services will refuse to come up in the temporary root and enter a failed state; these don't really matter for the moment.

    Some processes can't be dealt with via simple systemctl restart. For me these included auditd (which doesn't like to be killed via systemctl, and so just wanted a kill -15). These can be dealt with individually.

    The last process you'll find, usually, is systemd itself. For this, run systemctl daemon-reexec.

    Once you're done, the table should look like this:

                 USER        PID ACCESS COMMAND
    /oldroot:    root     kernel mount /oldroot
    
  6. Unmount the old root:

    umount /oldroot
    

    At this point, you can carry out whatever manipulations you require. The original question needed a simple resize2fs invocation, but you can do whatever you want here; one other use case is transferring the root filesystem from a simple partition to LVM/RAID/whatever.

  7. Pivot the root back:

    mount <blockdev> /oldroot
    mount --make-rprivate / # again
    pivot_root /oldroot /oldroot/tmp/tmproot
    for i in dev proc sys run; do mount --move /tmp/tmproot/$i /$i; done
    

    This is a straightforward reversal of step 4.

  8. Dispose of the temporary root

    Repeat steps 5 and 6, except using /tmp/tmproot in place of /oldroot. Then:

    umount /tmp/tmproot
    rmdir /tmp/tmproot
    

    Since it's a tmpfs, at this point the temporary root dissolves into the ether, never to be seen again.

  9. Put things back in their places

    Mount filesystems again:

    mount -a
    

    At this point, you should also update /etc/fstab and grub.cfg in accordance with any adjustments you made during step 7.

    Restart any failed services:

    systemctl | grep failed
    systemctl restart <whatever>
    

    Allow shared subtrees again:

    mount --make-rshared /
    

    Start the stopped service units - you can use this single command:

    systemctl isolate default.target
    

And you're done.

Comments

Comments powered by Disqus