ece5c0f304
I noticed booting a system with an ext4 root which expanded to 5T took quite a long time (12 minutes in some cases, 43(!) in others.) I changed stage-1 to run `resize2fs -d 62` for extra debug output and timing information. It revealed the adjust_superblock step taking almost all of the time: [Fri Oct 30 11:10:15 UTC 2020] zero_high_bits_in_metadata: Memory used: 132k/0k (63k/70k), time: 0.00/ 0.00/ 0.00 [Fri Oct 30 11:21:09 UTC 2020] adjust_superblock: Memory used: 396k/4556k (295k/102k), time: 654.21/ 0.59/ 5.13 but when I ran resize2fs on a disk with the identical content growing to the identical target size, it would only take about 30 seconds. I looked at what happened between those two steps in the fast case with strace and found: ``` 235 getrusage(RUSAGE_SELF, {ru_utime={tv_sec=0, tv_usec=1795}, ru_stime={tv_sec=0, tv_usec=3590}, ...}) = 0 236 write(1, "zero_high_bits_in_metadata: Memo"..., 84zero_high_bits_in_metadata: Memory used: 132k/0k (72k/61k), time: 0.00/ 0.00/ 0.00 237 ) = 84 238 gettimeofday({tv_sec=1604061278, tv_usec=480147}, NULL) = 0 239 getrusage(RUSAGE_SELF, {ru_utime={tv_sec=0, tv_usec=1802}, ru_stime={tv_sec=0, tv_usec=3603}, ...}) = 0 240 gettimeofday({tv_sec=1604061278, tv_usec=480192}, NULL) = 0 241 mmap(NULL, 2564096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7fa3c7355000 242 access("/sys/fs/ext4/features/lazy_itable_init", F_OK) = 0 243 brk(0xf85000) = 0xf85000 244 brk(0xfa6000) = 0xfa6000 245 gettimeofday({tv_sec=1604061278, tv_usec=538828}, NULL) = 0 246 getrusage(RUSAGE_SELF, {ru_utime={tv_sec=0, tv_usec=58720}, ru_stime={tv_sec=0, tv_usec=3603}, ...}) = 0 247 write(1, "adjust_superblock: Memory used: "..., 79adjust_superblock: Memory used: 396k/2504k (305k/92k), time: 0.06/ 0.06/ 0.00 248 ) = 79 249 gettimeofday({tv_sec=1604061278, tv_usec=539119}, NULL) = 0 250 getrusage(RUSAGE_SELF, {ru_utime={tv_sec=0, tv_usec=58812}, ru_stime={tv_sec=0, tv_usec=3603}, ...}) = 0 251 gettimeofday({tv_sec=1604061279, tv_usec=939}, NULL) = 0 252 getrusage(RUSAGE_SELF, {ru_utime={tv_sec=0, tv_usec=520411}, ru_stime={tv_sec=0, tv_usec=3603}, ...}) = 0 253 write(1, "fix_uninit_block_bitmaps 2: Memo"..., 88fix_uninit_block_bitmaps 2: Memory used: 396k/2504k (305k/92k), time: 0.46/ 0.46/ 0.00 254 ) = 88 ``` In particular the access to /sys/fs seemed interesting. Looking at the source of resize2fs: ``` [root@ip-172-31-22-182:~/e2fsprogs-1.45.5]# rg -B2 -A1 /sys/fs/ext4/features/lazy_itable_init . ./resize/resize2fs.c 923- if (getenv("RESIZE2FS_FORCE_LAZY_ITABLE_INIT") || 924- (!getenv("RESIZE2FS_FORCE_ITABLE_INIT") && 925: access("/sys/fs/ext4/features/lazy_itable_init", F_OK) == 0)) 926- lazy_itable_init = 1; ``` I confirmed /sys is mounted, and then found a bug suggesting the ext4 module is maybe not loaded: https://bugzilla.redhat.com/show_bug.cgi?id=1071909 My home server doesn't have ext4 loaded and had 3T to play with, so I tried (and succeeded with) replicating the issue locally: ``` [root@kif:/scratch]# lsmod | grep -i ext [root@kif:/scratch]# zfs create -V 3G rpool/scratch/ext4 [root@kif:/scratch]# time mkfs.ext4 /dev/zvol/rpool/scratch/ext4 mke2fs 1.45.5 (07-Jan-2020) Discarding device blocks: done Creating filesystem with 786432 4k blocks and 196608 inodes Filesystem UUID: 560a4a8f-93dc-40cc-97a5-f10049bf801f Superblock backups stored on blocks: 32768, 98304, 163840, 229376, 294912 Allocating group tables: done Writing inode tables: done Creating journal (16384 blocks): done Writing superblocks and filesystem accounting information: done real 0m2.261s user 0m0.000s sys 0m0.025s [root@kif:/scratch]# zfs set volsize=3T rpool/scratch/ext4 [root@kif:/scratch]# time resize2fs -d 62 /dev/zvol/rpool/scratch/ext4 resize2fs 1.45.5 (07-Jan-2020) fs has 11 inodes, 1 groups required. fs requires 16390 data blocks. With 1 group(s), we have 22234 blocks available. Last group's overhead is 10534 Need 16390 data blocks in last group Final size of last group is 26924 Estimated blocks needed: 26924 Extents safety margin: 49 Resizing the filesystem on /dev/zvol/rpool/scratch/ext4 to 805306368 (4k) blocks. read_bitmaps: Memory used: 132k/0k (63k/70k), time: 0.00/ 0.00/ 0.00 read_bitmaps: I/O read: 1MB, write: 0MB, rate: 3802.28MB/s fix_uninit_block_bitmaps 1: Memory used: 132k/0k (63k/70k), time: 0.00/ 0.00/ 0.00 resize_group_descriptors: Memory used: 132k/0k (68k/65k), time: 0.00/ 0.00/ 0.00 move_bg_metadata: Memory used: 132k/0k (68k/65k), time: 0.00/ 0.00/ 0.00 zero_high_bits_in_metadata: Memory used: 132k/0k (68k/65k), time: 0.00/ 0.00/ 0.00 ``` here it got stuck for quite some time ... straceing this 20 minutes in revealed this in a tight loop: ``` getuid() = 0 geteuid() = 0 getgid() = 0 getegid() = 0 prctl(PR_GET_DUMPABLE) = 1 (SUID_DUMP_USER) fallocate(3, FALLOC_FL_ZERO_RANGE, 2222649901056, 2097152) = 0 fsync(3) = 0 ``` it finally ended 43(!) minutes later: ``` adjust_superblock: Memory used: 264k/3592k (210k/55k), time: 2554.03/ 0.16/15.07 fix_uninit_block_bitmaps 2: Memory used: 264k/3592k (210k/55k), time: 0.16/ 0.16/ 0.00 blocks_to_move: Memory used: 264k/3592k (211k/54k), time: 0.00/ 0.00/ 0.00 Number of free blocks: 755396/780023556, Needed: 0 block_mover: Memory used: 264k/3592k (216k/49k), time: 0.05/ 0.01/ 0.00 block_mover: I/O read: 1MB, write: 0MB, rate: 18.68MB/s inode_scan_and_fix: Memory used: 264k/3592k (216k/49k), time: 0.00/ 0.00/ 0.00 inode_ref_fix: Memory used: 264k/3592k (216k/49k), time: 0.00/ 0.00/ 0.00 move_itables: Memory used: 264k/3592k (216k/49k), time: 0.00/ 0.00/ 0.00 calculate_summary_stats: Memory used: 264k/3592k (216k/49k), time: 16.35/16.35/ 0.00 fix_resize_inode: Memory used: 264k/3592k (222k/43k), time: 0.04/ 0.00/ 0.00 fix_resize_inode: I/O read: 1MB, write: 0MB, rate: 22.80MB/s fix_sb_journal_backup: Memory used: 264k/3592k (222k/43k), time: 0.00/ 0.00/ 0.00 overall resize2fs: Memory used: 264k/3592k (222k/43k), time: 2570.90/16.68/15.07 overall resize2fs: I/O read: 1MB, write: 1MB, rate: 0.00MB/s The filesystem on /dev/zvol/rpool/scratch/ext4 is now 805306368 (4k) blocks long. real 43m1.943s user 0m16.761s sys 0m15.069s ``` I then cleaned up and recreated the zvol, loaded the ext4 module, created the ext4 fs, resized the volume, and resize2fs'd and it went quite quickly: ``` [root@kif:/scratch]# zfs destroy rpool/scratch/ext4 [root@kif:/scratch]# zfs create -V 3G rpool/scratch/ext4 [root@kif:/scratch]# modprobe ext4 [root@kif:/scratch]# time resize2fs -d 62 /dev/zvol/rpool/scratch/ext4 [root@kif:/scratch]# time mkfs.ext4 /dev/zvol/rpool/scratch/ext4 mke2fs 1.45.5 (07-Jan-2020) Discarding device blocks: done Creating filesystem with 786432 4k blocks and 196608 inodes Filesystem UUID: 5b415f2f-a8c4-4ba0-ac1d-78860de77610 Superblock backups stored on blocks: 32768, 98304, 163840, 229376, 294912 Allocating group tables: done Writing inode tables: done Creating journal (16384 blocks): done Writing superblocks and filesystem accounting information: done real 0m1.013s user 0m0.001s sys 0m0.023s [root@kif:/scratch]# zfs set volsize=3T rpool/scratch/ext4 [root@kif:/scratch]# time resize2fs -d 62 /dev/zvol/rpool/scratch/ext4 resize2fs 1.45.5 (07-Jan-2020) fs has 11 inodes, 1 groups required. fs requires 16390 data blocks. With 1 group(s), we have 22234 blocks available. Last group's overhead is 10534 Need 16390 data blocks in last group Final size of last group is 26924 Estimated blocks needed: 26924 Extents safety margin: 49 Resizing the filesystem on /dev/zvol/rpool/scratch/ext4 to 805306368 (4k) blocks. read_bitmaps: Memory used: 132k/0k (63k/70k), time: 0.00/ 0.00/ 0.00 read_bitmaps: I/O read: 1MB, write: 0MB, rate: 3389.83MB/s fix_uninit_block_bitmaps 1: Memory used: 132k/0k (63k/70k), time: 0.00/ 0.00/ 0.00 resize_group_descriptors: Memory used: 132k/0k (68k/65k), time: 0.00/ 0.00/ 0.00 move_bg_metadata: Memory used: 132k/0k (68k/65k), time: 0.00/ 0.00/ 0.00 zero_high_bits_in_metadata: Memory used: 132k/0k (68k/65k), time: 0.00/ 0.00/ 0.00 adjust_superblock: Memory used: 264k/1540k (210k/55k), time: 0.02/ 0.02/ 0.00 fix_uninit_block_bitmaps 2: Memory used: 264k/1540k (210k/55k), time: 0.15/ 0.15/ 0.00 blocks_to_move: Memory used: 264k/1540k (211k/54k), time: 0.00/ 0.00/ 0.00 Number of free blocks: 755396/780023556, Needed: 0 block_mover: Memory used: 264k/3592k (216k/49k), time: 0.01/ 0.01/ 0.00 block_mover: I/O read: 1MB, write: 0MB, rate: 157.11MB/s inode_scan_and_fix: Memory used: 264k/3592k (216k/49k), time: 0.00/ 0.00/ 0.00 inode_ref_fix: Memory used: 264k/3592k (216k/49k), time: 0.00/ 0.00/ 0.00 move_itables: Memory used: 264k/3592k (216k/49k), time: 0.00/ 0.00/ 0.00 calculate_summary_stats: Memory used: 264k/3592k (216k/49k), time: 16.20/16.20/ 0.00 fix_resize_inode: Memory used: 264k/3592k (222k/43k), time: 0.00/ 0.00/ 0.00 fix_resize_inode: I/O read: 1MB, write: 0MB, rate: 5319.15MB/s fix_sb_journal_backup: Memory used: 264k/3592k (222k/43k), time: 0.00/ 0.00/ 0.00 overall resize2fs: Memory used: 264k/3592k (222k/43k), time: 16.45/16.38/ 0.00 overall resize2fs: I/O read: 1MB, write: 1MB, rate: 0.06MB/s The filesystem on /dev/zvol/rpool/scratch/ext4 is now 805306368 (4k) blocks long. real 0m17.908s user 0m16.386s sys 0m0.079s ``` Success! |
||
---|---|---|
.. | ||
doc | ||
lib | ||
maintainers | ||
modules | ||
tests | ||
COPYING | ||
default.nix | ||
README | ||
release-combined.nix | ||
release-small.nix | ||
release.nix |
*** NixOS *** NixOS is a Linux distribution based on the purely functional package management system Nix. More information can be found at https://nixos.org/nixos and in the manual in doc/manual.