As IT Pros, we have a myriad of tools available to us to configure and tweak and tune the systems we manage. So much so, there are often everyday tools right under our noses that might have applications we may not immediately realize. In a Linux environment, tune2fs is an indispensable tool, used to tune parameters on ext2/ext3/ext4 filesystems. Most Linux sysadmins that have used mdadm software RAID will certainly recognize this utility if they’ve ever had to manipulate the stride size or stripe width in an array.
tune2fs
First, Let’s take a look at the disks on an Ubuntu file server so we can see what this tool does.
1 2 3 4 5 6 7 8 9 10 11 12 |
user1@fileserver1:~$ df -Th Filesystem Type Size Used Avail Use% Mounted on udev devtmpfs 987M 4.0K 987M 1% /dev tmpfs tmpfs 200M 4.3M 196M 3% /run /dev/sda2 ext4 11G 4.7G 5.7G 46% / none tmpfs 4.0K 0 4.0K 0% /sys/fs/cgroup none tmpfs 5.0M 0 5.0M 0% /run/lock none tmpfs 998M 0 998M 0% /run/shm none tmpfs 100M 0 100M 0% /run/user /dev/sdb1 ext4 382G 314G 50G 87% /mnt/multimedia |
Now, we can use the tune2fs with the -l option to list the existing parameters of the filesystem superblock on /dev/sdb1.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 |
user1@fileserver1:~$ sudo tune2fs -l /dev/sdb1 tune2fs 1.42.9 (4-Feb-2014) Filesystem volume name: <none> Last mounted on: /mnt/multimedia Filesystem UUID: 4b3897ea-aa0a-42cd-89eb-63bff20500ad Filesystem magic number: 0xEF53 Filesystem revision #: 1 (dynamic) Filesystem features: has_journal ext_attr resize_inode dir_index filetype needs_recovery extent flex_bg sparse_super large_file huge_file uninit_bg dir_nlink extra_isize Filesystem flags: signed_directory_hash Default mount options: user_xattr acl Filesystem state: clean Errors behavior: Continue Filesystem OS type: Linux Inode count: 25427968 Block count: 101711611 Reserved block count: 5085579 Free blocks: 18388337 Free inodes: 25402275 First block: 0 Block size: 4096 Fragment size: 4096 Reserved GDT blocks: 999 Blocks per group: 32768 Fragments per group: 32768 Inodes per group: 8192 Inode blocks per group: 512 RAID stride: 32747 Flex block group size: 16 Filesystem created: Wed Dec 28 20:03:43 2016 Last mount time: Mon Dec 11 08:18:18 2017 Last write time: Mon Dec 11 08:18:18 2017 Mount count: 6 Maximum mount count: -1 Last checked: Thu Dec 29 13:56:45 2016 Check interval: 0 (<none>) Lifetime writes: 322 GB Reserved blocks uid: 0 (user root) Reserved blocks gid: 0 (group root) First inode: 11 Inode size: 256 Required extra isize: 28 Desired extra isize: 28 Journal inode: 8 Default directory hash: half_md4 Directory Hash Seed: d3eec02d-ca75-452c-8847-10d1fb381fdb Journal backup: inode blocks |
Reserved Blocks?
As you can see, there are a number of parameters from the filesystem that we can view, including a number that can be tuned with tune2fs. In this article however, we’re going to focus on a rather simple and somewhat innocuous parameter – reserved block count. Let’s take a look at that parameter again:
1 2 3 4 |
user1@fileserver1:~$ sudo tune2fs -l /dev/sdb1 | grep 'Reserved block count:' Reserved block count: 5085579 |
At first glance, it isn’t obvious what this parameter means. In fact, I’ve worked with Linux sysadmins with years of experience that weren’t aware of this little gem. To understand this parameter, we probably have to put it’s origins in a bit of context. Once upon a time, SSDs didn’t exist, and no one knew what a terabyte was. In fact, I remember shelling out well north of a $100 for my first 20GB drive. To date myself even further, I remember the first 486-DX PC I built with my father in the early ’90s, and it’s drive was measured in megabytes. Crazy, I know. Since drive space wasn’t always so plentiful, and the consequences of running out of disk space on the root partition in a Linux system are numerous, early filesystem developers did something smart – they reserved a percentage of filesystem blocks for privileged processes. This ensured that even if disk space ran precariously low, the root user could still log in, and the system could still execute critical processes.
That magic number? Five percent.
And while five percent of that 20GB drive back in 1998 wasn’t very much space, imagine that new 4-disk RAID1/0 array you just created with 10TB WD Red Pros. That’s five percent of 20TB of usable space, or a full terabyte. You see, though this was likely intended for the root filesystem, by default this setting applies to every filesystem created. Now, I don’t know about you, but at $450 for a 10TB WD Red Pro, that’s not exactly space I’d want to throw away.
We Don’t Need No Stinking Reserved Blocks!
The good news, however, is that space isn’t lost forever. If you forget to initially set this parameter when you create the filesystem, tune2fs allows you to retroactively reclaim that space with the -m option.
1 2 3 4 5 |
user1@fileserver1:~$ sudo tune2fs -m 0 /dev/sdb1 tune2fs 1.42.9 (4-Feb-2014) Setting reserved blocks percentage to 0% (0 blocks) |
Here you can see we’ve set the reserved blocks on /dev/sdb1 to 0%. Again, this isn’t something you’d want to do on a root filesystem, but for our “multimedia” drive, this is fine – more on that later. Now, let’s look at our filesystem parameters once again.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 |
user1@fileserver1:~$ sudo tune2fs -l /dev/sdb1 tune2fs 1.42.9 (4-Feb-2014) Filesystem volume name: <none> Last mounted on: /mnt/multimedia Filesystem UUID: 4b3897ea-aa0a-42cd-89eb-63bff20500ad Filesystem magic number: 0xEF53 Filesystem revision #: 1 (dynamic) Filesystem features: has_journal ext_attr resize_inode dir_index filetype needs_recovery extent flex_bg sparse_super large_file huge_file uninit_bg dir_nlink extra_isize Filesystem flags: signed_directory_hash Default mount options: user_xattr acl Filesystem state: clean Errors behavior: Continue Filesystem OS type: Linux Inode count: 25427968 Block count: 101711611 Reserved block count: 0 Free blocks: 18388337 Free inodes: 25402275 First block: 0 Block size: 4096 Fragment size: 4096 Reserved GDT blocks: 999 Blocks per group: 32768 Fragments per group: 32768 Inodes per group: 8192 Inode blocks per group: 512 RAID stride: 32747 Flex block group size: 16 Filesystem created: Wed Dec 28 20:03:43 2016 Last mount time: Mon Dec 11 08:18:18 2017 Last write time: Tue Jan 9 21:01:10 2018 Mount count: 6 Maximum mount count: -1 Last checked: Thu Dec 29 13:56:45 2016 Check interval: 0 (<none>) Lifetime writes: 322 GB Reserved blocks uid: 0 (user root) Reserved blocks gid: 0 (group root) First inode: 11 Inode size: 256 Required extra isize: 28 Desired extra isize: 28 Journal inode: 8 Default directory hash: half_md4 Directory Hash Seed: d3eec02d-ca75-452c-8847-10d1fb381fdb Journal backup: inode blocks |
Notice now that our reserved blocks is set to zero. Finally, let’s have a look at our free disk space to see the real world impact. Initially, we had 50GB of 382GB free. Now we can see that, although neither the size of the disk nor the amount of used space has changed, we now have 69GB free, reclaiming 19GB of space.
1 2 3 4 5 6 7 8 9 10 11 12 |
user1@fileserver1:~$ df -Th Filesystem Type Size Used Avail Use% Mounted on udev devtmpfs 987M 4.0K 987M 1% /dev tmpfs tmpfs 200M 4.3M 196M 3% /run /dev/sda2 ext4 11G 4.7G 5.7G 46% / none tmpfs 4.0K 0 4.0K 0% /sys/fs/cgroup none tmpfs 5.0M 0 5.0M 0% /run/lock none tmpfs 998M 0 998M 0% /run/shm none tmpfs 100M 0 100M 0% /run/user /dev/sdb1 ext4 382G 314G 69G 82% /mnt/multimedia |
Defrag Implications
Lastly, I’d be remiss if I didn’t mention that there’s one other function these reserved blocks serve. As always, in life there’s no such thing as a free lunch (or free space, in this case). The filesystem reserved blocks also serve to provide the system with free blocks with which to defragment the filesystem. Clearly, this isn’t something you’d want to do on a filesystem that contained a database, or in some other situation in which you had a large number of writes and deletions. However, if like in our case, you’re dealing with mostly static data, in a write once, read many (WORM) type configuration, this shouldn’t have a noticeable impact. In fact, the primary developer for tune2fs, Google’s Theodore Ts’o, can be seen here confirming this supposition.
So there you have it. You may be missing out on some valuable space, especially in those multi-terabyte arrays out there. And though it’s no longer 1998, and terabytes do come fairly cheap these days, it’s still nice to know you’re getting all that you paid for.