Resizing the Linux Root Partition in a Gen2 Hyper-V VM

Resizing the Linux Root Partition in a Gen2 Hyper-V VM

Without a doubt, modern virtualization has changed the landscape of enterprise computing forever. Since virtual machines are abstracted away from the physical hardware, changes in compute, memory, and storage resources become mere clicks of a mouse. And, as hypervisors mature, many operations that were once thought of as out-of-band tasks, such as adding storage or even memory can now be done with little, or even zero downtime.

Hyper-V SCSI Disks and Linux

In many cases, hypervisors are backed by large storage area networks (SANs). This provides shared storage for hypervisor nodes that supports failover clustering and high availability. Additionally, it gives administrators the ability to scale the virtual environment, including the ability to easily add or expand storage on existing virtual servers. Microsoft’s Hyper-V 2012 introduced Generation 2 VMs, which extends this functionality. Among the many benefits of Gen2 VMs, was the ability to boot from a SCSI disk rather than IDE. This requires UEFI rather than a legacy BIOS, so it’s only supported among newer operating systems. Many admins I talk to think this is limited to Microsoft Server 2012 and newer, probably because of the sub-optimal phrasing in the Hyper-V VM creation UI that altogether fails to mention Linux operating systems.

The fact is, however, that many newer Linux OSes also support this ability, as shown in these tables from Microsoft.

More Disk, Please

Once you’ve built a modern Linux VM and you’re booting from synthetic SCSI disks rather than emulated IDE drives, you gain numerous advantages, not the least of which is the ability to resize the OS virtual hard disk (VHDX) on the fly. This is really handy functionality – after all, what sysadmin hasn’t had an OS drive run low on disk space at some point in their career? This is simply done from the virtual machine settings in Hyper-V Manager or Failover Cluster Manager by editing the VHDX.

Now, if you’re a Microsoft gal or guy, you already know that what comes next is pretty straightforward. Open the Disk Management MMC, rescan the disks, extend the file system, and viola, you now automagically have a bigger C:\ drive. But what about for Linux VMs? Though it might be a little less intuitive, we can still accomplish the same goal of expanding the primary OS disk with zero down time in Linux.

On-the-Fly Resizing

To demonstrate this, let’s start with a vanilla, Hyper-V Generation 2, CentOS 7.6 VM with a 10GB VHDX attached to a SCSI controller in our VM. Let’s also assume we’re using the default LVM partitioning scheme during the CentOS install. Looking at the block devices in Linux, we can see that we have a 10GB disk called sda which has three partitions – sda1, sda2 and sda3. We’re interested in sda3, since that contains our root partition, which is currently 7.8GB, as demonstrated here by the lsblk command.

Now let’s take a look at df. Here we can see an XFS filesystem on our 7.8GB partition, /dev/mapper/centos-root which is mounted on root.

Finally, let’s have a look at our LVM summary:

From this information we can see that there’s currently no room to expand our physical volume or logical volume, as the entirety of /dev/sda is consumed. In the past, with a Gen1 Hyper-V virtual machine, we would have had to shut the VM down and edit the disk, since it used an emulated IDE controller. Now that we have a Gen2 CentOS VM with a SCSI controller, however, we can simply edit the disk on the fly, expanding it to 20GB.

Once the correct virtual disk is located, select the “Expand” option.

Next, provide the size of the new disk. We’ll bump this one to 20GB.

Finally, click “Finish” to resize the disk. This process should be instant for dynamic virtual hard disks, but may take a few seconds to a several minutes for fixed virtual hard disks, depending on the size of the expansion and speed of your storage subsystem. You can then verify the new disk size by inspecting the disk.

OK, so we’ve expanded the VHDX in Hyper-V, but we haven’t done anything to make our VM’s operating system aware of the new space. As seen here with lsblk, the OS is indifferent to the expanded drive.

Taking a look at parted, we again see that our /dev/sda disk is still showing 10.7GB. We need to make the CentOS operating system aware of the new space. A reboot would certainly do this, but we want to perform this entire operation with no downtime.



Issue the following command to rescan the relevant disk – sda in our case. This tells the system to rescan the SCSI bus for changes, and will report the new space to the kernel without a restart.

Now, when we look at parted again, we’re prompted to move the GPT table to the back of the disk, since the secondary table is no longer in the proper location after the VHDX expansion. Type “Fix” to correct this, and then once again to edit the GPT to use all the available disk space. Once this is complete, we can see that /dev/sda is now recognized as 20GB, but our sda3 partition is still only 10GB.

Next, from the parted CLI, next use the resizepart command to grow the partition to the end of the disk.

Our sda3 partition is now using the maximum space available, 20.2GB. The lsblk command also now correctly reports our disk as 20GB.

But what about our LVM volumes? As suspected, our physical volumes, volume groups and logical volumes all remain unchanged.

We need to first tell our pv to expand into the available disk space on the partition. Do this with the pvresize command as follows:

Sure enough, our pv is now 18.8GB with 10.00GB free. Now we need to extend the logical volume and it’s associated filesystem into the free pv space. We can do this with a single command:

Looking at our logical volumes confirms that our root lv is now 17.80GB of the 18.80GB total, or exactly 10.0GB larger than we started with, as one would expect to see.

A final confirmation with the df command illustrates that our XFS root filesystem was also resized.

Conclusion

So there you have it. Despite some hearsay to the contrary, modern Linux OSes run just fine as Gen2 VMs on Hyper-V. Coupled with a SCSI disk controller for the OS VHDX, this yields the advantage of zero-downtime root partition resizing in Linux, though it’s admittedly a few more steps than a Windows server requires. And though Linux on Hyper-V might not seem like the most intuitive choice to some sysadmins, Hyper-V has matured significantly over the past several releases and is quite a powerful and stable platform for both Linux and Windows. And one last thing – when you run critically low on disk space on Linux, don’t forget to check those reserved blocks for a quick fix!

Disk Pooling in Linux with mergerFS

Disk Pooling in Linux with mergerFS

Remember the old days when we used to marvel at disk drives that measured in the hundreds of megabytes? In retrospect it seems funny now, but it wasn’t uncommon to hear someone mutter, “Man, we’ll never fill that thing up.”

If you don’t remember that, you probably don’t recall life before iPhones either, but we old timers can assure you that, once upon a time, hard drives were a mere fraction of the size they are today. Oddly enough, though, you’ll still hear the same old tripe, even from fellow IT folks. The difference now, however, is that they’re holding a helium-filled 10TB drive in their hands. But just like yesteryear, we’ll fill it up, and it’ll happen faster than you think.

In the quest to build a more scalable home IT lab, we grew tired of the old paradigm of building bigger and bigger file servers and migrating hordes of data as we outgrew the drive arrays. Rather than labor with year-over-year upgrades, we ultimately arrived at a storage solution that we feel is the best compromise of scalability, performance, and cost, while delivering SAN-NAS hybrid flexibility.

We’ll detail the full storage build in another article, but for now we want to focus specifically on the Linux filesystem that we chose for the NAS component of our storage server: mergerFS.

What is mergerFS?

Before we get into the specifics regarding mergerFS, let’s provide some relevant definitions:

Filesystem – Simply put, a filesystem is a system of organization used to store and retrieve data in a computer system. The filesystem manages the space used, filenames, directories, and file metadata such as allocated blocks, timestamps, ACLs, etc.

Union Mount – A union mount is a method for joining multiple directories into a single directory. To a user interfacing with the union, the directory would appear to contain the aggregate contents of the directories that have been combined.

FUSE – Filesystem in Userspace. FUSE is software that allows non-privileged users to create and mount their own filesystems that run as an unprivileged process or daemon. Userspace filesystems have a number of advantages. Unlike kernel-space filesystems, which are rigorously tested and reviewed prior to acceptance into the Linux Kernel, userspace filesystems, since they are abstracted away from the kernel, can be developed much more nimbly. It’s also unlikely that any filesystem that did not address mainstream requirements would be accepted into the kernel, so userspace filesystems are able to address more niche needs. We could go on about FUSE, but suffice it to say that userspace filesystems provides unique opportunities for developers to do some cool things with filesystems that were not easily attainable before.

So what is mergerFS? MergerFS is a union filesystem, similar to mhddfs, UnionFS, and aufs. MergerFS enables the joining of multiple directories that appear to the user as a single directory. This merged directory will contain all of the files and directories present in each of the joined directories. Furthermore, the merged directory will be mounted on a single point in the filesystem, greatly simplifying access and management of files and subdirectories. And, when each of the merged directories themselves are mount points representing individual disks or disk arrays, mergerFS effectively serves as a disk pooling utility, allowing us to group disparate hard disks, arrays or any combination of the two. At Teknophiles, we’ve used several union filesystems in the past, and mergerFS stands out to us for two reasons: 1) It’s extremely easy to install, configure, and use, and 2) It just plain works.

An example of two merged directories, /mnt/mergerFS01 and /mnt/mergerFS02, and the resultant union.

mergerFS vs. LVM

You might be asking yourself, “why not just use LVM, instead of mergerFS?” Though the Logical Volume Manager can address some of the same challenges that mergerFS does, such as creating a single large logical volume from numerous physical devices (disks and/or arrays), they’re quite different animals. LVM sits just above the physical disks and partitions, while mergerFS sits on top of the filesystems of those partitions. For our application, however, LVM has one significant drawback: When multiple physical volumes make up a single volume group and a logical volume is carved out from that volume group, it’s possible, and even probable, that a file will be comprised of physical extents from multiple devices. Consider the diagram below.

The practical concern here is that should an underlying physical volume in LVM fail catastrophically, be it a single disk or even a whole RAID array, any data on, or that spans, the failed volume will be impacted. Of course, the same is true of mergerFS, but since the data does not span physical volumes, it’s much easier to determine which files are impacted. There’s unfortunately no easy way that we’ve found to determine which files are located on which physical volumes in LVM.



Flexibility & Scalability

As we’ve alluded to a few times, mergerFS doesn’t care if the underlying physical volumes are single disks or RAID arrays, or LVM volumes. Since the two technologies operate at different levels with respect to the underlying disks, nothing prevents the use of LVM with mergerFS. In fact, we commonly use the following formula to create large mergerFS disk pools: multiple mdadm 4-Disk RAID10 arrays > LVM > mergerFS. In our case, the LVM is usually just used for management, we typically do not span multiple physical volumes with any volume groups, though you easily could.

This gives you incredible flexibility and scalability – want to add an individual 5400 RPM 4TB disk to an existing 12TB RAID6 array comprised of 6+2 7200 RPM 2TB drives, for 16TB total? No problem. Want to later add an LVM logical volume that spans two 8TB 2+2 RAID10 arrays for another 16TB. MergerFS is fine with that, too. In fact, mergerFS is completely agnostic to disk controller, drive size, speed, form factor, etc. With mergerFS, you can grow your storage server as you see fit.

mergerFS & Your Data

One interesting note about mergerFS is that since it is just a proxy for your data, it does not manipulate the data in any way. Prior to being part of a mergerFS union, each of your component disks, arrays, and logical volumes will already have a filesystem. This makes data recovery quite simple – should a mergerFS pool completely crash (though, unlikely), just remove the component storage devices, drop them a compatible system, mount as usual, and access your data.

What’s more, you can equally as easily add a disk to mergerFS that already has data on it. This allows you to decide at some later point if you wish to add an in-use disk to the mergerFS pool (try that with LVM). The existing data will simply show up in the mergerFS filesystem, along with whatever data is on the other volumes. It just doesn’t get any more straightforward!

mergerFS & Samba

As we stated earlier, we selected mergerFS for the NAS component of our Teknophiles Ultimate Home IT Lab storage solution. Since this is a NAS that is expected to serve files to users in a Windows Domain, we also run Samba to facilitate the Windows file sharing. Apparently, there are rumblings regarding issues with mergerFS and Samba, however, according to the author of mergerFS, this is likely due to improper Samba configuration.

Here at Teknophiles, we can unequivocally say that in server configurations based on our, “Linux File Servers in a Windows Domain,” article, Samba is perfectly stable with mergerFS. In fact, in one mergerFS pool, we’re serving up over 20TB of data spread over multiple mdadm RAID arrays. The server in question is currently sitting at 400 days of uptime, without so much as a hiccup from Samba or mergerFS.

Installing mergerFS

OK, so that’s enough background, now let’s get to the fun part. To install mergerFS, first download the latest release for your platform. We’re installing this on an Ubuntu 14.04 LTS server, so we’ll download the Trusty 64-bit .deb file. Once downloaded, install via the Debian package manager.

Creating mergerFS volumes

Now we’re going to add a couple of volumes to a mergerFS pool. You can see here that we have a virtual machine with two 5GB virtual disks, /dev/sdb and /dev/sdc. You can also see that each disk has an ext4 partition.

Next, create the mount points for these two disks and mount the disks in their respective directories.

Now we need to create a mount point for the union filesystem, which we’ll call ‘virt’ for our virtual directory.

And finally, we can mount the union filesystem. The command follows the syntax below, where <srcmounts> is a colon delimited list of directories you wish to merge.

mergerfs -o<options> <srcmounts> <mountpoint>

Additionally, you can also use globbing for the source paths, but you must escape the wildcard character.

There are numerous additional options that are available for mergerFS, but the above command will work well in most scenarios. From the mergerFS man page, here’s what the above options do:

defaults: a shortcut for FUSE’s atomic_o_trunc, auto_cache, big_writes, default_permissions, splice_move, splice_read, and splice_write. These options seem to provide the best performance.

allow_other: a libfuse option which allows users besides the one which ran mergerfs to see the filesystem. This is required for most use-cases.

use_ino: causes mergerfs to supply file/directory inodes rather than libfuse. While not a default it is generally recommended it be enabled so that hard linked files share the same inode value.

fsname=name: sets the name of the filesystem as seen in mount, df, etc. Defaults to a list of the source paths concatenated together with the longest common prefix removed.

You can now see that we have a new volume called, “mergerFS,” which is the aggregate 10GB, mounted on /mnt/virt. This new mount point can be written to, used in local applications, or served up via Samba just as any other mount point.

Other Options

Although getting a little into the weeds, it’s worth touching on an additional option in mergerFS that is both interesting and quite useful. The FUSE function policies determine how a number of different commands behave when acting upon the data in the mergerFS disk pool.

func.<func>=<policy>: sets the specific FUSE function’s policy. See below for the list of value types. Example: func.getattr=newest

Below we can see the FUSE functions and their category classifications, as well as the default policies.

Category FUSE Functions
action chmod, chown, link, removexattr, rename, rmdir, setxattr, truncate, unlink, utimens
create create, mkdir, mknod, symlink
search access, getattr, getxattr, ioctl, listxattr, open, readlink
N/A fallocate, fgetattr, fsync, ftruncate, ioctl, read, readdir, release, statfs, write
Category Default Policy
action all
create epmfs
search ff

To illustrate this a bit better, let’s look at an example. First, let’s consider file or directory creation. File and directory creation fall under the “create” category. Looking at the default policy for the create category we see that it is called, “epmfs.” From the man pages, the epmfs policy is defined as follows:

epmfs (existing path, most free space)
Of all the drives on which the relative path exists choose the drive with the most free space. For create category functions it will exclude readonly drives and those with free space less than min-freespace. Falls back to mfs.

Breaking this down further, we can see that epmfs is a “path-preserving policy,” meaning that only drives that have the existing path will be considered. This gives you a bit of control over where certain files are placed. Consider, for instance if you have four drives in your mergerFS pool, but only two of the drives contain a directory called /pictures. When using the epmfs policy, only the two drives with the pictures directory will be considered when you copy new images to the mergerFS pool.

Additionally, the epmfs policy will also serve to fill the drive with the most free space first. Once drives reach equal or near-equal capacities, epmfs will effectively round-robin the drives, as long as they also meet the existing path requirement.

There are a number of equally interesting policies, including ones that do not preserve paths (create directories as needed), fill the drive with the least free space (i.e. fill drive1, then drive2, etc.), or simply use the first drive found. Though the defaults will generally suffice, it’s a good idea to become familiar with these policies to ensure that your mergerFS configuration best suits your needs.



Adding Volumes

Similar to creating a mergerFS pool, adding disks is also quite simple. Let’s say, for instance, you want to also use the space on your existing operating system disk for the mergerFS pool. We can simply create another directory in the root filesystem to use in the union. We created ours in /mnt for demonstration purposes, but your home directory might equally suit.

Next, we unmount our existing mergerFS pool and remount it including the new directory.

Notice now our total mergerFS volume space is 19GB – the original 10GB from our two 5GB disks, plus the 9GB from /dev/sda2. And now, since we now have truly disparate volumes, let’s test our default epmfs policy. Start by creating three files in our mergerFS pool mount:

Based on our expectations of how epmfs works, we should see these files in the /mnt/mergerFS00 folder, since /dev/sda1 has the most free space.

Sure enough, this appears to work as we anticipated. Now let’s create a folder on one disk only.

Replicating our previous experiment, we’ll create a few more files, but this time in the pics directory in our mergerFS pool.

Since the epmfs policy should preserve the path pics/, and that path only exists on /mnt/mergerFS01, this is where we expect to see those files.

Removing Volumes

Removing a volume from the mergerFS pool follows the same procedure as adding a drive. Simply remount the pool without the path you wish to remove.

Notice now file1, file2, and file2 are no longer present, since they were located on /dev/sda2, which has been removed. Additionally, our total space is now back to its previous size.

mergerFS & fstab

Typically, you’re going to want your mergerFS pool to be persistent upon reboot. To do this we can simply leverage fstab as we would for any mount point. Using our example above, fstab should follow the following format:

Performance Considerations

We would be remiss to end this article without disucssing the possible performance implications of mergerFS. As with any disk pooling utility, a possible weakness of this type of this configuration is a lack of striping across drives. In RAID or LVM configurations, striping may be used to take advantage of simultaneous I/O and throughput of the available spindles. RAID level, array size, and LVM configuration play dramatically into exactly what this looks like, but even a 6+2 RAID6 array with commodity drives can achieve read speeds that will saturate a 10Gbps network. Use LVM to stripe across multiple arrays, and you can achieve stunning performance. If you’re only using single disks in your mergerFS pool, however, you’ll always be limited to the performance of a single drive. And maybe that’s OK, depending on your storage goals. Of course, careful planning of the disk subsystem and using disk arrays in your mergerFS pool can give you the best of both worlds – excellent performance and the flexibility and scalability of disk pooling.

Lastly, it’s worth noting that mergerFS is yet another layer on top of your filesystems and FUSE filesystems by their very nature add some overhead. This overhead is generally negligible, however, especially in low to moderate I/O environments. You might not want to put your busy MySQL database in userspace, but you’ll likely not notice the difference in storing your family picture albums there.