Data backup is an essential component of any VMware data center. Understand the main approaches to backing up VMware virtual machines—backup of VMs as physical machines, backing up VMDK files, or using third-party virtualized backup tools—and their pros and cons, and learn best practices to ensure backups go smoothly.
Three Approaches for Backing Up VMware Infrastructure
Backup is a critical part of any data center. When operating virtualized resources managed by VMware, you need a solid strategy to backup and restore virtual machines (VMs). We’ll cover three methods: backing up VMs as physical machines, backing up VM files, and using a dedicated virtualization backup solution.
1. Backing Up Virtual Machines as Physical Machines
From the user’s perspective, a virtual machine works exactly like a physical machine—it has a guest operating system that works in isolation from other VMs. This means that regular backup procedures used for non-virtualized workloads can work the same way for VMware VMs. Administrators can install a backup agent and schedule backups just like they would for a regular machine.
Advantages: Simple, no learning curve, allows administrators to exclude non-essential applications or data from backups to reduce backup size.
Disadvantages: Protects the operating system and applications but not the VM, so there is no way to directly restore the full virtualized environment.
2. File-Based Backup for VMWare Virtual Machines
VMware stores each VM as a VM file (typically a VMDK file). You can backup these files to protect entire VMs in a one easy step. Unlike operating system backups that can take a long time and consume significant system resources, copying a VMDK file is a quick and simple operation.
Advantages: Quick, easy, does not affect the guest operating system or its applications. Backup is performed and resourced by the ESXi server.
Disadvantages: Captures a complete snapshot of a VM, with no ability to remove specific applications, or directly restore a specific application or file. You can only restore the full VMDK file, which may be large and contain irrelevant applications or data. Recovery Time Objective (RTO) can be higher with file-based backup and restore.
3. Dedicated VMware Backup and Restore Solution
In the past, VMware provided a tool called VMware Data Protection, but End of Availability (EOA) was announced in 2017. Many users have migrated to third-party backup solutions that support virtualized environments. These solutions provide:
- Application-aware backup for Microsoft technologies like Exchange, Active Directory and Microsoft SQL
- Backup and recovery of guest operating systems, entire VMs or entire ESXi hosts
- Instant restore of VMs from backup
- Incremental backups using VMware Changed Block Tracking and deduplication to reduce storage space
A few examples of third-party tools are:
- Veeam Backup and Replication
- Acronis Cyber Backup
- Synology Active Backup
- ThinWare vBackup
Read more about these solutions in our article on VMware Data Protection.
VMware vSphere Backup Best Practices
Here are five best practices that can help you manage VMware backups while minimizing risk to protected data and interruption to user activities.
1. Prefer File-Based (VMDK) Backup to Guest Operating System Backup
While we explained the pros and cons of both methods, many experts agree that the preferred option to directly backup VMs is the file-based method. This method is operationally simple, and does not impose any performance penalty on the target machine. This can be especially important for high-throughput workloads like databases, email servers or web applications.
2. Application Consistent Backups and VSS
When backing up a mission critical VM containing applications like OLTP databases, ensure you create an application-consistent backup. This means you should pause applications (this is called “quiescing”) and take other measures to ensure you don’t lose transactions during the backup process. For Windows machines, use the Microsoft Volume Shadow Copy Service (VSS), provided as part of VMware Tools, to quiesce applications.
3. Provide Ample Bandwidth and Resources for Backup
Ensure you provision adequate resources at the backup server and network level, to ensure you can meet your RPO objectives. For example, if your RPO is 1 hour, you need to perform frequent incremental backups throughout the day. The following resources are critical:
- Network bandwidth from backup server to backup targets—take into account the data volumes that need to be transferred throughout the day
- Hardware resources on the backup server—if you maintain your own backup server, over-provision it to ensure backups never slow down due to inadequate system resources
- Hardware resources on backup targets—when provisioning machines for your application workloads, take into account the extra overhead required by your backup process. Ensure machines have enough power to handle their regular workloads and also manage their part of the backup workflow.
Cloud-based backup systems take care of #1, but you still need to consider #1 and #3, to ensure machines are able to transfer data to the cloud quickly enough.
4. Do Not Use VM Snapshots as Primary Backup
VM snapshots are convenient when you need to save short-term copies of a VM. But they incur a serious performance penalty, so you should not use snapshots as your primary backup mechanism. VM snapshots require a lot of disk space and can become bigger than the original disk you are backing up. Finally, merging snapshots back into the VM disk is a slow operation that can also negatively affect performance on the machine.
5. Consider Using vStorage APIs
vSphere provides a vStorage API that provides programmatic access to a VMDK file for backup and restore purposes. It also gives you access to advanced features that are not currently available in the VMware management interface, such as Changed Block Tracking incremental backups, and deduplication.
The third-party VMware backup tools we listed all use the vStorage API to do their work. If you have a large VMware deployment and are willing to invest in development, consider building your own automated backup mechanism.
VMware backups can take up huge amounts of storage space, and setting up on-premise storage infrastructure can be daunting.
Cloudian HyperStore is an on-prem, enterprise storage solution that is certified for use in VMware environments, and enables easy scalability from hundreds of Terabytes to Exabytes to support any scale of backup data. It is fully compatible with the S3 API. HyperStore is used in demanding operator-scale deployments using VMware vCloud Director.
Learn more about Cloudian’s solutions for VMware storage.