Disaster recovery is an essential element of any virtualized datacenter. VMware provides the VMware Site Recovery Manager (SRM) as part of its vSphere suite, which lets you automate backups and orchestrate the recovery of entire VMware storage systems from a disaster recovery site. SRM can be used both on-premises and as a Disaster Recovery as a Service (DRaaS) using AWS infrastructure.
In this article you will learn about:
- VMware Site Recovery Manager
- How Site Recovery Manager works
- SRM best practices
- VMware site recovery on AWS
This article is part of a series on VMware Storage.
VMware Disaster Recovery with Site Recovery Manager
VMware provides a dedicated solution for disaster recovery, as part of the vSphere virtualization suite, called VMware Site Recovery Manager (SRM). SRM is a VMware backup automation tool that provides replication technology and supports policy-based management of backup programs. SRM can perform orchestration of recovery programs to minimize downtime in case of disasters, and also lets you run non-disruptive testing of disaster recovery plans.
VMware Site Recovery Manager is offered either on-premises or on the AWS cloud, in a Disaster Recovery as a Service (DRaaS) service model.
SRM leverages VMware vSphere Replication to provide hypervisor-based virtual machine replication. It protects VMs from partial or complete site failures by copying the virtual machines from a primary site to a secondary site, or from multiple sources to a single disaster recovery site. VSphere Replication is configured on a per-VM basis, allowing you to control which VMs are duplicated. After the initial replication, vSphere replication performs incremental backup, ensuring only changes are replicated to decrease network bandwidth usage.
VMware Site Recovery Supervisor tightly integrates with other VMware technologies and architectures:
- vSphere Replication and replication solutions from VMware storage partners, like VMware’s close partner Cloudian (read more about Cloudian’s low-cost S3-compatible VMware storage solution)
- VMware Software-Defined Data Center (SDDC) architecture
- VMware NSX system virtualization
- VMware vSAN hyperconverged infrastructure
Site Recovery Manager can support the following use cases:
- Disaster recovery
- Disaster avoidance
- Data center migrations
- Site-level load balancing
- Maintenance testing for enterprise applications
How Does VMware Site Recovery Manager Work?
Site Recovery Manager is based on an application server that runs on a Windows server, with its own database and a plug-in that connects it to vSphere clients. Each disaster recovery site must deploy an SRM host and a vCenter server. These servers may be deployed using physical servers or virtual machines. vCenter Server provides central visibility and control of all the SRM servers.
Source: code.vmware.com
SRM has the following additional deployment requirements:
- A trusted high-speed network connection between production sites and the disaster recovery site—not a dedicated link—to avoid VPN connections across the Internet
- Array-based replication between recovery site and the protected site, using a replication adapter supported by SRM (see a full list).
- The recovery site must have access to the same public and private IP networks as the protected site.
- The recovery site must have sufficient hardware, storage and network resources to run VM workloads of the protected site.
- You must ensure databases and other systems in your protected site are supported by Site Recovery Manager—see the full compatibility matrix.
VMware Site Recovery Manager Best Practices
VMware recommends the following best practices when operating VMware Site Recovery Manager:
- The SRM database should be colocated with the SRM server if possible, or as close as possible, to decrease round-trip time of data transfers.
- Prefer fewer but larger NFS volumes, to ensure less time is needed to mount the instances. This can also reduce recovery time by letting you define fewer security groups.
- Prefer more hosts, enabling higher concurrency when recovering VMs, resulting in faster recovery times.
- Before backing up VMs, ensure recovery site hosts are not in standby mode, so they are ready to create placeholder VMs.
- Prefer to use the same VM dependencies across priority groups, instead of configuring dependencies per VM.
- Install VMware Tools in protected VMs. This will allow SRM to check heartbeats and network performance.
- Make sure no custom script or UI dialog box blocks backup retrieval.
- Swap files should be stored in a non-replicated datastore, to avoid the need to replicate between two sites and make remote calls to vCenter Server.
VMware Site Recovery on AWS
VMware provides VMware Cloud on AWS, a hybrid service that enables cloud migration, hybrid data centers and disaster recovery. As part of the service, VMware provides Site Recovery Manager (SRM) as a managed service, in a Disaster Recovery as a Service (DRaaS) model.
VMware Site Recovery enables replication, orchestration and backup automation on AWS to protect cloud applications from failure. It provides an end-to-end disaster recovery solution with two key capabilities:
- Multi-site topologies—use one VMware Cloud on AWS Software-Defined Data Center (SDDC) for multiple on-premises sites, reducing costs by consolidating enterprise resources.
- One-click deployment and testing of Site Recovery Manager—check connectivity of VMware Cloud on AWS from your on-premise center with one click, and instantly connect local workloads to backup policies.
Read also our guide to VMware Data Protection.
VMware Storage Made Simple with Cloudian
One of the advantages of VMware Site Recovery Manager is that it lets you set up storage using a wide variety of connectors, which support many storage vendors. One of those vendors is Cloudian, which provides a unique solution for on-premise storage. Cloudian’s solution is cost-effective and massively, elastically scalable, like popular cloud storage services.
Cloudian HyperStore is an on-prem, enterprise storage solution that is certified for use in VMware environments, and enables easy scalability from hundreds of Terabytes to Exabytes to support any scale of data. HyperStore supports both traditional storage protocols like SAN and NAS, but at its core it is based on a Software-Defined Storage paradigm, managing dynamic pools of object storage.
HyperStore is used in demanding operator-scale deployments using VMware vCloud Director.
Learn more about Cloudian’s solutions for VMware storage.