Disaster Recovery vs. High Availability

jincounterintuity

Disaster Recovery vs. High Availability: What Is the Difference and How Can They Work Together?

What Is Disaster Recovery?

Disaster recovery (DR) is the organization’s ability to respond to and recover from events that negatively affect business operations.

A disaster recovery plan is a documented policy or process designed to standardize an organization’s response to disasters, and enable faster and more effective recovery. It details all actions the relevant roles must take before, during, and after a disaster.

A disaster recovery plan must address:

Man-made disasters—including cyber attacks, terrorism, and human error.
Natural disasters—including earthquakes, landslides, lightning, volcanic eruptions, wildfires, tornadoes, floods, hurricanes, and extreme weather conditions.

What Is High Availability?

High availability (HA) is a system’s ability to operate continuously without experiencing failure for a predefined period.

High availability processes help ensure the system meets a predefined operational performance level. The standard is typically five-nines availability, which aims to achieve 99.999% availability for the system or product at all times.

High availability systems strive to keep critical systems operational even during disasters.

High availability is critical for business continuity, and can even save lives. For example, autonomous vehicles, military control systems, healthcare, and industrial systems are critical and must remain available and operational at all times.

Related content: Read our guide to disaster recovery plans

In this article:

How Disaster Recovery Works
How Does High Availability Work?
Disaster Recovery vs High Availability
High Availability and Disaster Recovery Reinforce One Another
Built-in Data Protection for Disaster Recovery with Cloudian

This article is part of a series on Disaster Recovery.

How Disaster Recovery Works

Today, businesses of all sizes can implement disaster recovery more easily due to cloud and virtualization technologies. Advances in these fields have made it possible to easily package complex systems, save them to a remote site, and restore them at the click of a button.

Public cloud platforms provide a scalable, resilient infrastructure which can save as a company’s remote DR site. Some companies still maintain their own DR infrastructure, but they also make use of virtualization and containerization to make backup and restore easier and more effective.

Organizations use a DR site—whether internal, external, or cloud-based—to backup data, technical infrastructure, and business applications. When the primary data center is unavailable, they can transition operations to systems running in the DR site, and when systems are back online, restore them from the backups.

Disaster recovery solutions take many forms. DR providers can offer backup and recovery software, infrastructure hosting services, DR management services, or end-to-end disaster recovery as a service (DRaaS) solutions.

It is important to remember that disaster recovery is not just an IT concern, and has additional organizational components, such as security, risk management, and compliance. Therefore, some vendors combine disaster recovery with other aspects of security planning, such as incident response and contingency planning.

Learn more in our guides about:

How Does High Availability Work?

No system can be 100% available, but a high-availability system strives for an operational performance standard of 99.999%. Here are some main principles to consider when designing an HA system:

Avoid a single point of failure—the system must not rely on one component to run applications, which can cause the entire system to fail.
Ensure reliable crossover—the system must have built-in redundancy to allow backup components to replace failed ones, ensuring reliable failover and crossover.
Detect failures—failures must be visible. The system should automatically identify and avoid issues that may result in failures.

Load balancing is important for ensuring high availability when multiple users access the system. The load balancer distributes workloads automatically, determining which system resource can best handle each workload. Using several load balancers helps prevent resources from being overwhelmed.

HA systems have tiered architectures with multiple servers in different clusters enabling failover. If one cluster or server fails, another can take over without impacting performance. Ensuring high availability is more challenging in complex systems with more potential points of failure.

Disaster Recovery vs. High Availability

Here are key similarities:

Business continuity—high availability and disaster recovery are both subsets of business continuity, the overall program that defines how the organization ensures business operations continue during and after disruptions.
Redundancy—successful high availability and disaster recovery programs require redundancy or eliminating points of failure.
Risk assessment—risk assessment often involves cost comparisons between risks. For example, the risk of an earthquake can be high in some locations and nearly non-existent in others. Comparing costs to risks enables organizations to build appropriate high availability and disaster recovery programs on a budget.
Predefined objectives and measures—both programs require setting predefined objectives and measures, each setting different objectives, for example, defining availability for critical systems and setting recovery point and recovery time objectives for disaster recovery programs.

Here are key differences:

System design vs. organizational policies—high availability strives to ensure systems are designed and built to prevent overall system failure. It involves automating recovery or failover procedures and eliminating single points of failure. Disaster recovery aims to standardize recovery by defining, implementing, and enforcing policies, procedures, and tools. Disaster recovery assumes the primary system has already failed and that system recovery might take some time.
Different objectives and measures—high availability systems focus on availability as the objective, typically expressed as a percentage of the expected system availability time. For example, 99.99% availability allows approximately 16 seconds of downtime each week, while a 24×7 system allows a little over six seconds per week. Disaster recovery systems use recovery times and recovery points to measure objectives. For example, recovering a system from a data center fire within an hour while losing only five minutes’ worth of transactions.

High Availability and Disaster Recovery Reinforce One Another

Organizations need to use high availability alongside disaster recovery to ensure business continuity. High availability and disaster recovery are both extremely important for business continuity—each playing a critical role in maintaining day-to-day uptime and data recoverability during a major disaster.

High availability helps protect against day-to-day events that might interrupt system availability, such as network failures, hardware failures, application failures, or load-induced outages. It ensures these failures result only in minimal or no impact at all. Disaster recovery processes kick in when a major outage occurs due to man-made or natural disasters.

Creating backups of business-critical systems and storing offsite data recovery copies ensures organizations have resilient backups available for restoring data during data loss events. Replication helps protect the organization during a site-wide disaster that takes an entire site offline. Replicating virtual machines (VMs) to a data recovery facility enables organizations to reroute resources to the data recovery site when the main production site goes down.

Built-In Data Protection for Disaster Recovery with Cloudian

Do you need to backup data to on-premises storage, as part of your disaster recovery setup? Cloudian offers a low-cost disk-based storage technology that lets you backup data locally with a capacity of up to 1.5 Petabytes. You can also set up a Cloudian appliance in a remote site and use our integrated data management tools to save data there.

Another deployment option is a hybrid cloud configuration. You can backup data to a local Cloudian appliance, then replicate to the cloud for DR purposes. This combines the low latency of local storage with the resilience of the cloud.

Learn more about Cloudian’s data protection solution.