Data Backup in Depth: Concepts, Techniques, and Storage Technologies

In an increasingly digitized business landscape, data backup is vital for the survival of an organization. You can get hacked or ransomed, and lose your data to thieves who’ll sell your trade secrets to the highest bidder. Injected malware can corrupt your hard-earned information. Disgruntled employees or other insider threats can delete your valuable digital assets. Can you recover from data loss?

Data backup is a practice that combines techniques and solutions for efficient and cost-effective backup. Your data is copied to one or more locations, at pre-determined frequencies, and at different capacities. You can set up a flexible data backup operation, using your own architecture, or make use of available Backup as a Service (BaaS) solutions, mixing them up with local storage. Today, there are plenty of corporate storage TCO solutions to help you calculate costs, and avoid data loss.

In this article:

• What Is Data Backup?
• The Importance of a Disaster Recovery Plan: Alarming Statistics
• 6 Data Backup Options
• Backup Storage Technology

What Is a Data Backup?

Data backup is the practice of copying data from a primary to a secondary location, to protect it in case of a disaster, accident or malicious action. Data is the lifeblood of modern organizations, and losing data can cause massive damage and disrupt business operations. This is why backing up your data is critical for all businesses, large and small.

What does backup data mean?

Typically backup data means all necessary data for the workloads your server is running. This can include documents, media files, configuration files, machine images, operating systems, and registry files. Essentially, any data that you want to preserve can be stored as backup data.

Data backup includes several important concepts:

  • Backup solutions and tools—while it is possible to back up data manually, to ensure systems are backed up regularly and consistently, most organizations use a technology solution to back up their data.
  • Backup administrator—every organization should designate an employee responsible for backups. That employee should ensure backup systems are set up correctly, test them periodically and ensure that critical data is actually backed up.
  • Backup scope and schedule—an organization must decide on a backup policy, specifying which files and systems are important enough to be backed up, and how frequently data should be backed up.
  • Recovery Point Objective (RPO)—RPO is the amount of data an organization is willing to lose if a disaster occurs, and is determined by the frequency of backup. If systems are backed up once per day, the RPO is 24 hours. The lower the RPO, the more data storage, compute and network resources are required to achieve frequent backups.
  • Recovery Time Objective (RTO)—RTO is the time it takes for an organization to restore data or systems from backup and resume normal operations. For large data volumes and/or backups stored off-premises, copying data and restoring systems can take time, and robust technical solutions are needed to ensure a low RTO.

The Importance of a Disaster Recovery Plan: Alarming Statistics

To understand the potential impact of disasters on businesses, and the importance of having a data backup strategy as part of a complete disaster recovery plan, consider the following statistics:

  • Cost of downtime—according to Gartner, the average cost of downtime to a business is $5,600 per minute.
  • Survival rate—another Gartner study found only 6% of companies affected by a disaster that did not have disaster recovery in place survived and continued to operate more than two years after the disaster.
  • Causes of data loss—the most common causes of data loss are hardware/system failure (31%), human error (29%) and viruses, and malware of ransomware (29%).

6 Data Backup Options

There are many ways to backup your file. Choosing the right option can help ensure that you are creating the best data backup plan for your needs. Below are six of the most common techniques or technologies:

 

  1. Removable media
  2. Redundancy
  3. External hard drive
  4. Hardware appliances
  5. Backup software
  6. Cloud backup services

 

  1. Removable Media

A simple option is to backup files on removable media such as CDs, DVDs, newer Blu-Ray disks, or USB flash drives. This can be practical for smaller environments, but for larger data volumes, you’ll need to back up to multiple disks, which can complicate recovery. Also, you need to make sure you store your backups in a separate location, otherwise they may also be lost in a disaster. Tape backups also fall into this category.

  1. Redundancy

You can set up an additional hard drive that is a replica of a sensitive system’s drive at a specific point in time, or an entire redundant system. For example, another email server that is on standby, backing up your main email server. Redundancy is a powerful technique but is complex to manage. It requires frequent replication between cloned systems, and it’s only useful against the failure of a specific system unless the redundant systems are in a remote site.

  1. External Hard Drive

You can deploy a high-volume external hard drive in your network, and use archive software to save changes to local files to that hard drive. Archive software allows you to restore files from the external hardware with an RPO of only a few minutes. However, as your data volumes grow, one external drive will not be enough, or the RPO will substantially grow. Using an external drive necessitates having it deployed on the local network, which is risky.

  1. Hardware Appliances

Many vendors provide complete backup appliances, typically deployed as a 19” rack-mounted device. Backup appliances come with large storage capacity and pre-integrated backup software. You install backup agents on the systems you need to back up, define your backup schedule and policy, and the data starts streaming to the backup device. As with other options, try to place the backup device isolated from the local network and if possible, in a remote site.

  1. Backup Software

Software-based backup solutions are more complex to deploy and configure than hardware appliances, but offer greater flexibility. They allow you to define which systems and data you’d like to back up, allocate backups to the storage device of your choice, and automatically manage the backup process.

  1. Cloud Backup Services

Many vendors and cloud providers offer Backup as a Service (BaaS) solutions, where you can push local data to a public or private cloud and in case of disaster, recover data back from the cloud. BaaS solutions are easy to use and have the strong advantage that data is saved in a remote location. However, if using a public cloud, you need to ensure compliance with relevant regulations and standards, and consider that over time, data storage costs in the cloud will be much higher than the cost of deploying similar storage on-premises.

What Is a 3-2-1 Backup Strategy?

A 3-2-1 backup strategy is a method for ensuring that your data is adequately duplicated and reliably recoverable. In this strategy, three copies of your data are created on at least two different storage media and at least one copy is stored remotely: 

 

  • Three copies of data—your three copies include your original data and two duplicates. This ensures that a lost backup or corrupted media do not affect recoverability.
  • Two different storage types—reduces the risk of failures related to a specific medium by using two different technologies. Common choices include internal and external hard drives, removable media, or cloud storage.
  • One copy off-site—eliminates the risk associated with a single point of failure. Offsite duplicates are needed for robust disaster and data backup recovery strategies and can allow for failover during local outages. 

 

This strategy is considered a best practice by most information security experts and government authorities. It protects against both accidents and malicious threats, such as ransomware, and ensures reliable data backup and restoration.

Server Backup: Backing Up Critical Business Systems

The easiest way to backup a server is with a server backup solution. These solutions can come in the form of software or appliances. 

 

Server backup solutions are typically designed to help you backup server data to another local server, a cloud server, or a hybrid system. In particular, backup to hybrid systems is becoming more popular. This is because hybrid systems enable you to optimize resources, support easy multi-region duplication, and can enable faster recovery and failover.

 

In general, server backup solutions should include the following features:

 

  • Support for diverse file types—should not include any file types. In particular, solutions should support documents, spreadsheets, media, and configuration files. 
  • Backup location—you should be able to specify backup locations. The solution should support backup to a variety of locations and media, including on and off-site resources.
  • Scheduling and automation—in addition to enabling manual backups, solutions should support backup automation through scheduling. This helps ensure that you always have a recent backup and that backups are created in a consistent manner.
  • Backup management—you should be able to manage the lifecycle of backups, including number stored and length of time kept. Ideally, solutions also enable easy export of backups for transfer to external resources or for use in migration. 
  • Partition selection—partitions are isolated segments of a storage resource and are often used to separate data within a system. Solutions should enable you to independently backup data and restore partitions.
  • Data compression—to minimize the storage needed for numerous backups, solutions should compress backup data. This compression needs to be lossless and maintain the integrity of all data. 
  • Backup type selection—you should be able to create a variety of backup types, including full, differential, and incremental backups. Differential backups create a backup of changes since the last full backup while incremental records the changes since the last incremental backup. These types can help you reduce the size of your backups and speed backup time.
  • Scaling—backup abilities should not be limited by the volume of data on your servers. Solutions should scale as your data does and support backups of any size. 

Backup Storage Technology

Whichever technique you use to backup, at the end of the day, data must be stored somewhere. The storage technology used to hold your backup data is very significant:

  • The more cost-effective it is, the more data it is able to store, and the faster the storage and retrieval over a network, the lower your RPO and RTO will be.
  • The more reliable the storage technology, the safer your backups will be.

Below, you’ll find a review of backup storage technologies and their unique advantages.

Network Shares and NAS

You can set up centralized storage such as Network Attached Storage (NAS ), Storage Area Network (SAN), or regular hard disks mounted as a network share using Network File System (NFS) protocol. This is a convenient option for making large storage available to local devices for backup. However, it is susceptible to disasters affecting your entire data center, such as natural disasters or cyberattacks.

Tape Backup

Modern tape technology such as Linear Tape-Open 8 (LTO-8) can store up to 9 TB of data on a single tape. You can then ship the tape to a distant location, preferably at least 100 miles away from your primary location. Tape backups have been used for decades, but their obvious downside is the extremely high RTO and RPO due to the need to physically ship the tapes to and from a backup location. They also require a tape drive and an autoloader to perform backup and recovery, and this equipment is expensive.

Cloud-Based Object Storage

When using cloud providers, you have access to a variety of storage services. Cloud providers charge a flat price per Gigabyte, but costs can start to add up for frequent access. There are multiple tools that let you backup data to S3 automatically, both from within the cloud and from on-premise machines.

Local Object Storage with Cloudian

Cloudian® HyperStore® is a massive-capacity object storage device that is fully compatible with Amazon S3. It can store up to 1.5 Petabytes in a 4U Chassis device, allowing you to store up to 18 Petabytes in a single data center rack. HyperStore comes with fully redundant power and cooling, and performance features including 1.92TB SSD drives for metadata, and 10Gb Ethernet ports for fast data transfer.

cloudian hyperstore appliance

HyperStore is an on-premise data storage solution that can help you perform backups with RPO and RTO near zero, for almost any data volume.

Learn more about Cloudian® HyperStore®.

Learn More About Data Backup and Archive

Data backup is the process of protecting data in case of a disaster, accident, or malicious action, by copying it from one location to another. Data is the lifeblood of any organization, losing data can lead to serious damage and interrupt business operations. Therefore, backing up your dataup is critical for both large and small businesses.

 

Data backup is a broad topic. There’s a lot more to learn about data backup and archive. To continue your research, take a look at the rest of our blogs on this topic:

 

Ensuring Your Data with Effective Backup Storage

Backup storage refers to physical locations or devices for storing copies of data for recovery in the event of failure or data loss. Backup storage systems usually include both the hardware and the software for managing copies and recovery. This includes anything from a simple thumb drive to a hybrid system of local physical storage and remote cloud storage. 

 

This article explains the concept of backup storage, and shows the different types of backup storage methods, including Network Attached Storage (NAS), external hard drives, and cloud storage.

 

Read more: Ensuring Your Data with Effective Backup Storage

 

NAS Backup: Supporting the Shared Environment

NAS is a dedicated file storage system that enables multiple users to share data. You can access this shared storage on a Local Area Network (LAN) via an Ethernet connection. NAS is designed for handling unstructured data like video, audio, text files, websites, and Microsoft Office documents.

 

NAS devices usually store data essential to the daily operations of an organization. Therefore, you need to protect NAS devices to ensure the safety of data in events of a device failure, natural disasters, or human error.

 

Read more: NAS Backup: Supporting the Shared Environment

 

Using Storage Archives to Secure Data and Reduce Costs

A storage archive is a device or location for storing data that is rarely if ever accessed. Archives are usually more cost-effective than regular storage solutions, and they are frequently used for storing compliance data, log data, historical data, or legacy applications data.

 

There are three main types of data archives—governance archives, active archives, and cold data archives. This article explores the different storage archive options so you can build an effective archive strategy.

Read more: Using Storage Archives to Secure Data and Reduce Costs

 

Data Archives and Why You Need Them

Data archives and backups are not the same. Even though they are both used to store data, you should use them for different purposes. Data backups protect data that is currently in use. This enables you to restore corrupted or lost data from a single point in time. 

 

Data archives store data that is not currently in use. This enables you to restore data across a period of time. Archives store data in an indexed fashion, through the use of metadata. To retrieve data, you need to know the search parameters like author name or file contents.

 

Read more: Data Archives and Why You Need Them

 

Distributed Storage: What’s Inside Amazon S3?

Distributed storage systems are designed to split data across multiple physical servers, and usually across more than one data center. Distributed storage systems take the form of a cluster of storage units. Each cluster has a mechanism for data coordination and synchronization between cluster nodes.

 

Scalable cloud storage systems like Amazon S3 and Microsoft Azure Blob Storage are based on distributed storage. This article explains the concept of distributed storage technologies and services like Amazon S3.

 

Read more: Distributed Storage: What’s Inside Amazon S3?

 

Backup Cloud Storage: Ensuring Business Continuity

Cloud backup refers to the procedure of storing copies of cloud data in another location. This enables you to restore information in case of data compromise, downtime or damage. Additionally, organizations often need to backup cloud data to comply with regulations. They can face penalties and fines if they neglect to do so.

 

This article explains the concept of cloud backup and its importance, discusses on-premise backup solutions and compares the pros and cons of on-premise and cloud-based backup models.

 

Read more: Backup Cloud Storage: Ensuring Business Continuity

 

Storage Tiering: Making the Most of Your Storage Investment

Storage tiering is a method for efficiently using storage systems according to their importance or business value. A tiered storage solution provides several types of storage, including SSD disk drives, tape storage, and magnetic disk drives. The most frequently-accessed or important data is stored on the fastest, and most expensive SSD and the least important on the slowest, cheapest media.

Read more: Storage Tiering: Making the Most of Your Storage Investment

 

Private Cloud Storage: Bringing True Cloud Storage In-House

Private cloud storage is a service model for provisioning storage to users in an organization. This service model offers storage on-demand, with the same private cloud capabilities: on-demand access, resource pooling, elasticity and metering.

 

Companies usually invest in private cloud storage to address compliance or security requirements. Another use case is on-premises applications that require high-latency or high-throughput access to data, making it necessary to place the storage physically near to the storage consumer.

 

Read more: Private Cloud Storage: Bringing True Cloud Storage In-House

 

Ransomware Backup: How to Get Your Data Back

Ransomware is a type of malware that prevents users from accessing their files. When ransomware infects a system, it starts searching for valuable files and encrypting them. Files are encrypted using asymmetric key encryption, where attackers hold the private key that can decrypt the files.

 

Data backup is the best way to protect yourself against Ransomware. If you have a clean backup of your data when ransomware strikes, and are able to prevent ransomware from reaching the backup and encrypting it too, you have a safe and easy way to recover without paying the ransom.

 

Read more: Ransomware Backup: How to Get Your Data Back

 

See Our Additional Guides on Key Data Storage Topics:

We have authored in-depth guides on several other data storage topics that can also be useful as you explore the world of data backup.

Data Protection Guide

Data protection relies on technologies such as data loss prevention (DLP), storage with built-in data protection, firewalls, encryption, and endpoint protection. Learn what is the difference between data protection and data privacy, and how to leverage best practice to ensure the continual protection of your data.

 

See top articles in our data protection guide:

 

Hybrid IT Guide

Hybrid IT is a blend of on-premise and cloud-based services that has emerged with the increasing migration of businesses to cloud environments. Learn about hybrid IT, implementation solutions, and practices, and discover how Cloudian can help optimize your implementation.

 

See top articles in our Hybrid IT guide:

IT Disaster Recovery Guide

IT disaster recovery is the practice of anticipating, planning for, surviving, and recovering from a disaster that may affect a business. Learn what is disaster recovery, how it can benefit your business, and four essential features any disaster recovery program must include to be effective.

 

See top articles in our IT disaster recovery guide:

 

VMware Storage Guide

VMware provides a variety of ways for virtual machines to access storage. It supports multiple traditional storage models including SAN, NFS and Fiber Channel (FC), which allow virtualized applications to access storage resources in the same way as they would on a regular physical machine. 

 

See top articles in our VMware storage guide:

 

Health Data Management Guide

Health Data Management (HDM), also known as Health Information Management (HIM) is the systematic organization of health data in digital form. Learn what is health data management, the types of data it encompasses, unique challenges and considerations for storing Petabytes of health data.

See top articles in our health data management guide:

 

Splunk Architecture Guide

Splunk is a distributed system that aggregates, parses and analyses log data. This article explains how the Splunk big data pipeline works, how components like the forwarder, indexer and search head interact, and the different topologies you can use to scale your Splunk deployment.

 

See top articles in our splunk architecture guide: