Using Storage Archives to Secure Data and Reduce Costs

Modern businesses are constantly creating and modifying data, much of which is used only briefly, but must be retained for compliance reasons or for historical analysis. When you have data that must be kept long term, you can save costs and resources by archiving it. This article will help clarify your options so you can build an effective archive strategy.

In this article:

• What a storage archive is

• Types of archives

• Types of storage media

• Features of good solutions

• Archiving with Cloudian

What Is a Storage Archive?

A storage archive is used to preserve data that is rarely if ever accessed, often for long periods of time. It is more cost-effective than regular storage solutions and is frequently used for data related to compliance or auditing, log data, historical data, or data generated by retired applications.

Types of Data Archives

There are three main types of data archives:

Governance Archive

Governance archives are designed in response to regulatory and audit requirements and typically fall under the areas of record management, risk management, or compliance readiness. These archives contain primarily communications data, like emails or instant messages, but can also include documents, images, websites, or social media information. These archives must be easily searchable and data quickly retrievable in case of eDiscovery or audit.

Active Data Archive

Active archives are useful for data that is infrequently accessed but still needs to be available. The data they store usually isn’t read-write intensive and is often static, allowing the use of lower performance media, like tapes. Active solutions tend to be user-centric and sometimes include software meant to simplify retrieval and searching of records. Often data in active storage will be replicated in other archive systems.

Cold Data Archive

Cold data archives are useful for data that is infrequently or never accessed, such as backups or data from legacy applications, with the aim of storing this data as cheaply as possible. These archives typically have very slow data retrieval times and no integrated user access. These limitations can make them a liability in cases of eDiscovery or audit and often lead to investing additional money in the development or purchase of a UI to simplify use.

Storage Archive Media

In order to find a solution that best suits your needs, you’ll need to weigh the benefits and drawbacks of the media available and choose accordingly. Many strategies use multiple media types to accommodate user needs and data priority.

Tape Drive

Tape is a cheap and reliable medium with a long history of use. Its offline nature makes it especially useful for protecting data from cyber threats and malware.

Advantages include:

  • Significant storage capacity at good transfer speeds
  • Minimal storage requirements with a long shelf life
  • Reliable error detection and correction with built-in read-after-write verification
  • 2 generations backward compatibility

 

Disadvantages include:

  • Sequential access makes retrieval and searching slower
  • Requires special drive or tape library to read or write data
  • Prone to wear with use and sensitive to environmental conditions

 

Optical Media Storage

Optical disks, CDs and DVDs, are a form of write once, read many (WORM) storage. They are useful when you need highly portable storage that you don’t want to be overwritten.

Advantages include:

  • Longest shelf life
  • Less vulnerable to wear and tear and no chance of mechanical failure
  • Compact size makes highly portable

 

Disadvantages include:

  • Low storage capacity
  • Slow read times and slow write performance
  • Requires optical drive to read data and different functionality to write data


Disk Drive

Disk storage offers good storage to cost ratio and can include features for local and remote replication, data deduplication, and faster search capacity.

Advantages include: 

  • Random-access allows faster read and write
  • Single point of failure protection when using RAID
  • Can be paired with indexing engines for faster searching

 

Disadvantages include:

  • Expensive to purchase, maintain, store, and upgrade
  • Relatively short lifespan and high failure rate
  • Energy-intensive operation requires environmental controls like cooling and air filtering


Removable Disk

Removable disk storage, such as thumb drives or external hard drives, is primarily used by individuals or small to medium-sized businesses due to its trade-off of limited capacity for portability.

Advantages include:

  • Random-access allows faster read and write
  • Available as multi-disk
  • Portable and allows offline storage

 

Disadvantages include:

  • Poor cost to storage volume ratio
  • Requires media handling, increasing risk of damage


Cloud Platforms

Cloud storage is a good option for businesses of all sizes, particularly if they operate in a decentralized fashion. This medium’s remote nature allows for easier globalization and protects from localized disasters.

Advantages include:

  • Reduced costs since don’t need to purchase, store, or maintain equipment
  • Highly flexible medium with good scalability and application integration capabilities
  • Data is remotely accessible with built-in encryption

 

Disadvantages include:

  • Requires network or internet access for use
  • Requires specialized software for transfer and access to data
  • Reliance on the provider can create lock-in

Features of a Good Archiving Solution

If an archiving solution doesn’t have certain key features, the time and effort cost of using it can outweigh any benefits.

Solutions must include efficient search capabilities. You should be able to search for data based on type (document, PDF, email, etc.), source of origin (server, application, device, etc.), author, and by the structure of the data contained within (SSNs, bank routing numbers, credit card numbers, etc.).

Audit tracking features are essential━solutions can provide audit trails including who is accessing data, when they’re accessing it, and what specifically is being accessed.

Data deduplication features are key to maintaining low archive size and thus lower cost. Deduplication ensures that only changes to data are kept, along with references to a baseline copy for unchanged data. These features can be present at either the file, block, or bit-level with bit-level granting the least redundancy.

Good solutions are flexible and prevent media or vendor lock-in. They allow multiple data platforms to be used for both data writing and retrieval, making it easier for you to change or update systems as needed. They need to be able to handle multiple data types, from application logs to archives of social networking sites.

Automation is vital to reduce the amount of time spent creating, auditing, and modifying archives. Good solutions allow you to create policies to schedule when data is archived along with its lifecycle and to manage access permissions. They should also provide logging of these processes and alerts in case of write failure.

Archiving with Cloudian

Over time, it is likely that you will accumulate data that still holds value for your business but that doesn’t need to be available instantly. Archiving this data is a good solution for ensuring that it is kept safe without taking up expensive resources. The variety of archive options available allows you to create a solution that suits your needs and if you select strategically, can simplify archiving and retrieval processes for you in the future.

You can simplify the process of archiving data with solutions like Cloudian HyperStore, which is an on-premise object storage platform available as an appliance or software. This solution is scalable and can be integrated with cloud and third-party migration services, making it flexible to your needs.

HyperStore is fully S3 API compliant and includes automatic data verification and encryption. It allows you to tag your data with custom metadata for intelligent search or analytic functions, and manage stored data with bucket-level policies, determining replication schedule and lifecycle time. You can also create policies dictating erasure coding and replication according to data type. HyperStore can help you store your data securely and efficiently while keeping it accessible to your broader storage systems.