David Axler, Principle Architect

View LinkedIn Profile


How to tier data to (or among) clouds with Cloudian HyperStore

To help build out a hybrid cloud deployment, Cloudian HyperStore can perform an important data management function by moving or copying data to the cloud. This blog describes to how configure your Cloudian system for this, and how to make that data available in native formats.

Configurable lifecycle policies

Tiering in a Cloudian HyperStore cluster is configured via lifecycle policies. These flexible policies support a variety of data management options, such as:
• Move and/or expire data locally
• Move data to remote systems or clouds
• Keep local and remote copies of data
• Create immutable backup copies of data

With all data that is tiered between systems, Cloudian HyperStore maintains a local copy of the metadata and sends a full copy of the metadata to the destination system. This ensure that you can access the data using the native tools of that cloud. (This is demonstrated in a hybrid cloud configuration with Azure in this video)

In a tiering configuration, Cloudian HyperStore supports various destinations including, but not limited to, AWS, Azure, GCP and other Cloudian HyperStore systems.

A lifecycle policy can apply to all objects in a bucket, or to a subset of objects. They may be filtered by object name prefix and/or by object size. You also have the option of creating multiple lifecycle rules for a bucket, giving countless options for how to move your data.

If versioning is enabled you can tier either current or previous versions.

 

tier data

 

If you are using tiering to create additional data copies, you can choose to retain a local copy and to use bridge mode whereby all files are instantly tiered to the destination.

There are multiple options for retrieving data (objects) from the external location, including:
• Stream: Gets the object from the destination and immediately streams it through to the client.
• Require Restore: Requires S3 client applications to first execute a RestoreObject request.
• Cache (Stream & Restore): The local system retrieves (using a GET command) the object from the tiering destination system, immediately streams it through to the client, and simultaneously saves a copy of the object to local storage.

Lifecycle policies, along with storage policies, allow full control over where your data resides. While storage policies manage the protection of objects, lifecycle policies deal with the movement and migration of those objects. Keeping a copy of the metadata locally allows Cloudian HyperStore to act as a single S3-compatible storage endpoint, while providing access to the data sitting on any public could.

Will Cloudian HyperStore version 7.5.2 or later, the same technique can be used to create immutable backup copies of data across multiple HyperStore systems. This strategy will be covered in a later blog post.