All data is not equal due to factors such as frequency of access, security needs, and cost considerations, therefore data storage architectures need to provide different storage tiers to address these varying requirements. Storage tiers differ depending on disk drive types, RAID configurations or even completely different storage sub-systems, which offer different IP profiles and cost impact.
Data tiering allows the movement of data between different storage tiers, which allows an organization to ensure that the appropriate data resides on the appropriate storage technology. In modern storage architectures, this data movement is invisible to the end-user application and is typically controlled and automated by storage policies. Typical data tiers may include:
- Flash storage – High value, high-performance requirements, usually smaller data sets and cost is less important compare to the performance Service Level Agreement (SLA) required
- Traditional SAN/NAS Storage arrays – Medium value, medium performance, medium cost sensitivity
- Object Storage – Less frequently accessed data with larger data sets. Cost is an important consideration
- Public Cloud – Long-term archival for data that is never accessed
Typically, structured data sets belonging to applications/data sources such as OLTP databases, CRM, email systems and virtual machines will be stored on data tiers 1 and 2 as above. Unstructured data is more commonly moving to tiers 3 and 4 as these are typically much larger data sets where performance is not as critical and cost becomes a more significant factor in management and purchasing decisions.
Some Shortcomings of Data Tiering to Public Cloud
Public cloud services have become an attractive data tiering solution, especially for unstructured data, but there are considerations around public cloud use:
- Performance – Public network access will typically be a bottleneck when reading and writing data to public cloud platforms, along with data retrieval times (based on the SLA provided by the cloud service). Especially for backup data, backup and recovery windows are still incredibly important, so for the most relevant backup sets it is worth considering to hold onsite and only archive older backup data to the cloud.
- Security – Certain data sets/industries have regulations stipulating that data must not be stored in the cloud. Being able to control what data is sent to the cloud is of major importance.
- Access patterns – Data that is re-read frequently may incur additional network bandwidth costs imposed by the public cloud service provider. Understanding your use of data is vital to control the costs associated with data downloads.
- Cost – As well as bandwidth costs associated with reading data, storing large quantities of data in the cloud may not make the most economical sense, especially when compared to the economics of on-premise cloud storage. Evaluations should be made.
Using Hybrid Cloud for a Balanced Data Tier Strategy
For unstructured data, a hybrid approach to data management is key with an automation engine, data classification and granular control of data necessary requirements to really deliver on this premise.
With a hybrid cloud approach, you can push any data to the public cloud while also affording you the control that comes with on-premises storage. For any data storage system, granularity of control and management is extremely important as different data sets have different management requirements with the need to apply different SLAs as appropriate to the value of the data to an organization.
Cloudian HyperStore is a solution that gives you that flexibility for easily moving between data tiers 3 and 4 listed earlier in this post. Not only do you get the control and security from your data center, you can integrate HyperStore with many different destination cloud storage platforms, including Amazon S3/Glacier, Google Cloud Platform, and any other cloud service offering S3 API connectivity.
Learn more about our solutions today.
Learn more about NAS backup here.