What is Cloud Native Storage?
Cloud native is a new paradigm for developing and operating software applications, including technology trends like cloud computing, containerization, serverless, and microservices. Cloud-native storage is a storage technology designed for use in a cloud-native environment.
A cloud-native storage platform provides data management for stateful applications, and provides solutions to ongoing data storage challenges in cloud-native environments based on Kubernetes or other cloud native infrastructure. Object storage solutions can be based on modern object storage technology, block storage, or traditional disk drives in a distributed architecture.
Most cloud-native storage solutions mimic the nature of cloud-native tools and environments as described by the Cloud Native Computing Foundation (CNCF). These features include scalability, high availability, vendor neutrality, security, resiliency, manageability, observability, declarative deployment and API-based automation.
In this article, you will learn:
- Cloud-Native Storage: Key Characteristics
- Common Cloud-Native Storage Solution Models
- Cloud Native Storage with Cloudian
Cloud-Native Storage: Key Characteristics
The key characteristics that define cloud native storage are availability, scalability, consistency, durability, and dynamic deployment.
Cloud native storage must be highly available. Storage system availability is the ability to access data in the event of a failure—whether in the storage medium, transmission system, controller, or any other component. There are three elements to storage availability:
- Maintaining redundant copies of data on another storage device
- Handling failover to redundant devices in case of failure
- Healing and restoring failed components
Availability can be measured using several metrics:
- Recovery time objective (RTO)—the time from failure to restoration of service
- Recovery point objective (RPO)—how recent is the latest copy of the data, affecting the maximal amount of data that can be lost in case of failure
- Percentage of uptime—% of total time the service is up and available
- Meantime between failures (MTBF)—how frequently faults occur
- Meantime to recovery (MTTR)—how long it takes the service to recover from failure
Cloud native storage must be easily scalable. The scalability of a storage system can be defined in four dimensions:
- Client scalability—ability to grow the number of clients or users accessing the storage system
- Throughput scalability—ability to run more throughput, measured as MB/sec or GB/sec, or a larger number of operations per second using the same interface
- Capacity scalability—the ability to grow storage capacity in a single deployment of storage systems, measured either in number of gigabytes, terabytes, or petabytes that can be stored, or the number of files or objects the service can store.
- Cluster scalability—ability to grow a cluster of storage components by adding more components as needed.
Related content: read our guide to distributed storage
Cloud native storage should support predictable, scalable performance and service levels. Storage system performance is typically measured from one or more of the following perspectives:
- Time to complete a read or write operation
- Maximum number of storage operations per second
- Throughput of data that can be stored or retrieved in MB/s or GB/s
Cloud native storage should support consistency as follows:
- Read operations should return the correct, updated data after write, update, or delete operations
- If there is no delay between modification of data and availability of the new data to read operations by clients, the system is “strongly consistent”
- If there is a delay until read operations return the updated data, the system is “eventually consistent”
In an eventually consistent system, the read delay can be considered as a recovery point objective (RPO), because it represents the maximal amount of data loss in case of component failure.
Cloud native storage should be durable, meaning it protects data against loss. Durability goes beyond accessibility—it describes the system’s ability to ensure the data remains stored for a long period of time. The following factors affect the durability of a storage system:
- Layers of data protection, such as the number of data copies available
- Levels of redundancy—for example, local redundancy, redundancy to a remote site, redundancy over public cloud availability zones, and redundancy over regions
- Durability characteristics of storage media—for example, SSD, rotating disk, tape
- The system’s ability to detect corruption due to component failure, wearout of storage media, and so on, and automatically reconstruct or restore corrupted data
The final desired criterion in cloud native storage systems is the ability to deploy or provision them easily on demand. Storage systems can be deployed or instantiated in a variety of ways, including:
- Hardware deployment—physical storage equipment deployed in a data center. Cloud native storage using this deployment model should be built of standardized components, which can be added to a cluster with no special configuration, removed and swapped when needed.
- Software deployment—storage components defined as a software component on commodity hardware, devices, or cloud instances. Cloud native software solutions can typically be installed in both local and cloud environments. Some software-defined storage systems are built as containers and can be deployed automatically using orchestrators.
- Cloud service—cloud services managed by public cloud providers and delivered as a service, with abstraction of the underlying storage implementation. Users provision new instances or additional storage using a web interface or API.
Common Cloud-Native Storage Solution Models
Here are the most common models under which cloud native storage is consumed.
Public Cloud Storage
Public clouds provide a range of cloud native storage options, including object storage (such as Amazon S3 and Azure Blob Storage), cloud-based file shares (such as Amazon EFS or Azure Files), and managed disks attached to compute instances (such as Amazon EBS and Azure Managed Disks).
Commercial Cloud Storage
When organizations build private clouds, they often turn to commercial cloud storage services that can provide high data reliability, easily scalability and convenience. Many of these services offer post-production support, and operations and maintenance (O&M) services. As the demand for cloud-native storage grows, private cloud infrastructure vendors offer more mature cloud-native interfaces that allow on-premise resources to consume cloud storage.
Self-Maintained Storage Services
There are two main types of storage services companies can build in-house: block storage and simple file storage. For block storage, Ceph RBD and storage area networks (SANs) are considered relatively mature solutions. However, due to their complexity, they often require a specialized support and maintenance team.
Services like NFS, GlusterFS, and CephFS provide file storage services for companies that decide to create their own distributed storage systems. Although NFS is relatively mature, it is usually insufficient to address high-performance application requirements. GlusterFS and CephFS are often unable to meet performance and reliability needed for mission-critical applications.
A new trend in on-premises cloud native storage is S3 compatible storage—local storage devices that support the S3 API, and provide similar capabilities to elastic cloud services such as Amazon S3.
In cloud native applications, there are many use cases in which it doesn’t make sense to use a distributed storage service. Here are two common cases in which edge devices or components in a cloud native system use local storage:
- Databases—cloud native applications still use traditional databases, both SQL and NoSQL. In many cases cloud native storage does not provide the throughput and high performance required for production databases. Also, databases may already be replicated or set up for redundancy, making the high availability built into cloud native unnecessary.
- Caching—in many cases, components use local storage as a cache for temporary information, and it is not necessary to persist or protect the data. A common example is ephemeral storage used by containers, which is erased when the container shuts down.
On-Premise Cloud Native Storage with Cloudian
Cloudian HyperStore is an on-prem, enterprise storage solution that uses a fully distributed architecture to eliminate single points of failure, and enable easily scalability from hundreds of Terabytes to Exabytes. It is cloud native and fully compatible with the Amazon S3 API.
The HyperStore software implementation builds on three or more distributed nodes, allowing you to replicate your objects for high availability. It lets you add as many storage devices as needed, and the additional devices automatically join an elastic storage pool.