Storage capacity is the most fundamental criteria of a storage device. The second most fundamental storage criteria is what happens when you need to add capacity.
The prospect of adding capacity keeps storage managers up at night. When you run out of capacity, your options may be as simple as adding a new shelf of drives, or as complex as building a new data center. Obviously, simpler is better. And the key architectural difference that drives simplicity is whether your system design is scale-up or scale-out.
“Scale-up” architecture has been a long-running standard for storage. However, as data volume grows, it becomes increasingly difficult to ignore the many limitations and flaws of scale-up storage. The solution may be “scale-out” architecture. This blog post will highlight the key differences between the two types of storage.
Scale-Up Storage is Showing Its Age
Scale-up is the most common form of traditional block and file storage platforms. The system consists of a pair of controllers and multiple shelves of drives. When you run out of space, you add another shelf of drives. Scale-up architecture is limited to the scalability limits of the storage controllers.
Figure 1 – Modular/Scale-up Storage Architecture
Once the performance and/or capacity limits of the storage controllers are reached, then the only option is to add a new system to sit alongside the existing one. At this point, your workload grows as you migrate storage and manage the load between the two independent silos of storage that now exists.
As an organisation’s data volume grows, complete new systems need to be added to cope with the additional demands. Ultimately, this architecture becomes highly complex to manage. Inefficient resource allocation becomes an issue in deciding where workloads need to reside.
Figure 2 shows the potential for storage system sprawl.
Figure 2 – Modular/Scale-up Storage Silos
As disk drive failure protection is provided by RAID (Redundant Array of Inexpensive Disks), it is impossible to scale a modular storage system out by adding more storage controllers as the physical drives are tied to a specific controller that then decides on how the storage blocks are laid out onto the drives to provide the physical drive resilience. This is irrespective of the RAID type, 1, 10, 5 or 6. Basically, RAID as a technology does not scale across multiple storage controllers which in itself introduces the inherent scalability issue.
The Difference With Scale-Out and Object Storage
Contrast the above to a scale-out solution using object storage capabilities as shown in figure 3. This architecture is built using industry standard commodity x86 servers with the disk storage tied to each node similar to Direct Attached Storage (DAS) architectures of old. Whether the disks are internal to the server chassis or an external JBOD connected by SAS, only the individual server has access to this physical server.
Figure 3 – Object/Scale-out Storage Architecture
Object storage software is installed on each node and this combines all the nodes into a single cluster, whereby all storage tied to each individual node is brought into a single storage pool and presented out to the user/application network as a single unified name space. Essentially what this means is that the user/application is not aware on which node its data resides – it is just presented with a storage container with a Fully Qualified Domain Name (FQDN). The object storage system manages the store, retrieval, and protection of the data objects and manages the data placement across all the nodes within the cluster.
As mentioned already, RAID cannot be used to provide protection against drive failure in a multi-node architecture, so a similar implementation called RAIN is used. RAIN is Redundant Array of Independent Nodes and provides similar data protection capabilities to RAID within a disk-based storage system. In this case, RAIN protects against an entire node failing rather than just individual disks. Similarly to RAID, RAIN comes in multiple flavours;
- Replicas – complete copies of a data object which are distributed across multiple nodes – similar to mirroring in RAID concepts, but commonly more than just two replicas are used, especially in a multi DC architecture.
- Erasure Coding – offers similar data protection capabilities as RAID 5 and RAID 6 by creating parity for data sets in order to provide protection against failures (but without the capacity overhead of mirroring data).
Scale-Out Means Better Scalability
As the client to the object storage system addresses a virtual FQDN address, it is simple to add further nodes to the cluster, the object storage software including the new nodes into the cluster and re-distributing the data across these nodes. For every new node added, it adds not only additional disk storage capacity (and performance) but additional RAM, CPU and networking resources to improve overall performance of the cluster. This method of scaling avoids the limitations of the traditional scale-up storage architecture where every IO has to be processed by one of only two storage controllers.
Moving into a multi DC architecture, the cluster can be extended by adding nodes in different geographical areas and because object storage can be geo aware, policies can be established to distribute data into these other locations. As a user is accessing the storage via the FQDN address, the object storage system will return data from the node that provides the best response time to the user, whether it is from a data centre in London, or a data centre in New York. So a user in New York would expect to get faster access to a data object from the copy stored in the New York data centre than the copy in London. Similarly for a user accessing the same data set in London, what is key is to be able to maintain data consistency between locations, so that the same data is returned regardless of location.
As data storage needs grow rapidly over the next few years, it’s important to consider moving towards scale-out architecture vs. scale-up. The scalability that scale-out storage offers will help mitigate costs, complexity, and resource allocation. For more information, watch our overview video on object storage or read the first part of our series on object storage vs. file storage.