Object storage is still fairly new, when compared with more traditional storage systems such as file or block storage. So, what is object storage, exactly? You can think of it as a form of unstructured data storage that eliminates the scaling limitations of traditional file storage.
Unlike the addressing hierarchy used with traditional file storage, object-based storage employs a flat file system that can grow without limit. Hence, it’s able to hold huge volumes of unstructured data such as audio, video, emails, health records, and documents. For a visual representation of object storage, watch our overview video:
What is Object Storage?
Object storage is a technology that manages data as objects. All data is stored in one large repository, which may be distributed across multiple physical storage devices, instead of being divided into files or folders.
It is easier to understand object-based storage when you compare it to more traditional forms of storage – file and block storage.
File storage stores data in folders. This method, also known as hierarchical storage, simulates how paper documents are stored. When data needs to be accessed, a computer system must look for it using its path in the folder structure.
File storage uses TCP/IP as its transport, and devices typically use the NFS protocol in Linux and SMB in Windows.
Block storage splits a file into separate data blocks, and stores each of these blocks as a separate data unit. Each block has an address, and so the storage system can find data without needing a path to a folder. This also allows data to be split into smaller pieces and stored in a distributed manner. Whenever a file is accessed, the storage system software assembles the file from the required blocks.
File storage uses FC or iSCSI for transport, and devices operate as direct attached storage or via a storage area network (SAN).
In object storage systems, data blocks that make up a file or “object”, together with its meta data, are all kept together. Extra metadata is added to each object, which makes it possible to access data with no hierarchy. All objects are placed in a unified address space. In order to find an object, users provide a unique ID.
Object-based storage uses TCP/IP as its transport, and devices communicate using HTTP and REST APIs.
Metadata is an important part of object storage technology. Metadata is determined by the user, and allows flexible analysis and retrieval of the data in a storage pool, based on its function and characteristics.
The main advantage of object storage is that you can group object storage devices into large storage pools, and distribute those pools across multiple locations. This not only allows unlimited scale, but also improves resilience and high availability of the data.
Object Storage Architecture
According to ESG, large-scale object storage systems should be based on the following architectural principles:
Object storage technology should be easy to use and implement, and require minimal effort for ongoing maintenance. Operations like clustering, healing and tuning should be fully automated.
Data in an object storage system should be accessible via an API, typically an HTTP-based RESTful API. Developers should be able to perform any action on storage pools, programmatically. Applications should be able query objects using their metadata, to find the required objects no matter where they are stored in a large storage pool.
Administrators should be able to choose a variety of use a variety of storage devices and platforms, combining heterogeneous hardware into one storage pool. Object storage should also easily extend from on-premises to the public cloud and vice versa.
4. Cloud-like consumption
An object storage solution, whether based in the cloud or on-premises, should have a way to meter the usage of different parts of the organization, and make it possible to bill each group according to its actual usage.
Object Storage Benefits
Unlike file or block storage, object storage services enable scalability that goes beyond exabytes. While file storage can hold many millions of files, you will eventually hit a ceiling. With unstructured data growing at 50+% per year, more and more users are hitting those limits, or they expect to in the future.
Scale Out Architecture
Object storage makes it easy to start small and grow. In enterprise storage, a simple scaling model is golden. And scale-out storage is about as simple as it gets: you simply add another node to the cluster and that capacity gets folded into the available pool.
HyperStore is an S3-compatible storage system. HyperFile is a connector that allows files to be stored on HyperStore.
While file systems have metadata, the information is limited and basic (date/time created, date/time updated, owner, etc.). Object storage allows users to customize and add as many metadata tags as they need to easily locate the object later. For example, an X-ray could have information about the patient’s age and height, the type of injury, etc.
High Sequential Throughput Performance
Early object storage systems did not prioritize performance, but that’s now changed. Now, object stores can provide high sequential throughput performance, which makes them great for streaming large files. Also, object storage services help eliminate networking limitations. Files can be streamed in parallel over multiple pipes, boosting usable bandwidth.
Flexible Data Protection Options
To safeguard against data loss, most traditional storage options utilize fixed RAID groups (groups of hard drives joined together), sometimes in combination with data replication. The problem is, these solutions generally lead to one-size-fits-all data protection. You can not vary the protection level to suit different data types.
Object storage solutions employ a flexible tool called erasure coding that is similar to old-fashioned RAID in some ways, but is far more flexible. Data is striped across multiple drives or nodes as needed to achieve the needed protection for that data type. Between erasure coding and configurable replication, data protection is both more robust and more efficient.
Support for Industry Standards
Back when object storage solutions were launched, the interfaces were proprietary. Few application developers showed interest in writing to these interfaces. Then Amazon created the Simple Storage Service, or “S3”. They also created a new object storage interface, also called the “S3 API”. The S3 interface has since become a de-facto standard for object storage data transfer.
The existence of a standard changed the game. Now, S3-compatible application developers have a stable and growing market for their applications. And service providers and S3-compatible storage vendors such as Cloudian have a growing user set deploying those applications. The combination sets the stage for rapid market growth.
Lower Total Cost of Ownership (TCO)
Cost is always a factor in storage. And object storage services offer the most compelling story, both in hardware/software costs and in management expenses. By allowing you to start small and scale, object storage technology minimizes waste, both in the form of extra headcount and unused space. Additionally object storage systems are inherently easy to manage. With limitless capacity within a single namespace, configurable data protection, geo replication, and policy-based tiering to the cloud, it’s a powerful tool for large-scale data management.
To learn more about Cloudian’s fully native S3-compatible object storage in your data center, and how it can cut down your TCO, check out our free trial today. Or visit Cloudian.com for more information.