What is Object Storage
Object storage is relatively new when compared with more traditional storage systems such as file or block storage. So, what is object storage, exactly? In short, it is storage for unstructured data that eliminates the scaling limitations of traditional file storage. Limitless scale is the reason that object storage is the storage of the cloud. All of the major public cloud services, including Amazon, Google and Microsoft, employ object storage as their primary storage.
Object storage delivers limitless scale because — unlike the addressing hierarchy used with traditional file storage — object-based storage employs a flat file system that has no built-in limits. Hence, it’s able to hold huge volumes of unstructured data such as audio, video, emails, health records, and documents. For a visual representation of object storage, watch our overview video:
Object Storage Definition
Object storage is a technology that manages data as objects. All data is stored in one large repository which may be distributed across multiple physical storage devices, instead of being divided into files or folders.
It is easier to understand object-based storage when you compare it to more traditional forms of storage – file and block storage.
File storage stores data in folders. This method, also known as hierarchical storage, simulates how paper documents are stored. When data needs to be accessed, a computer system must look for it using its path in the folder structure.
File storage uses TCP/IP as its transport, and devices typically use the NFS protocol in Linux and SMB in Windows.
Block storage splits a file into separate data blocks, and stores each of these blocks as a separate data unit. Each block has an address, and so the storage system can find data without needing a path to a folder. This also allows data to be split into smaller pieces and stored in a distributed manner. Whenever a file is accessed, the storage system software assembles the file from the required blocks.
Block storage uses FC or iSCSI for transport, and devices operate as direct attached storage or via a storage area network (SAN).
In object storage systems, data blocks that make up a file or “object”, together with its metadata, are all kept together. Extra metadata is added to each object, which makes it possible to access data with no hierarchy. All objects are placed in a unified address space. In order to find an object, users provide a unique ID.
Object-based storage uses TCP/IP as its transport, and devices communicate using HTTP and REST APIs.
Metadata is an important part of object storage technology. Metadata is determined by the user, and allows flexible analysis and retrieval of the data in a storage pool, based on its function and characteristics.
The main advantage of object storage is that you can group devices into large storage pools, and distribute those pools across multiple locations. This not only allows unlimited scale, but also improves resilience and high availability of the data.
According to ESG, large-scale object storage systems should be based on the following architectural principles:
Object storage technology should be easy to use and implement, and require minimal effort for ongoing maintenance. Operations like clustering, healing and tuning should be fully automated.
Data in an object storage system should be accessible via an API, typically an HTTP-based RESTful API. Developers should be able to perform any action on storage pools, programmatically. Applications should be able query objects using their metadata, to find the required objects no matter where they are stored in a large storage pool.
Administrators should be able to choose a variety of use a variety of storage devices and platforms, combining heterogeneous hardware into one storage pool. Object storage should also easily extend from on-premises to the public cloud and vice versa.
4. Cloud-like consumption
An object storage solution, whether based in the cloud or on-premises, should have a way to meter the usage of different parts of the organization, and make it possible to bill each group according to its actual usage.
Object Storage Benefits
Unlike file or block storage, object storage services enable scalability that goes beyond exabytes. While file storage can hold many millions of files, you will eventually hit a ceiling. With unstructured data growing at 50+% per year, more and more users are hitting those limits, or they expect to in the future.
Scale Out Architecture
Object storage makes it easy to start small and grow. In enterprise storage, a simple scaling model is golden. And scale-out storage is about as simple as it gets: you simply add another node to the cluster and that capacity gets folded into the available pool.
HyperStore is an S3-compatible storage system. HyperFile is a connector that allows files to be stored on HyperStore.
While file systems have metadata, the information is limited and basic (date/time created, date/time updated, owner, etc.). Object storage allows users to customize and add as many metadata tags as they need to easily locate the object later. For example, an X-ray could have information about the patient’s age and height, the type of injury, etc.
High Sequential Throughput Performance
Early object storage systems did not prioritize performance, but that’s now changed. Now, object stores can provide high sequential throughput performance, which makes them great for streaming large files. Also, object storage services help eliminate networking limitations. Files can be streamed in parallel over multiple pipes, boosting usable bandwidth.
Flexible Data Protection Options
To safeguard against data loss, most traditional storage options utilize fixed RAID groups (groups of hard drives joined together), sometimes in combination with data replication. The problem is, these solutions generally lead to one-size-fits-all data protection. You can not vary the protection level to suit different data types.
Object storage solutions employ a flexible tool called erasure coding that is similar to old-fashioned RAID in some ways, but is far more flexible. Data is striped across multiple drives or nodes as needed to achieve the needed protection for that data type. Between erasure coding and configurable replication, data protection is both more robust and more efficient.
Support for the S3 API
Back when object storage solutions were launched, the interfaces were proprietary. Few application developers wrote to these interfaces. Then Amazon created the Simple Storage Service, or “S3”. They also created a new interface, called the “S3 API”. The S3 API interface has since become a de-facto standard for object storage data transfer.
The existence of a de facto standard changed the game. Now, S3-compatible application developers have a stable and growing market for their applications. And service providers and S3-compatible storage vendors such as Cloudian have a growing user set deploying those applications. The combination sets the stage for rapid market growth.
Lower Total Cost of Ownership (TCO)
Cost is always a factor in storage. And object storage services offer the most compelling story, both in hardware/software costs and in management expenses. By allowing you to start small and scale, this technology minimizes waste, both in the form of extra headcount and unused space. Additionally object storage systems are inherently easy to manage. With limitless capacity within a single namespace, configurable data protection, geo replication, and policy-based tiering to the cloud, it’s a powerful tool for large-scale data management.
Object Storage vs. File Storage: What’s the Difference?
Object storage and file storage both offer scalable data management. But there are key differences. Read what sets them apart here.
Object Storage vs. Block Storage: What’s the Difference?
Learn about the difference between object storage and file storage and what are the benefits of each, including side by side comparison.
6 Best Practices
Six best practices for getting the most from your deployment. Learn how to get started and what to look for when considering petabyte-scalable enterprise storage.
How Object Storage Protects You From Ransomware
Learn how to protect your data from ransomware.
Analytics: Adding Metadata Labels to S3 Images with TensorFlow
To make data useful for analytics, metadata about the objects sometimes needs to be added. Read a case study of adding and then using metadata of S3 objects with Cloudian’s HyperStore Analytics Platform (HAP).
S3 Compatible Storage Solutions Compared
Today’s emerging on-prem enterprise storage medium is S3 compatible storage. Initially used only in the cloud, S3 compatible storage is now becoming very common in on-prem and private cloud deployments. Learn what the S3 API is, how it is changing the world of enterprise storage, and the key differences between S3 compatible storage solutions