IBM Spectrum Protect with Amazon S3 Cloud Storage

IBM Spectrum Protect (formerly IBM Tivoli Storage Manager) solution provides the following benefits:

  • Supports software-defined storage environments
  • Supports cloud data protection
  • Easily integrates with VMware and Hyper-V
  • Enables data protection by minimizing data loss with frequent snapshots, replication, and DR management
  • Reduce the cost of data protection with built-in efficiencies such as source-side and target-side deduplication

IBM Spectrum Protect has also enhanced its offerings by providing support for Amazon S3 cloud storage (version 7.1.6 and later) and IBM Spectrum Protect version 7.1.6 was just released on June 17th, 2016. I was actually a little nervous and excited at the same time. Why? Because Cloudian HyperStore has a S3 guarantee. What better way to validate that guarantee than by trying a plug-and-play with a solution that has just implemented support for Amazon S3?

Overview of IBM Spectrum Protect with Amazon S3 cloud storage

And the verdict? Cloudian HyperStore configured as “Cloud type: Amazon S3” works right off the bat with IBM Spectrum Protect. You can choose to add a cloud storage pool from the V7.1.6 Operations Center UI or use the Command Builder. The choice is yours.

We’ll look at both the V7.1.6 Operations Center UI and the Command Builder to add our off-premise cloud storage.

NOTE: Cloudian HyperStore can be deployed as your on-premise S3 cloud storage but it has to be identified as an Amazon S3 off-premise cloud storage and you have to use a signed SSL certificate.

Here’s how you can add an Amazon S3 cloud storage or a Cloudian HyperStore S3 cloud storage into your IBM Spectrum Protect storage pool:

From the V7.1.6 Operations Center UI

 

From the V7.1.6 Operations Center console, select “+Storage Pool”.

Adding 'Storage Pool' to the IBM Spectrum Protect V7.1.6 Operations Center console

In the “Add Storage Pool:Identity” pop-up window, provide the name of your cloud storage and the description. In the next step of the “Add Storage Pool:Type”, select “Container-based storage:Off-premises cloud”.

IBM Spectrum Protect cloud storage description

Click on “Next” to continue. The next step in the “Add Storage Pool:Credentials” page is where it gets exciting. This is where we provide the information for:

  • Cloud type: Amazon S3 (Amazon S3 cloud type is also used to identify a Cloudian HyperStore S3)
  • User Name: YourS3AccessKey
  • Password: YourS3SecretKey
  • Region: Specify your Amazon S3 region (for Cloudian HyperStore S3, select “Other”)
  • URL: If you had selected an Amazon S3 region, this will dynamically update to the Amazon region’s URL. If you are using a Cloudian HyperStore S3 cloud storage, input the S3 Endpoint Access (HTTPS).

Complete the process by clicking on “Add Storage Pool”.

IBM Spectrum Protect

NOTE: Be aware that there is currently no validation performed to verify your entries when you click on “Add Storage Pool”. Your S3 cloud storage pool will be created. I believe the IBM Spectrum Protect group is addressing this with a validation process for the creation of a S3 cloud storage pool. I hope the step-by-step process that I have provided will help minimize errors with your Amazon S3 cloud storage pool setup.

From the V7.1.6 Operations Center Command Builder

 

From the V7.1.6 Operations Center Command Builder, you can use the following define stgpool command and you are done adding your off-premise S3 cloud storage pool:

  • define stgpool YourCloudName stgtype=cloud pooltype=primary cloudtype=s3 cloudurl=https://s3.cloudianstorage.com:443 access=readwrite encrypt=yes identity=YourS3AccessKey password=YourS3SecretKey description=”Cloudian”

NOTE: You can review the server instance dsmffdc log if there’s errors. It is located in the server instance directory. There’s also a probability that the signed SSL certificate might not be correct.

For example:

06-20-2016 11:58:26.150][ FFDC_GENERAL_SERVER_ERROR ]: (sdcloud.c:3145) com.tivoli.dsm.cloud.api.ProviderS3 handleException com.amazonaws.AmazonClientException Unable to execute HTTP request: com.ibm.jsse2.util.h: PKIX path building failed: java.security.cert.CertPathBuilderException: unable to find valid certification path to requested target
[06-20-2016 11:58:26.150][ FFDC_GENERAL_SERVER_ERROR ]: (sdcntr.c:8166) Error 2903 creating container ibmsp.a79378e1333211e6984b000c2967bf98/1-a79378e1333211e6984b000c2967bf98
[06-20-2016 11:58:26.150][ FFDC_GENERAL_SERVER_ERROR ]: (sdio.c:1956) Did not get cloud container. rc = 2903

 

Importing A Signed SSL Certificate

 

You can use the IBM Spectrum Protect keytool –import command to import the signed SSL certificate. However, before you perform the keytool import process, make a copy of the original Java cacerts.

The Java cacerts is located in IBM_Spectrum_Protect_Install_Path > TSM > jre > security directory.

You can run the command from IBM_Spectrum_Protect_Install_Path > TSM > jre > bin directory.
For example, on Windows:

    • ./keytool –import ../lib/security/cacerts –alias Cloudian –file c:/locationofmysignedsslcert/admin.crt

 

Enter the keystore password when prompted. If you haven’t updated your keystore password, the default is changeit and you should change it for production environments. When you are prompted to “Trust this certificate?”, input “yes”.

NOTE: Keep track of the “Valid from: xxxxxx” of your signed SSL certificate, you will have to import a new certificate when the current one expires.

By the way, if you encounter error “ANR3704E sdcloud.c(1636): Unable to load the jvm for the cloud storage pool on Windows 2012R2”, update the PATH environment variable on the Spectrum Protect Server:
IBM_Spectrum_Install_Path\Tivoli\TSM\jre\bin\j9vm and also set the JVM_LIB to jvm.dll.

Here’s what your Amazon S3 cloud storage type looks like from IBM Spectrum Protect V7.1.6 Operations Center console:

Operations Center console final result after adding Amazon S3 cloud storage to IBM Spectrum Protect V7.1.6

And you’re off! If you encounter any issues during this process, feel free to reach out to our support team.

You can also learn more by downloading our solution brief.

How-To: S3 Your Data Center

As the Storage Administrator or a Data Protection Specialist in your data center, you are likely looking for some alternative storage solution to help store all your big data growth needs. And with all that’s been reported by Amazon (stellar growth, strong quarterly earnings report), I am pretty sure their Simple Storage Service (S3) is on your radar. S3 is a secure, highly durable and highly scalable cloud storage solution that is also very robust. Here’s an API view of what you can do with S3:

S3 API view

As a user or developer, you can securely manage and access your bucket and your data, anytime and anywhere in the world where you have web access. As a storage administrator, you can easily manage and provision storage to any group and any user on always-on, highly scalable cloud storage. So if you are convinced that you want to explore S3 as a cloud storage solution, Cloudian HyperStore should be on your radar as well. I believe a solution that is easy to deploy and use helps accelerates the adoption of the technology. Here’s what you will need to deploy your own cloud storage solution:

  • Cloudian’s HyperStore Software – Free Community Edition
  • Recommended minimum hardware configuration
    • Intel-compatible hardware
    • Processor: 1 CPU, 8 cores, 2.4GHz
    • Memory: 32GB
    • Disk: 12 x 2TB HDD, 2 x 250GB HDD (12 drives for data, 2 drives for OS/Metadata)
    • RAID: RAID-1 recommended for the OS/Metadata, JBOD for the Data Drives
    • Network: 1x1GbE Port


You can install a single Cloudian HyperStore node for non-production purposes, but it is best practice to deploy a minimum 3-node HyperStore cluster so that you can use logical storage policies (replication and erasure coding) to ensure your S3 cloud storage is highly available in your production cluster. It is also recommended to use physical servers for production environments.

Here are the steps to set up a 3-node Cloudian HyperStore S3 Cluster:

  1. Use the Cloudian HyperStore Community Edition ISO for OS installation on all 3 nodes. This will install CentOS 6.7 on your new servers.
  2. Log on to your servers
    1. The default root password is password (Update your root access for production environments)
  3. Under /root, there are 2 Cloudian directories:
    1. CloudianTools
      1. configure_appliance.sh allows you to perform the following tasks:
        1. Change the default root password
        2. Change time zone
        3. Configure network
        4. Format and mount available disks for Cloudian S3 data storage
          1. Available disks that were automatically formatted and mounted during the ISO install for S3 storage will look similar to the following /cloudian1 mount:
            Format and mount available disks for Cloudian S3 data storage
    2. CloudianPackages
      1. Run ./CloudianHyperStore-6.0.1.2.bin cloudian_xxxxxxxxxxxx.lic to extract the package content from one of your nodes. This will be the Puppet master node.
        S3 Puppet master mode
      2. Copy sample-survey.csv survey.csv
        sample-survey.csv
      3. Edit the survey.csv file
        Edit survey.csv
        In the survey.csv file, specify the region, the node name(s), IP address(s), DC, and RAC of your Cloudian HyperStore S3 Cluster.

        NOTE: You can specify an additional NIC on your x86 servers for internal cluster communication.

      4. Run ./cloudianInstall.sh and select “Install Cloudian HyperStore”. When prompted, input the survey.csv file name. Continue with the setup.
        NOTE: If deploying in a non-production environment, it is possible that your servers (virtual/physical) may not have the minimum resources or a DNS server. You can run your install with ./cloudianInstall.sh dnsmasq force. Cloudian HyperStore includes an open source domain resolution utility to resolve all HyperStore service endpoints.
      5. v. In the following screenshot, the information that we had provided in the survey.csv file is used in the Cloudian HyperStore cluster configuration. In this non-production setup, I am also using a DNS server for domain name resolution with my virtual environment.Cloudian HyperStore cluster configuration
      6. Your Cloudian HyperStore S3 Cloud Storage is now up and running.
        Cloudian HyperStore S3 cloud storage
      7. Access your Cloudian Management Console. The default System Admin group user ID is admin and the default password is public.
        Cloudian Management Console
      8. Complete the Storage Policies, Group, and SMTP settings.
        Cloudian HyperStore - near final

Congratulations! You have successfully deployed a 3-node Cloudian HyperStore S3 Cluster.

Can Scale-Out Storage Also Scale-Down?

Private cloud storage can scale-out to meet the demands for additional storage capacity, but can it scale-down to meet the needs of small and medium-sized organizations who don’t have petabytes of data?

The answer is, yes it can, and you should put cloud storage vendor claims to the test before making your decision to build a private storage cloud.

Scale-out cloud storage

The Importance of Scale-Down Private Cloud Storage

A private storage cloud that can cost-efficiently store and manage data on a smaller scale is important if you don’t need petabyte-capacity to get started. A petabyte is a lot of data. It is equivalent to 1000 terabytes. If you have 10 or 100 terabytes of data to manage and protect, a scale-down private storage cloud is what you need to do that. And in the future, when you need additional storage capacity, you must be able to add it without having to rip-and-replace the storage you started with.

Key Characteristics of Scale-Down Private Cloud Storage

The characteristics of scale-down, private cloud storage make it attractive for organizations with sub-petabyte data storage requirements.

It's important for storage to be both scale-out and scale-down

You can start with a few storage servers and grow your storage capacity using a mix of storage servers and storage capacities from different manufacturers. A private storage cloud is storage server hardware agnostic so you can buy what you need when you need it.

Peer-to-Peer Architecture in Scale-Down Private Cloud Storage

Scale-down, private cloud storage should employ a “peer-to-peer” architecture, which means the same software elements are running on each storage server.

A “peer-to-peer” storage architecture doesn’t use complex configurations that require specialized and/or redundant servers to protect against a single point of failure. Complexity is not a good thing in data storage. After all, why would you choose a private cloud storage solution that is too complex for your needs?

Ease of Use and Management

Scale-down, private cloud storage should also be easy-to-use and easy-to-manage.

Easy-to-use means simple procedures to add, remove or replace storage servers. It also means using storage software with built-in intelligence that can protect your data and keep it accessible without a lot of fine tuning or tinkering to do it.

Easy-to-manage means you don’t need a dedicated storage administrator to keep your private cloud storage cluster running. An in-house computer systems administrator can do it or you can hire out administration to a managed services provider who can do it remotely.

Determining the Right Storage Size for Your Needs

So just how small is small when it comes to building your own private cloud storage? Small is a relative term, but a practical minimum from a hardware perspective would be about 10 terabytes of usable storage. There is nothing hard and fast about starting with 10 terabytes of usable storage, but once you start moving data into your private storage cloud, you should have an amount of usable storage that is appropriate for the uses you have in mind.

Choosing the Right Private Cloud Storage Vendor

If you have never built your own private cloud storage, you will need to determine which private storage cloud vendor has a simple, easy-to-use and easy-to-manage, private cloud storage solution that will work for you.

Conducting a Proof-of-Concept (POC)

The best way to help you make your decision is to conduct a Proof-of-Concept (POC) to determine which vendor will best meet your requirements for private cloud storage. Every vendor will tell you how easily their cloud storage scales out, but they may not mention if it can easily scale-down to meet the needs of organizations with sub-petabyte data storage requirements.

A Proof-of-Concept is not a whiteboard exercise or a slide presentation. A POC is done by having vendors showing you how their storage software running on their storage hardware or your storage hardware works. A vendor who cannot commit to a small-scale POC may not be a good fit for your requirements.

Consideration of Vendor Ecosystems and Compatibility

The applications you plan to use with your private storage cloud should also be included in your POC. If you are not writing your own applications, then it is important to consider the size of the application “ecosystem” supported by the storage vendors participating in your POC.

After ten years in the public cloud storage business, Amazon Web Services (AWS) has the largest “ecosystem” of third-party applications written to use their Simple Storage Service (S3). The AWS S3 Application Programming Interface (API) constitutes a de facto standard that every private storage cloud vendor supports to a greater or lesser degree, but only Cloudian guarantees that applications that work with AWS S3 will work with Cloudian HyperStore. The degree of AWS S3 API compliance among storage vendors is something you can test during your POC.

The Value of a Proof-of-Concept for Private Cloud Storage

Running a POC will cost you some time and money, but it is a worthwhile exercise because storage system acquisitions have meaningful implications for your data. It is worth spending a small percentage of the acquisition cost on a POC in order to make a good decision.

The Future of Software-Defined Private Cloud Storage

The future of all data storage is being defined by software. Storage software running on off-the-shelf storage server hardware defines how a private storage cloud works. A software-defined private storage cloud gives you the features and benefits of large public cloud storage providers, but does it on your premises, under your control, and on a scale that meets your requirements. Scale-down private cloud storage is useful because it is where many small and medium-sized organizations need to start.

Tim Wessels is the Principal Consultant at MonadCloud LLC, which designs and builds private cloud storage for customers using Cloudian HyperStore. Tim is a Cloudian Certified Engineer and MonadCloud is a Preferred Cloudian Reseller Partner. You can call Tim at 978.413.0201, email [email protected], tweet @monadcloudguy, or visit http://www.monadcloud.com

YOU MAY ALSO BE INTERESTED IN:

Object Storage vs File Storage: What’s the Difference?

 

This is a guest post by Tim Wessels, the Principal Consultant at MonadCloud LLC.

Harness the Power of Software-Defined Infrastructure to Help Solve the Biggest Big Data Storage Challenges

With the popularity of rich media, the proliferation of mobile devices and the digitization of content, there has been an exponential growth in the amount of unstructured data that IT is managing (think medical images or large research data sets).  And this growth is not slowing but increasing. This unprecedented growth is just not sustainable for IT organizations as they try to control costs and limit or reduce the operational complexity in the datacenter.  Enter the need for an evolution in IT – for Software-Defined Infrastructure, and software-defined storage solutions to help store and manage the Big Data tidal wave.

 

Intel and Cloudian understand the powerful benefits behind software-defined infrastructure, and the real value it delivers to businesses and to the IT organizations behind them. We have been working together to deliver the true benefits of SDI to our joint customers with the world’s largest Big Data storage needs. From enterprises in Life Sciences, Media & Entertainment, and Financial Services, to Service Providers like ScaleMatrix, who serve their customers with new, innovative hosted cloud storage solutions, we help them store and manage petabyte-scale data, easily, securely, and cost-effectively.

Cloudian develops 100-percent software-defined object storage software.  We’ve joined Intel’s Storage Builders alliance, an elite group of independent software vendors focused on delivering on the benefits of software-defined infrastructure with innovative software-defined storage solutions. We’re pleased to join the larger Intel’s Builders family of Independent Software Vendors (ISVs), operating system vendors (OSVs), and original equipment manufacturers (OEMs) to support the group’ mission to accelerate global adoption of software-defined infrastructure, including cloud, storage, and networking technologies. We’ve contributed Cloudian’s validated software-defined object storage designs, along with Cloudian HyperStore – our scale-out, ready-to go, object storage software – to deliver customers all the benefits of software-defined storage: scalability, agility, and the choice to build on-premise or hybrid clouds at the lowest cost. We’ve qualified the Intel Xeon processor D-1500 product family for use with Cloudian HyperStore. We’ve also developed a comprehensive reference architecture for Cloudian HyperStore powered by Lenovo hardware.

Customers love the flexibility and scalability of software-defined storage. They can quickly deploy HyperStore for on-premise cloud storage that’s easy to manage but that also allows choice to automatically tier some or all of their data to any S3-compatible cloud in a hybrid model. Global enterprises use HyperStore to easily and securely manage their data stored across multiple datacenters, and Service Provider businesses take advantage of HyperStore’s software-defined architecture to deliver Storage as a Service (StaaS) product offerings to their customers. They are able to quickly deploy and manage these services while easily making updates to continuously deliver new features to their customers.

Together, Cloudian, Intel, and other leading vendors, including Lenovo, are helping organizations embrace the business value of software-defined infrastructure.  We put the power of software-defined object storage in the hands of our joint customers to make their IT organizations, and their infrastructures strong, agile, and ready to meet the demands of their Big Data.

Learn more about Cloudian and Intel Storage Builders.

Cloudian HyperStore Integration with Symantec NetBackup

Starting with Symantec NetBackup 7.7, administrators will find an exciting new feature for cloud storage backup: Cloudian HyperStore®. The NetBackup Cloud Storage Connector enables the NetBackup software to back up data to and from Cloudian HyperStore straight out of the box without additional software installations or plugins. HyperStore is an option in the “Cloud Storage Server Configuration Wizard”. Users can simply add their S3 account information such as endpoint, access key, and secret key to begin the process of backing up their data to Cloudian HyperStore storage.

cloudian hyperstore 4000

Cloudian HyperStore and Symantec NetBackup together deliver the following benefits:

  • Enterprise-level backup
  • Complete integrated data center solution: computing, networking, and storage
  • Reduced total cost of ownership (TCO) that continues to improve as the solution scales out
  • Operational efficiency
  • Agility and scalability with the scale-out architectures of Cloudian HyperStore
  • Complete Amazon Simple Storage Service (S3) API–compatible geographically federated object storage platform
  • Enterprise-class features: multi-tenancy, quality of service (QoS), and dynamic data placement in a completely software-defined package
  • Policy-based tiering between on-premises hybrid cloud storage platform and any S3 API–compliant private or public cloud
  • Investment protection: mix and match different generations and densities of computing platforms to build your storage environment; more than 400 application vendors support S3

The seamless integration allows IT Departments to manage cloud storage for backup and recovery as easily as on-premise storage, but with lower costs. Finally, this integrated solution helps deliver an automated and policy-based backup and recovery solution. Organizations can also leverage the cloud as a new storage tier or as a secondary off-site location for disaster recovery.

For more information, please see the Symantec NetBackup and Cloudian HyperStore Solution Brief.

 

Next Generation Storage: integration, scale & performance

Guest Blog Post by Colm Keegan from Storage Switzerland

Various industry sources estimate that data is doubling approximately every two years and the largest subset of that growth is coming from unstructured data. User files, images, rich multimedia, machine sensor data and anything that lives outside of a database application can be referred to collectively as unstructured data.

Storage Scaling Dilemma

3d-man-growing-data-centerThe challenge is that traditional storage systems, which rely on “scale-up” architectures (populating disk drives behind a dual controller system) to increase storage capacity, typically don’t scale well to meet the multi PB data growth which is now occurring within most enterprise data centers. On the other hand, while some “scale-out” NAS systems can scale to support multiple PB’s of storage within a single filesystem, they are often not a viable option since adding storage capacity to these systems often requires adding CPU and memory resources at the same time – resulting in a high total cost of ownership.

Commoditized Storage Scaling

Businesses need a way to cost effectively store and protect their unstructured data repositories utilizing commodity, off the shelf storage resources and/or low cost cloud storage capacity. In addition, these repositories need to be capable of scaling massively to support multiple PB’s of data and enable businesses to seamlessly share this information across wide geographical locations. But in addition to storage scale and economy, these resources should also be easy to integrate with existing business applications. And ideally, they should be performance optimized for unstructured data files.

Software Driven Capacity

Software defined storage (SDS) technologies are storage hardware agnostic solutions which allow businesses to use any form of storage to build-out a low cost storage infrastructure. Internal server disk, conventional hard disk drives inside a commodity disk array or even a generic disk enclosure populated with high density disk can be used. Likewise, with some SDS offerings, disk resources in the data center can be pooled with storage in secondary data center facilities located anywhere in the world and be combined with cloud storage to give businesses a virtually unlimited pool of low-cost storage capacity.

Plug-and-Play Integration

From an integration perspective, some of these solutions provide seamless integration between existing business applications and cloud storage by providing native support for NFS and CIFS protocols. So instead of going through the inconvenience and expense of re-coding applications with cloud storage compatible API’s like REST, SWIFT or Amazon’s S3 protocol, these technologies essentially make a private or hybrid cloud data center object storage deployment a plug-and-play implementation but still provide the option to go “native” in the future.

Tapping Into Cloud Apps

But storage integration isn’t just limited to on premise applications, it also applies to cloud based applications as well. Today there is a large ecosystem of Amazon S3 compatible applications that businesses may want to leverage. Examples include backup and recovery, archiving, file sync and share, etc. Gaining access to these software offerings by utilizing an S3 compatible object storage framework, gives businesses even more use cases and value for leveraging low-cost hybrid cloud storage.

Data Anywhere Access

Now businesses can provision object storage resources on-premises and/or out across public cloud infrastructure to give their end-users ubiquitous access to data regardless of their physical location. This enables greater data mobility and can serve to enhance collaborative activities amongst end-users working across all corners of the globe. Furthermore, by replicating data across geographically dispersed object storage systems, businesses can automatically backup data residing in remote offices/branch offices to further enhance data resiliency.

With data intensive applications like big data analytic systems and data mining applications clamoring for high speed access to information, object storage repositories need to be capable of providing good performance as well. Ideally, the storage solution should be tuned to read, write and store large objects very efficiently while still providing high performance.

 Stem The Data Tide

Businesses today need a seamless way to grow out low-cost abundant, hybrid cloud storage resources across the enterprise to meet the unstructured data tsunami that is flooding their data center environments. In addition to providing virtually unlimited scaling from a storage capacity perspective, these resources need to easily integrate into existing application environments and provide optimal performance access to large unstructured data objects. Cloudian’s HyperStore solution provides all of these capabilities through a software defined storage approach which gives businesses the flexibility to choose amongst existing commodity disk assets in the data center and/or low cost object storage in the cloud, to help stem the unstructured data tide.

 

About Author

Colm Keegan is a 23 year IT veteran, Colm’s focus is in the enterprise storage, backup and disaster recovery solutions space at Storage Switzerland.

What is Hybrid Cloud Storage and What Can it Do For You?

IT departments today are forced to come up with innovative ways to deal with the amount of unstructured data that they must manage. Now many of these places are looking to merge the flexibility and scale of the cloud with the security and control of their on-premises IT environment by using a hybrid cloud storage solution. However, until recently, the hybrid cloud storage solution was a mere dream.

CloudianHybridCloud

With the release of the Hyperstore 4.0, Cloudian has transcended this split data system and created a cost-effective data storage solution that begins in the on-premises IT environment, but also integrates with the Amazon cloud infrastructure.

Basic Benefits of Using Hybrid Cloud Storage

Moving towards a hybrid cloud storage system such as the newly developed Cloudian Hyperstore, companies are now able to reap a number of previously unattainable benefits including:

–          Reducing storage acquisition costs

–          Reducing cost of managing storage environment

–          Enjoying Amazon S3 compatible and complementary applications

–          Easily expanding from small (Terabytes) to Petabytes as unstructured data grows

–          Balancing SLAs (service level agreements) using S3 bucket lifecycle policies

 

Cloudian Hyperstore Infrastructure

 

Cloudian Hybrid Cloud

The Cloudian Hyperstore is a hybrid cloud storage software solution for service providers and enterprises. It is fully Amazon S3 compliant and can be seamlessly integrated. It begins with an on-premises data solution and integrates with a cloud data solution for the best of both worlds, particularly for those with large growing amounts of unstructured data.

Hybrid Cloud Storage with Cloudian Hyperstore and Amazon S3

 

Simone Morellato Director of Technical & Solutions Marketing

The Right S3-Compatible Storage System: Choose Wisely

Applications are driving the escalating need for cloud storage and the public cloud providers saw it coming years ago. Amazon’s foresight dates back to early 2006 when they launched the Amazon Simple Storage Service (S3), a massively scalable and cost effective cloud storage solution developed specifically to house the massive influx of data being created by organizations worldwide. We have since witnessed several milestones for the public cloud giant including the number of objects stored in S3 grow to over 2 trillion, 40 price drops on the service, and the development of a burgeoning ecosystem of over 350+ compatible applications. It’s clear that Amazon has established itself as the dominant leader in public cloud storage. Now hybrid cloud storage is getting its turn in the limelight. For the enterprise, flexibility, control, and resilience are at the forefront of their concerns and hybrid cloud is rapidly shaping up to be the solution of choice for their storage needs. In fact, a recent Rackspace survey indicated that over 60% of the enterprise IT departments surveyed planned to deploy hybrid cloud over the next 3 years.

Amazon S3 holds twice the market share of all its closest competitors combined, so it’s likely that the storage platform of choice for on-premise hybrid or private cloud deployments largely depends on its compatibility with S3. With no standards enforced for claiming S3 compatibility, choosing the right storage platform can be tenuous.

So what does it mean to be S3 compatible? And why does it matter?

Why it matters…

If you have applications that speak to S3 and you are looking to deploy a hybrid cloud storage solution that allows for applications to simply switch between storage targets, then compatibility matters. If you utilize any of the 350+ applications that speak S3 and you want them to operate seamlessly in an on-premise or hybrid cloud environment, compatibility matters. If you have written or plan to write applications that utilize S3 and want them to continue to run without having to rewrite their APIs, then compatibility matters. If ease of migration, cost efficiency, and TCO are important to you, then compatibility matters.

Compatibility with the S3 API can mean the difference between a good decision and costly and time intensive mistake. Understanding the S3 API and the varying levels of compatibility storage platforms offer can seriously impact the outcome for those of you that plan to stand up hybrid or on-premise clouds.

The S3 API

To make S3 simple for applications to access, Amazon built and continues to refine the operations available to applications through the S3 API. The API enables operations to be performed on the S3 Service, S3 Buckets, and S3 Objects. There are 51 total operations available through it. Compatibility is based on a storage platform’s ability to perform some, many, or all of the 51 operations available through S3 API.

S3 Compatibility (1)

Simple Compatibility

There are 9 simple operations available through the S3 API. The “Simple” subset of operations should act as the barebones set of operations that a storage platform must perform in order to claim compatibility. These 9 operations allow for very basic manipulation of data through the API, though it is important to remember that 42 additional operations still remain that are not being executed by a storage platform falling under this category, even though it may boast “S3 compatibility”. The chart below shows the 9 “Simple” operations:

Screen Shot 2014-04-24 at 2.57.08 PM

An example of a storage platform that can claim simple compatibility is SwiftStack. Using middleware called Swift3, SwiftStack can perform the above operations through the S3 API.

Moderate Compatibility

If an application requires just one more operation to be performed through the S3 API other than the 9 “Simple” operations listed above AND you don’t want to have to rewrite your application APIs, then you need to look to storage platforms that carry a bit more robust compatibility. The chart below shows the 9 simple operations (in red) with the addition of 18 moderately complex operations (in yellow). In order to be considered moderately compatible, a storage platform should be able to perform the majority of the 27 operations listed below:

Screen Shot 2014-04-24 at 2.57.21 PM

 

One such storage platform is the open-source unified storage platform, Ceph. Based on the above chart, it can boast “Moderate” S3 compatibility, though robust as it is, even it cannot perform 100% of the operations listed above.

Advanced Compatibility

For those of you that desire peace of mind, knowing that the applications you have written that speak S3 and/or all 350+ S3 compatible applications will continue to work seamlessly with your hybrid or on-premise cloud, then choosing a storage platform that boasts advanced compatibility with the S3 API is vital. Of the 51 operations available through the S3 API, 24 of them are considered advanced. Below is the total set of operations available through the API including simple (red), moderate (yellow), and advanced (green):

Cloudian is the only storage platform that can boast full “Advanced Compatibility”. Additionally, Cloudian is the only storage platform in the “Advanced Compatibility” tier that allows developers continued use of Amazon’s S3 SDK, which significantly eases their workload. Finally, Cloudian is the only storage platform that automatically tiers data between on-premise deployments and Amazon’s S3 public cloud while representing it under a single name space. With this set of advanced functionality, Cloudian stands by its claim as a bug-for-bug match with S3 for on-premise and hybrid cloud deployments.

Too Long; Didn’t Read…

When looking at deploying an open-hybrid cloud and/or moving data between S3 and your private cloud, it is of utmost importance that you understand the level of compatibility your storage platform claims versus its compatibility in reality. S3 is not going anywhere and if you’re reading this, there is a good chance you currently use or plan to use it, either exclusively or in a hybrid cloud environment. Choosing the right storage platform for your hybrid or private cloud can save you tons of money and shave months off of your time to deploy. Compatibility matters.

So, choose wisely.

Steven Walchek, Business Development