Site icon Cloudian

How to Deploy Cloudian S3-compatible Storage with PyTorch

pytorch workflowMachine learning workflows can require significant storage capacity, particularly when imaging data is in use.

Now PyTorch users have an easy way to deploy limitless storage capacity on-prem. The Cloudian contribution to the PyTorch Amazon S3 Connector repository allows PyTorch users to connect to Cloudian HyperStore S3-compatible storage, providing local capacity that is secure and exabyte-scalable.

By enabling direct access to a cost-effective, scalable data repository, Cloudian is simplifying the ML process, reducing both complexity and costs associated with data analysis.

Here are the steps to connect your Cloudian HyperStore object storage system to your PyTorch projects.

Getting Started

Prerequisites

Installation

Configuration

To use s3torchconnector, AWS credentials must be provided through one of the following methods:

Example with Cloudian Endpoint

The easiest method to utilize the S3 Connector for PyTorch involves creating a dataset, which can be either map-style or iterable-style. This is achieved by defining an S3 URI (comprising a bucket and, optionally, a prefix) along with specifying the region where the bucket resides and the custom S3 endpoint url:

In addition to data loading primitives, the S3 Connector for PyTorch also provides an interface for saving and loading model checkpoints directly to and from an S3 bucket.

Conclusion

In conclusion, integrating PyTorch with on-premises S3 storage powered by Cloudian presents a powerful solution for organizations seeking efficient and scalable deep learning workflows. By leveraging PyTorch’s robust framework alongside Cloudian’s reliable storage infrastructure, users can seamlessly train their models while securely storing and accessing data within their own premises.

This setup not only ensures data privacy and compliance but also optimizes performance and reduces latency by keeping data close to compute resources. As deep learning continues to drive innovation across industries, the combination of PyTorch and Cloudian’s S3 storage offers a compelling platform for organizations to unlock the full potential of their data and accelerate their AI initiatives.

The enhanced S3 connector is available from the GitHub repositories of AWS Labs and Cloudian.

View a demonstration of this installation process here:

Learn more at cloudian.com

Or, sign up for a free trial

Click to rate this post!
[Total: 2 Average: 5]
Exit mobile version