Storage Innovations in AI Workloads

Posted by Peter Sjoberg on June 17, 2025

At a recent NVIDIA GTC conference panel session focused on storage for AI workloads, the room was packed with AI innovators eager to understand how the storage industry is helping to drive the AI revolution. At Cloudian, we’ve been laser-focused on evolving our approach specifically for AI workloads over the past few years.

While Cloudian is relatively new in AI, being new has its advantages. We’re not wrestling with legacy architectures that were designed for traditional workloads and now need retrofitting for GPU clusters. We get to ask different questions: not “how do we make our existing systems work with GPUs?” but rather “what would storage look like if we designed it from scratch for AI?”

View the session here:

The S3-RDMA Breakthrough Moment

Object storage over RDMA, also known as “S3 over RDMA,” is changing everything, I told the room. And I meant every word. The audience of engineers understood immediately why this combination is revolutionary.

For years, object storage and high-performance computing lived in separate worlds. Object storage was for scale and cost-effectiveness; high-performance storage was for throughput and low latency. But AI workloads demand both—the ability to store massive datasets economically while feeding hungry GPUs at wire speed.

RDMA networking provides the ultra-low latency and high bandwidth that GPUs crave, bypassing traditional networking bottlenecks. When you combine that with Cloudian HyperStore’s proven alignment to the S3 API, scalability, and enterprise-friendly management model, you get something that wasn’t possible before: object storage that can actually exceed the demands of modern AI – for all AI workloads.

Our combination of speed and exabyte scale enables entirely new categories of AI applications.

The Jensen Factor

The panel took an interesting turn when we started discussing our relationship with NVIDIA. One of my fellow panelists captured it perfectly: “Jensen said it best—NVIDIA builds brains and brains need fuel, and [storage companies] are the fuel to the brain.”

This wasn’t just a clever soundbite; it represented a fundamental shift in how we think about our role in the AI ecosystem. We’re not just storage vendors anymore—we’re enablers of AI innovation. Every millisecond of latency we eliminate, every bottleneck we remove, directly translates to faster training times, more efficient inference, and ultimately, better AI outcomes.

But here’s what struck me during that discussion: NVIDIA doesn’t just innovate in isolation. As one panelist noted, “NVIDIA makes it easy to work with them. This is an engineering conference for engineers, and NVIDIA puts those people in the room to talk to us about things.” The fact that NVIDIA engineers were literally in our audience, asking questions, demonstrated this collaborative approach.

Of course, this collaboration comes with challenges. “Things are moving fast and things are going to be broken along the way,” another panelist acknowledged. The pace isn’t just fast—it’s “exponential” rather than incremental, as one engineer described it. But that’s the price of being part of an industry transformation this significant.

Multi-Modal Reality Check

When the discussion turned to real-world deployments, the complexity of modern AI became clear. It’s not just about training large language models anymore. Organizations are building systems that need to handle video, images, audio, and traditional files simultaneously.

The example of searching for “cats” across all these different data types got some chuckles from the engineering audience, but it perfectly illustrated the challenge. Modern AI doesn’t just process one type of data—it synthesizes insights across multiple modalities. This means storage systems need to efficiently handle completely different access patterns within the same platform.

Video streams need massive sequential throughput. Image processing often requires random access patterns. Traditional file operations demand low-latency metadata performance. And increasingly, applications need to search and correlate across all these data types simultaneously.

Our object storage foundation, enhanced with intelligent tiering and caching, positions us well for this multi-modal reality. But more importantly, being new to the space means we designed with these requirements in mind rather than trying to bolt them onto existing architectures.

Security Isn’t Optional Anymore

One of the more sobering moments in the panel came when we discussed encryption. “Encryption is not a choice anymore,” one panelist stated flatly, and the room nodded in agreement. The days of treating security as a performance trade-off are over.

This hits particularly hard in AI deployments, where training data often contains sensitive information, proprietary algorithms represent significant intellectual property, and compliance requirements are becoming increasingly stringent. In addition, there is an ever present need to guarantee that data used to train or augment AI is unchanged from its origin. This, along with the requirement to audit precisely who has had access to data and when lend itself to increasing demands on security.

We’ve learned that customers won’t accept “fast but insecure” anymore—they need both.

The Deployment Reality

Toward the end of the panel, we got into the nitty-gritty of real-world deployments. When asked about the percentage split between GPU, CPU, and mixed environments, the answers revealed just how varied the AI landscape really is.

Training workloads tend to be GPU-heavy, as expected. But the broader enterprise AI picture includes significant mixed environments. Inference workloads, in particular, often benefit from hybrid approaches where different tasks run on the most appropriate hardware.

This reality has shaped our platform design philosophy. Rather than optimizing exclusively for one type of workload, we focus on efficiently serving diverse computational patterns. Sometimes that means feeding data to GPU clusters at maximum throughput; other times it means providing low-latency access for CPU-based analytics.

One panelist’s honest admission—”I don’t know if I can give a proportion… maybe 5 to 10% GPUs versus CPUs”—followed by good-natured ribbing about being “a real engineer” who “has no idea,” perfectly captured the experimental nature of many current deployments. We’re all still figuring this out together.

Engineering at Warp Speed

Perhaps the most revealing moment came when we discussed the pace of innovation itself. One panelist described how their engineering teams are “putting out things so quickly, it’s mind-boggling,” even for someone who’s been at the company for years and is “trying to keep pace with all the innovations.”

This acceleration isn’t unique to any one company—it’s become the new normal across the entire AI infrastructure ecosystem. Features that might have taken years to develop are now shipping in months. The traditional software development lifecycle has been compressed into something almost unrecognizable.

For Cloudian, this means embracing rapid iteration and maintaining the engineering agility that comes with our focused approach. We can’t match NVIDIA’s scale, but we can match their commitment to innovation and speed.

What Comes Next

As the panel wound down, the forward-looking perspective was both exciting and slightly daunting. “A year from now, we’ll have whole new topics of conversation and things we weren’t discussing this year,” one panelist predicted. “Jensen will tell us what we’re gonna work on tomorrow.”

That last comment got the biggest laugh of the session, but it also captured a fundamental truth about our industry right now. We’re all essentially building the future in real-time, responding to requirements that didn’t exist a year ago and preparing for use cases that haven’t been invented yet.

For storage companies like Cloudian, this represents both a massive challenge and an unprecedented opportunity. Our relatively fresh perspective means we’re not constrained by legacy assumptions about how storage “should” work. We can adapt quickly to whatever comes next, building solutions that enterprises actually need rather than what storage vendors have traditionally provided.

As I left that GTC panel session, one thing was crystal clear: the AI revolution isn’t slowing down—it’s accelerating. The question isn’t whether we can keep up with NVIDIA’s pace of innovation, but whether we can position ourselves to help our customers succeed in whatever comes next. After all, even the most advanced AI brain is only as good as the fuel that powers it.

And sometimes, that fuel needs to be delivered by the new kid who’s willing to rebuild everything from scratch.

Learn more at cloudian.com

Or, sign up for a free trial

Storage Innovations in AI Workloads

The S3-RDMA Breakthrough Moment

The Jensen Factor

Multi-Modal Reality Check

Security Isn’t Optional Anymore

The Deployment Reality

Engineering at Warp Speed

What Comes Next

Categories

Get Started With Cloudian Today

Request a Demo

Download a Free Trial

Pricing