Nasuni CEO: ‘We’re going to liberate you from the bottleneck around your files’

Company challenging EMC and NetApp with cloud-powered virtual appliances and services


										Single Use


Single Use

When it comes to file systems, scale is the enemy, according to Andres Rodriguez, CEO of Nasuni. And the best weapon in the battle for scale is the cloud. Nasuni claims to have developed the first cloud-native file system, delivering not only virtually unlimited scale in the cloud but rapid access to files from locations around the world.

Instead of deploying more and more on-site storage – and dealing with costly, painful upgrades – Nasuni delivers a virtual machine (or appliance) that handles local needs while using the cloud for the heavy storage lifting. You buy capacity, not boxes, and that makes it easier to budget and plan for growth, per Rodriguez.

In this installment of the IDG CEO Interview Series, Rodriguez spoke with Chief Content Officer John Gallant about how the Nasuni UniFS file system works, how customers deploy it and what kind of savings and flexibility they can expect. He also talked about what it means for your existing EMC and NetApp systems, and how those competitors are responding to the Nasuni challenge. Rodriguez also explored why it’s not easy to try to replicate what Nasuni does on your own.

What does Nasuni do and why do you do it?

We handle all file workloads for enterprise customers. I started the company because I was the CTO of the New York Times and we had very big file problems that were getting worse. I felt, 15 years ago, the architecture for file storage was going to run out of steam. So I set out to build. In most businesses, files are the work product. If you’re talking about an architecture firm or an engineering firm, you’re talking about design documents. If you’re talking about a media firm you’re talking about movies they may be working on. If you’re talking about a software development house you’re talking about the software itself. That’s all unstructured file data.

I felt that there was a looming problem of scale around file data. The architectures of the time did not scale to what I thought file systems were going to have to reach to handle future workloads. The files were getting bigger and everything was being captured in digital form in some file format or another and there wasn’t an integrated approach to addressing that issue.

Nasuni asks: Where are your files? What kind of pain are you having around files? It’s typically: We’re having a hard time storing them because the scale is killing us. We’re having a hard time protecting them. (That’s a byproduct of scale as well because the backup systems are getting too large.) We’re having a very hard time moving these files around. The files are so big and they need to be accessed from so many places around the world and at various degrees of performance. Bad performance is always easy. It’s getting great performance around the world that’s really difficult.

We help clients address that and we do it all from an integrated platform. We’ll take any file workload. We’ll take your Office documents, your design files, your movie files. Every file workload you have, we’ll take it off the traditional storage system but we will give you something that is fully compatible to anything above the file layer - all the applications, all the security access control systems – and it will scale forever. The protection will be integrated in the system and you will be able to move and access these files around the world at any level of performance, no matter how high.

We do it all as an integrated capacity license so that a CIO can look at this and say: We’re getting a 100TB license from Nasuni this year. Next year we need another 100TB from Nasuni. It’s incredibly clean and predictable, like the way you would budget for and purchase something like Salesforce or a SaaS application. It’s not done by selling you more boxes at your data center, it’s done as a service.

As I understand it there are two critical components to the Nasuni offering. One is an onsite device and then there’s the cloud storage capability that you offer. What does the onsite device do?

A good analogy for this is the Nest thermostat. The Nest thermostat serves two functions. The first is it needs to be an awesome thermostat. It needs to be able to control your HVAC system. That on-prem appliance is essentially doing the job of an equivalent thermostat, which - in the data center for files – is a network-attached storage device or NAS appliance. When you put one of our appliances in your data center you ensure complete compatibility with anything that was attached before to your file systems. It will attach to Active Directory, DFS. It will have NFS, all of the file verticals that exist in the data center will be part of what this device can talk to.

The magic then happens inside the device, just like the Nest thermostat, as it creates the file system that’s going to live in the cloud. With Nest, the magic is that you can control your thermostat remotely. The magic is that the thermostat is learning because there is intelligence in the cloud that is figuring out when you are out and when you’re not. Our file system resolves all of the issues that have been plaguing the data center in file systems for decades by removing the file storage from the data center.

We’re creating this massively scalable file system in the cloud that can scale forever. It is completely versioned and protected and you can access it from anywhere in the world from other appliances that look just like the appliance you first deployed in your data center. The cloud is where all the heavy lifting is actually getting done but you need this local appliance that sits in your data center so you can talk to the things that already exist in the data center and have the performance levels that are required in the data center.

How do you ensure performance where you have the local device but the files really are stored up in the cloud?

The wonderful thing about file systems is that they can be improved a great deal with caching. When I was looking at this problem, my biggest realization was that most file systems in the enterprise are largely unused. Most of the data that accumulates in big companies is almost never touched. Companies are typically working on a high-performance edge of that data called a working set, which can be as little as 5% of what a company is actually storing. This data is incredibly active, incredibly high performance, read, write all the time.

We leveraged that observation to build cache into the appliance that goes into the data center. The single most important thing those appliances are doing is, essentially, continuously evicting, continually getting rid of the data that’s not being actively used. All of the data moves to the cloud and what remains local in the data center is that very-high-performance layer that needs to be accessed by applications locally. There is an incredible leverage. That leverage is as dramatic as a 5% to 95% ratio in terms of what you need in your data center versus what you could have stored in the back end.

Under circumstances where, for instance, you need that data accessible in multiple locations around the world, you can synchronize the caching in the devices so that everywhere in the world they’re looking at the same working set. You can play a trick around globalizing the cache across multiple geos. When that data is required and say it doesn’t exist there, you also have an advantage because file systems - unlike traditional databases - have built-in resiliency against latency issues. Other than the fact that you have to wait, the applications will not produce errors or hang because a file needs to be streamed from the back end.

Think about the way movie streaming has changed for consumers: It’s the same principle. The old TV streaming devices had tons and tons of storage in them and they were terrible to use because the movies had to be downloaded before you could watch the movies. All modern streaming devices use the fact that movies have been optimized for download streaming and then they have a tiny little cache that allows you to basically feel like your movie is there even though it’s being streamed from the cloud. That’s a very narrow use case. We typically have larger caches than that and account for the fact that you need to be able to read and write over the files.

+ MORE FROM RODRIGUEZ: How to use Cloud Integrated Storage to support the distributed enterprise +

But that experience is basically the experience that people get when the files don’t exist in the local appliance. The device pauses before it gives you the first bit but as soon as it gets it, it just begins streaming like crazy because we’ve done all the work necessary before we put it in the cloud to compress the files, deduplicate them and make them very stream-friendly. They can just shoot right back to wherever they’re needed. That helps the system overall perform better.

Putting a price on Nasuni technology

How does somebody buy it? How do you price this?

One of my biggest frustrations when I used to buy storage was having to deal with all the nickel-and-diming features of the storage company. When I started this company I decided we were going to take a really straightforward approach. Basically, I’m selling you capacity and integrated protection. Protection is really two parts. I’ve versioning my file systems so that customers - say they get attacked with ransomware - can always go back to pristine versions of their file systems that are untouched by the malware attack.

The other thing is you need to be able to replicate the back end. We use the cloud to replicate the data asset. When we create this file system in the cloud it is one logical file system, if that’s what the customer wants, but it is many, many physical file systems that are distributed throughout our partners, cloud providers, companies like Microsoft with Azure and AWS with S3. Then we allow you to access that file system from many, many locations around the world. You want two locations? That’s great. You want 20 locations? That’s great too. We charge for one thing in that whole equation: How much usable capacity is in the file system. We don’t charge you for the replicated copy. We don’t charge you for how many points of access you have to the file system and we don’t charge you for how many versions you want to keep in the file system.

That’s truly revolutionary. Most every storage company, because they were selling you storage in a box, thinks they were charging you for raw storage capacity and then you were compromising because you wanted to keep a thousand versions of your file system to be able to go back in time. We give our customers infinite versions of the file systems and we don’t charge them anything extra for it. Most of our clients are operating in an infinite retention mode, meaning if they’ve been our clients for five years they can go back in time in five-minute intervals for five years and restore at any point in time a file, a directory structure, a complete file system. That whole equation is licensed just on the usable capacity, just on how much storage the users or the applications have direct access to.

That pricing includes the cost of the device onsite as well?

That device is an edge appliance and it’s a virtual appliance which runs on Hyper-V, on VMware. It can also run - and this is important for the access part of the equation - natively in AWS or in Microsoft Azure. That means you have access to your file systems wherever you want them. This is one of the big transformations that is happening now with the cloud. Customers want to be able to have their data in a kind of data Switzerland. They don’t want the data to be captured in any one provider because they want to bring the best-in-class services to that file data.

If you want to run a VDI environment or if you want to do transcoding for movies you may want to have access in AWS or Azure or in other providers that are specific to those services. You’re able to run on virtual appliances that are native to those environments and have full access to your file system even if your file system is hundreds of terabytes or tens of petabytes large. Regardless of the size of the file system, I want to be able to access that file system from anywhere, from any service provider.

The virtual machines allow you to do that and those are licensed at no additional cost. If you want a bare metal, dedicated appliance we have an OEM partnership with Dell where, basically, customers can buy a Dell server that comes preloaded with our software for data center. Our goal there is like a cable company. We want to give you the highest quality hardware appliance at the lowest cost because that’s going to make you consume more of the service. You’re going to be a happier client. It’s a completely non-hardware model for how you pay for this. What you’re really paying for is the subscription service based on the capacity. That’s how my sales reps are compensated. That’s how the company makes money and that’s how we add value to our customer.

Andres Rodriguez, CEO of Nasuni

Are there situations where this solution is really applicable and are there situations where it’s not as applicable? Are there any specific instances where this is not a great solution for a customer problem?

It is a great solution for unstructured file data and it works pretty much at any scale. We focus on the high end of the market because the solution has so many benefits when you have scale, when you’re trying to store hundreds of terabytes, when you’re trying to deliver those files to 20 to 30 locations around the world. That’s when the solution is really valuable.

If you’re trying to run an ERP system with databases, that is high-performance block data. There are many great storage companies that do an awesome job at high-performance block data, and there are application vendors that do great block synchronous replication, typically across two locations for active/passive DR [disaster recovery] of databases. That is not what Nasuni does. If you want to protect an Oracle system, if you want to protect your SAP system, that’s not what Nasuni does. That is the block database world. That is the traditional SAN world.

We are the NAS world. If you have files at scale, it’s a general purpose file system. I love calling it that because that’s the only useful kind of file system. We actually have many deployments where clients will upgrade to the latest flash array SAN storage product, something like Pure or Nutanix, and immediately have great performance. But, they say: What do we do with all the files? They’ll get a virtual machine inside these high-performance, solid state flash arrays and put their files there and their files will drain right out the back into the cloud. Now you have a really high-performance environment for your files, but you don’t have the complete file footprint in your data center. That kind of fine optimization can be extremely powerful.

It’s not unlike the world of thin laptops or smartphones in that everything has gone to solid state. Everything is super high performance on the local handheld device that you’re carrying. The thing we’ve done with those devices though is that we squeeze the capacity out of them. There is much less capacity in those devices than we used to carry around five or 10 years ago because the data is really in the network. Whether it’s your email, your music collection, your photo collection, the smartest thing that Apple ever did was put all the consumer data in the cloud so that people could get thinner, faster laptops at an affordable price.

When I want to move from one generation of Nasuni hardware or Nasuni VMs to the next generation of my SAN hardware infrastructure, there is no forklift upgrade. It’s no longer: Oh my God, I have to take 100TB from this monolithic NAS storage controller to this other one and it’s going to take months to do that. The upgrades look a lot more like getting a new iPhone. You get the new hardware, you put it online, you tell it this is my new NAS device. This is my new Nasuni edge appliance. You connect it to the service and it resynchronizes with everything that was there in the old appliance and nothing came from the old appliance. Everything came back from the cloud. It just streamed back from the cloud. It’s safe and it’s much more convenient than doing this bulk migration every three or four years.

Do you tell customers to replace your existing storage with Nasuni, augment it with Nasuni? What is the strategy?

We tell them we will change their business. File systems are strategic. File systems change how people work in organizations. They are the ultimate duct tape for an organization. You’re going to have your document management systems to create some workflows and yes, you’re going to have your ERP systems for supply chain management. But when you’re talking about groups of people getting work done, it’s about the files: Who has access to the files, who can change the files and who can see the files around the world? There has been a bottleneck in the industry around file systems for at least the last decade. We’re going to liberate you from the bottleneck around your files. We can get great collaboration because the performance is really terrible across all these sites. We can get rid of that. We can give you high performance on every single site.

This is, I think, one of the more interesting things. We are seeing the tip of the iceberg around machine-generated data in the enterprise. We have several large manufacturing companies that have been constrained in terms of how much data they can store and analyze, by how much data they can store financially and technically in their exiting file systems. All of a sudden what you’re seeing is the same kind of scale that hit the web companies 10 to 15 years ago that caused things like S3 to develop, that caused things like Google to scale their infrastructure.

Those are now hitting the larger segment of the enterprise where you will have a big manufacturer saying: Wouldn’t it be amazing if we could have telemetry on every single device we have out there? Where are we going to store it? We’re not going to want to access it from one point; we’re going to want to access it from everywhere, anyone that wants to consume this service to make our products better. Or we want the test data for how we manufacture things to be stored forever so that when something starts failing in the field we can determine where we missed things and what happened. Nasuni will get rid of all your previous problems around files but, more important, we give you a file system strategy for the future that is going to make your business far more productive, far more responsive to your own customers by helping you build better stuff.

Are your customers taking out existing storage or just not buying additional future storage?

There is a combination. Many customers have taken out their storage and their backup systems. That’s kind of a bread-and-butter thing that we do. It could be an EMC-native environment and they’ll have backup environments around that. We can eliminate all of that and you can store with our edge appliance and Nasuni service and you’re done. You don’t have to worry about making backups anymore and you don’t have to worry about ever running out of space. That’s one side of the equation. The other side is global access to the data. That is brand new. Customers have been stuck for years or decades with essentially [this problem]: If you have a lot of files in one location you have to go over the wide-area network to get those files with some kind of network acceleration product.

That approach is super frustrating to companies. When you want to have 40 locations around the world doing software development/testing and have them all feel like they’re operating on a local file system, the approach should be that the file sits in the cloud but it’s replicated locally to every site where that file system needs to be in use. That radically changes what could be done in the past. It creates new workflows that for this client are one of the most exciting things they’ve seen in their business in a long time in terms of being able to get more work done faster.

Just a very quick tactical question. What happens if your site is offline and can’t get to the cloud?

If you are not trying to do one of these global synchronization things the system behaves like any kind of local storage system. The same with your Nest thermostat; when it’s offline it’s still a good thermostat that you can control and you can make rooms warmer or cooler. We fall back to the behavior of essentially being like a traditional NAS box. Everyone can keep writing and reading from it. The cache helps you a ton because typically everything that people need is right there and when the internet comes back online the device is synchronized to the back end. They are all designed to work offline.

I’ve been talking about one file system but our customers typically have dozens of file systems, not because any one file system couldn’t scale forever but because they want different behaviors for their file systems. They may want to have a file system that is local to a certain site and another file system that is local to a region or that is global. You can change the behavior at the file-system level. If you have this kind of global access and the internet goes down, you can tell the file system at that site to exist in read-only mode for the data that they already have so they cannot make modifications and those modifications don’t conflict with what people are doing elsewhere in the world. Or I want to have a site fall into read/write mode and then our system will identify conflicts. It never throws away any data and says: Your end users need to now reconcile these files because you’ve allowed conflicts to happen in the system.

Why couldn’t I do what you’re doing today with my existing file systems and cloud storage? Why couldn’t I create that same hybrid cloud approach with what I already have?

Because it’s all about the file system. To solve your file problem, you need an awesome file system. We have built the first cloud-native file system. Before I started Nasuni I started an object storage company so I knew how cloud storage systems had to be built. My previous experience was all in distributed systems and I knew we needed a file system that could be replicated and live comfortably among tens of thousands of servers distributed all over the world. That’s what UniFS is. It’s a file system for the times. It’s a file system that, at a technical level, has a limitless pool of metadata to draw from in terms of the cloud and therefore it can scale forever.

You can take a UniFS file system and go from a million files to a billion files and not change the device. That appliance in the front doesn’t change at all. All that file system growth happens in the cloud and it scales forever. We use that same technique of scale in the cloud to scale across time. The reason we can store versions forever is that the file system is able to map in time the same thing that it’s doing in space with all the metadata and basically give you full-on versions of your files every five minutes for an infinite amount of time. There is no file system that can do that in the cloud providers, in the traditional vendors. That’s what we do.

Finally, providing access to that file system from multiple locations is not just about having one file system that can live centrally - which we have because our file system is in the cloud - it’s about the orchestration and management of all those edge appliances around that file system core. It’s really important to be able to maneuver the blocks and the state of the different synchronizations that are happening with the edge appliances from each location to that file system core. That is what Nasuni back-end services support. They’re orchestration wrappers around UniFS to make sure we are not stepping on our own toes as we’re trying to synchronize 30 locations around the world on read/write, active/active endpoints.

Nasuni vs. legacy players

I know that there are some specific competitors that come up in relation to your company but I want to talk about the traditional storage suppliers at this point. Is there any indication that they’re doing something or are planning on doing something similar?

The traditional competitors are NetApp and EMC. Those are the ones that we see all the time. EMC has VNX, which is their mid-market NAS array, and they have Isilon, which is what they use for scale. NetApp has NetApp. They have cluster and they have their WAFL FS. What’s limiting about their approach is they are stuck in the mode of thinking everything has to live in our devices. Everything has to be in our arrays.

Recently, what they’ve finally said is: We are going to keep the file system in the arrays and all the metadata for the file systems in the arrays, but we’re going to allow you to tier to the object store. I use object store and cloud storage as synonyms. We’re going to use the cloud storage systems to tier. This is typical of dominant players. They see this new technology as a second-class citizen and relegate it as a second-class citizen in their architectures.

If you have an Isilon array and they allow you to tier to the cloud, they are essentially keeping all of that file system metadata in the Isilon array and then there are pointers back to the cloud. The problem is that to get a bigger file system you basically need to scale the front end of the cluster. And to be able to distribute the cluster around the world it’s the same old approach. You need to go back to that cluster because the file system is located on that physical cluster that you still have in your data center. The way to think about this is to completely turn the file system upside down and say: No! The cloud storage system is the first-class citizen here. The file system needs to exist there and then you need to have this disposable, stateless device that extends it.

That’s the beauty for consumers. That’s the Dropbox approach. The files you can list with the service and that gets extended to wherever you want to use it. That is a much better experience than trying to synchronize all your devices against each other. Traditional vendors are still in that world.

What do they tell customers about you when they’re selling? What do they warn them about?

They are trying to sell them a thermostat with tons and tons of buttons and whistles and they tell them to look at all the buttons, look at all the configurations that you can have with our stuff because we’ve been doing data center devices for the last 20 years. Nasuni is this streamlined, very simple appliance that cannot possibly be complete. Every once in a while there will be some bell or whistle that we don’t have and we have to put it in the road map.

We are a startup. We’ve been around seven years. We don’t have everything. We don’t have the sink in the kitchen that they have but in general, we address 80% of the problem really, really well. They are saying there is no battery dual protection in that appliance and there is no Fibre Channel option on that thing and it’s all these secondary things. It depends on how much pain customers have around the new workloads. If you’re a customer that’s dying on scale or dying because you don’t have a DR plan around half a petabyte of data, you’re going to listen to us a lot more carefully than if you’re a customer that only has 20TB and a single site and really could do it with just about every traditional storage vendor that is out there quite happily.

Give us some sense of the success you’re having. How many customers? What kind of inroads are you making?

We’re doing great. I started the company ahead of its time and I’ve been trying to not do this throughout my career. I’ve been a colossal failure at this. I always start a company probably five years before its time. The benefit is we managed to develop a rock-solid file system and the times have changed. When I was talking to CIOs six years ago, the idea of using the cloud as your core for infrastructure was ludicrous.

There were all these concerns; security and performance and this and that. Now, in the last 18 months to two years, there isn’t a single CIO that doesn’t tell me: 'We have a cloud-first strategy'. As we look at infrastructure refresh, we are thinking how we can leverage the cloud first. That is a huge boon for us. Our client base has gone from companies that used to store 10 to 20TB with us to companies that are storing 600TB and a petabyte with us and multi-petabyte systems.

The top 500 enterprise accounts are buying Nasuni as opposed to the people that are the mid-market and low mid-market. We’re pretty much doubling year over year. Our entire business is predicated on the subscription part of the business and we did $5 million two years ago for the year. We did close to $10 million last year. We’ll do close to $20 million in what we call annual contract value of our customer base. It’s just steady. I can see every single quarter is twice what we did a year ago. {Ed. Note: A representative of Nasuni provided the following clarification on the above revenue numbers: ‘The numbers are estimates for revenue under contract/reoccurring revenue, not GAAP revenue, which is significantly larger, because GAAP would also include services and hardware sales.’}

In December you got a significant round of funding. How are you investing those dollars?

From a technology perspective this is revolutionary. From a go-to-market perspective this is traditional infrastructure sales, which means we invest a ton of money on two things. You have to have field sales and technical knowledge that can go to the accounts and conduct proofs of concept. Then you have to have incredible professional services and a support organization that comes behind that to make sure that things are running really well for clients. We’re investing in those areas and we’re always investing in engineering.

The first half of our year it’s going to be spent basically increasing field presence. This is the kind of model where all our customers, within the first year to year-and-a-half will double their contracts with Nasuni. Acquiring those customers early helps the company a ton in terms of growth and cash flow. We’re investing, essentially, in acquisition in the first half of the year and then we’re doubling up on engineering and support services in the second half of the year.

Join the TechWorld newsletter!

Error: Please check your email address.

Tags netapp

More about AppleAWSCustomersDellDropboxEMCFibre ChannelGoogleIDGMicrosoftNASNestNetAppNutanixOdinOraclePureSwitzerland

Show Comments

Market Place

[]