«

»

May 15 2017

The story of ‘Open Convergence’ and why I joined Datrium

It is not often that an opportunity to join a cutting edge technology company appears to us in a lifetime and I have done it twice now – I consider myself very lucky.

Firstly, the team at Datrium formed part of the founding teams for Data Domain and VMware. Data Domain revolutionized the enterprise storage industry by introducing deduplication technology, and VMware does not need introductions. Diane Greene and Mendel Rosenblum, VMware founders, are early funders at Datrium, and some of the top Valley VCs like Lightspeed and NEA (Nutanix, Riverbed, Nest, Uber, Workday, etc…) are also investors.

Datrium is growing faster than most datacenter infrastructure companies at the same stage, and merely one year after the product launch they already have 100+ deployments in production across more than 70 leading customers, and they were also selected by CRN as one of the coolest storage startups last year.

 

The Technical Background – What You Need To Know

 

SSD-class storage has changed the way datacenters are architected, and workloads are increasingly designed to demand SSD-class storage performance.

Today’s Hyperconverged architectures leverage SSD-class storage capabilities to deliver performance, placing SSDs close to the CPU PCIe bus, in this case on each server, with a software stack responsible for coordinating replicas of the application data across multiple clustered servers for resiliency.

A hypervisor and virtual machines run on each host, and the most advanced solutions implement some form of data locality that guarantees that data is local to the host where the application is running.

Hyperconvergence brought cloud-type consumption-based economics and flexibility to enterprise IT. Rather than making big buys every few years, IT simply adds building blocks of infrastructure to the data center as needed. The approach gives the business faster time-to-value for the expanded environment.

Another main selling point of Hyperconvergence is the capacity to scale-out performance with each newly added server, removing the dual-controller performance bottleneck in 3-tier SAN architectures. Each new node is a new controller powered by software.

This Integrated Systems market is growing at the expense of the SAN market, and both Goldman & Gartner are predicting it is going to be a $30B+ market in the next few years. As you can imagine, there are several horses in this race, from upstarts to tech behemoths.

Ultimately, Convergence (CI) and Hyperconvergence (HCI) are about making IT more responsive and agile, providing the cloud like experience and economics, moving enterprise IT from intuition to analytics.

 

The Short History of Storage and the HCI Evolution

 

You may remember that servers used to be stateless in 3-tier SAN architectures because data integrity & reliability were located on high-end storage arrays, with dual-controller technology often bottlenecking performance. We loved the stateless nature of servers!

Trying to eliminate SAN performance bottlenecks, acceleration products were developed for both server and SAN edges. These implementations operated decoupled from each other, not providing a globally unified approach, nor completely removing the SAN controller bottleneck.

Later on, All-Flash arrays (AFA) introduced the ability deliver plenty of IOPS and Throughput, but the array-based controller dilemma continued to limit Flash capabilities, and today network bandwidth still limits performance-hungry workloads and scalability.

On the other hand, with HCI, both data and metadata began to live in stateful servers with sophisticated algorithms for distributed data and metadata placement. Data is written to local SSDs and synchronously replicated and written to other hosts over the network for data resiliency.

What’s clear is that SSDs have a place in the server for accelerating workloads. A single NVMe PCIe device can saturate a next-generation 40 GbE link, and 3D XPoint technology delivers latencies as low as 60µs, and it is faster than 100 GbE next-next-generation network connections.

[5]

 

The Datrium Technology and the Creation of ‘Open Convergence’

(friendly warning, it may get a little tech deep… skip to the conclusion if you lack the appetite!)

 

With Datrium, speed on servers scales independently from persistent data. For this reason, new or existing standard servers (rackmount or blades) may be added to a Datrium cluster – compute nodes are stateless, and the loss of one or multiple servers does not affect the cluster or data resiliency – component failures are the norm rather than the exception.

A benefit to this approach is that enterprise IT does not need to forklift workloads or select distinct initial pet projects to start on the Convergence path, offering the business a much faster time-to-value.

Datrium virtual controllers are a distributed host-based software stack that enables a global approach to data management and security (erasure coding, deduplication, compression, encryption, etc…). At a high level, application data is stored on SSDs and/or RAM on the host where applications are running, and another copy is synchronously committed to a Data Node NVRAM (aka DN).

 

Data Nodes are 2U commodity x86 servers spec’d for storage capacity (60-180 TB with 2-6x data reduction) with redundant, hot-swappable, mirrored and battery backed NVRAM for ultra-fast write I/O acknowledgment and data storing.

 

Taking the array controller CPU out of the data path removes it as a bottleneck and enables future implementation of NVMf (NVMe over RDMA). This is really cool stuff for another post.

 

Datrium has already moved data management functions to the servers. Every Compute Node (CN) is responsible for data coalescing and hashing functions for erasure coding, compression, deduplication, and encryption before the data is sent to a Data Node; so there is no data processing happening on Data Nodes.

That said, Data Nodes are the last resort for Read I/O operations. If a VM is migrated to a different server, the Datrium software looks and tries to retrieve data from the source host before going to a Data Node. However, if identical deduped data is already in-use by another VM on the destination host, the index lookup will succeed, and the data is then read from local Flash, retaining data locality.

As data is written, Datrium also protects it with error-correcting-codes (ECC) and cryptographic fingerprints, and as data is read, the system continuously verifies that it matches these data integrity codes.

I am not covering the I/O data path in this post, but one of the most interesting aspects is the use of host RAM, and the Data Node mirrored NVRAM to ack and accelerate I/O writes, effectively making Datrium likely one of the most performant enterprise storage solutions on the market today. Datrium’s architecture also eliminates most undesired East-West traffic between hosts, common in HCI implementations.

Based on some of my previous articles some people may turn their noses for what I am about to disclose as a better implementation, but it is always good to review technology advancements and be able to change opinions and perspectives.

There’s no requirement for external Controller VMs on each server, nor is Datrium hard-coded into the hypervisor kernel. Datrium runs in ‘userspace,’ in the hypervisor, but at the same time providing independent management and upgrade cycles. The software approach allows for faster release cycles and whenever there are hardware performance improvements you get to enjoy the benefits and performance improvements right away.

 

 

Datrium is currently available for vSphere, and the solution is auto-deployed using a VIB (vSphere Installation Bundle). Other hypervisors may be supported in the future.

Datrium is VM-Centric and as you would expect there are no LUNs, zoning, queue-depths, RAID levels, cache sizing or complex configurations to be managed. The operational and data management is done via a seamless multi-datacenter and multi-cloud single pane of glass completely integrated into vCenter that helps moving enterprise IT from intuition to real-time analytics.

 

 

 

In Summary

 

I have a ton of respect for the market that Hyperconvergence vendors created, and heck, until yesterday I was marketing and selling it. There are strong use cases for HCI, however, in a diversified market, there are also compelling alternative choices for an array of different customer requirements and use cases.

As the SAN market declines, there is a clear market gap to be filled by intelligent Software-Defined-Storage platforms that can efficiently leverage Moore’s Law server-side performance with modern SSD-class storage, like NVMe and 3D XPoint, and at the same time provide integrated robustness, reliability and cost-effectiveness ($/IOPS and $/GB) of disaggregated storage.

Managing 10s of thousands of servers requires automation and analytics, but in datacenters like Google and Facebook application servers have no private state (stateless), so they can be easily replicated. For them refreshing the network and servers must be a constant, and the old must live with the new without interruptions. WebScale companies have learned that storage capacity has to increase through disaggregation, but providing global datacenter storage as if it were local.

 

On the surface, hyper-convergence and disaggregation appear to be orthogonal to one another, but in my view, disaggregation is simply an evolution of the concept of hyper-convergence. It allows for the creation of more powerful web-scale clouds where massive pools of resources are available to be used and shifted based on need or policy across all applications. With disaggregated computing, utilization rates would soar compared to where they are today —  reducing the need for space, power and cooling.Arun Taneja is founder and president at Taneja Group [2]

 

Resource utilization studies of the Microsoft Azure and a large private cloud at Google have also shown that storage capacity and I/O rates are not necessarily correlated [1]. Bryce Canyon (picture) is Facebook’s next-generation of disaggregated storage platform for Open Rack [3]. Google GFS operates with a master metadata server and multiple chunk servers accessed by GFS clients, but similarly, the storage capacity can be disaggregated from application servers and now they seem to be looking at disaggregated storage models once again.[4]

Your datacenter may not be Google, Amazon or Facebook, but some of their hard-learned lessons apply to business needs from all organizations.

 

Some key points that made me very interested on Datrium are:

 

  • Better density, scalability, and performance by bringing the newest servers into the cluster. Because only a couple SSDs per server are required, blades and small footprint rackmount servers are the perfect fit for expanding footprint with higher consolidation ratios. The storage stack is managed entirely via software, enabling performance to scale independently from capacity. Every new host adds I/O performance, but hosts are still stateless/fungible.

 

  • No significant financial modeling exercises necessary to justify the transition from 3-tier SAN architectures to Open Convergence. Existing hosts may be used as a starting point and storage capacity average 1/2 the cost/GB of arrays. Datrium also sells their 1U-ODM compute node (CN2000) for customers wanting the end-to-end single throat to choke support experience.

 

  • Datrium customers are widely deploying Tier-1 apps, databases & data warehouse workloads, but given the extreme performance and lower cost/GB, it is also perfect for VDI. I will also add that Datrium makes for excellent performance and financial sense for Citrix PVS (more on that later).

 

  • Enterprise-Ready with features like VAAI support, Erasure Coding, always-on Dedupe/Compression, Native Replication, Cloning, Snapshots and the novelty End-to-end Encryption with data reduction to secure data-in-use, in-flight and at-rest.

 

  • Customers love the tech. Read this Reddit thread and see for yourself.

 

I still have much to learn about Datrium’s Open Convergence, like their blanket encryption and Rack Scale Systems, and the new Data Cloud management. Also, some product announcements soon!

If you are interested in a deep-dive watch this video with CTO and Co-Founder Hugo Patterson and Devin Hamilton, Consulting Technologist; or optionally for an architecture deep dive download the technical architecture here.

I will remain a relentless blogger (myvirtualcloud.net) and look forward to continuing to share my thoughts, ideas, findings and experiences on my career journey with you.

My new role at Datrium is Field CTO and Vice President of Solutions and Alliances, which will include Partnerships, Verticals, Performance Engineering and “real world” benchmarks. I’m excited about the innovation of the Datrium technology solution and look forward to sharing the next chapter in my journey with you.

I would like to end this post with a big thank you to Nutanix for the extraordinary opportunity and learnings over the past 3 1/2 years. I have evolved into a more broadly experienced professional, and with a newly discovered passion for building startups.

 

Thank You!

 

References
[1] Flash Storage Disaggregation
http://csl.stanford.edu/~christos/publications/2016.flash.eurosys.pdf
[2] Disaggregation marks an evolution in hyper-convergence http://searchstorage.techtarget.com/opinion/Disaggregation-marks-an-evolution-in-hyper-convergence
[3] How Facebook Does Storage
https://thenewstack.io/facebook-storage/
[4] Some Food For Thought About Hyper-Converged Infrastructure
https://idc-community.com/groups/it_agenda/infrastructureanddatamanagement/some_food_for_thought_about_hyper_converged_infrastructure
[5] Your Network Is Too Slow For Flash And What To Do About It (Image)
http://longwhiteclouds.com/2016/06/05/your-network-is-too-slow-for-flash-and-what-to-do-about-it/

3 pings

  1. The story of ‘Open Convergence’ and why I joined Datrium — myvirtualcloud.net | Farhan Parkar's Weblog

    […] via The story of ‘Open Convergence’ and why I joined Datrium — myvirtualcloud.net […]

  2. VDI Calculator 7.1 Available w/ Support for Intel SkyLake, Datrium and more » myvirtualcloud.net

    […] « The story of ‘Open Convergence’ and why I joined Datrium […]

  3. More Network Performance and Resiliency without LACP, at least w/ Datrium Adaptive Pathing » myvirtualcloud.net

    […] and disaggregate performance from capacity (for an understanding of the DVX architecture read this). This approach means that, unlike traditional storage arrays, DVX has complete control over both […]

Leave a Reply