Ok, you got here! However, before you start reading and give up in the middle of a long post, let me tell you that what Datrium is announcing today is a generational improvement to existing Disaster Recovery and Hybrid Cloud platforms for many reasons – so it is crucial to understand how we got here.
If you just want to learn about the announcement, go to the section “Welcome, CloudShift” or read the following: ” CloudShift delivers a complete but straightforward run-book orchestration for multiple use-cases, and Cloud DR provided as a cloud service. CloudShift enables customers to utilize the data already in the cloud, via Cloud DVX, to failover, failback, instantiate and migrate applications on various clouds, with the first release supporting the VMware Cloud on AWS (VMC).” Here is the Press Release.
Delivering a true Hybrid cloud experience is hard, really hard, and anyone who says that it can be achieved without an integrated data fabric that spans on-prem and public clouds with cost-effective data movement is just fooling themselves. I don’t say that lightly, because a Hybrid cloud is a journey and requires multiple parts of the stack to be perfectly aligned for successful delivery.
Datrium Hybrid Cloud has always been on the roadmap, but to deliver the experience customers desire we had to go through multiple product phases.
We started building the most-scalable and fastest converged platform in the world that can run any tier-1 app, at any scale – DVX has been validated being 10x faster than the fastest HCI and 5X faster than the fastest AFA but at 1/3 of the latency.
We also knew that at any serious scale data services cannot be an option, so we took a zero knobs approach that brings simplicity to the platform, but more importantly, enables Datrium to defeat data gravity (more on that later). As part of the DVX platform, we also delivered a native and robust scale-out backup that eliminate air-gaps between traditional primary storage and backup, providing data fidelity across the entire stack.
The 2nd delivery phase was the ability to leverage integrated universal deduplication combined with replication to create availability zones across on-premise and public cloud. As part of the public cloud component, we shipped Cloud DVX, a native and integrated Backup-as-a-Service solution that lives on the cloud and provides cost-effective long-term archiving.
Now comes the 3rd delivery phase of the vision, and we are shipping the tools necessary to leverage Cloud DVX to orchestrate and instrument Disaster Recovery-as-a-Service on public clouds, with the first release supporting VMware Cloud on AWS. Yes, all that with great partnership from VMware.
There’s a lot more to come as part of the platform vision, but let’s now take a step back to understand how all the pieces fit together.
To make Hybrid Cloud work, we need to defeat Data Gravity
Data Gravity is the term used to describe the hypothesis that Data, like planets, have mass and that applications and services are naturally attracted to Data. This is the same effect Gravity has on objects around a planet.
Dave McCrory first coined the term Data Gravity to explain that because of Latency and Throughput constraints applications and services will or should always be executing in proximity to Data — “Latency and Throughput, which act as the accelerators in continuing a stronger and stronger reliance or pull on each other.”
This Data Gravity theory is highly applicable to the evolution of datacenters and clouds. The ascension of host attached Flash devices, and the ability to utilize them on local computing buses vs. over a network is a clear indication that applications benefit from the data proximity.
In the case of data movement between clouds, the real puzzle is how to dilute and reduce data to its most essential fundament, a sequence of bits and bytes that never repeat themselves. Also known as data deduplication, this technology has been around for many years, but it has always been used as in a self-contained manner, this means that data is deduplicated in a container, in a drive, in a host, in a cluster, on the wire. However, when it comes to application and data mobility, we are still wrapped by latency and throughput, making such data movement hard, particularly when addressing vast data lakes.
If it was possible to de-duplicate application data at a global level, across various datacenters, across clouds, across data lakes, and across systems then we would be guaranteeing very high-level of data availability in each part of the globe because data becomes ubiquitous and universal.
With Datrium an application running in a private datacenter has each data block de-duplicated and hashed locally creating a unique fingerprint, then these fingerprints are compared to hashes available on multiple Cloud DVX deployments. Then only the outstanding and unique data is transferred before migrating the application and data from a location to another, in a fraction of the time and bandwidth that is required with traditional mechanics.
Universal de-duplication makes data ubiquitous and universal, common to every application and system, while metadata takes on a vital role, building datasets, enforcing policies and distribution. The bigger the pool of de-duplicated data available on a given location (AWS, Azure, GCP, On-Prem), the lesser bandwidth is required because most of the necessary data is already there. Data contention and distribution issues are gone because data is ubiquitous and common to all systems, while metadata starts playing a vital role.
Unless we defeat data gravity it’s not possible to build a Hybrid Cloud that won’t incur high networking and cloud costs. You can read more about Data Gravity here.
To make Hybrid Cloud work, we need Backup-as-a-Service
Breaking data gravity is not enough, because to deliver a reliable Hybrid cloud experience that is also cost-effective there’s need for a well-thought-out backup engine that continuously protects and archive data on a cloud repository, but also consolidates data from multiple locations – that’s what Cloud DVX delivers.
Cloud DVX is a zero-administration SaaS solution of the overall Datrium DVX platform that lives on the cloud (AWS today but Azure and GCP in the future). As a part of the service offering, Datrium manages the service availability, software upgrades as well as proactive support and self-healing functions.
Cloud DVX is the brains for on-premise DVX instances, and the software is built on the same split provisioning foundation as the on-premise DVX system, enabling massive on-demand scalability of compute and capacity. Furthermore, the same superpowers of the Log-Structured Filesystem (LFS) is behind Cloud DVX.
One of the use-cases for Cloud DVX is Backup-as-a-Service. Traditionally IT organizations provide incremental and differential snapshots and backups of running systems and store an extra copy on secondary storage for quick retrieval (low RTO), and later the same data is archived to tape for long-term retention.
Cloud DVX BaaS delivers native dedupe-aware backup and archival capabilities to AWS. Cloud DVX collapses the long-term archiving tier, traditionally owned by tape vendors, and enables organizations to go to the cloud with an extremely secure, cost-effective and remarkable RTO.
The solutions offer a self-managed solution that supports multi-site, multi-system, and multi-object global de-duplication with full data efficiency and encryption on the wire and at rest. Further, because the service supports end-to-end encryption, there is no need to add a separate VPN and related AWS charges.
In the context of offering the World’s First Hybrid and Multi-Cloud Data Fabric, Cloud DVX is the data repository holding backup and replicated data. For now, this repository is on AWS (under the covers it uses EC2 and S3), but it will also have placement in Azure and GCP, enabling intercommunication between clouds.
You can read more about Cloud DVX here.
Cloud DR and Hybrid Cloud are HARD!
Cloud DR holds promise, but the reality is that technology vendors have not been able to deliver solutions that are simple and cost-effective. At the same time, there’s so much complexity in the datacenter, making most DR options fragile and brittle. Furthermore, inefficient data protection infrastructure forces choice between low RPO and low cloud costs.
Solutions like VMware SRM are solid and have evolved to handle very complex datacenters, but the underneath complexity in dealing with multiple infrastructure silos, many copies of data, and data transformations, make DR solutions complex and with a high management overhead.
Below is an example of all the components involved in a traditional DR implementation using legacy technology. What could go wrong?
CloudShift is another element of Cloud DVX that delivers a complete but straightforward run-book orchestration, and Cloud DR provided as a cloud service. CloudShift enables Cloud DVX BaaS customers to use the data already in the cloud to instantiate applications on different clouds, with the first release supporting VMware Cloud on AWS (VMC).
VMC is our first pick because the vast majority of organizations are happy VMware customers and many of them are looking to leverage VMC in multiple ways. Furthermore, with VMC there are no risky VM conversions necessary, making the DR process very simple and integrated.
Before we move ahead, it’s important to highlight that CloudShift Runbook Automation for DR and Workload Mobility provide support for traditional Prem-to-Prem, Prem-to-Cloud and Cloud-to-Prem. Also, Datrium supports VMware SRM and can be orchestrated as part of a broader datacenter runbook automation.
CloudShift is as part of the Datrium data fabric and provides on-demand VMware DR to VMC, not even requiring that a VMC SSDC be pre-created in order to kick-off a DR plan. In that case, of course, the DR plan may take a couple hours, because first an SSDC must be created and then data on Cloud DVX must be copied to the SSDC – in such cases, we are looking at couple hours RTO to re-instantiate a completely lost datacenter. Another option is to have an SDDC pre-created, in which case, the RTO would be much lower.
Cloud DVX delivers a very low RPO, because it utilizes DVX snaps and forever incremental replication, but at the same time keep costs low for CloudShift to deliver just-in-time DR infrastructure. When compared to synchronous replication or stretch-clusters solutions that require the VMC SDDC to be fully operational, Cloudshift becomes attractive, especially when cost and availability are important to organizations.
The picture below demonstrates how legacy data protection architecture works in a DR scenario and how CloudShift is a generational improvement.
The orchestration is the only thing that must be configured by IT teams to make CloudShift work, and as part of this config, a DR and Test plans must be created. Using those plans, IT is able to execute non-disruptive testing and build site-to-site mappings with guest VM re-IP when necessary – CloudShift integrates with the VMware stack, including NSX for seamless networking migration.
Furthermore, the system runs continuous plan verification and compliance checks to ensure everything will work just fine when you need the most. Finally, CloudShift also offers the ability to completely failback all applications and data to your primary datacenter.
Here are some Beta product screenshots.
We have a rock-solid platform that lives in your datacenter and delivers extreme performance, scalability and native data protection; this platform is natively integrated with an extremely efficient cloud backup and data cloud repository; this data cloud repository is used to deliver seamless disaster recovery to VMC today and soon with VM conversion to multiple clouds. In a near future, these data cloud repositories will communicate to each other creating a cross-cloud data fabric and forming a global data ledger that provides native global data governance.
- CloudShift Press Release
- CloudShift page:
- CloudShift blog – Brian Biles
- CloudShift: Failproof Hybrid Cloud Disaster Recovery – Sazalla Reddy
- CloudShift Product Brief:
- CloudShift WhitePaper:
This article was first published by Andre Leibovici (@andreleibovici) at myvirtualcloud.net