More Network Performance and Resiliency without LACP, at least w/ Datrium Adaptive Pathing

Isn’t awesome when a function that could only be done using specialized hardware get’s to be implemented via software? Software Is Eating the World – that’s the case with the new Adaptive Pathing.

A new Adaptive Pathing feature introduced with DVX 2.0 improves the overall aggregate storage bandwidth and reliability, without adding the usual LACP (Link Aggregation Control Protocol ) management overhead and complexity.

 

What is link aggregation, LACP?

In a nutshell, it allows one to aggregate multiple network connections in parallel to increase throughput beyond what a single connection could sustain, and to provide redundancy in case one link goes down. LACP is a vendor independent standard term which stands for Link Aggregation Control Protocol, defined in IEEE 802.1ax or 802.3ad. LACP links need to be manually configured on the physical network switch, to allow both links to appear as one logical aggregated link. MAC address(es) from the host side could appear on both links simultaneously, and the switch will not freak out and thinking there’s a loop on the network. – Wen Yu

 

Datrium DVX is split between hosts and data nodes, and disaggregate performance from capacity (for an understanding of the DVX architecture read this). This approach means that, unlike traditional storage arrays, DVX has complete control over both ends of the communication stack, hosts and data nodes.

 

Performance

DVX manages and route storage traffic at a higher layer in the networking stack rather than rely on network link level protocol. DVX data nodes actively heartbeat hosts using each of the network interfaces and can detect which interfaces have good connectivity to hosts. (Each data node has two controllers in an active/passive mode, and each controller presents two network interfaces).

 

When a host have functioning network paths to both interfaces on the data node active controller, DVX will automatically spread the host traffic across the data node controller interfaces, increasing the available bandwidth across all hosts in the cluster – a total of 20Gb throughput for each data node.

 

Benefit: Improved network bandwidth

The performance increase itself is dependent on IO patterns. However, internal FIO tests with three hosts and 64KB block IO demonstrated up to 65% performance improvement for sequential write workloads and a 53% improvement for random write workloads. For small 4KB random IO, the performance improvement expected is lower due to the nature of the workload.

The picture below perhaps makes easier to understand how Adaptive Pathing improves network bandwidth, ensuring that both network interfaces on active controllers are utilized simultaneously.

 

Note that Adaptive Pathing is not Multi-Pathing, and an individual host is limited to the bandwidth of a active interface in the network team/bond. Adaptive Pathing eliminates network bonding of the data interfaces on the controller. Adaptive Pathing spreads the data traffic from the hosts across the data nodes controller interfaces and results in greater aggregate bandwidth to the controllers, but Adaptive Pathing does not increase the bandwidth available to an individual host.

 

Reliability

DVX data nodes do not route storage traffic through the passive controller interface, but it does use the passive interface to monitor connectivity to all compute hosts and network interfaces.

If a host is only able to talk to one of the network interfaces on the active data node controller, the DVX software automatically routes all host traffic to the functional interface. Likewise, if a data node network interface can communicate to some of the hosts, but not others, each host will communicate to the most appropriate controller based on its unique connectivity status.

The intelligence is built into the DVX software stack and at any point in time compute hosts may be using different paths to transport data, even in the event of network connectivity failures, always automatically choosing the best network path.

If we a DVX data node detects that the passive controller can talk to all hosts that have connectivity to the active controller and additional hosts that cannot, then an automatic failover is triggered to increase host-controller connectivity.

 

Monitoring

The DVX platform fully understands the network connectivity and topology, and the GUI provides administrators with insights into networking issues and connectivity statuses such as redundant or degraded components, hosts and data nodes.

Not only DVX will route around network issues when they occur, but it also makes it easier to resolve issues and return the network to full redundancy.

 

Benefit: Continuous network monitoring

 

 

Conclusion

Simple and Easy! Just make sure that all network interfaces are properly cabled and add two new IP addresses, and Adaptive Pathing is enabled. Datrium DVX 2.0 Adaptive Pathing enables customers to get more bandwidth, improved network resiliency and increased availability with zero management overhead.

 

 

References:
To LACP or NOT to LACP?

To LACP or NOT to LACP?

This article was first published by Andre Leibovici (@andreleibovici) at myvirtualcloud.net.

 

Leave a Reply