Pages

Wednesday 12 October 2011

ESXi 5.0 Load Balancing Test: Route based on IP hash

Within ESXi 5.0, there are 4 methods of Load Balancing.  As stated in the help

Route based on the originating port ID Select an uplink based on the virtual port where the traffic entered the standard switch.


Route based on ip hash Select an uplink based on a hash of the source and destination IP addresses of each packet. For non-IP packets, whatever is at those offsets is used to compute the hash.

Route based on source MAC hash Select an uplink based on a hash of the source Ethernet.


Use explicit failover order Always use the highest order uplink from the list of Active adapters that passes failover detection criteria.

The test below is to test and study what "Route based on IP hash" does and how it perform traffic load balancing.


Below is my test setup

6 Windows XP virtual machines connected to 2 port groups into vSwitch1.  3 pNIC connected to vSwitch1.


In the vSwitch NIC Teaming setup, Route based on IP hash is selected as the Load Balancing Policy.  Three pNIC is set in the following order in Active Adapters, vmnic1, vmnic2 and vmnic3.



Because IP hash is selected in ESXi, we will also have to configure the physical switch to perform etherchannel in order to match the IP hash of ESXi.  The physical switch is setup with 3 physical port configured to one etherchannel port.  IP Hash is also selected as the Load Balance Algorithm for the etherchannel. Do note that LACP is disabled, and etherchannel is as to on.

An additional thing to note is that when your physical switch port is configured using etherchannel, there are some changes to the way MAC address are presentated to the physcial switch.


After you group Port 2, 3 and 4 into a single etherchannel port.  The etherchannel port takes over the MAC address table of port 2, 3 and 4 as a single group.  This means that all the 6 test machines MAC address that I have will appear in this etherchannel and not the physical port anymore.   If you understand this, you will understand why Beacon probing will not work with IP hasing.  For all physical switches, it will only forward the traffic out to other ports except the original/source port. This is to prevent network loops.  In the case of Beacon, it will need to send from one physical MAC address to the other.  But if group of them as one etherchannel port, the physical switch will never send the packet back to the original/source port.


As "IP Hashing" uses IP address of the virtual machine, I need to gather all six virtual machine's IP addresses from ESXi.  I managed to find out all virtual machine's IP address in a quick way by using vmware-vimdump,

Because I have named all my test machine in the following text format "TESTXPxx", I did a "grep" with any text having "TESTXP" and 30 lines below it.  Thereafter, I do another round of filter just to extract two informations out of the 30 lines, "TESTXP" and "ipaddress".  Below is the result.

As I am using my NAS box for this test, therfore the destination IP for all the virtual machine is my NAS box.  Each XP Virtual Machine will connect to my NAS Box and copy one directory with 1GB files down to it's C: drive.  I then use esxtop "n" to look at the actual vmnic used during the copy process.


TestXP01, uses vmnic1 to send traffic

TestXP02, uses vmnic3 to send traffic

TestXP03, uses vmnic2 to send traffic

TestXP04, uses vmnic2 to send traffic


TestXP05, uses vmnic2 to send traffic

TestXP06, uses vmnic3 to send traffic

Do note that from the above test result, the vmnic RX (receive traffic) may not be the same as the TX(transmit traffic).  The reason is because we can only control the selection of vmnic for outgoing traffic, the incoming traffic is govern by the physical switch and not ESXi. In this case, IP hashing load balancing is also selected in the physical switch. But because the physical switch may have different method of calculating IP hashing, hence the return path could be different.

Based on vmware support website on the calculation of vmnic selection for "IP Hashing", I  have worked out the "IP Hashing" calculation on Excel Spredsheet (NIC#) and compare the calculated result and the actual result (Actual NIC#) in esxtop.




Excel Formula for those who want to create such spreadsheet.

Source or Dest IP (in Decimal) if IP Address is in C7
=((VALUE(LEFT(C7, FIND(".", C7)-1)))*256^3)+((VALUE(MID(C7, FIND(".", C7)+1, FIND(".", C7, FIND(".", C7)+1)-FIND(".", C7)-1)))*256^2)+((VALUE(MID(C7, FIND(".", C7, FIND(".", C7)+1)+1, FIND(".", C7, FIND(".", C7, FIND(".", C7)+1)+1)-FIND(".", C7, FIND(".", C7)+1)-1)))*256)+(VALUE(RIGHT(C7, LEN(C7)-FIND(".", C7, FIND(".", C7, FIND(".", C7)+1)+1))))


xOr Result (in INT) if Source IP in D7 and Destination IP in F7
=SUMPRODUCT(MOD(MOD(INT(D7/(2^(32-ROW(INDIRECT("1:32"))))),2)+MOD(INT(F7/(2^(32-ROW(INDIRECT("1:32"))))),2),2),(2^(32-ROW(INDIRECT("1:32")))))


Remainder, G7 is the xOr Result and C4 is the number of NIC Cards
=MOD(G7,$C$4)


NIC# as the mod result start with 0 for the first nic card always + 1
Take Remainder + 1

Source/Destination IP in Hex if Source/Destination IP in C7
=DEC2HEX(((VALUE(LEFT(C7, FIND(".", C7)-1)))*256^3)+((VALUE(MID(C7, FIND(".", C7)+1, FIND(".", C7, FIND(".", C7)+1)-FIND(".", C7)-1)))*256^2)+((VALUE(MID(C7, FIND(".", C7, FIND(".", C7)+1)+1, FIND(".", C7, FIND(".", C7, FIND(".", C7)+1)+1)-FIND(".", C7, FIND(".", C7)+1)-1)))*256)+(VALUE(RIGHT(C7, LEN(C7)-FIND(".", C7, FIND(".", C7, FIND(".", C7)+1)+1)))))
Findings from the Test Result

In vmware support Knowledge Base, Troubleshooting IP-Hash outbound NIC selection
 http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=1007371 
it state the following

To convert the NFS IP addresses to Hex:
  1. Use any online IP Hex Converter tool to convert the IP addresses to Hex.

    This is an example we used a vSwitch with 2 uplinks, EtherChannel and IP-Hash:

    Links = 2 (0 and 1)
    VMKnic 10.0.0.10 = 0xa00000a
    NFS1 10.0.0.20 = 0xa000014
    NFS2 10.0.0.22 = 0xa000016
  2. Use the following IP-Hash formula to calculate the outbound uplink:

    VMKnic > NFS1 (0xa00000a Xor 0xa000014 =1E) % 2= 0
    VMKnic > NFS2 (0xa00000a Xor 0xa000016 =1C) % 2= 0
    1. On any scientific calculator, select Hex and Qword.
    2. Enter the VMKnic IP in HEX format (a00000a) and click Xor.
    3. Enter NFS1 IP in HEX format (a000014) and click =.
    4. Press Mod, press 2 for the number of uplinks, then click =. The result is 0.
    5. Repeat steps a-e, using the NFS2 IP (a000016) in step c. The result is also 0.

      IP-Hash chooses the first uplink in the team because they both have result of 0.
Basing on the above xOr logic from the KB, I created the Excel spreadsheet formula and put in my test machine IP addresses.  I then compare the calculated result with esxtop.  To my surpise, I got half of the calculated result different from the actual esxtop result.  After some study with some calculation check and verification, I realise that vmware IP hash only uses the last 8 bit for the xOr rather than the full 32 bit.  Which means, either the KB is wrong or in ESXi 5.0, vmware changed the IP hash calculation logic.  In any case, 32 bit or 8 bit does not make much different as the result is still getting 1 of the 3 nics to get the traffic out of ESXi. 

In this test, the order of the vmnic in Active Adapters list is also important if you want to determine the result.  I have reorder the vmnic from 3 to 1 in  the Active Adapters list, and the esxtop reacted to the same logic, NIC# 1 will goes to vmnic3.




















5 comments:

  1. more to come...
    Thomas Low

    ReplyDelete
  2. Have your read this great document from Frank Denneman: http://frankdenneman.nl/2009/11/nfs-and-ip-hash-loadbalancing/

    I have used it to calculate load balancing with NFS and NetApp storage.

    ReplyDelete
  3. Replies
    1. Here, calculations are done by ESXi Host, and ESXi host reads the IP. On vSphere Standard Switch, Route Based on IP Hash is configured as Load Balancing technique.

      Delete
  4. excellent, great job to make the things understand.

    ReplyDelete