Beacon Probing Deep-Dive
A few weeks ago, on my on-site training one of the student asked about beacon probing. How exactly the technology works and what scenarios can be covered by beacon probing? Moreover, will beacon probing works with only two physical network adapters?
I have found out this question interesting because, from the configuration perspective, the beacon probing can be enabled by a single option and in the best-practice guide, it is recommended to use beacon probing. However, how this technology works internally?
Let’s have a look at the following network topology:
Beacon probing overview
Beacon probing is an advanced failure detection mechanism, that can be configured on the vSwitch. The default configuration is based on the link status only, but there might be a situation when the traffic is not passing over the network, but the physical links on the ESXi side are connected to the upstream switch.
The following options are available for Network failure detection:
- Link status only: Relies only on the link status that the network adapter provides. This option detects failures such as removed cables and physical switch power failures.
- Beacon probing: Sends out and listens for beacon probes on all NICs in the team, and uses this information, in addition to link status, to determine link failure.ESXi sends beacon packets every second. The NICs must be in an active/active or active/standby configuration because the NICs in an unused state do not participate in beacon probing.
Let’s have a look at the following diagram:
When the link status only option is selected, from the ESXi perspective the link is up, and it is used to forward the traffic, but the traffic will never reach its destination because the connection between the upstream switch and the core switch is broken.
When the beacon probing is used, the physical NIC is considered down, since based on the beacon probing mechanism ESXi hypervisor knows, that the link cannot be used to forward the traffic.
Beacon probing mechanism
When you enable beacon probing, each physical NIC connected to the vSwitch (or port-group) that have Network Failure detection set to beacon-probing will start to send specific beacons over the network.
Every NIC will send a specific Ethernet frame to all other physical NICs. If the destination NIC will not receive this particular beacon frame, it will be considered „down.“ The common misconception is that the destination of those frames is a broadcast address so that all NICs will receive the traffic.
That is not true. The frames are sent between source and destination NICs (MAC addresses).
ESXi with two NICs
If you have three or more NICs, the mechanism is pretty straight-forward, but what happens if you have only two NICs?
In this situation, you cannot determinate if the problem is caused by the sending NIC ort he receiving NIC.
There is a fallback mechanism that will be used in this situation. Since ESXi does not know which of the NIC is affected (but it knows that one of the NIC is affected), it will simply starts to duplicate the frames on both NICs
Detail frame examination
I think it is essential to understand the content of the frame, so let’s have a look at the frame structure itself:
You can see, that the source of the frame is 00:50:56:5d:f7:cb MAC address and the destination MAC address is 00:50:56:53:ad:07.
Those MAC addresses should seem familiar. The MAC address looks like an address for virtual machine or vmkernel, not the MAC address of the physical interface.
Let’s check on the ESXi level:
[root@esxi-prod-1:~] esxcfg-nics -l |grep -i vmnic6 vmnic6 0000:1c:00.0 nvmxnet3 Up 10000Mbps Full 00:50:56:a9:d6:95 1500 VMware Inc. vmxnet3 Virtual Ethernet Controller [root@esxi-prod-1:~] esxcfg-nics -l |grep -i vmnic7 vmnic7 0000:05:00.0 nvmxnet3 Up 10000Mbps Full 00:50:56:a9:d8:af 1500 VMware Inc. vmxnet3 Virtual Ethernet Controller [root@esxi-prod-1:~] esxcfg-nics -l |grep -i vmnic8 vmnic8 0000:15:00.0 nvmxnet3 Up 10000Mbps Full 00:50:56:a9:db:41 1500 VMware Inc. vmxnet3 Virtual Ethernet Controller [root@esxi-prod-1:~] esxcfg-nics -l |grep -i vmnic9 vmnic9 0000:0d:00.0 nvmxnet3 Up 10000Mbps Full 00:50:56:a9:98:12 1500 VMware Inc. vmxnet3 Virtual Ethernet Controller
In my example, don’t be confused with the MAC since my LAB is based on nested ESXi servers, but in case of a physical server, you would see MAC addresses based on the physical NIC itself.
However, the frames are coming from different MAC addresses. Where did those get from? Each physical NIC has it’s own virtual MAC address that is used for the beacon probing.
[root@esxi-prod-1:~] vsish -e get /net/pNics/vmnic6/virtualAddr Virtual MAC Address { Virtual MAC address:00:50:56:53:ad:07 } [root@esxi-prod-1:~] vsish -e get /net/pNics/vmnic7/virtualAddr Virtual MAC Address { Virtual MAC address:00:50:56:5d:f7:cb } [root@esxi-prod-1:~] vsish -e get /net/pNics/vmnic8/virtualAddr Virtual MAC Address { Virtual MAC address:00:50:56:55:d4:e7 } [root@esxi-prod-1:~] vsish -e get /net/pNics/vmnic9/virtualAddr Virtual MAC Address { Virtual MAC address:00:50:56:5c:c6:8d }
So now we know, what source and destination MAC addresses are used.
Let’s have a look at the raw frame:
First six octets are for the source MAC address followed by another six octets for the destination MAC address:
Next two octets identify the Ethernet Type. For the beacon frames VMware decide to use custom type 8922 as shown in the ethernet frame:
There are several interesting values in the frame content as well.
UUID of the ESXi host:
You can check against your ESXi host using esxcfg-info:
[root@esxi-prod-1:~] esxcfg-info |grep "System UUID" |----World Command Line.................................grep System UUID |----System UUID..............................................5c94a57e-bc3e-e397-d446-005056a9e83a
Source and destination vNIC used to send and receive the frame:
To determinate the vNIC name, you need to convert the PortNumber of the vNIC to HEX format. PortNumber can be determinate using net-stats command as shown in the following code output:
[root@esxi-prod-1:~] net-stats -l PortNum Type SubType SwitchName MACAddress ClientName 100663298 4 0 DvsPortset-0 00:50:56:a9:d8:af vmnic7 100663300 4 0 DvsPortset-0 00:50:56:a9:d6:95 vmnic6 100663303 4 0 DvsPortset-0 00:50:56:a9:98:12 vmnic9 100663305 4 0 DvsPortset-0 00:50:56:a9:db:41 vmnic8
DEC to HEX conversion:
- 100663298 = 6000002 = vmnic7
- 100663300 = 6000004 = vmnic6
- 100663303 = 6000007 = vmnic9
- 100663305 = 6000009 = vmnic8
So we can see that the beacon was sent from vmnic6 to vmnic7 and we can crosscheck that against the virtual MAC addresses as well:
- vmnic6 = 00:50:56:53:ad:07
- vmnic7 = 00:50:56:5d:f7:cb
Conclusion
I hope that after reading the article, you have learned what the benefits of beacon probing are as well as you can fully understand the mechanism behind it.
I strongly suggest enabling the beacon probing in your environment since it increases the network availability quite a lot (depending on your physical network design).
Feel free to check official documentation as well:
Thank you very much for sharing this
很感谢
hmm So does that mean if we have two NICs for a vSwitch – we should not enable beacon probing?
I read online and some articles suggest using 3 or more NICs.
My ESXi servers has 9 NICs but of course the other NICs are used for other services like vMotion, NFS etc. The MGMT network however uses 2 NICs on standard switch, and have “link state only” configured- we had issues when the core switches were rebooted. Wanted to enable beacon probing but with the conflicting information, I am unable to decide
It will be great if you can clarify.
Thank you
Ven