|
The host with the master role sends periodically reports states to vCenter. The slaves are informed that the Master is alive via heartbeats. The slaves monitors the state of their locally running VMs and any changes are transmitted to Master. The Slave sends a heartbeats to master and if master should fail, the re-election process occurs. vCenter knows if a new Master is elected, because it’s the new master which contacts vCenter after the re-election process is finished.
The secondary channel through datastores is known as a Heartbeat Datastores. But this secondary network is not used in normal situations, only in case the primary network goes down. This secondary channel permits the Master to be aware of all Slave hosts and also the VMs running on those hosts. The Heartbeat datastores can also determine if host became isolated or network partitioned. The secondary channel can determine if host is failed (PSOD) or if it’s just isolated. HA likes you to have at least 2 shared datastores for each ESXi host, but you can enable it with just one shared storage but you will get a warning message on the host's front page in the VI Client.
How does this heartbeating mechanism work? HA leverages the existing VMFS filesystem locking mechanism. The locking mechanism uses a so called “heartbeat region” which is updated as long as the lock on a file exists. In order to update a datastore heartbeat region, a host needs to have at least one open file on the volume. HA ensures there is at least one file open on this volume by creating a file specifically for datastore heartbeating. In other words, a per-host a file is created on the designated heartbeating datastores, as shown in the screenshot below. HA will simply check whether the heartbeat region has been updated.
So in my opinion for this to work you need to have your iSCSI and NAS network physically seperated from your ESXi management network containing the heartbeat. FC networks are seperate by default offcourse.
In case of a ESXi host going down while a re-election process is going on, the VMs on this host will be restarted as soon as the new master is elected.
In the scenario of when a Geo-Dispersed cluster is split in two sites due to a link failure, each “partition” will get its own master. Still only 1 master will communicate with vCenter so the data reflected by vCenter might not be 100% accurate.
FDM will work with vCenter5 and ESXi4 hosts. FDM will replace the AAM agent on ESXi4 hosts. The HA concept completely different from ESX4 and before. For instance enabling HA on 32 nodes will take a minute or so to configure, the HA agent is pushed out in parallel instead of serial with 4.1 and prior.
One more important thing: HA no longer uses DNS – it means there is no dependency on DNS or hosts files.
|