Thursday, January 27, 2011

Failover Cluster for use with Hyper-V (requirements)

I want to present a checklist for what you need to consider if you intend to create a Failover Cluster for Hyper-V.

A key word is that you need common components. If you want a successful implementation, you better make sure that everything is supported.

Server Hardware:
I`ve created some clusters by now, and since I`ve been part of the planning process as well, I have luckily made it clear the servers that should be part of a cluster should have the exact same configuration and components. This is really basic. You may have cluster nodes with different CPU`s, but at least from the same manufacturer. We`ll come back to that later. Remember that the whole idea of a High Availability solution is that if one node fails, - the workloads should failover to a second (or a third) node. So that brings us to the RAM. It`s a good idea to have enough RAM installed on every node, so that the VMs on node 1 can also run on node 2. You should even calculate the size of the RAM so that both hosts are able to run the entire workload, especially if you`re planning for an active-active cluster. So let`s assume that the CPU are identical and so is the amount of RAM. What about the firmware, BIOS settings, NICs, and storage? If you are not familiar with the Cluster Validation wizard, you should spend some time with it. It will pick on every little detail that may affect the clusters stability, and guide you through whatever you have configured wrong, or at least notify what it suggest to do about the errors/warnings.
One great thing about the Cluster Validation is that it will not deny the creation of the cluster even if there is not ‘enough’ NICs installed. This is nice if you intend to use the cluster for testing/training.

Network:
Network is the core of everything today - also when it comes to clusters. Communication between the nodes, iSCSI to storage, dedicated NICs to host management, VMs, and so on. Best practice is to assign at least one NIC dedicated to a single purpose. This is for assuring adequate performance, security, availability and stability. One example is that the Live Migration process relies on a good network configuration, and should have a dedicated network for its purpose.

CPU:
In an ideal world the CPU`s would be identical. But sometimes the ideal is not always possible and they should at least come from the same manufacturer.
CPU`s operates different in the way they manage memory, and varies what instructions are available. If you do not have identical CPU, you must enable the ‘Migrate to a physical computer with a different processor version’ on the CPU settings on the VM.

Storage (SAN):
If you want HA and Failover Cluster, you must have some sort of shared storage, and the storage provider must support iSCSI or FC. In addition, SCSI-3 Persistent which is a command which controls disk arbitration must be supported. It`s quite often a focus on the hosts when it comes to CPU`s and RAM. But a key to a successful Hyper-V implementation is that the storage is well configured. The VMs is nothing but a set of files on a disk, and may be very I/O intensive. And since the VMs are located on shared storage that may be connected through iSCSI, the network throughput must be adequate.  

Quorums and the voting:
If your cluster should act as a cluster, there must be some mechanism that would identify a failure of the node, and the health of the cluster. For this we have majorities, quorums, and voting.
It`s actually explained very well here: http://technet.microsoft.com/en-us/library/cc731739.aspx

Operating Systems:
One thing to note, is that you can`t combine a mix of Server Core and Full install of Windows Server Enterprise/Datacenter in the cluster. Although you could do a mix of Enterprise/Datacenter (the validation would give a warning).
If your budget is low and you are familiar with Hyper-V, you could also use the free Hyper-V 2008 R2 edition to build a Cluster. This edition runs the windows kernel and is based on Enterprise/Datacenter which supports Failover Cluster.

No comments: