Pushing Virtual Limits

Case in point why clustering (*) is not the best way to deploy eRoom (especially when it is virtualized)
June 9, 2008, 6:45 pm
Filed under: Documentation, vmware | Tags: , , , ,

Let’s assume the following default configuration as our existing infrastructure –

3 MSFT 2k3 clusters + SQL backend clusters. For this discussion the SQL back end just needs to be available and is a whole different scaling discussion.

This gives us SIX multi cpu boxes ( 2 – 2cpu, and 4 x 4cpu) where only half of them are ever doing anything. Compound that with the fact that these boxes have huge amounts of RAM (in excess of 10GB) and yet the utility of having that level of ram has been called into question. Microsoft lists the limits here.

If we assume that MSFT Clustering Services are perfect (*cough*) and simple to configure we still have to deal with the limiting factors of eRoom. eRoom is configured to fail over as a result of ONE instance – Deadlock Detection. Now if someone unplugs the other box the system should fail over but eRoom will only ever instigate the fail over when it see that one specific problem.

If IIS dies will eRoom failover? No

If erScheduler dies will we failover? No

etc etc etc

When we go to an eRoom advanced configuration using multiple web servers we take those 3 passive nodes and make them active. This gives us 6 active web servers to share the load. Using the default provisioning that means that if one of those nodes catches on fire we should expect only ~16.5% of services to be interrupted compared with 33% in the cluster configuration. In theory the cluster should fail over and minimize the outage without intervention – it works pretty well.


If you are willing to take a slightly more manual approach you can reprovision the Facilities hosted on that server to any of the other servers on the fly. Nothing is stored on the application server so losing one merely causes that server to go down without directly impacting the other systems. Reprovisioning can be done without any downtime or impact to the other facilities.

Now to explain the asterisk. If you are going to run eRoom in a cluster here is how you should do it.

Take our original 6 boxes and install ESX on one of them. On that host create B nodes for the 3 primary clusters. Turn off the spare box to save the electricity and run your Active/Passive configuration with physical hardware for the active node and let the passive nodes idle on VM. Physical to Virtual Clusters are the only reasonable way to do an active/passive cluster configuration.

This type of cluster isn’t any “better” than a physical/physical model but it is cheaper in the long run to maintain and setup.