Pushing Virtual Limits


VMware Management Suite Showdown - vCharterPro vs V-Kernel Capacity Bottleneck Analyzer

This review is long overdue. A quick trip around the country to meet with customers and setup a new Virtual Infrastructure Lab have put me behind schedule. Mea Culpa

First, I want to comment about the support I received from both VizionCore and V-Kernel. Glen P from Vizioncore kept them in the running a lot longer than anyone else could have. Kudos to him and that team for trying to work through all the “glitches” we ran into. The whole team at V-Kernel was also very helpful and successful in diagnosing and resolving the defects I hit.

That being said I hit major defects with both of these products. Both teams released patches or provided workarounds in short order but the ability of V-Kernel to quickly adapt and address new problems was definitely a positive mark for them.

In the end we never got vCharterPro working 100% due to some data collection issues. After two months of working with support and their team I had to compare their “sort of” product against one from V-Kernel that was now doing everything it promised it would do.

When comparing features I found that my deployment was not par for the course. Anyone who has read my previous posts understands that my mantra is that “Hosts Don’t Matter!” All my vms live in the Virtual Machines and Templates views in a heavily nested folder structure. Running reports against business units is a breeze if the application is aware of this folder structure.

vCharterPro had no awareness of the folder structure. It assumed a Host or Cluster based reporting model (kind of silly when you have a huge Cluster running DRS and two or three departments sharing it)

V-Kernel has a native folder level awareness. They let me easily create groups for analysis based on the nested folder structure or via the traditional host/cluster method. This flexible group creation was ultimately the winning feature. Having lots of data in bad groups is useless but if we can get the data into meaningful reports or views it adds immediate value.

Where vCharterPro clearly excelled was in the looks department. Both products are bringing back roughly the same data (vCharterPro has a fixation on disk I/O vs actual space used) but vCharterPro presents it in a very pleasing fashion. Utilizing their parent companies framework they provide a seriously customizable interface to really tweak the dashboard view to be exactly what you want.

V-Kernel CBA has clean looks but it is nothing to get excited about.

As I am sure it is now apparent that I went with V-Kernel’s product suite. We chose it because it had intellegent grouping that worked with my environment as is rather than requiring me to reorganize everything from scratch. It collected all the data I needed accurately and efficiently (vCharterPro wanted a 4cpu box with 2GB+ ram versus CBA which is 1 CPU with 1 GB ram). I really appreciate that V-Kernel is going with a ready made appliance that is easy to deploy and just as easy to upgrade.

In the near future I will start sharing some of the reports that I am running and the value add I get out of them. I am very interested to see what other people see from this data. We will also be implementing V-Kernel’s ChargeBack product within the next few weeks (pending the next release). At that point I will share some pictures of that.

Also - Check out Rob’s blog over at the V-Kernel main site to get an interesting take on a variety of challenges facing the virtualization industry. I promise it is worth at least a quick perusal.



Fort Collins and a week of lab construction
June 24, 2008, 9:25 pm
Filed under: Uncategorized | Tags:

Safe in Fort Collins, Colorado getting ready to rack up a new CX3 box and a group of ESXi servers. Crossing my fingers that everything works according to plan.

Friday night to Orlando then Monday night to Denver with a drive to FC and then back home Saturday night. It is going to be a very long week but good stuff will get completed and that is what matters.



New Mozy Error
June 14, 2008, 2:55 am
Filed under: Uncategorized | Tags:

If you have read my earlier posts you will know I am a big fan of Mozy. I think it is a great idea and I have an actual account (not just the free ones). My problem is that I am more than a month out of sync with Mozy because of stupid errors.

The latest error is - DB_ERROR[1]: no such table: restores

To find out what errors are causing your mozy to wander off in the wrong direction check \program files\mozyhome\data there is a file in there called mozy - open it in textpad and see what this little program has been up to.
I am at my wits end now and am thinking of canning the whole approach entirely. After multiple uninstalls and reinstalls it won’t start the mozy service now “cannot find the file specified” when I try and start it.

For the record I don’t mind this kind of additional effort when getting FreeNAS to work or building a Gentoo box. I do mind it when it is my backup solution and my data is at risk because it fails to function. Another continuing concern is what all these threats of charging for bandwidth used will do to cloud based computing. I don’t think the model can sustain the ISPs siphoning off revenue. Just my two cents.



Magellan on Youtube
June 13, 2008, 5:20 pm
Filed under: Magellan | Tags: , , , ,

Youtube posting of Magellan

These are the latest builds being demoed on Youtube. Unfortunately I can’t go into too much more detail on this upcoming release other than giving pointers to what product management has already released into the wild.

Take a look - it is very cool stuff!

*Edit - OK so it is just a rehash of EMC World. I will see what I can do to beg some fresh kibble for us all.



VMware on a CX3-40
June 11, 2008, 5:31 pm
Filed under: Documentation, vmware | Tags: , , , ,

I have the good fortune to run VMware on a single CX3-40. Right now I have approximately 30TB of usable disk space. Lots of space is great but with the frequent snapshot usage and the constant resizing of disks in a development/testing/replication lab I chose to go with smaller LUNs.

How small?

400GB per lun. 20TB of allocated storage / 400 GB LUNs = ~50 LUNs!

I am going to continue with the 400GB LUNs even as I expand out to two additional CX boxes (Probably CX4-40c’s) and add another 20TB of storage in two more locations. My concern is that my naming convention is sub optimal.

SAN_1 through SAN_30 vs my local storage naming convention LOCAL_(first letter of machine name)_(Volume) <Local_Q_1>

I think I will begin naming the child locations SAN_L_1, SAN_T_1, etc. Using the letter of the site in the name of the LUN keeps the friendly names presented in the datastores view clear. I hate nothing more than when I go and see someones infrastructure and they have local(1)-local(26). The datastores view has serious value if you utilize it correctly.

This post begins to detail one alternative plan to my design - VMEtc



Mozy state.dat part2
June 9, 2008, 6:51 pm
Filed under: Uncategorized

This problem has come up again to break my Mozy. I did four installations and uninstalltions, removed all the registry keys etc.
Somehow, someway, this state.dat stayed in my system and wacked me. The only thing I missed was the windows temp directory. I believe there must be remnants of the program there which cause this to be recreated during the new installation.



Case in point why clustering (*) is not the best way to deploy eRoom (especially when it is virtualized)
June 9, 2008, 6:45 pm
Filed under: Documentation, vmware | Tags: , , , ,

Let’s assume the following default configuration as our existing infrastructure -

3 MSFT 2k3 clusters + SQL backend clusters. For this discussion the SQL back end just needs to be available and is a whole different scaling discussion.

This gives us SIX multi cpu boxes ( 2 - 2cpu, and 4 x 4cpu) where only half of them are ever doing anything. Compound that with the fact that these boxes have huge amounts of RAM (in excess of 10GB) and yet the utility of having that level of ram has been called into question. Microsoft lists the limits here.

If we assume that MSFT Clustering Services are perfect (*cough*) and simple to configure we still have to deal with the limiting factors of eRoom. eRoom is configured to fail over as a result of ONE instance - Deadlock Detection. Now if someone unplugs the other box the system should fail over but eRoom will only ever instigate the fail over when it see that one specific problem.

If IIS dies will eRoom failover? No

If erScheduler dies will we failover? No

etc etc etc

When we go to an eRoom advanced configuration using multiple web servers we take those 3 passive nodes and make them active. This gives us 6 active web servers to share the load. Using the default provisioning that means that if one of those nodes catches on fire we should expect only ~16.5% of services to be interrupted compared with 33% in the cluster configuration. In theory the cluster should fail over and minimize the outage without intervention - it works pretty well.

BUT

If you are willing to take a slightly more manual approach you can reprovision the Facilities hosted on that server to any of the other servers on the fly. Nothing is stored on the application server so losing one merely causes that server to go down without directly impacting the other systems. Reprovisioning can be done without any downtime or impact to the other facilities.

Now to explain the asterisk. If you are going to run eRoom in a cluster here is how you should do it.

Take our original 6 boxes and install ESX on one of them. On that host create B nodes for the 3 primary clusters. Turn off the spare box to save the electricity and run your Active/Passive configuration with physical hardware for the active node and let the passive nodes idle on VM. Physical to Virtual Clusters are the only reasonable way to do an active/passive cluster configuration.

This type of cluster isn’t any “better” than a physical/physical model but it is cheaper in the long run to maintain and setup.



Running Documentum eRoom 7 in VMware - Notes and tricks
June 9, 2008, 1:41 pm
Filed under: Documentation, vmware | Tags: , ,

Frequently I get asked about how to deploy eRoom in a VMware infrastructure. Some people don’t even know that we fully support VMware (I blame this on the fact that VMware changed its logo to remove the “an EMC Company” subscript).

eRoom is a fantastic application to run in a VM. We fully support VM ESX 2.5-3.5 (2.5 support may fall of the chart soon - see EMC Support Note ESG25111 for the most current supportability matrix).

The problem I have with this basic eRoom v7.4 design and implementation is that so much depends on load and use cases. Lets assume 1,000 licensed users and 2 TB of data. Here is how I would design and deploy this environment:

5 total VMs

2 - Application VM, 1 CPU 2GB ram 20GB hdd- no reservations on RAM or CPU - Server 2003 SP2

1 - Index server/File Server - 4 vCPU, 2GB ram, 2 HDDs (OS on one and file share/indexing data on the second) - There are a few reasons for the multiple CPU count. It should always be N+1 where N = the number of application servers. The primary reason for multiple vCPUs is a VMware defect with their hardware acceleration feature which can cause a problem with the indexing engine.

1 - IRM server - Same as the application server above

1 - SQL 2005 DB Vm (Build per MSFT spec)

This system plan allows the greatest flexibility through the eRoom Advanced feature set. I will write about provisioning at a later date but it is still one of the most impressive features within eRoom.

Given a choice between eRoom in a cluster and multiple eRoom servers there should be no question in the choice to use multiple eRoom servers rather than a cluster. That deserves a post all for itself but the short answer is that eRoom in a cluster provides an active/passive configuration whereas eRoom advanced with multiple web servers provides active/active/active/etc configurations which allow you to truly scale your installation.



Chi.mp - Cool name - Cool Service?
June 3, 2008, 2:26 am
Filed under: Uncategorized | Tags:

Chi.mp has a very cool name and so I figure it was worth signing up for. Supposedly they are going to rework the way we think of our presence online. Anyone who is interested in checking out what could be the next big thing should take a look.

I signed up and we will see what it brings in the future.



Really interesting article on a Las Vegas Company - Switch
May 25, 2008, 12:03 am
Filed under: Documentation