Pushing Virtual Limits


Iomega Ix2 – I have one! Very cool
October 30, 2008, 9:43 pm
Filed under: Uncategorized | Tags: ,

So I ended up getting my hands on an Ix2 for a pilot program with our field teams. Just doing the initial configuration now. Super simple – 5 minutes and it is available.

This week I will load some VMs on it and run it over the wire and configure it to hookup to my 360 (as well as the mundane backup duties it is meant for). Everyone should go buy one.



Sun xVM finally entering limited beta
August 8, 2008, 12:53 am
Filed under: Uncategorized | Tags: , , ,

See this post here and then here for details on how to sign up. This is definitely a good thing because I don’t see VMware spending the time to build a Sparc compliant VM platform anytime soon.
Having a number of Sparc boxes under my infrastructure I am dying to get my little hands on this and tweak with it to my hearts content. Will it ever rival VMware? I doubt it but for those of us who desperately need to leverage Sparc based hardware for platform testing this could be a god send.
Maybe someone in the blogosphere will take pity on me and let me in behind the velvet rope to put this through its paces. My contact info is there. Reach out and I will definitely run this stuff until it drops!



Managing a Growing VMware deployment in a Software Development and Testing Environment
July 20, 2008, 3:29 am
Filed under: vmware | Tags: , , , ,

Big title, Big problem.

I think anyone here gets the basics and probably has a VCP or some other certificatation to prove they know what they are doing. On a technical level I have my fair share of challenges right now (SQL2k to 2k5 upgrade of the VC, an ESX Host that has decided two of its NICs are dead, and a variety of client issues) but those are pretty straight forward. VMware support has been challenging lately(see my earlier post on how they told me SQL2k was no longer a support DB backend with VC2.5 and told us we HAD to upgrade) but with the forum and other people out there posting I don’t think anyone ever hits a “unique” technical problem.

What we all hit that is unique is our management structure, our IT structure, and the ever changing requirements of the security teams. These not technical obstacles have always proven to be the limiting factor in my deployment and I doubt I am alone.

Let me set the stage a bit for the discussion that will follow. Right now our environment is working on at least three major new products and providing sustaining and support for at least twelve others. Our average machine profile – 1 cpu, less than 2gb of ram (1gb avg), under 60GB total hard disk space. The problem is we add or remove a dozen or more a day and have 100+ users with Virtual Machine Administrator rights.

The tricky problem comes in when you take a look at 26 hosts spread across more than five business units. Now that we are fully utilizing DRS the “I bought this host and therefore it is all mine” mentality becomes challenging. If one team has excess capacity shouldn’t they be part of the solution rather than hoarding an easily reclaimed resource?

At EMC we have this concept of “One|EMC” to try and bring all the acquisitions together. There are good things and bad things with this policy but I think this is an opportunity to do a real good. In this effort my management team has been very supportive with “lending” our excess capacity to other teams.

My BU owns the hardware, licensed the software, and pays for all upgrades and maintenence. There are a ton of costs associated with this effort and we have no intention of “charging” for utilizing idle assets (exactly what VMware excels at). What I do need to do is provide “cost visibility” to my management and the business units we work with. In order to do this we have purchased and are implementing VKernels “Chargeback Appliance.” The plan will be to provide scheduled reporting based on the following levels:

Deployment Total

Business Unit

Project teams within each BU

(Other reports as necessary)

The great thing is that these reports will be ready at anytime and I can give a login right to my management structure so they don’t have to ask me to generate reports for them. We will also be going one step further to show just how much we save by buying big iron – we will create a cost in VMware vs a physical system cost. VKernel has provided a great baseline for costing out the big numbers as well as all those little things that I just assume will be there (like electricity). Metrics matter and here they matter more than at most places. We know we have had a great thing for the past few years but now I finally have the tools to collect the metrics to show the big guys exactly how much money we are saving.



Diane Green Out – Paul is in!
July 9, 2008, 9:40 pm
Filed under: Uncategorized | Tags: , ,

Apparently Pi is hot. I am very excited about the fact that Paul is going to take over VMware. I hope my colleagues (sort of EMC’ers) over there will welcome him and embrace his enthusiasm. If he can continue to bring the top notch effort that I have seen coming out of his former division than I think we have good things coming.

As always StorageZille sums it up nicely – “Have you stopped freaking out Yet” covers all the bases



VMware Management Suite Showdown – vCharterPro vs V-Kernel Capacity Bottleneck Analyzer

This review is long overdue. A quick trip around the country to meet with customers and setup a new Virtual Infrastructure Lab have put me behind schedule. Mea Culpa

First, I want to comment about the support I received from both VizionCore and V-Kernel. Glen P from Vizioncore kept them in the running a lot longer than anyone else could have. Kudos to him and that team for trying to work through all the “glitches” we ran into. The whole team at V-Kernel was also very helpful and successful in diagnosing and resolving the defects I hit.

That being said I hit major defects with both of these products. Both teams released patches or provided workarounds in short order but the ability of V-Kernel to quickly adapt and address new problems was definitely a positive mark for them.

In the end we never got vCharterPro working 100% due to some data collection issues. After two months of working with support and their team I had to compare their “sort of” product against one from V-Kernel that was now doing everything it promised it would do.

When comparing features I found that my deployment was not par for the course. Anyone who has read my previous posts understands that my mantra is that “Hosts Don’t Matter!” All my vms live in the Virtual Machines and Templates views in a heavily nested folder structure. Running reports against business units is a breeze if the application is aware of this folder structure.

vCharterPro had no awareness of the folder structure. It assumed a Host or Cluster based reporting model (kind of silly when you have a huge Cluster running DRS and two or three departments sharing it)

V-Kernel has a native folder level awareness. They let me easily create groups for analysis based on the nested folder structure or via the traditional host/cluster method. This flexible group creation was ultimately the winning feature. Having lots of data in bad groups is useless but if we can get the data into meaningful reports or views it adds immediate value.

Where vCharterPro clearly excelled was in the looks department. Both products are bringing back roughly the same data (vCharterPro has a fixation on disk I/O vs actual space used) but vCharterPro presents it in a very pleasing fashion. Utilizing their parent companies framework they provide a seriously customizable interface to really tweak the dashboard view to be exactly what you want.

V-Kernel CBA has clean looks but it is nothing to get excited about.

As I am sure it is now apparent that I went with V-Kernel’s product suite. We chose it because it had intellegent grouping that worked with my environment as is rather than requiring me to reorganize everything from scratch. It collected all the data I needed accurately and efficiently (vCharterPro wanted a 4cpu box with 2GB+ ram versus CBA which is 1 CPU with 1 GB ram). I really appreciate that V-Kernel is going with a ready made appliance that is easy to deploy and just as easy to upgrade.

In the near future I will start sharing some of the reports that I am running and the value add I get out of them. I am very interested to see what other people see from this data. We will also be implementing V-Kernel’s ChargeBack product within the next few weeks (pending the next release). At that point I will share some pictures of that.

Also – Check out Rob’s blog over at the V-Kernel main site to get an interesting take on a variety of challenges facing the virtualization industry. I promise it is worth at least a quick perusal.



VMware on a CX3-40
June 11, 2008, 5:31 pm
Filed under: Documentation, vmware | Tags: , , , ,

I have the good fortune to run VMware on a single CX3-40. Right now I have approximately 30TB of usable disk space. Lots of space is great but with the frequent snapshot usage and the constant resizing of disks in a development/testing/replication lab I chose to go with smaller LUNs.

How small?

400GB per lun. 20TB of allocated storage / 400 GB LUNs = ~50 LUNs!

I am going to continue with the 400GB LUNs even as I expand out to two additional CX boxes (Probably CX4-40c’s) and add another 20TB of storage in two more locations. My concern is that my naming convention is sub optimal.

SAN_1 through SAN_30 vs my local storage naming convention LOCAL_(first letter of machine name)_(Volume) <Local_Q_1>

I think I will begin naming the child locations SAN_L_1, SAN_T_1, etc. Using the letter of the site in the name of the LUN keeps the friendly names presented in the datastores view clear. I hate nothing more than when I go and see someones infrastructure and they have local(1)-local(26). The datastores view has serious value if you utilize it correctly.

This post begins to detail one alternative plan to my design – VMEtc



Case in point why clustering (*) is not the best way to deploy eRoom (especially when it is virtualized)
June 9, 2008, 6:45 pm
Filed under: Documentation, vmware | Tags: , , , ,

Let’s assume the following default configuration as our existing infrastructure –

3 MSFT 2k3 clusters + SQL backend clusters. For this discussion the SQL back end just needs to be available and is a whole different scaling discussion.

This gives us SIX multi cpu boxes ( 2 – 2cpu, and 4 x 4cpu) where only half of them are ever doing anything. Compound that with the fact that these boxes have huge amounts of RAM (in excess of 10GB) and yet the utility of having that level of ram has been called into question. Microsoft lists the limits here.

If we assume that MSFT Clustering Services are perfect (*cough*) and simple to configure we still have to deal with the limiting factors of eRoom. eRoom is configured to fail over as a result of ONE instance – Deadlock Detection. Now if someone unplugs the other box the system should fail over but eRoom will only ever instigate the fail over when it see that one specific problem.

If IIS dies will eRoom failover? No

If erScheduler dies will we failover? No

etc etc etc

When we go to an eRoom advanced configuration using multiple web servers we take those 3 passive nodes and make them active. This gives us 6 active web servers to share the load. Using the default provisioning that means that if one of those nodes catches on fire we should expect only ~16.5% of services to be interrupted compared with 33% in the cluster configuration. In theory the cluster should fail over and minimize the outage without intervention – it works pretty well.

BUT

If you are willing to take a slightly more manual approach you can reprovision the Facilities hosted on that server to any of the other servers on the fly. Nothing is stored on the application server so losing one merely causes that server to go down without directly impacting the other systems. Reprovisioning can be done without any downtime or impact to the other facilities.

Now to explain the asterisk. If you are going to run eRoom in a cluster here is how you should do it.

Take our original 6 boxes and install ESX on one of them. On that host create B nodes for the 3 primary clusters. Turn off the spare box to save the electricity and run your Active/Passive configuration with physical hardware for the active node and let the passive nodes idle on VM. Physical to Virtual Clusters are the only reasonable way to do an active/passive cluster configuration.

This type of cluster isn’t any “better” than a physical/physical model but it is cheaper in the long run to maintain and setup.