Today, VMware made their official announcement of vSphere, the next generation of virtualization technology in their flagship ESX line. There has been a lot of coverage of vSphere online in the weeks leading up to today’s big announcement. To me, the most interesting information and the most sought after is the answer to one simple question – what will this upgrade cost me as a current VMware customer with an active support agreement.
Philip Sellers
Philip Sellers
Phil is a Solutions Architect at XenTegra, based in Charlotte, NC, with over 20 years of industry experience. He loves solving complex, technology problems by breaking down the impossible into tangible achievements - and by thinking two or three steps ahead of where we are today. He has spent most of my career as an infrastructure technologist in hands-on roles, even while serving as a leader. His perspective is one of servant leadership, working with his team to build something greater collectively. Having been lucky to have many opportunities and doorways opened during his career - Phil has a diverse background as a programmer, a writer, an analyst all while staying grounded in infrastructure technology.
A co-worker, Jamie, loves watching presentations for TED.com. TED.com bills itself as the site with “Ideas worth Sharing,” may of which were presented at the TED conference. And I’ve seen some wonderful presentations. And he’s shared may of those wonderful ones with me.
This week, he sent me a link to a presentation from a MIT lab led by Pattie Maes. This presentation shows off a computer with an interface much like that seen in the movie Minority Report. This is pretty exciting technology. I’d venture to say, this may be as game changing as the iPhone has been to cell phones and the current generation of portable computers. This is pretty cool and although, I’m not sure about the “Sixth Sense” billing that she gives to it, I think the practical application of this is very cool.
Our journey to VDI continues. This week, we had several more milestones on our road to delivering this solution into the far reaching corners of our area. Among those accomplishments were implementing the DHCP options successfully, setting up several pools of linked-clone virtual desktops and a wider testing of the Pano devices and their capabilites. (This is only an incremental update – see my prior post.)
The bain of a sysadmin’s existence is documention. Most of us hate doing the tedious paperwork, but doing so helps the group around you and many times yourself once you’ve moved on to new projects. I know its a struggle for me and my co-workers.
Part of the problem is that documentation tends to get outdated. Keeping your notes updated as changes are made is tough. Old documentation is sometimes worse than no documentation… Its sometimes better to get inside and dig around to see how things are actually working/setup.
Last night, we undertook replacing our mid-plane in the blade enclosure that had problems last month. It was the first recommendation from HP support after several hours of working with various teams, but both our internal team and our HP field service guys didn’t feel it was the cause. Turns out, we may have been very wrong with our initial hesistance, but I’m getting ahead of myself.
After a month of continued support and working the case, we escalated the case to a level where an engineer reviewed all the steps, troubleshooting, and case information to ensure nothing had been missed and to help diagnose the issue. He came back to the original conclusion – that after all else was eliminated, the mid-plane must be the culprit. So, we scheduled the replacement.
The actual hardware replacement went smoothly and took less than an hour to complete. The midplane is a lot more bulky the I expected when I first saw it. It is a single piece of hardware with interconnects on both sides that connect blades to interconnect bays, power sources to power consumers and LCD display to the logic. But, I guess I was surprised that it was a good 2 to 3 inches thick. In my mind, I expected a single piece of copper sitting in the middle – yes I realize now that’s stupid.
Something interesting occurred after replacing the mid-plane. Apparently, Virtual Connect did not see the system serial number that it expected and so it reverted to its default configuration. So, word of advice to anyone replacing a mid-plane. Leave your VC modules ejected so that you don’t lose your domain configuration. From talking with support, Virtual Connect needs constant communication with the OA to function (another dependancy we were not aware of). The serial number stored reported by the OA from the enclosure is also very important to VC. Its part of the configuration file for VC. It all makes logical sense, but it was not spelled out in the support document detailing the mid-plane replacement.
After unsuccessfully attempting to restore my backup for the Virtual Connect domain, I opted to build it from scratch by hand. It took about an hour and half to do for my 5 blades, but I feel better about it. I am still worried about not being able to restore my VC domain configuration, but I attribute that problem to the hardware replacement.
I may have already mentioned that one of our projects for the year is to transition our corporate ESX cluster from 2U hardware onto blades. The process of transitioning does not come without some concern and some caveots moving to the blade architecture. We feel that blades are a good fit in our case for this particular cluster (we run several ESX clusters). Our VMware View deployment is our first production ESX workload on blade hardware. We have learned a few things from this deployment that might be helpful.
Normally, a firmware release wouldn’t warrant a post on here, but this case is a little different. Earlier this week, Apple introduced newer models of the Airport Extreme broadband router/switch and the backup appliance version, Time Capsule. Along with that announcement came several enhancements that most figured would be released for previous hardware. These features are enabled in firmware release 7.4.1.
While I’m not sure the firmware brings the dual band features to the older hardware The update does not include the guest networking or dual-band simultaneous networking, but the MobileMe integration (Back to My Mac) is included for older hardware (like mine). This allows you secure access to your shared disks from far far away. I installed the firmware last night and setup MobileMe integration. Broke out the Mac here at work and whala, its working to home… Now that’s cool. I knew technically that it wouldn’t be a stretch for them to extend the BtMM functionality.
For anyone in my local area, I wanted to make sure to let you know that Wilmington, NC, area has a newly formed VMware Users Group. Their inaugural meeting will be March 25 at the North Carolina State Port Authority. The Port Authority will be presenting a case study of their VMware View (VDI) deployment. A senior sales engineer will also be presenting updates from VMworld Europe.
I’m very excited that a more local group is available to me and my co-workers and I look forward to participating as often as possible. For more information about this user group (or one in your area if you’re not in Myrtle Beach or Wilmington), go to: http://www.vmware.com/communities/content/vmug/ and click on events or local groups.
My boss, a co-worker and I attended the Carolina’s Summit in Charlotte, NC, last May and we were very impressed by the event. I received another announcement that this event will be held again this year on May 29. This is an all day thing with several session – a mini-VMworld if you will. I found last year to be very informative and helpful. Like last year, Mike Laverick with RTFM Education will be the guest speaker.
My co-worker, Jason, was tasked with our VMware View implementation and I’m glad to report that its been largely successful and more importantly easy to deploy. I thought that now was a good time to reflect and share with you how we came to the decision to deploy virtual desktops, where we plan to use them, and what components we have implemented.
I can’t believe it, but its been almost a month since my last post. And what a month its been around my work. This has been one of the busiest and most difficult months that I can remember with the company. I have my hands in several different technologies, VMware and our blades are just two of my primary responsiblities. Over the past month, though, we’ve experienced a catastrophic failure of one of our blade enclosures. The failure has only occurred once, but the fall-out from this has taken almost a month to work out. And honestly, we’re still not through working out the kinks.
Of course, my story has to begin on Friday the 13th… Sometime around 9:00am, we started getting calls for both our SQL 2005 database cluster and our Exchange cluster. After investigation, we found that the active nodes were both in the same enclosure and a third ESX host in the same was experiencing problems, too. The problems were affecting both network and disk IO on the blades. All of our blades are boot from SAN, so the IO had to be a fiber-channel issue.
Several hours later, we were finally able to get enough response out of the nodes to be able to force a failover of services for Exchange, shortly followed by SQL 2005. As I worked with HP support, nothing improved on the affected servers. We were finally diagnosed with a problem mid-plane on the enclosure.
While waiting for the mid-plane to be dispatched to the field service folks, I requested that we go ahead and do a complete power-down on the enclosure and bring it up clean. This required physically removing power from the enclosure after powering down everything that I could from the onboard administrator.
After the reboot, everything looked much healthier. The blades came back to life and everything began operating as expected. After intense discussions on the HP side, we reseated our OA’s and the sleeve that they plug into on the back side of the enclosure. Net outcome was the same – everything still operating well. The OA’s nor the sleeve were loose, so we doubted that was the cause.
One nugget I learned from HP support (please vett this information on your own), is that the Virtual Connect interconnect modules require communication with the onboard administrators (OA’s). I’m still not sure I fully understand, but HP support did tell us that if VC lost communication to the OA, its possible that it caused our problems. If this is so, this smells like very, very bad engineering and design…
Continued investigation on HP’s part has pointed us back to the original diagnosis – a faulty mid-plane. Only by default did we return to that conculsion, however. This is the only piece of hardware common to the problems. Our only other conclusion was that this was a very bad, “hiccup” — which obviously buys us no real peace of mind…
So, sometime soon, we will be replacing the mid-plane of our enclosure. I have, of course, lost some faith in the HP blade ecosystem. We have plans to migrate our corporate VMware cluster onto blades, as well as some Citrix and other servers. Losing an enclosure like this has un-nerved those plans. We were fortunate to have drug our feet to only have 3 blades populated and serving anything at the time this happened. I will post updates as we move forward…