Our journey to VDI continues. This week, we had several more milestones on our road to delivering this solution into the far reaching corners of our area. Among those accomplishments were implementing the DHCP options successfully, setting up several pools of linked-clone virtual desktops and a wider testing of the Pano devices and their capabilites. (This is only an incremental update – see my prior post.)
Philip Sellers
Philip Sellers
Phil is a Solutions Architect at XenTegra, based in Charlotte, NC, with over 20 years of industry experience. He loves solving complex, technology problems by breaking down the impossible into tangible achievements - and by thinking two or three steps ahead of where we are today. He has spent most of my career as an infrastructure technologist in hands-on roles, even while serving as a leader. His perspective is one of servant leadership, working with his team to build something greater collectively. Having been lucky to have many opportunities and doorways opened during his career - Phil has a diverse background as a programmer, a writer, an analyst all while staying grounded in infrastructure technology.
The bain of a sysadmin’s existence is documention. Most of us hate doing the tedious paperwork, but doing so helps the group around you and many times yourself once you’ve moved on to new projects. I know its a struggle for me and my co-workers.
Part of the problem is that documentation tends to get outdated. Keeping your notes updated as changes are made is tough. Old documentation is sometimes worse than no documentation… Its sometimes better to get inside and dig around to see how things are actually working/setup.
Last night, we undertook replacing our mid-plane in the blade enclosure that had problems last month. It was the first recommendation from HP support after several hours of working with various teams, but both our internal team and our HP field service guys didn’t feel it was the cause. Turns out, we may have been very wrong with our initial hesistance, but I’m getting ahead of myself.
After a month of continued support and working the case, we escalated the case to a level where an engineer reviewed all the steps, troubleshooting, and case information to ensure nothing had been missed and to help diagnose the issue. He came back to the original conclusion – that after all else was eliminated, the mid-plane must be the culprit. So, we scheduled the replacement.
The actual hardware replacement went smoothly and took less than an hour to complete. The midplane is a lot more bulky the I expected when I first saw it. It is a single piece of hardware with interconnects on both sides that connect blades to interconnect bays, power sources to power consumers and LCD display to the logic. But, I guess I was surprised that it was a good 2 to 3 inches thick. In my mind, I expected a single piece of copper sitting in the middle – yes I realize now that’s stupid.
Something interesting occurred after replacing the mid-plane. Apparently, Virtual Connect did not see the system serial number that it expected and so it reverted to its default configuration. So, word of advice to anyone replacing a mid-plane. Leave your VC modules ejected so that you don’t lose your domain configuration. From talking with support, Virtual Connect needs constant communication with the OA to function (another dependancy we were not aware of). The serial number stored reported by the OA from the enclosure is also very important to VC. Its part of the configuration file for VC. It all makes logical sense, but it was not spelled out in the support document detailing the mid-plane replacement.
After unsuccessfully attempting to restore my backup for the Virtual Connect domain, I opted to build it from scratch by hand. It took about an hour and half to do for my 5 blades, but I feel better about it. I am still worried about not being able to restore my VC domain configuration, but I attribute that problem to the hardware replacement.
I may have already mentioned that one of our projects for the year is to transition our corporate ESX cluster from 2U hardware onto blades. The process of transitioning does not come without some concern and some caveots moving to the blade architecture. We feel that blades are a good fit in our case for this particular cluster (we run several ESX clusters). Our VMware View deployment is our first production ESX workload on blade hardware. We have learned a few things from this deployment that might be helpful.
Normally, a firmware release wouldn’t warrant a post on here, but this case is a little different. Earlier this week, Apple introduced newer models of the Airport Extreme broadband router/switch and the backup appliance version, Time Capsule. Along with that announcement came several enhancements that most figured would be released for previous hardware. These features are enabled in firmware release 7.4.1.
While I’m not sure the firmware brings the dual band features to the older hardware The update does not include the guest networking or dual-band simultaneous networking, but the MobileMe integration (Back to My Mac) is included for older hardware (like mine). This allows you secure access to your shared disks from far far away. I installed the firmware last night and setup MobileMe integration. Broke out the Mac here at work and whala, its working to home… Now that’s cool. I knew technically that it wouldn’t be a stretch for them to extend the BtMM functionality.
For anyone in my local area, I wanted to make sure to let you know that Wilmington, NC, area has a newly formed VMware Users Group. Their inaugural meeting will be March 25 at the North Carolina State Port Authority. The Port Authority will be presenting a case study of their VMware View (VDI) deployment. A senior sales engineer will also be presenting updates from VMworld Europe.
I’m very excited that a more local group is available to me and my co-workers and I look forward to participating as often as possible. For more information about this user group (or one in your area if you’re not in Myrtle Beach or Wilmington), go to: http://www.vmware.com/communities/content/vmug/ and click on events or local groups.
My boss, a co-worker and I attended the Carolina’s Summit in Charlotte, NC, last May and we were very impressed by the event. I received another announcement that this event will be held again this year on May 29. This is an all day thing with several session – a mini-VMworld if you will. I found last year to be very informative and helpful. Like last year, Mike Laverick with RTFM Education will be the guest speaker.
My co-worker, Jason, was tasked with our VMware View implementation and I’m glad to report that its been largely successful and more importantly easy to deploy. I thought that now was a good time to reflect and share with you how we came to the decision to deploy virtual desktops, where we plan to use them, and what components we have implemented.
I can’t believe it, but its been almost a month since my last post. And what a month its been around my work. This has been one of the busiest and most difficult months that I can remember with the company. I have my hands in several different technologies, VMware and our blades are just two of my primary responsiblities. Over the past month, though, we’ve experienced a catastrophic failure of one of our blade enclosures. The failure has only occurred once, but the fall-out from this has taken almost a month to work out. And honestly, we’re still not through working out the kinks.
Of course, my story has to begin on Friday the 13th… Sometime around 9:00am, we started getting calls for both our SQL 2005 database cluster and our Exchange cluster. After investigation, we found that the active nodes were both in the same enclosure and a third ESX host in the same was experiencing problems, too. The problems were affecting both network and disk IO on the blades. All of our blades are boot from SAN, so the IO had to be a fiber-channel issue.
Several hours later, we were finally able to get enough response out of the nodes to be able to force a failover of services for Exchange, shortly followed by SQL 2005. As I worked with HP support, nothing improved on the affected servers. We were finally diagnosed with a problem mid-plane on the enclosure.
While waiting for the mid-plane to be dispatched to the field service folks, I requested that we go ahead and do a complete power-down on the enclosure and bring it up clean. This required physically removing power from the enclosure after powering down everything that I could from the onboard administrator.
After the reboot, everything looked much healthier. The blades came back to life and everything began operating as expected. After intense discussions on the HP side, we reseated our OA’s and the sleeve that they plug into on the back side of the enclosure. Net outcome was the same – everything still operating well. The OA’s nor the sleeve were loose, so we doubted that was the cause.
One nugget I learned from HP support (please vett this information on your own), is that the Virtual Connect interconnect modules require communication with the onboard administrators (OA’s). I’m still not sure I fully understand, but HP support did tell us that if VC lost communication to the OA, its possible that it caused our problems. If this is so, this smells like very, very bad engineering and design…
Continued investigation on HP’s part has pointed us back to the original diagnosis – a faulty mid-plane. Only by default did we return to that conculsion, however. This is the only piece of hardware common to the problems. Our only other conclusion was that this was a very bad, “hiccup” — which obviously buys us no real peace of mind…
So, sometime soon, we will be replacing the mid-plane of our enclosure. I have, of course, lost some faith in the HP blade ecosystem. We have plans to migrate our corporate VMware cluster onto blades, as well as some Citrix and other servers. Losing an enclosure like this has un-nerved those plans. We were fortunate to have drug our feet to only have 3 blades populated and serving anything at the time this happened. I will post updates as we move forward…
I’m trying to go paperless at home… well, not really. I’m really trying to make sure some important documents don’t get destroyed if we ever had a fire or other disaster at home. I don’t know why, but that sort of things concerns me now. Maybe it was Hurricane Katrina and memories of Hurricane Hugo blowing over my house years ago, but I digress. What I’ve found is a great little Mac app that does the trick for my document archive – its call Yep. Its billed at iPhoto for your PDF’s and that’s a pretty accurate billing. Its a great library application for your PDF files, wherever they happen to lie on your filesystem.
This is going to be short post… part of my New Years resolution to be more postive… I’ll warn you, this one is just a rant. Its just one of those things that irks me.
We went to Circuit City this past weekend, again. Its only the 3rd time we have been in the Myrtle Beach store since it announced its closing. We were searching for a deal, which has been hard to find, even in their liquidation. I finally bought something – I bought a new multi-function printer for home. Our printer/scanner/copier at home is getting pretty long in the tooth and its ink finally dried up last week. I’d heard that the printers at CC had finally been marked to 25% off. The last time I went in, they were a pathetic 10% off – and that was off of some jacked up price that someone pulled from their nether regions. Finally, at “25% off”, the price dropped to $10 below what NewEgg, Best Buy, Sam’s Club and everyone else has been selling this particular HP printer for…
I realize this has been widely reported, from every news outlet in the country, but let me just say it. Those prices at the liquidation sale (Goody’s Family Clothing, Circuit City) — They aren’t deals people – they are rip offs. You could buy most to of the merchandise cheaper the day before the liquidation was announced. I know its not really news, but its still a sad state of things.
But I’m not sure what I’m more sad about – that people fall into the one-million-gazillon percent off syndrome where anything on sale is a great deal and I must buy 3 – or the corporate greed still exhuding from corporate America even during a liquidation. Let me explain. Put anything on sale, no matter how much you mark it up, and there are people who will buy it – because its a “great deal” or “it was on sale.” Are we really that gulliable America? Secondly, you’re a business that is failing. You can’t make it on your own two feet. You weren’t selling it at your “everyday low low price”, so why come in and raise prices just to “mark it down” to a price above your everyday low price… It makes no sense. How much does it cost a company like Circuit City to keep the lights on and pay employees, rather than simply do a true going-out-of-business, sell the merchandise at a real discount and wrap things up? Its a dying cow, put it out of its misery… I’m sure they’d make more money that way, but what do I know, right?