Posts filed under 'Virtualization'
Virtualization on Mac OSX
I use virtualization extensively at work to run multiple virtual computers on one physical machine. We also use it to disconnect the operating system and application environment from the physical hardware for disaster recovery and hardware agnosticism. Our platforms of choice are VMware Server and VMware ESX server. The first is great because it’s free, and the second is great because it’s amazingly fast and reliable.
Almost all our virtualized workloads run fine in VMware Server in production, which is great because there are no license costs. The only workload that works like crap on VMware server that we use, is SQL Server. It dies like a dog because of I/O latency or something, and the only thing we could do to get it working in a virtualized environment is to run it in ESX server. SQL Server is so flaky that it returns random query results (when it works) or one of several unrelated errors (when it fails) when run in VMware Server or VMware Workstation.
Since I’ve become a switcher I’ve been looking to run virtual machines on my Mac at home. The likely choice for me is VMware Fusion, which is still in beta, even though Parallels is more mature on the Mac platform. The advantage of VMware is that my work virtual machines will run at home. In Beta 3, it seems that 64-bit VMs are not supported, even though my Mac is a Core2-Duo. The website says you can run 64-bit VMs, but a Solaris VM I built at work won’t run in 64-bit mode in Beta3. I haven’t updated to Beta4 yet, but apparently it has a new feature called Unity, which allows you to sort-of disappear a Windows virtual machine desktop so that the application windows running inside the virtual machine just appear as windows on your Mac desktop. that’s kind-of cool, I guess. I’ll update to Beta4 and see if my Solaris 64-bit VM works.
Add comment 2007-06-07
Vision Rollout (Mostly) Complete
Our users started using Deltek Vision last week. Despite a long time planning, preparing, and porting, there were still many long days and late nights to get everything working on time. Other than some problems with one very large set of reports that we are still trying to troubleshoot, it seems like it’s working OK. It is quite slow compared to our old in-house custom web-based reporting that we did off of CFMS, but that’s not unexpected, because what we had before was wicked-fast.
There are a few little things to clean up, like making sure backups are coordinated with accounting’s large report runs, and stuff like that. There are also lots of other IT initiatives that I’m looking forward to working on now that this project is complete.
2 comments 2007-03-20
SQL Server on VMware Server
We are deploying Deltek Vision 4.1 as our new financial management system in March. We started work a while back on this project. I built the infrastructure in VMware Server running on top of SuSE Linux Enterprise Server 9. We are using a three-box Vision implementation, with a separate VM running Windows 2003 Server Standard Edition on dedicated VMware Server hosts for each of Vision Web, Vision Reporting and Vision SQL Server. The virtualization is to allow for disaster recovery and portability of hardware. The database analyst and programmer guys got started quite a while ago getting the reports that our project managers rely on in our old system working in Vision. We’ve also been testing and troubleshooting Vision and training the accounting staff during this time.
A problem started manifesting itself with Vision and SQL Server after some of the data was imported into SQL Server and we started doing queries on it. The problem would occur particularly often whenever nested select statements were used in a query. SQL Server would fail to execute the query and error out with one of four different errors: error 5243, 5242, 823, or 682. All of these errors have multiple meanings, but a common thread is I/O problems to do with physical disks or storage drivers, when SQL Server does lots of writes in TEMPDB. In our case because we are using SQL Server on a virtual machine, it implied some kind of problem with virtual disks or with the VMware virtual storage controller driver, or possibly with the underlying filesystem on the VMware host.
Several VMware Server knowledgebase and discussion posts mentioned similar problems regarding SQL Server 2000 and SQL Server 2005 on VMware Server.
To confirm that the problem was a VMware problem, which was just a suspicion initially, I built a physical Windows 2003 Server that was otherwise identically configured to the virtual one. On the physical server the queries never failed in our tests.
That made us fairly confident that we had a problem with VMware Server somewhere. When I initially created the virtual machine, I built a reiserfs partition on our IBM DS4300 SAN to store it. When I built the VM, I created a 100 GB virtual disk that was configured in 2 GB chunks, and I did not preallocate all the space at build time. I thought that perhaps the I/O problems were occurring when the VM writes the TEMPDB and new storage was allocated as the virtual disk expanded. I decided to convert the virtual disk to a fully preallocated disk using vmware-vdiskmanager, which is a command-line tool that comes with VMware Server. I did that conversion and then tested the new VM on non-production hardware that was very similar to the production blade server, except that it had locally attached storage instead of a SAN LUN. The problem almost went away. It went from erroring out more than half the time to erroring out about once in 15 or 20 runs. That indicated that I was on the right track.
We had a momentary lapse of reason and thought that the VM might run better on a Windows VMware Server host. I moved the 100 GB preallocated disk version of the VM to a Windows XP Pro workstation running VMware Server. The error occurred nearly every time, so we abandoned that ill-conceived path.
Next, I thought that either reiserfs couldn’t cut it, or VMware Server couldn’t cut it. Since I had just received a new workstation from the vendor, I configured it with OpenSUSE 10.2 and formatted the disk in ext3. I also built an ESX Server 3 evaluation server in Engineering, on an IBM x334 pizza box connected to a fibre-channel SAN. I copied the 100 GB VM to both my new workstation and the ESX Server.
On my workstation, the moderately demanding test query ran 50 times in a row without failing until I gave up on it, proving out that ext3 works better as an underlying filesysetm for VMware Server, at least when the VM you are hosting is Windows 2003 Server with SQL Server 2005.
On the ESX server, the query also worked every time, which was fully expected.
Finally, I decided that even if the ext3 and preallocated disk fix didn’t fix the problem 100% of the time, it was worth applying it to the production system, so that the problems would be reduced during training. It would also buy us time to decide whether or not to buy ESX Server for about $10,000CDN including one year of support.
I shut down the production Vision database server after hours. I moved the existing non-preallocated VM off the SAN LUN that it used for production. Then I reformatted the reiserfs partition on the SAN LUN to ext3. After the format I was surprised to find that the available space was smaller than it was with reiserfs. I had to resize the SAN LUN up a few gigabytes to allow me to convert the non-preallocated disk to a fully preallocated one back on the SAN LUN. After the resize, I recreated the ext3 partition and used vmware-vdiskmanager to convert the non-preallocated disk to a preallocated one. The VM booted and ran fine after the disk conversion.
On the converted production VM, all errors appear to have ceased and performance may have improved slightly as well. We have decided to proceed to deployment on VMware Server using this configuration.
Take Away Points
- The problem referenced in this article occurs on SQL Server 2000 and SQL Server 2005. We discovered this after the fact while working on something else.
- It is a good idea to run VMware Server on Linux, not on a Windows host.
- It is a good idea to use ext3 instead of reiserfs as the filesystem to store your virtual machines. Other Linux filesystems might be suitable as well, but were not tested.
- Filesystems formatted with ext3 use more space for overhead than reiserfs.
- VMware Server is similar in performance to VMware ESX server for Windows 2003 virtual machines running SQL Server under light to moderate loads.
- In the future I will try very hard to not have to move a 100 GB virtual machine all over my network. It takes a long time to repeatedly move 100 GB worth of files from system to system. (duh!)
- When working with troubleshooting on large virtual machines, it is great to have lots and lots of fast storage nearby on the network. Speculative changes are much less hair-raising if you have lots of room to backup your virtual machines.
- You can do awesome stuff with virtual machines that you just can’t even consider unless you have lab hardware coming out your ears and an army of lab monkeys to help you.
15 comments 2007-02-09
Vision Test Server Under Construction
I’m building a virtual machine to run Deltek Vision. I had a physical machine that worked, running Windows 2003 server on a little HP desktop machine. It took a long time to install and patch, so I wanted to reuse the work I put into setting it up. I figured I could use the new VMware Converter 3 to convert it to a VM. I tried three times to create a VM from the physical machine but it errored out each time, with an unspecified error. I also tried using Norton Ghost to image the drive into a new virtual machine, but the VM ended up unbootable, and trying a repair from the Windows 2003 CD didn’t fix it.
I gave up on the physical-to-virtual conversion and started building a new VM in Windows 2003 server. So far it’s taken over eight hours to get a scratch-installed VM setup with Windows 2003, its 51 security updates, SQL Server 2005, SQL Server 2005 SP1, and Deltek Vision 4.1 with all its service packs, and we have a multi-megabit fibre connection to the Internet. Online update sites are great, but man would it be nice to have been able to pull all that patch stuff locally so I could apply it from a local server.
Once we have this VM built, we shoud be able to isolate our problem with Deltek Vision and the TEMPDB in SQL Server down to either our production hardware, the production VM, or something to do with VMware Server. In all three of those cases, there will be a path forward for our production rollout.
Add comment 2007-02-01
Showstopper Problem With Our Deltek Vision Rollout
We are entering the final phases before rolling out Deltek Vision, and we’ve hit a showstopper. We’re running the entire Deltek Vision system as four virtual machines on four dedicated hosts, in VMware Server on top of SuSE Linux. Each virtual machine is a W2K3 server, with one piece of Vision on it. One is a PDC, one is the web tier, one is the report server, and one is an SQL2005 server. All have dual Xeon processors and tons of RAM.
The issue we are having is that many queries against the SQL server, if they return more than trivial results, stop with an error that says TEMPDB is full. Other times the same queries will work as expected. The size of TEMPDB configured by the DBA doesn’t matter.
We thought that initially it was because we had upgraded from SQL Server 2000 to 2005, so we uninstalled and reinstalled SQL Server 2005. That didn’t fix the problem. Then, I built a standalone Deltek Vision sysetm on a single workstation, our guys imported the data and tried their tests, and it didn’t experience the problem. Right now, I’m building a standalone Vision install on a brand new virtual machine, to try it again. If that works, I’ll move it to the production hardware and see what happens there.
If it turns out that it is a problem with running the VM on our production hardware, I may have to rebuild one of the production machines as a native W2K3 server rather than a VMware Server host. Then I lose the ability to fail the machine over to different hardware in the event of an emergency.
7 comments 2007-01-31
New Web Server Deployed
I was unexpectedly asked to provide a PHP and MySQL-enabled web server with phpMyAdmin for one of our subsidiary companies to host their brand new web content, that they contracted to be developed by a marketing outfit. The server was requested last week around Wednesday. I had no hardware, no host environment, no software selected, and was basically not ready for this request. We were planning on revamping our public web hosting infrastructure in the late spring after all the wrinkles get worked out of our Deltek Vision deployment, and after we’re done updating our GroupWise hardware and decommissioning the old GroupWise hardware. Right now the old and new GroupWise systems are running in tandem in our colo rack, so there’s not really any spare hardware capacity or even much rack space down there.
Anyways, I quickly dumped some lab services I had running in Engineering on an ML370, stuffed some ram into it, and started building a virtual server host last week. I set it up in OpenSUSE 10.2, with VMware Server 1.01, and then built a web server VM with SLES9, Apache, MySQL, phpMyAdmin, an FTP server, a firewall, user IDs, and all the latest patches. I deployed it to the public internet, with the virtual server host’s interface hidden behind the firewall, and just the appropriate services exposed outside on the VM. I notified the web developers that it was ready yesterday mid-afternoon.
The web developers were barking at me about how long it was taking and how I might cause them a production delay. I got the server out as fast as I could, considering it was spur of the moment and I had to reshuffle a bunch of other work and hardware to deal with it. I assumed that because they were dancing around waiting for it that they were ready to upload content to it right away. Now it’s over a day since I notified them it was ready, and there’s still no content. They haven’t even tried to login yet. It just goes to show you that a lack of planning on other people’s part shouldn’t constitute an emergency on your part. Despite the fact that I take that as a fundamental axiom, I don’t follow my own advice too well and allow other people to impose artificial urgency to too many things I do. I should learn my lesson, but in a service role in our company, it’s tough, and often the squeakiest (and most annoying) wheel gets the oil.
2 comments 2007-01-25
Still not done new web server
I started building a VMware Server host for the new web server mentioned previously last week. The VM host is just running OpenSUSE 10.2, because it’s temporary. The VM for the web server will be SLES9, our current standard platform. I started that today. It’s mostly built, but it needs some tlc, patching and configuration before it’s ready.
A whole bunch of other crap is occupying my time now too. We’re getting ready for our switchover to Deltek Vision, and I have to build some VMs with w2k3 server and SQL Server 2000, that we will use to convert our data from CFMS to Vision. We need a few of those. Luckily I can build one and clone it.
I have a new workstation on the way, and according to our vendor, it should be here Friday. It’s a Sun Ultra 20 M2, and I can’t wait to get it.
I was having inexplicable problems with OpenSUSE 10.2 64-bit version on my present desktop, so I backed it up to the thumper box with rsync and rebuilt it in 32-bit mode today. So far so good.
8 comments 2007-01-22
Microsoft Could Make Windows Better With Virtualization
Caveat: I don’t use Windows on my desktop machines, and haven’t since 2003, and I work mostly with Linux and NetWare on the server side, with some Solaris and Windows servers thrown into the mix.
I am constantly perplexed with people who love Windows. It costs a lot of money. It is riddled with viruses and spyware. Normal people can’t maintain it and to keep it stable you have to reinstall it every six months. It doesn’t come with anything useful out of the box, and by the time you have everything you need (an office suite, photo manager, PDF reader and writer, proper web browser (with plugins), mp3 player, CD burner, personal organizer, email program, flash player, quicktime player, proper text editor, C compiler and other developer tools, etc. etc., you’ve spent another $1,000, and downloaded a gigabyte or two of stuff (plus wasted hours of time). Don’t forget that you need to install 500 patches and reboot after each one. Also, don’t forget about the constant virus scanner updates, disk defragging, adware scanning, and all that nonsense. But, I digress (I guess I’d better assign this post to the “Rant” category).
Anyways, one of the big problems with Windows is that it is so insecure and vulnerable to security exploits. Many people think that this is because it is developed in a closed source model. While I think that closed development prevents a lot of opportunity for bug-finding and security-hole fixing, I think one of the other major reasons Windows is so vulnerable is that Microsoft is forced by the market to maintain backwards compatibility with ancient software. If Microsoft does something that breaks compatibility of existing applications, but increases the security of the platform, they get raked over the coals. They walk a fine line between keeping everything as secure as they can (which isn’t very) and preventing the applications of thier customers from breaking. For example, the new Vista feature of using less privileged users without administrator privileges will fail, because many applications don’t work properly unless the users running them have administrator privileges, and users will rebel if they are continuously asked for permission by an application that needs administrator privileges. This causes all kinds of security issues. I won’t talk about how Microsoft got into that conundrum, as that isn’t the point of this post.
The point of this post is that I think that the commoditization of virtualization in modern hardware and software is an opportunity for Microsoft to drastically improve security in Windows version Vista + 1, without breaking compatibility with older applications that require older insecure APIs and features in the operating system. After the prolonged ranting above, the conclusion is fairly short. Microsoft could re-architect the version of Windows that comes after Vista to have a hardened secure core, with tightly secured APIs, with concepts like Least User Privilege, and all the modern thinking that has been done about secure operating systems. This core could drop all legacy compatibility completely. New Windows applications could be written around this new secure core, and Windows would be much better off going forward. At the same time, Microsoft could implement a sort-of sandboxed compatibility layer (or layers) for applications that were written for older versions of Windows, using virtualization. A Windows desktop could have it’s secure core running with non-legacy applications, and one or more virtual machines, that were logically isolated from the core, running the old less-secure Win32 APIs that would allow older applications to run. The applications could be isolated from the core and from each other, preventing a security compromise in an old application from compromising the whole system. This approach would give Microsoft’s customers time to migrate to the new more secure Windows architecture at their own pace, while still being able to maintain legacy applications, and have the benefits of a more secure environment.
Most of this isn’t a new idea. Apple produced a compatibility layer called Rosetta when they came out with OSX, to allow older Macintosh applications to work. Unfortunately from everything I’ve read, that compatibility layer was very slow. The new part of this idea is to use virtualization to provide a fully functional virtual machine to run the compatibility layer in. This would have the effect of drastically improving the performance of the compatibility layer, as opposed to writing it as a dynamic old-API-to-new-API translator, like Apple’s Rosetta. It would also simplify the isolation of the compatibility layer from the secure core. Also, if Microsoft uses virtual machines to host the compatibility layer, then the compatibility layer is already written. It’s called Windows Vista. They would just have to strip out unneeded parts, so that it just provided the facilities necessary to run legacy apps, and away they could go.
This is my million dollar idea of the day.
Add comment 2006-10-25
Blackberry in VMware Server on OpenSUSE
I have a Blackberry 7250 from Telus as my phone and PIM device. I got it for free via a promo at Brainshare last year, but I haven’t been using it because I had a perfectly good Motorola phone, and a Palm Zire 71, which I was happy with so the Blackberry with it’s higher monthly data fees seemed unnecessary.
In the mean time, I got an iPod, which had me carrying three devices, and my old phone died, and then my Zire 71 died. I decided to activate the Blackberry on my account and use that as my phone and PIM.
I had that working fine, and got mail working on it and so forth, but since I’m a Linux user I haven’t been able to connect to my PC to back up the settings in the Blackberry. I use VMware Server for all kinds of stuff so I decided to use that to run a Windows 2000 virtual machine, and connect it to my Blackberry to enable me to upgrade the Blackberry firmware and back up the handheld.
I’m running OpenSUSE 10.1 as my Desktop OS, and I have a Windows 2000 workstation VM already built, so I downloaded the Blackberry desktop software into the VM and installed it. Then I connected the Blackberry to the host, and clicked the VM / Removable Devices / USB menu to tell VMware to connect the Blackberry to the VM. The menu showed Empty, instead of the expected Blackberry Device.
I then went searching and found this knowledgebase article in the VMware Technology Network Knowledgebase, which explains that you need to have the usbfs filesystem mounted, which SUSE Linux doesn’t do automatically. A quick su followed by mount -t usbfs none /proc/bus/usb got that mounted. I then rebooted my VM, and the Blackberry device appeared on the USB menu.
I connected the Blackberry device to the VM, and the Blackberry desktop application started up and I was away. I updated its firmware, backed up its contents, and it all worked flawlessly.
Add comment 2006-10-24
Finishing Up Bladecenter Re-Configuration
I re-built eight of the nine blades in the Bladecenter after Brainshare. I wasn’t quite done by the weekend, so I finished that off this week. I got the VMs from our new financial management system running on the newly-rebuilt blades, and configured one of our routers so that our developers could see the VMs from their regular desktops, even though the VMs are in the normally isolated lab network. Then I started working on other stuff.
We are trying to get ready for a major switchover from our existing VPN infrastructure to a new one from a new provider. The provider has some challenges getting things setup the way we need. We have also had problems getting fibre pulled into the various sites, with contractors shrugging the responsibiltiy back and forth between themselves and the telco provider. I think we’re finally just about ready with that stuff.
Meanwhile, the blades started crashing overnight, so I spent a day or two figuring out what was causing that. It turns out cron would execute zenworks zmd (management agent) to do some maintenance function, and it would cause a kernel panic, locking up the machines. I found a beta patch of the zmd piece, and installed that on one server to see if it would work. I’m waiting to see if it stays up when the other blades crash.
Add comment 2006-04-13