Multipathing Havoc on Bladecenter and DS4300 SAN
I have been having fun and games with a DS4300 SAN controller hooked to an IBM Bladecenter. There are a bunch of blades configured with SLES9 and VMware Server, running virtual Windows 2003 servers in this environment. When I initally set everything up, it all seemed to work ok, but now that our programming team is working on our Deltek Vision implementation, they are hitting on one of the VMs that runs MS SQL server, and it has been falling over.
The symptoms seemed like anything CPU intensive would cause the VM to drop it’s network connections. That meant that every time they executed a query against the MS SQL server from the Deltek Vision report server, the SQL server would fall off the network. I thought maybe it was a VMware issue or a Linux kernel networking driver problem, but patching and replacing drivers didn’t help. Then I was doing some other work on one of the other blades that necessitated my going into the SAN configuration tool to allocate some storage, and I noticed that some of the SAN partitions were not being hosted on their preferred SAN controllers. In the SAN, each partition can prefer to be hosted on one of the two redundant controllers. When one of the controllers or the fibre path to one of the blades is unavailable, the partitions flip over to the other path and continue to work, after a small lag. I noticed that some of the blades were trying to communicate by default to some of the storage partitions on the wrong channel, and this was causing the partitions to thrash between one controller and the other. I spent yesterday trying to figure out how to configure each blade’s dual controller to prefer the same path as the SAN controller, and today I reconfigured most of the blades to do that, and now the thrashing has stopped.
I may do a detailed write-up about how to configure IBM DS4300 SAN storage for use with a redundant path to an IBM Bladeserver next week, to share how I got this darn thing working. The fix also fixes the problems with the Windows 2003 MS SQL server virtual machine, which was the whole point to begin with.