Posts filed under 'zfs'

Solaris iSCSI Target with ESX 3.02 Server

I have a nice big IBM server box with ESX serving as my entire Engineering lab (Engineering is my name for the IT lab we have here for Star Trek-related historical reasons). The box has a terabyte of RAID5 SATA disks on board for virtual machine storage, but that’s not quite enough for what we’re doing. To get more storage for virtual machines, I decided that I could take advantage of our Sun x4500 server with Solaris, to allocate another terabyte of storage. Solaris supports iSCSI targets in ZFS, so that seemed like the way to go. The x4500 has four gigabit network interface cards on board, three of which I haven’t been using so far, so I decided a good way to go would be to add a gigabit network interface to my IBM ESX server and use a crossover CAT6 cable to direct-connect them and provide a dedicated gigabit storage “network” for iSCSI.

Here are my lab notes from getting that setup.

Setting up iSCSI storage on Solaris and then getting it mounted on ESX Server

First, add a physical nic to VMware Server if required. iSCSI is only supported on Gigabit ethernet in VMware ESX 3.x. If you don’t know how to do this, get your hardware dude to take care of it, and you go take a remedial computer class.

Next, add a physical nic to Solaris box if required. This was not required in my case because my x4500 server had a spare port. Hook the Solaris box nic to the ESX box nic with a crossover cable or via a gigabit switch.

Next configure the nic in Solaris like this: (assuming you know the device name). On the x4500 the built in ethernet interfaces are called e1000g0, e1000g1, e1000g2, and e1000g3. I was already using e1000g0 for the main interface to the network.

ifconfig e1000g1 plumb
ifconfig e1000g1 192.168.254.2 netmask 255.255.255.0 up

Check to see if it worked with this:

ifconfig -a

You should see that e1000g1 now has an IP address assigned to it.

Next, add the file for the new interface’s IP address under /etc

cat > /etc/hostname.e1000g1
192.168.254.2
CTRL-D

Next, edit /etc/inet/hosts and add a line for your new IP address. You might have to chmod +w hosts before editing and chmod -w hosts after. Follow the format in the file. Set a new hostname for your new interface

For example,

192.168.254.2 ss003stor.corporate.ae.ca ss003stor

Make sure the file ends with a blank line.

Next, edit the /etc/inet/netmasks file. You might have to do the chmod ting again because the file is normally set to read only. Add this line:

192.168.254.2 255.255.255.0

Make sure the file ends with a blank line.

Next, add a new zfs volume that we will use as an iSCSI target. This example assumes that the zfs filesystem data/vols already exists.

zfs create -V 1T data/vols/vs-vmfs01

This creates a 1 Terabyte data volume. It is a volume, and not a zfs filesystem, so you can’t see it with the ls command, but you can see it with zfs list -r data/vols.

Next, make the volume shared via iSCSI.

zfs set shareiscsi=on data/vols/vs-vmfs01

Next, create an iSCSI target portal group (tpgt), and set it up to only listen for iSCSI connections on your new dedicated storage network card.

iscsitadm create tpgt 1
iscsitadm modify tpgt -i 192.168.254.2 1

Make sure the group got created:

iscsitadm list tpgt -v
TPGT: 1
      IP Address: 192.168.254.2

Add the target we already created to the group.

iscsitadm modify target -p 1 data/vols/fs-vmfs01

Make sure that worked.

iscsitadm list target -v

Target: data/vols/vs-vmfs01
    iSCSI Name: iqn.bla.bla.bla
    Alias: data/vols/vs-vmfs01
    Connections: 1
        Initiator:
            iSCSI Name: iqn.bla.bla
            Alias: vs200.bla.bla
    ACL list:
    TPGT list:
        TPGT: 1
    LUN information:
        LUN: 0
            GUID: blabla
            VID: SUN
            PID: SOLARIS
            Type: disk
            Size: 1.0T
            Backing store: /dev/zvol/rdsk/data/vols/vs-vmfs01
            Status: online

Now the target is ready to get mounted and formatted by the ESX Server.

The next step is to configure the ESX Server.

First, go to the virtual infrastrucutre client and select the server’s configuraion tab. Click on Network Adapters and make sure your network card shows up in the list. Click on Networking, and click Add Networking.

In the dialog, select the VMkernel radio button and click Next. Select “Create a Virtual Switch” and make sure your new network interface is selected. Click next. Under Network Label, give it a name like vmkernel storage, and select Next and then Finish.

On your new virtual switch, click Properties. In the Ports dialog, click Add, and select VMKernel. Call the network label VMkernel Storage, and give it an IP address on the same subnet as your Solaris dedicated storage server network card, with the same network mask. Click Next and Finish.

Next, go to the console command line via ssh or on the actual console. Login as root. Type the following commands to enable iSCSI software initiator, open a port in the firewall, and setup and scan the iSCSI target:

/usr/sbin/esxcfg-swiscsi -e
(enables iSCSI software initiator)
/usr/sbin/esxcfg-firewall -e swISCSIClient
(opens iSCSI client port in firewall)
/usr/sbin/vmkiscsi-tool -D -a 192.168.254.2 vmhba40
(tells the iSCSI software adapter to request iSCSI targets from the storage box)
/usr/sbin/esxcfg-rescan vmhba40
(tells the iSCSI initiator to scan for new targets)

Next, go to the Storage tab and click Add Storage. Choose the Disk/LUN radio button and click Next. Select your new iSCSI target from the list and click Next. Follow the prompts, and ESX server will format the volume with vmfs and mount it. You should now be able to start using it right away.


3 comments 2008-03-12

No More USB Hard Disks

Since I got some Macs at home I’ve been wanting to use Time Machine systematically for backups. Previously I have had a FreeBSD server at home to store bulk and shared data and backups. Initially, I had anticipated getting more drives for the server, formatting them as ZFS, and using that with Time Machine. Then, when Mac OSX 10.5 Leopard released, it turned out that Time Machine only supports external USB or firewire disks. That meant my server idea wouldn’t work.

I borrowed a USB drive from work to test out with time machine, and it worked for a while, to the point that I even used it to restore a large chunk of my system because I had screwed it up by messing around with it. Then, something funny happened and the USB disk wouldn’t mount anymore. I did a bunch of troubleshooting and discovered that the journal had gotten buggered up on the HFS+ volume on the USB disk. I found a little utility that I ran to disable journaling, which then allowed me to use disk utility to repair the volume.

Recently I was using the USB disk to shuffle some data around, and at the one point that the only copy of some digital video files from the video camera I have were on the disk, it died again. I’ve been trying to repair the volume for four days with various tools. Nothing is working so far. It appears that there is an actual hardware failure somewhere in the device. It’s a Maxtor One Touch III with two hard drives configured in a RAID0, so I can’t recover the data from the disks one at a time by direct-connecting them to another machine.

I was considering going out and buying a big external USB disk for backups until this event occurred. On the other side of my desk, there’s a FreeBSD box with a nice set of mirrored ZFS drives where the data really should be. I’m taking this as a slap upside the head, and I’m not risking any of my data on external USB disks anymore. I’m going to get a couple more disks for the server, RAIDZ them, and use that instead.

I won’t be able to use Time Machine, but what the hell, I can setup a different backup software for Jenn’s and my machines.


5 comments 2008-03-03

Converting VMware Server VMs to ESX Server 3.02

Our company is drinking the virtualization kool-aid more and more. We have three ESX server licenses in our Bladecenter now, and tonight I migrated the second of four of our Deltek Vision servers (the Reporting tier) from VMware Server hosted on Linux to ESX server (The first was the Deltek Vision SQL Server tier).

First I used my normal rsync backup script and zfs snapshot to create the last backup of the VMware Server version of the report server. Then I just mounted the snapshot via NFS on the ESX server, and used

vmkfstools -i nfsvol/source.vmdk vmfsvol/dest.vmdk

to clone the virtual disk onto one of my SAN vmfs3 datastores. Then I created a new virtual machine in one of the ESX server hosts, pointed it at the imported virtual disk, booted, re-installed vmware tools and rebooted again. It wasn’t too hard and everything works.

The longest part was the vmkfstools import operation, which took about 45 minutes for the 80 GB disk. The SAN was under heavy load at the time doing mail system backups, so I can’t complain.


6 comments 2007-10-17

iSCSI Problems (mostly) My Fault

I now have iSCSI working, and have managed to resolve the problems I was having in my previous post. The problem was caused by me trying to use the wrong syntax in iscsiadm, the command in open-iscsi that you use to manage iscsi on your Linux system.

If you try to login to an iscsi target using iscsiadm with the following syntax, it locks up your Linux system. While it is my fault I was using the incorrect syntax, a sane tool should not be able to hang an entire system just because the user uses the wrong syntax with the command, obviously. This command causes the crash:

iscsiadm -m node -p 10.0.0.1:3260 --targetname target --login

Note that the --targetname parameter is supposed to have an equals sign between the switch and the parameter, and I was forgetting the equals sign in favour of a space. The proper command would be this:

iscsiadm -m node -p 10.0.0.1:3260 --targetname=target --login

I have fixed my scripts and now iscsi is working for me on Linux initiators, and I have also submitted this as a bug to the open-iscsi mailing list, so that in the future, other dummies won’t crash their servers because of a typo. Check the link to see if any discussion develops on the bug report.


Add comment 2007-09-20

iSCSI Fun

Since VMworld, and all the talk I heard there about iSCSI, I’ve become interested in learning more about iSCSI in the various environments we use at work. To that end, I’ve set up an iSCSI target on our Solaris-based x4500 (because with zfs zvols, iSCSI targets are pretty trivial to set up). I’ve been trying to get various software initiators to talk to it. Here’s my track record so far:

  • A physical machine running OpenSUSE Linux 10.2 with open-iscsi: discovery works, but logging into the iscsi portal on the x4500 locks up the OpenSUSE box completely, forcing a power-off reboot.
  • A VMware Server VM running Solaris Express DE: just works.
  • A VMware Server VM running Windows XP with the Microsoft iSCSI initiator: just works.
  • A physical machine running Ubuntu 7.04 with open-iscsi: discovery works, but logging into the iscsi portal locks the box up completely forcing a power-off reboot.
  • Netware: just works.

Conclusion: Either find out what’s wrong with open-iscsi on my versions of the Linux kernel or the conflict between open-iscsi and Solaris’s iscsi target, or find another iscsi tool chain on Linux.


2 comments 2007-09-20

Restoring GroupWise Accounts - Getting it to a Science

Now that our GroupWise 7 deployment is completely moved onto Linux servers with Solaris back-ending the storage, I have finally got a consistent backup and restore solution that allows me to recover accidentally deleted users without pulling out all my hair.

Our GroupWise post office directories live on Solaris zfs partitions exported to the Linux post office servers via nfs. When I want to back them up, I stop the poa for five seconds and do a zfs snapshot. Then I use zfs send and zfs receive to replicate the snapshot to another Solaris box with oodles of SATA disks. I snapshot the replica every day and keep daily snapshots for 60 days. The snapshots are very space-conservative so it’s relatively cheap to do this.

When one of my administrators deletes a GroupWise account that he or she shouldn’t have, I can use this system to recover the deleted accounts quickly. On the Linux post office servers, I created /mnt/restore/po directories for each post office. Then using ConsoleOne, I defined restore areas pointing to those directories.

When I have to restore a user’s data that was deleted on a certain day, I do a zfs clone operation on the backup server to clone the zfs snapshot made the day before the user was deleted. Then I use zfs set sharenfs= to export the clone via nfs and make it mountable on the Linux post office server. Back on the Linux post office server, I mount the exported post office under /mnt/restore/po, and I now have a working restore area for that day. Choosing a different day is as easy as doing a new zfs clone, exporting it, and unmounting and remounting on the post office server.

If I have to restore a deleted user, instead of just deleted mail, I just have to do the same clone, export and import operation with the GroupWise primary domain directory backup, and then use ConsoleOne, select Restore Deleted User, browse to the mounted clone, and pick the user to restore. Then I follow the above procedure to get the data back for that user.

This is so much easier than dealing with our former backup service provider, where I would have to get a full tape restore done, see if the user was on it, wash, rinse and repeat. That process sometimes took days if the administrator wasn’t sure which day the deletion happened on.


3 comments 2007-08-09

Solaris Live Upgrade Testing

In the continuing saga to update Solaris, I’ve added the first successful chapter. I need to get to a Solaris build of about b55 or higher, so that the command zfs receive supports the rollback feature prior to the receive operation.

On some of my Solaris machines, the boot disks are small and there is not enough room to do an upgrade over top of the system. I probably have to rebuild those completely, and until I can do that, synchronizing between them will have to be via rsync rather than zfs send / zfs receive. However, our x4500 has a big boot disk, with unpartitioned space, which means I can use Solaris Live Upgrade to update it. Live Upgrade allows you to make a duplicate of your working boot environment on a new partition or disk, upgrade the duplicate to a new version (or even just apply patches to it) and then reboot into that duplicate. It also allows you to revert to the original system in the event you have problems with the upgraded one.

Before I do this on the x4500, I wanted to try it on a non-production machine. To do that, I built a VMWare Server VM with Solaris Express DE build 55, and then yesterday, with the help of these instructions, I did a live upgrade to Solaris Express DE build 64. That worked fine, so I’m encouraged to try a live upgrade on the x4500. It’s running Solaris 11/06, however, not Solaris Express DE, so it’s a bigger step to get to build 64. I think I’m going to start my VM over at the same level as the x4500, and re-do the live upgrade to b64 directly, so I can see what all the steps will need to be on the x4500.


Add comment 2007-06-26

GroupWise Migration Complete

Thanks to a lot of hard work by Denys, and no thanks to Novell support, we’ve finished moving all our GroupWise users to our new GroupWise server architecture.

Our old system consisted of six post offices hosted on a two-node NetWare cluster using Novell Cluster Services. There were six post offices as part of a history that included corporate political influences that helped create a functional but sub-optimal technical design. We had an opportunity to upgrade to GroupWise 7, migrate to new hardware, and consolidate to an architecture that included one post office per server on three servers, all in one operation. We built new servers, consisting of three virtual machines running SLES9, hosting one post office each, but with the storage for those post offices held on a fourth server running Solaris, with the post office data on ZFS filesystems, and remote-mounted on the SLES boxes. The beauty of this is that we can do backups with about 10 seconds or less of downtime per post office per day, and we can do speculative changes with near-instant rollback if the changes do anything unexpected. ZFS is your friend.

Anyways, the thanks for Denys is for all the effort he put in to move more than 600 users, many resources and several distribution lists from the six old post offices to the three new ones, and dealing with the aftermath of broken passwords that ensued.

The no-thanks to Novell support is for the lack of help fixing a bug that causes mailbox caching passwords to break when you move a user from a NetWare-created GroupWise post office to a Linux-hosted GroupWise 7 post office. Despite us filing a Premium Support ticket, and Novell recognizing a bug (GroupWise defect 239947) and refunding our support ticket, they never fixed the bug, and after months of us waiting, they even stopped responding for our requests for a status update on the bug. GroupWise 7 SP2 came out without the bug fixed, and we had to proceed and fix 600 broken passwords manually. Hence the ensuing aftermath.

Anyways, we’re finally done with the migration, and our new architecture is much more scalable than the former one. I expect that by adding post office virtual machines on additional blades, plus adding storage management Solaris blades as required, we’ll be able to scale up to several thousand users, which should work for us for the foreseeable future.


5 comments 2007-06-19

ZFS on Mac OSX 10.5

I read today that Jonathan Schwartz “accidentally” leaked that ZFS would be the filesystem of OSX Tiger. This is very interesting to me, because we use ZFS for doing disk backup snapshots at work, and because I really want a ZFS-based home server too. When I first saw an announcement of Apple’s Time Machine feature for Tiger, it occurred to me that it would be fairly easy to implement that using ZFS as a file backing store. I can’t wait to get Tiger, and integrate it with my Solaris file server at home. However, if I can be patient enough, FreeBSD 7.0 with ZFS might end up being my server OS instead. I like FreeBSD and just have a lot more experience with it than Solaris, so for a home server it makes more sense for me.


Add comment 2007-06-06

Updating Solaris Releases

We’re using Solaris 10 with zfs as a target for disk-to-disk backups off of our production networks. Our old system used Linux and the filesystems we tried (ext3 and reiserfs) were awkward, slow and poorly able to handle millions of files on a single filesystem. We should probably have used XFS, but by the time we were strongly considering changing what we were using, zfs showed up on the scene. We needed more storage, and instead of buying more of the crappy HP MSA20 SATA enclosures we’d been using and having lots of problems with (weekly firmware updates and inexplicable occurances of the storage just unmounting and refusing to remount without rebooting), we decided to buy our nice Sun x4500 server with 24 TB of disk space.

We’re using rsync to synchronize from our production servers to the primary backup box, the x4500 server, in zfs. Then, we make zfs snapshots and make those available to the network administrators in our offices over http. That way, they can get files back from any day pretty easily. It works well and is very reliable.

We also have a second Solaris 10 storage server in our colocated site, with about 6 TB of storage in SATA shelves in an IBM DS4300 SAN. We want to make replicas of backup data on our x4500 onto this Solaris box so that we can purge any vestiges of local backups from our system. The zfs filesystem has a neat feature for doing this, called zfs send and zfs receive. These commands allow you to stream a whole zfs filesystem out to another medium and then retrieve it back and recreate that zfs filesytem again. If you pipe the output of zfs send through ssh to another box and then to zfs receive, you end up with a duplicate zfs filesystem created on another server. Sweet! Then, you can use a differential zfs send to update a snapshot, so that instead of re-sending the entire data set again the next day, you can just send the differences between yesterday and today and get another full snapshot. Unfortunately, in the build of Solaris we are using, which is Solaris 10 Enterprise 11/06, zfs send and receive for differential snapshots over the network doesn’t work, because the receiving side always makes some insignificant change to the filesystem before the receive starts, and then the differential receive fails because the two endpoints don’t match.

Later builds of OpenSolaris have a new option on zfs receive that forces it to do a rollback immediately prior to accepting the sent data. Our build doesn’t have that option. I’m now faced with the prospect of either a) unmounting the destination filesystem during the send/receive, which makes my backup snapshots unavailable, or b) updating Solaris to an OpenSolaris build. I need to figure out which is more appropriate. I also need to figure out how to nondestructively upgrade from Solaris 10 11/06 to Solaris Express DE or some other newer build.


1 comment 2007-05-04


Links

Archives

Categories

Feeds