I have a Sun x4500 at work, with 48 500 GB disks in it, with 46 of those configured as a gigantic ZFS filesystem. A couple of weeks ago when I had to restart it, it wasn’t able to mount the ZFS filesystem. After bashing on it a bit to get it to boot up without mounting the ZFS filesystem, I was able to use Solaris’s format command to determine that I have two disks with bad blocks, but format was not able to repair them. Fortunately the two disks are in two different raidz groups, so the data is all still there.
I have been trying to disable the problem disks, so that I can mount the ZFS filesystems in degraded mode and at least get at my data. I use cfgadm -c unconfigure device to turn off the SATA port of the two problem disks, and then zpool import pool to import the pool. That takes forever, but during the import, I can manually mount some of the zfs filesystems and access the data for a while until the server locks up.
This is a real nuisance. I don’t know why disabling the bad disks doesn’t allow the system to work normally until my replacement disks arrive. Ok Internets, any ideas?