Update: The following information could be beneficial to some, however my issues actually were with Caviar black drives shipping with TLER disabled. You need to pay Western Digital a premium for their “RAID” drives with TLER enabled. So for anyone reading this, avoid consumer Western Digital drives if you plan on using them for RAID.

zfs_vdev_max_pending

I can’t believe how long I have been tolerating horrible concurrent IO performance on OpenSolaris running ZFS. When I have any IO intensive writes happening the whole system slows down to a crawl for any further IO. Running “ls” on a uncached directory is just painful.


victori@opensolaris:/opt# iostat -xnz 1

                    extended device statistics              
    r/s    w/s   kr/s   kw/s wait actv wsvc_t asvc_t  %w  %b device
    0.0   87.0    0.0 2878.1  0.0  0.0    0.0    0.4   0 100 c4t0d0
    0.0   83.0    0.0 2878.1  0.0  0.1    0.2    0.7   1  50 c4t1d0
    1.0    0.0   28.0    0.0  0.0  0.0    0.0    5.4   0   1 c4t2d0

Notice c4t0d0 is blocking at 100%. If a disk is blocking at 100% good luck getting the disk to do any other operations such as reads.

SATA disks do Native Command Queuing while SAS disks do Tagged Command Queuing, this is an important distinction. Seems like OpenSolaris/Solaris is optimized for the latter with a 32 wide command queue set by default. This completely saturates the SATA disks with IO commands in turn making the system unusable for short periods of time.

Dynamically set the ZFS command queue to 1 to optimize for NCQ.


echo zfs_vdev_max_pending/W0t1 | mdb -kw

And add to /etc/system


set zfs:zfs_vdev_max_pending=1

Enjoy your OpenSolaris server on cheap SATA disks!