Thursday, October 13, 2011

ZFS Performance issues

We have a Oracle/Fujitsu M4000 with 4 non-global sparse zones. It is connected to an EMC Clariion CX480. We started to notice some I/O performance issues on one particular zone. This led us to all sorts of troubleshooting and speculation. Long story short, we didn't see any memory or CPU constraints, but we did see I/O issues of being very busy and low r/s and low w/s numbers.. in the low 200's and high 100's.

The OS is Solaris 10 update 9 patched with CPU 07/2011. Powerpath is 5.3p1. I opened a ticket with Oracle and low and behold, there are some major performance issues with Solaris 10u9 and ZFS.

I never implemented any of the ZFS evil tuning as frankly they are beyond me and I get scared with Production Oracle Databases that run large companies.

We decided we would do the following in the non-global zone. /u05 is the SAN attached 2G fiber link to the Clariion. This is of course the ZFS filesystem and its own zpool.

M4Kngz> time mkfile 10g /u05/testfile
real    2m58.753s
user    0m0.690s
sys     0m43.812s

V1280# time mkfile 10g /fs1/testfile
real    1m13.893s
user    0m0.139s
sys     0m25.506s

M5Kngz> time cp testfile testfile1
real    6m32.563s
user    0m0.012s
sys     0m48.173s

V1280# time cp testfile testfile1
real    1m24.273s
user    0m0.019s
sys     1m8.497s

Create a 10g file with 8k blocksize via dd.
M4Kngz> time dd if=/dev/zero of=/u05/out bs=8k count=1280000
1280000+0 records in
1280000+0 records out 

real    1m26.714s
user    0m1.311s
sys     0m42.821s

V1280# time dd if=/dev/zero of=/fs1/blksz8k bs=8k count=1280000
1280000+0 records in
1280000+0 records out

real    1m20.721s
user    0m3.333s
sys     1m8.255s

Create 10g file with 128k blocksize.
M4Kngz> time dd if=/dev/zero of=/u05/blksz128k bs=128k count=80000
80000+0 records in
80000+0 records out

real    1m50.177s
user    0m0.105s
sys     0m11.601s

V1280# time dd if=/dev/zero of=/fs1/testfile bs=128k count=80000
80000+0 records in
80000+0 records out

real    1m11.743s
user    0m0.213s
sys     0m17.242s

dd it back to same filesystem.
M4Kngz> time dd if=/u05/blksz128k of=/u05/blksz128k.out bs=128k count=80000
80000+0 records in
80000+0 records out

real    5m12.868s
user    0m0.127s
sys     0m23.035s

V1280# time dd if=/fs1/testfile of=/fs1/blksz128k bs=128k count=80000

80000+0 records in
80000+0 records out

real    1m56.865s
user    0m0.300s
sys     0m32.155s

WOW! Quite a difference on the cp times. 6 1/2 minutes to copy the 10g file on the same array on the M4000 compared to 1 1/2 minutes on the V1280 we are going to scrap.

What is interesting is that dd shows comparable times across both system except for the dd of the same file to the same filesystem on the M4K.

Looks like Update 10 will be getting Live Upgraded into our Zone environment.

1 comment:

  1. I really appreciate information shared above. It’s of great help. If someone want to learn Online (Virtual) instructor lead live training in Oracle Solaris, kindly contact us
    MaxMunus Offer World Class Virtual Instructor led training on Oracle Solaris. We have industry expert trainer. We provide Training Material and Software Support. MaxMunus has successfully conducted 100000+ trainings in India, USA, UK, Australlia, Switzerland, Qatar, Saudi Arabia, Bangladesh, Bahrain and UAE etc.
    For Demo Contact us:
    Name : Arunkumar U
    Email :
    Skype id: training_maxmunus
    Contact No.-+91-9738507310
    Company Website –