iowait - Houston we have a problem - or?


The metric iowait is approximately as cumbersome as the memory free in unix. From time to time there is a major issue with the servers not coping with the demands and the applications see high iowaits. I always stumble when I try to explain that the metric doesn't necessarily tell you the truth.

It seems like everyone knows that iowait is the time spent by the CPU waiting for IO to complete and that sounds bad. Back in the days it was bad, there was one CPU and it couldn't do anything until the IO completed. But now a days we have more core's i.e. other processes can continue on core's that aren't blocked by the single IO-wait.

To add further insult the increase of CPU-performance has outperformed the improvements of disk performance. SSD disks are still fairly expensive, at least for larger disks event though they do stand for a gigantic leap in performance improvements.

This means that we have to factors (CPU speed and multi-cores) that mitigate the issue of IO-wait. One of them, the CPU speed, do inflate the perceived issue of IO-wait. Lets consider the case where we have one CPU that is slow where the compute takes 60ms and the IO takes 20ms.  In that system we replace the slow CPU with a fast one where the compute takes 20ms and the IO still takes 20ms.


In this case the application performance doubled - surely a good thing. The IO-wait also did also double is crappiness - or? No - it didn't it's just how we measure it and perceive the figures.

Unfortunately IO problems aren't a bleak memory form the past - they do still exists and the needs to be diagnosed. The trick is to look at the response times instead. If they increase above what you should consider to be normal for your server/disk/driver combo - you have these metrics from a healthy system - right? If not I should invest an hour or two to collect them. But if you don't have them a rule of thumb, although it depends on the workload, could be:
Typical HDD figures
HDD
Spindle
[rpm]
Average
rotational
latency [ms]
4,2007.14
5,4005.56
7,2004.17
10,0003.00
15,0002.00
If you significantly different figures - you might have an issue.

Tools that I like to use to investigate are iostat, iotop and if need be strace. Note that iostat works with deltas so the first execution isn't worth anything:
iostat -x 1
Linux 3.11.0-15-generic (hive)  02/13/2014      _i686_  (1 CPU)

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           0.13    0.05    0.12    0.22    0.00   99.48

Device:         rrqm/s   wrqm/s     r/s     w/s    rkB/s    wkB/s avgrq-sz avgqu-sz   await r_await w_await  svctm  %util
sda               0.12     0.13    0.26    0.19     3.86     3.40    32.53     0.02   38.60    8.96   79.37   4.51   0.20

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           0.00    0.00    0.99    2.97    0.00   96.04

Device:         rrqm/s   wrqm/s     r/s     w/s    rkB/s    wkB/s avgrq-sz avgqu-sz   await r_await w_await  svctm  %util
sda               0.00     0.00    1.98    0.00    11.88     0.00    12.00     0.02   12.00   12.00    0.00  12.00   2.38

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           0.00    0.00    1.00    0.00    0.00   99.00

Device:         rrqm/s   wrqm/s     r/s     w/s    rkB/s    wkB/s avgrq-sz avgqu-sz   await r_await w_await  svctm  %util
sda               0.00     0.00    0.00    1.00     0.00     4.00     8.00     0.00    0.00    0.00    0.00   0.00   0.00

If you gave an issue this might help you figuring out who it is that is creating iowaits:
rsyslog:
/etc/init.d/rsyslog stop; echo 1 > /proc/sys/vm/block_dump ; sleep 10; dmesg | egrep "READ|WRITE|dirtied" | egrep -o '[a-z]+\([0-9]*\)' | sort | uniq -c | sort -rn | head ; echo 0 > /proc/sys/vm/block_dump ; /etc/init.d/rsyslog start

rsyslog:
/etc/init.d/syslog stop; echo 1 > /proc/sys/vm/block_dump ; sleep 10; dmesg | egrep "READ|WRITE|dirtied" | egrep -o '[a-z]+\([0-9]*\)' | sort | uniq -c | sort -rn | head ; echo 0 > /proc/sys/vm/block_dump ; /etc/init.d/syslog start

Comments

Popular posts from this blog

Possible SYN flooding on port 3306 (MySQL)

Part 1 - Disaster Recovery with SRM and vSphere Replication

Part 2 - Disaster Recovery with SRM and vSphere Replication