|
# Some perfomance measures
|
|
# Some perfomance measures
|
|
|
|
|
|
|
|
The most surprising thing about ZIO for new users is the idea of a
|
|
|
|
"block" and a "control" (the 512 bytes thing) attached to every data
|
|
|
|
burst. This is considered a serious overhead.
|
|
|
|
|
|
|
|
This is the block:
|
|
|
|
![](/uploads/0be03d422b12400f848c332b5c18daee/zio-block.png)
|
|
|
|
|
|
|
|
I've made some performance measures on a recent PC-class computer.
|
|
|
|
In short, the overhead of bringing the block over the whole pipeline
|
|
|
|
(device, trigger, buffer, char device) is less than 0.3usec .
|
|
|
|
|
|
|
|
## Measuring the overhead
|
|
|
|
|
|
|
|
With `zio-zero.ko` and the transparent buffer (called `user` and now
|
|
|
|
the default ai ZIO initialization), we can read or write huge amounts
|
|
|
|
of data.
|
|
|
|
|
|
|
|
We may compare with `/dev/zero` but that would be unfair because the
|
|
|
|
`/dev/zero` implementation uses `__clear_user()`, not `memset()` and
|
|
|
|
`copy_to_user()`. The optimization of /dev/zero is specific to
|
|
|
|
/dev/zero, so that device isn't a meaningful test.
|
|
|
|
|
|
|
|
Channel 1 of cset 0 of `zio-zero` returns random numbers. It uses
|
|
|
|
`get_random_bytes()` like `/dev/urandom` does, so this is a fair
|
|
|
|
comparison.
|
|
|
|
|
|
|
|
Acquisition in ZIO is cset-wide, so we should disable the other
|
|
|
|
channels,
|
|
|
|
to avoid the overhead of 3 blocks when we are only interested in 1 of
|
|
|
|
them.
|
|
|
|
|
|
|
|
echo 0 > /sys/zio/devices/zzero/cset0/chan0/enable
|
|
|
|
echo 0 > /sys/zio/devices/zzero/cset0/chan2/enable
|
|
|
|
|
|
|
|
The sample size in zio-zero is 1 byte, and the default block size is
|
|
|
|
16 sample (see `/sys/zio/devices/zzero/cset0/trigger/nsamples`). So
|
|
|
|
we can read 1 million times 16 bytes from `/dev/urandom`
|
|
|
|
|
|
|
|
spusa.root# dd bs=16 count=1000000 if=/dev/urandom > /dev/null
|
|
|
|
1000000+0 records in
|
|
|
|
1000000+0 records out
|
|
|
|
16000000 bytes (16 MB) copied, 2.11017 s, 7.6 MB/s
|
|
|
|
|
|
|
|
We can then do the same with the ZIO
|
|
|
|
device:
|
|
|
|
|
|
|
|
spusa.root# dd bs=16 count=1000000 if=/dev/zzero-0-1-data > /dev/null
|
|
|
|
1000000+0 records in
|
|
|
|
1000000+0 records out
|
|
|
|
16000000 bytes (16 MB) copied, 2.46607 s, 6.5 MB/s
|
|
|
|
|
|
|
|
The difference is .355 seconds, which means .355 microseconds per
|
|
|
|
block.
|
|
|
|
I repeated the test several times, and picked one around the middle.
|
|
|
|
The oscillation between runs, on an unloaded machine, is within 1%.
|
|
|
|
|
|
|
|
## The vmalloc buffer
|
|
|
|
|
|
|
|
The new `vmalloc` buffer does even better: there is no need to
|
|
|
|
`read()` data, just build a pointer to it. The difference is not very
|
|
|
|
big with `/dev/urandom` because generating the data takes most of the
|
|
|
|
processing time within the test.
|
|
|
|
|
|
|
|
The suggested test here is the following:
|
|
|
|
|
|
|
|
size=16; while [ $size -lt 64000 ]; do
|
|
|
|
echo
|
|
|
|
echo $size
|
|
|
|
echo $size > /sys/zio/devices/zzero/cset0/trigger/nsamples
|
|
|
|
n=$(expr 16 \* 1048576 / $size)
|
|
|
|
dd bs=$size count=$n if=/dev/urandom of=/dev/null 2>&1 | grep copied
|
|
|
|
/tmp/zio-cat-file /dev/zzero-0-1-data $n > /dev/null
|
|
|
|
size=$(expr $size \* 2)
|
|
|
|
done
|
|
|
|
|
|
|
|
This is done twice, using the two buffers available:
|
|
|
|
|
|
|
|
echo vmalloc > /sys/zio/devices/zzero/cset0/current_buffer
|
|
|
|
echo kmalloc > /sys/zio/devices/zzero/cset0/current_buffer
|
|
|
|
|
|
|
|
And the result is plotted in the following figure.
|
|
|
|
The first numbers obtained, for 1048576 reads of 16 bytes , are:
|
|
|
|
|
|
|
|
dd: 2.213750
|
|
|
|
kmalloc: 2.763513
|
|
|
|
vmalloc: 2.603158
|
|
|
|
|
|
|
|
This means that the overhead of reading the full block (both
|
|
|
|
control and data) is .52 microseconds per block, while reading
|
|
|
|
the control and accessing data with `mmap` is 0.37 microseconds
|
|
|
|
ober a plain read.
|
|
|
|
|
|
|
|
For large data sizes, the advantage of accessing data instead
|
|
|
|
of reading it will be more of this per-block overhead. However,
|
|
|
|
I have taken no measures so far.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
### Files
|
|
### Files
|
... | | ... | |