[go: up one dir, main page]

Menu

#47 poor sequential fb perf

pending
None
6
2014-07-31
2011-08-19
No

Have you ever tried comparing the performance of fb vs. iozone or dd?

We aren't too sure about random workloads right now, but for both seq read and write, performance is at least 30% below iozone, and even 70% is some cases when the iosize is very small. Do you have any insights in this manner, as at this point we cannot use fb until this is resolved.

Here is our sequential write .f file, is there a problem with it?

#define file name=bigfile1,path=/mnt/boar7,size=1024m,prealloc,reuse,cached=false
define file name=bigfile1,path=/gpfs/gpfsA,size=1024m,prealloc,reuse,cached=false

define process name=filewriter,instances=1
{
thread name=filewriterthread,memsize=10m,instances=1
{
flowop write name=writefile,filesetname=bigfile1,iosize=4k,iters=262144
flowop fsync name=fsync
flowop finishonbytes name=finish,value=1
}
}

create file
run 6000

Discussion

  • Vasily Tarasov

    Vasily Tarasov - 2011-08-19
    • status: open --> open-accepted
     
  • Vasily Tarasov

    Vasily Tarasov - 2011-08-19

    [root@hobbes wrkld]# filebench -f low-write-perf.f
    Filebench Version 1.4.9
    IMPORTANT: Virtual address space randomization is enabled on this machine!
    It is highly recommended to disable randomization to provide stable Filebench runs.
    Echo 0 to /proc/sys/kernel/randomize_va_space file to disable the randomization.
    5418: 0.000: Allocated 170MB of shared memory
    5418: 0.011: Creating/pre-allocating files and filesets
    5418: 0.030: File bigfile1: 1024.000MB
    5418: 0.033: Removed any existing file bigfile1 in 1 seconds
    5418: 0.033: making tree for filset /mnt//bigfile1
    5418: 0.063: Creating file bigfile1...
    5418: 30.319: Preallocated 1 of 1 of file bigfile1 in 31 seconds
    5418: 30.319: waiting for fileset pre-allocation to finish
    5422: 30.320: Starting 1 filewriter instances
    5423: 30.321: Starting 1 filewriterthread threads
    5418: 31.323: Running...
    5418: 91.392: Run took 60 seconds...
    5418: 91.403: Per-Operation Breakdown
    writefile 415781ops 6923ops/s 27.0mb/s 0.1ms/op 45us/op-cpu [0ms - 226ms]
    5418: 91.403: IO Summary: 415781 ops, 6922.828 ops/s, (0/6923 r/w), 27.0mb/s, 87us cpu/op, 0.1ms latency
    5418: 91.403: Shutting down processes
    [root@hobbes wrkld]# dd if=/dev/zero of=/mnt/
    bigfile1/ lost+found/
    [root@hobbes wrkld]# dd if=/dev/zero of=/mnt/bigfile1/00000001/00000001
    ^C800765+0 records in
    800765+0 records out
    409991680 bytes (410 MB) copied, 2.48309 seconds, 165 MB/s

    [root@hobbes wrkld]# dd bs=4096 if=/dev/zero of=/mnt/bigfile1/00000001/00000001 count=262144
    262144+0 records in
    262144+0 records out
    1073741824 bytes (1.1 GB) copied, 26.938 seconds, 39.9 MB/s
    [root@hobbes wrkld]#

     
  • Vasily Tarasov

    Vasily Tarasov - 2011-08-19
    • status: open-accepted --> closed-invalid
     
  • Vasily Tarasov

    Vasily Tarasov - 2011-08-19

    If I run regular dd, I get 39-40MB/sec throughput on my test machine:

    # dd bs=4096 if=/dev/zero of=/mnt/bigfile1/00000001/00000001 count=262144
    262144+0 records in
    262144+0 records out
    1073741824 bytes (1.1 GB) copied, 27.1796 seconds, 39.5 MB/s

    I I run original Filebench workload file that you sent to me, I get 25-26MB/sec:

    # filebench -f low-write-perf.f
    IO Summary: 262145 ops, 6547.178 ops/s, (0/6547 r/w), 25.5mb/s

    Here is the reason of a difference. The workload file that you sent to me, contains fsync operation, as a result Filbench fsyncs the file after writing it. I removed this operation from the workload file and throughput is comparable now (about 38MB/sec)
    # filebench -f low-write-perf.f
    196: 28.031: IO Summary: 262144 ops, 9699.496 ops/s, (0/9699 r/w), 37.9mb/s

    On the other hand, if you run dd with "conv=fsync" parameter, you get same numbers as the original Filebench file (26-27MB/sec)
    #dd bs=4096 if=/dev/zero of=/mnt/bigfile1/00000001/00000001 count=262144 conv=fsync
    262144+0 records in
    262144+0 records out
    1073741824 bytes (1.1 GB) copied, 40.384 seconds, 26.6 MB/s

    Notice, that the difference of 2 MB/sec in the output is caused by the difference in units: dd defines MB as 1000 * 1000 bytes, but Filebench defines MB as 1024 * 1024 bytes.

    I tried iozone as well, and got the same write throughput numbers: 38.6MB/sec - no flush, 26.0MB/sec with flush ("-e" flag):
    #./iozone -i 0 -s `expr 1024 \* 1024` -f /mnt/file
    KB reclen write rewrite
    1048576 4 38657 41179

    #./iozone -i 0 -s `expr 1024 \* 1024` -f /mnt/file
    KB reclen write rewrite
    1048576 4 26027 27525

     
  • Vasily Tarasov

    Vasily Tarasov - 2011-08-23
    • status: closed-invalid --> open-accepted
     
  • Vasily Tarasov

    Vasily Tarasov - 2011-08-23

    I created a tmpfs fs to rule out the disk bottleneck, and here are the numbers I'm seeing. Note that a single device raid5 4+1p in our ds4700 can do ~200MB/s, so we see the problem, but I figure that using tmpfs is reproducible.
    >mkdir /rd
    >mount -t tmpfs -o size=2G tmpfs /rd

    As you can see from the numbers below, iozone/dd is getting .7-1.3 GB/s write and 1.1-2.3GB/s read, whereas fb can do no better than 93MB/s in either case. Pretty stark difference that really renders fb useless in high-throughput environments.

    write 1GB file

    file bench 1.4.9
    writefile 262145ops 23830ops/s 93.1mb/s 0.0ms/op 21us/op-cpu [0ms - 0ms]

    >iozone -aec -+n -i 0 -s 1024m -r 4k -f /rd/disk1 -w
    KB reclen write rewrite read reread read write read rewrite read fwrite frewrite fread freread
    1048576 4 1303224 0

    # dd if=/dev/zero of=/rd/disk1 bs=1024 count=1048576 conv=fsync
    1073741824 bytes (1.1 GB) copied, 1.53124 s, 701 MB/s

    read 1GB file

    file bench 1.4.9
    readfile 262145ops 23830ops/s 93.1mb/s 0.0ms/op 20us/op-cpu [0ms - 0ms]

    >iozone -aec -+n -i 1 -s 1024m -r 4k -f /rd/disk1 -w
    KB reclen write rewrite read reread read write read rewrite read fwrite frewrite fread freread
    1048576 4 2304197 0

    # dd if=/rd/disk1 of=/dev/null bs=1024 count=1048576
    1073741824 bytes (1.1 GB) copied, 0.967837 s, 1.1 GB/s

     
  • Vasily Tarasov

    Vasily Tarasov - 2011-08-23

    Guys, I did some more profiling using tmpfs, and here are the numbers that I got:

    1) Run of non-modified Filebench: 45MB/sec
    2) Run without reading /proc/<pid>/stat and /proc/stat for every flowop: 340MB/sec
    3) Point 2 + not setting random buffer offset: 511MB/sec
    4) Point 3 + not calling gettimeofday for every operation (to collect latency): 1022MB/sec

    write with dd:

    [root@hobbes wrkld]# dd bs=4096 if=/dev/zero of=/mnt/tmpfs/bigfile1/00000001/00000001 count=262144

    262144+0 records in
    262144+0 records out
    1073741824 bytes (1.1 GB) copied, 1.47869 seconds, 726 MB/s

    OVERwrite with dd (notice "notrunc" in the end of the invocation line):

    [root@hobbes wrkld]# dd bs=4096 if=/dev/zero of=/mnt/tmpfs/bigfile1/00000001/00000001 count=262144 conv=notrunc

    262144+0 records in
    262144+0 records out
    1073741824 bytes (1.1 GB) copied, 1.03275 seconds, 1.0 GB/s

    Notice, that Filebench workload file that you sent to me was doing overwrite (because "preallocated") was specified in the file definition.

    I'm attaching a patch that makes Filebench run with low overhead. You need to apply it _after_ ./configure step! This is a temporary solution, I'll code it properly later.

     
  • Vasily Tarasov

    Vasily Tarasov - 2014-07-31
    • Status: open-accepted --> pending