CLsmith

emibench: EMI testing for real-world OpenCL programs

We give 10 OpenCL benchmarks drawn from the Parboil [0] and Rodinia [1] suites. Each benchmark has been modified to allow EMI testing. That is, the OpenCL kernel has dead-by-construction blocks that can be injected at online compilation time. Each benchmark has an expected output that can be used to detect possible miscompilations.

[0] J. A. Stratton, C. Rodrigues, I. J. Sung, N. Obeid, L. W. Chang, N. Anssari, G. D. Liu, and W. W. Hwu. Parboil: A revised benchmark suite for scientific and commercial throughput computing. Technical Report IMPACT-12-01, University of Illinois, at Urbana-Champaign, March 2012.

[1] S. Che, M. Boyer, J. Meng, D. Tarjan, J. W. Sheaffer, S. Lee, and K. Skadron. Rodinia: A benchmark suite for heterogeneous computing. In Proceedings of the 2009 IEEE International Symposium on Workload Characterization, IISWC 2009, October 4-6, 2009, Austin, TX, USA, pages 44–54. IEEE, 2009.

The results of these experiments are discussed in our paper "Many-Core Compiler Fuzzing" (PLDI'15), Section 7.2 and Table 3.

Structure

Each benchmark has a src/ directory containing the modified application. In particular, the directories src/emi0 and src/emi1 contain EMI blocks (randomly generated CLSmith code blocks) that can be injected at online compilation time. We give 125 EMI block variants (as header files 0.h, 1.h, ..., 124.h) for testing, including the empty substitution (999.h). We have provided wrapper python scripts to perform this testing automatically.

The common/ directory contains the EMI library functions mentioned in the paper for (a) allocating and initialising the dead array (called emi_data in emibench) for ensuring that the EMI blocks are dynamically unreachable; and (b) compiling OpenCL kernels using (1) a given EMI block, (2) enabling or disabling substitutions (see "Injecting into real-world kernels" in Sec. 5) and (3) enabling or disabling compile optimisations. These options can all be specified at the command-line to our wrapper scripts.

The results/ directory contains raw CSV files generated from running EMI testing over different configurations. We have anonymized some CSV files for some configurations.

The bugs/ directory contains bugs that we have reduced from wrong code errors. We have anonymized certain bugs.

Prerequisites

Linux requires Make, g++

Windows requires Visual Studio

Building under Linux

Assuming a bash shell, export the path to your OpenCL headers and libraries so that $CLDIR/include/CL/cl.h and $CLDIR/lib/libOpenCL.so are correct.

$ export CLDIR=/path/to/opencl-install

Now try building the first benchmark, parboil_bfs:

$ cd parboil_bfs $ make

If this worked successfully then ./build/bfs should now be built. Other benchmarks can be built similarly. Assuming that you are in the root directory of emibench, run the following to compile all other benchmarks:

$ ./linux_compile_all.sh

Move onto "Running an EMI test", below.

Building under Windows

First set up your environment to setup %OCL_INCLUDE% and %OCL_LIB% so that %OCL_INCLUDE%/CL/cl.h and %OCL_LIB%/OpenCL.dll are correct. For example, if using the AMD SDK:

set OCL_INCLUDE="C:\Program Files (x86)\AMD APP SDK\2.9-1\include"
set OCL_LIB="C:\Program Files (x86)\AMD APP SDK\2.9-1\lib\x86\OpenCL.lib"

You may find the script "amd_set_windows_environment_vars.bat" useful to modify. Now try building the first benchmark, parboil_bfs:

cd parboil_bfs
windows_compile.bat

If this worked successfully then ./build/bfs.exe should now be built. Other benchmarks can be built similarly. Assuming that you are in the root directory of emibench, run the following to compile all other benchmarks:

windows_compile_all.bat

Move onto "Running an EMI test", below.

Running an EMI test

Change directory into a benchmark, say parboil_bfs:

$ cd parboil_bfs

Run an EMI test by using the wrapper script:

$ python ../scripts/runsingle.py --optimisations 0 --substitutions 0 --emi_block 0 --verbose

This will invoke the benchmark to generate the expected output (using the empty EMI block), then invoke the benchmark with the given emi block (0) with substitutions off and optimisations disabled. You should expect output similar to:

./build/bfs -i data/1M/input/graph_input.dat -o bfs.out --optimisations 0 --substitution 0 --emi_block 999
# OpenCL compiler flags are [ -I src -D NO_SUBSTITUTION -cl-opt-disable -D EMI_BLOCK=999]
Starting GPU kernel
GPU kernel done
IO : 1.795649
Copy : 0.640985
Driver : 0.286905
Timer Wall Time: 2.723589
./build/bfs -i data/1M/input/graph_input.dat -o bfs.out --optimisations 0 --substitution 0 --emi_block 0
# OpenCL compiler flags are [ -I src -D NO_SUBSTITUTION -cl-opt-disable -D EMI_BLOCK=0]
Starting GPU kernel
GPU kernel done
IO : 1.856676
Copy : 0.739039
Driver : 0.332579
Timer Wall Time: 2.928346
OKAY

The OKAY flag means that the output matched the expected output (see Section "Exit flags" below).

By default the wrapper scripts will target platform 0 device 0. If you wish to target a different device then provide --platform X and --device Y flags to run_single.py, like so:

$ python ../scripts/runsingle.py --optimisations 0 --substitutions 0 --emi_block 0 --verbose -- --platform X --device Y

NB: note the "--" separating the --platform and --device flags. This means that these flags are passed directly to the benchmark executable.

Running all EMI tests

Assuming that runsingle.py worked above all other benchmarks should be compiled similarly and can be tested using the same wrapper script. The script ./root_runall.py in the root directory can be invoked:

$ python -u root_runall.py | tee out.csv
# benchmark, opt, sub, emiblock, result
bfs,0,0,0,OKAY
bfs,0,0,1,OKAY
[...]

This will produce a CSV output to stdout and also to out.csv where we exhaustively test all 500 possible EMI injections for each benchmark. As noted in the paper, we skip the spmv and myocyte benchmarks due to data races. For details about the various flags that you may observe see the Section "Exit flags" below.

To target a different platform use the same flags as for runsingle.py:

$ python -u root_runall.py -- --platform X --device Y

NB: note the "--" separating the --platform and --device flags. This means that these flags are passed directly to the benchmark executable.

Exit flags

The following flags are possible outputs from each EMI test. The expected result refers to the result of the test when using an empty EMI injection (i.e., the original program).

OKAY: The test using the EMI injection generated the expected result (no bug found) Corresponds to a \tick in Table 3.
BAD_DIFF: The test using the EMI injection *did not* generate the same expected result (possible bug found). Corresponds to a \textbf{w} in Table 3.
BAD_EXIT_CODE: The test did not exit successfully (non-zero exit code). Corresponds to a \textbf{c} in Table 3.
BAD_TIMEOUT: The test timed out. Corresponds to a \textbf{to} in Table 3.

Additionally, the script can output the following flags but these should not usually be observed:

BAD_NO_OUTPUT: The test exited successfully (0) but failed to produce any output.
BAD_IDENTIFIER: The test was not able to compile the EMI injection because a proper substitution of variables was not provided and gave rise to an "identifier not found" error during compilation.

Finally, the wrapper script will try to fast-forward through repeating results so you may observe:

SKIP_REPEAT: The same non-OKAY result has appeared >10 times.
This feature can be disabled by passing "--no_skip" to the script.