Many-Core Compiler Fuzzing: The Artifact

Overview

We present here the artifact that accompanies our PLDI'15 paper. In principle it should be possible to reproduce the findings of our paper for each of the non-anonymous OpenCL configurations we tested. However, doing so requires access to specific hardware and drivers.

To ensure that reproducibility is possible for at least one configuration, we provide a virtual machine that is equipped with the Intel OpenCL SDK, and with the Oclgrind emulator (configuration 17 in the paper) as a device on which kernels can be executed.

Unfortunately it is not possible to expose other devices via a virtual machine (in particular, it does not even appear to be possible to execute kernels on the host CPU via the virtual machine). We thus provide instructions on how to build CLsmith so that the evaluators can run the tool natively to test other OpenCL implementations if they wish; this will require downloading and installing an SDK from one of the main vendors, e.g. Intel (Linux download) or AMD (download).

The rest of this guide is structured as follows:

Abstract and Paper

Here is the submitted paper in PDF form. The abstract now follows:

We address the compiler correctness problem for many-core systems through novel applications of fuzz testing to OpenCL compilers. Focusing on two methods from prior work, random differential testing and testing via equivalence modulo inputs (EMI), we present several strategies for random generation of deterministic, communicating OpenCL kernels, and an injection mechanism that allows EMI testing to be applied to kernels that otherwise exhibit little or no dynamically-dead code. We use these methods to conduct a large, controlled testing campaign with respect to 19 OpenCL (device, compiler) configurations, covering a range of CPU, GPU, accelerator, FPGA and emulator implementations. Our study provides independent validation of claims in prior work related to the effectiveness of random differential testing and EMI testing, proposes novel methods for lifting these techniques to the many-core setting, reveals a significant number of OpenCL compiler bugs in commercial implementations, and acts as a call to arms for higher quality OpenCL compilers from many-core device vendors.

Virtual Machine

We have prepared a Virtual Machine which contains a built CLsmith, a built emibench and an OpenCL emulator, Oclgrind. Oclgrind is useful as it allows for OpenCL testing in a hardware-independent environment. We have also added our generated tests and obtained results, which can be used to reproduce the information available in the paper. The Virtual Machine is available for download here.

The Virtual Machine mainly consists of a vdi file, representing a Virtual Box hard disk image. Virtual Box can be downloaded from here.

Creating a new Virtual Machine and attaching the .vdi:

CLsmith Tool

A tool built upon Csmith that generates random OpenCL kernels. It is available for download here. The only requirements are an OpenCL compatible device and the OpenCL headers and libraries (which can be obtained from a proprietary OpenCL SDK).

Design of CLsmith

This PDF document provides usage details for CLsmith, and includes a section on the design of the tool, Section 6. This section explains where in the source code the various novel features described in the paper are implemented.

Building CLsmith (for Linux):

A successful build will generate the binaries CLSmith and cl_launcher in {CLsmith-root}/build.

Running CLsmith:

Results

Can be seen in Table 4 (results from running 5000 * 6 randomly generated OpenCL programs) and Table 5 (results from running 61 * 41 = 2501 randomly generated OpenCL programs with dead-by-construction emi-blocks, where statements inside these blocks are either deleted or, a novel approach, lifted when they are inside a block).

To reproduce these results inside our VM, go into one of the two folders inside the Data folder and execute:

NOTE: We would like to point out that Table 5 in our paper was generated via a buggy script, that would produce different results. The bug has been fixed and a single, correct table is now being generated. However, the table in our paper is incorrect. As such, several programs were classified incorrectly. We are fixing this error in the camera-ready version.

emibench

Represents a set of hand-picked real-world OpenCL benchmarks, modified to employ the EMI method. The modification consists of inserting randomly generated syntactically-correct OpenCL code in the existing kernels, guarded by a condition that is guaranteed to fail. We investigate how the program's execution is being altered by the existence of this new dead code.

Building the emibench benchmarks:

Running benchmarks:

Results

Can be seen in Table 3 of our paper, running all the hand-picked benchmarks over the devices we had at out disposal.

To reproduce these results inside our VM, go into the emibench/results folder and follow the same procedure as for tables 4 and 5.