Correct and Efficient Accelerator Programming (Dagstuhl Seminar 13142)

Abstract

In recent years, massively parallel accelerator processors, primarily GPUs, have become widely available to end-users. Accelerators offer tremendous compute power at a low cost, and tasks such as media processing, simulation and eye-tracking can be accelerated to beat CPU performance by orders of magnitude. Performance is gained in energy efficiency and execution speed, allowing intensive media processing software to run in low-power consumer devices. Accelerators present a serious challenge for software developers. A system may contain one or more of the plethora of accelerators on the market, with many more products anticipated in the immediate future. Applications must exhibit portable correctness, operating correctly on any configuration of accelerators, and portable performance, exploiting processing power and energy efficiency offered by a wide range of devices.

The seminar focussed on the following areas:

  • Novel and attractive methods for constructing system-independent accelerator programs;
  • Advanced code generation techniques to produce highly optimised system-specific code from system-independent programs;
  • Scalable static techniques for analysing system-independent and system-specific accelerator programs both qualitatively and quantitatively.

The seminar featured five tutorials providing an overview of the landscape of accelerator programming:

  • Architecture — Anton Lokhmotov, ARM
  • Programming models — Lee Howes, AMD
  • Compilation techniques — Sebastian Hack, Saarland University
  • Verification - Ganesh Gopalakrishnan, University of Utah
  • Memory models — Jade Alglave, University College London

In addition, there were short presentations from 12 participants describing recent results or work-in-progress in these areas, and two discussion sessions:

  • Domain specific languages for accelerators;
  • Verification techniques for GPU-accelerated software.

Due to the “correctness” aspect of this seminar, there was significant overlap of interest with a full week seminar on Formal Verification of Distributed Algorithms running in parallel. To take advantage of this overlap a joint session was organised, featuring a talk on verification of GPU kernels by Alastair Donaldson, Imperial College London (on behalf of the Correct and Efficient Accelerator Programming seminar) and a talk on GPU-accelerated runtime verification by Borzoo Bonakdarpour, University of Waterloo, on behalf of the Formal Verification of Distributed Algorithms seminar.