Practical programming of FPGAs in the data center and on the edge

FPGAs are now essential components in the data center and on the edge with millions deployed. FPGAs are found in a wide variety of system elements and provide such critical functions as SDN, encryption/decryption, and compression. Yet for nearly all system providers, much less system users, programming these FPGAs is impossible. Our overall goal is to enable FPGA application development by High Level Language (HLL) programmers, especially for the data center and the edge, and exclusively using existing open-source tools.

Obtaining Programmability / Performance / Portability  (PPP) is one of the central problems in Computer Science and Engineering with decades of work and multiple billion-dollar projects on-going, but with only modest progress to date. This has especially been the case in compiling user applications to FPGAs – FPGAs have additional degrees of freedom since not only the software, but also the hardware can be programmed.

 In recent work we have demonstrated that we may have reached a tipping point: PPP for FPGAs could be within reach. Our work (with an Intel High Level Synthesis, or HLS, environment) demonstrated performance comparable to expert code written directly in Hardware Description Language (HDL). The critical insight was that much of the FPGA-specific information usually given to the compiler, e.g. about communication channels, was actually invoking unnecessary constraints and reducing performance. In our method, the code optimizations are mostly based on standard processes used in High Performance Computing (HPC) – the Figure shows the effectiveness of their systematic application through six rounds. [UD1] We emphasize the significance of this result: previous optimization scripts (see “Ad hoc Code Tuning”) have rarely gotten performance within a factor of five of their HDL reference implementations.

Despite its promise, there are still drawbacks to this approach. One is that it requires much programmer intervention, although of a generic kind common in High Performance Computing. A more significant problem is that it depends on reverse-engineering a closed source compiler, an inherently fragile approach.

In this project we are exploring a fully automatic and fully integrated approach based on the application of Machine Learning (ML) to automatically and dynamically discover the best optimizations and to apply them within an open source compiler. We begin with our observation that modern compilers already have the innate capability to support PPP for FPGA applications, but just don’t know what to do when. The main idea is to have the compiler learn exactly that: which optimization passes to apply in what order for which application. We use GCC as our target.

The project has multiple parts. First we have modified GCC to enable scheduling of optimization passes. Then we need to create an appropriate ML representation – one that encapsulates application and compiler information and is usable by an ML system. A “back end” or output code generator must be added that creates a synthesizable hardware specification, and there must be a way to quickly evaluate performance of that synthesized hardware. Finally, since ML methods require vast amounts of data, an input code generator must be created.

Additional Project Resources:

For more information on this project and the unique partnership that produced it, please see the website of the Red Hat Collaboratory at Boston University.

Additional Information on how to build an open-source toolchain for FPGAs can be found in the article by Ahmed Sanaullah in Red Hat Research Quarterly magazine Volume 1, Issue 3.