FAB: An FPGA-based Accelerator for Bootstrappable Fully Homomorphic Encryption
Rashmi Agrawal, Boston University; Leo de Castro, MIT; Guowei Yang, Boston University; Chiraag Juvekar, Analog Devices; Rabia Yazicigil, Boston University; Anantha Chandrakasan, MIT; Vinod Vaikuntanathan, MIT; Ajay Joshi, Boston University
Fully Homomorphic Encryption (FHE) offers protection to private data on third-party cloud servers by allowing computations on the data in encrypted form. However, to support general-purpose encrypted computations, all existing FHE schemes require an expensive operation known as “bootstrapping”. Unfortunately, the computation cost and the memory bandwidth required for bootstrapping add significant overhead to FHE-based computations, limiting the practical use of FHE.
In this work, we propose FAB, an FPGA-based accelerator for bootstrappable FHE. Prior FPGA-based FHE accelerators have proposed hardware acceleration of basic FHE primitives for impractical parameter sets without support for bootstrapping. FAB, for the first time ever, accelerates bootstrapping (along with basic FHE primitives) on an FPGA for a secure and practical parameter set. Prior hardware implementations of FHE that included bootstrapping are heavily memory bound, leading to large execution times and wasted compute resources. The key contribution of our work is to architect a balanced FAB design, which is not memory bound. To this end, we leverage recent algorithms for bootstrapping while being cognizant of the compute and memory constraints of our FPGA. To architect a balanced FAB design, we use a minimal number of functional units for computing, operate at a low frequency, leverage high data rates to
and from main memory, utilize the limited on-chip memory effectively, and perform operation scheduling carefully.
We evaluate FAB using a single Xilinx Alveo U280 FPGA and by scaling it to a multi-FPGA system consisting of eight such FPGAs. For bootstrapping a fully-packed ciphertext, while operating at 300 MHz, FAB outperforms existing state-of-the-art CPU and GPU implementations by 213× and 1.5× respectively. Our target FHE application is training a logistic regression model over encrypted data. For logistic regression model training scaled to 8 FPGAs on the cloud, FAB outperforms a CPU and GPU by 456× and 6.5×, and provides competitive performance when compared to the state-of-the-art ASIC design at a fraction of the cost