Mohammad Hosseinabady
Pipelined streaming computation of histogram in FPGA OpenCL
Hosseinabady, Mohammad; Nunez-Yanez, Jose Luis
Authors
Jose Luis Nunez-Yanez
Contributors
Sanzio Bassini
Editor
Marco Danelutto
Editor
Patrizio Dazzi
Editor
Gerhard R. Joubert
Editor
Frans Peters
Editor
Abstract
The emergence of High-Level Synthesis (HLS) techniques and tools, along with new features in high-end FPGAs such as multi-port memory interfaces, has enabled designers to utilize FPGAs not only for compute-bound but also for memory-bound tasks. This paper explains how to efficiently parallelise histogram, as a memory-bound task, utilizing the OpenCL framework running on FPGA. We have run our implementation on three high-end FPGAs including Alpha Data 7v3, Alpha Data ADM-PCIE-KU3 and Xilinx KU115. The 256 fixed-width bins histogram running on 7v3, KU3 and KU115 platforms shows 8.38, 15.29 and 38.57 Giga bin Update Per Second (GUPS), respectively. The best result, i.e., 38.57 GUPS on KU115 platform defeats the Nvidia GeForce 1060 GPU with 31.36 GUPS. In addition, it shows better performance than the one obtained in the dual socket 8-core Intel Xeon E5-2690 with 13 GUPS and 60-core Intel Xeon Phi 5110P coprocessor with 18 GUPS. The proposed implementation is not sensitive to locally invariant (LI) data sets, while the performance of GPU and CPU implementations drops with LI data. Processing locally invariant data sets shows that our FPGA implementation can be up to 91.4% and 44.9% faster than that of the GeForce 1060 and 1080 GPUs, respectively. The source codes of the designs are available at https://github.com/Hosseinabady/histogram-sdaccel.
Presentation Conference Type | Conference Paper (published) |
---|---|
Publication Date | 2018 |
Deposit Date | Dec 11, 2023 |
Volume | 32 |
Pages | 632-641 |
Series Title | Advances in Parallel Computing |
Book Title | Parallel Computing is Everywhere |
ISBN | 9781614998426 |
DOI | https://doi.org/10.3233/978-1-61499-843-3-632 |
Public URL | https://uwe-repository.worktribe.com/output/11512172 |
You might also like
Dynamic energy management of FPGA accelerators in embedded systems
(2018)
Journal Article
Energy optimization in commercial FPGAs with voltage, frequency and logic scaling
(2015)
Journal Article
Simultaneous multiprocessing in a software-defined heterogeneous FPGA
(2018)
Journal Article
Multi-precision convolutional neural networks on heterogeneous hardware
(2018)
Presentation / Conference Contribution
Downloadable Citations
About UWE Bristol Research Repository
Administrator e-mail: repository@uwe.ac.uk
This application uses the following open-source libraries:
SheetJS Community Edition
Apache License Version 2.0 (http://www.apache.org/licenses/)
PDF.js
Apache License Version 2.0 (http://www.apache.org/licenses/)
Font Awesome
SIL OFL 1.1 (http://scripts.sil.org/OFL)
MIT License (http://opensource.org/licenses/mit-license.html)
CC BY 3.0 ( http://creativecommons.org/licenses/by/3.0/)
Powered by Worktribe © 2025
Advanced Search