Skip to main content

Research Repository

Advanced Search

All Outputs (4)

Sparse matrix-dense matrix multiplication on heterogeneous CPU+FPGA embedded system (2020)
Conference Proceeding
Hosseinabady, M., & Nunez-Yanez, J. (2020). Sparse matrix-dense matrix multiplication on heterogeneous CPU+FPGA embedded system. In Proceedings of the 11th Workshop on Parallel Programming and Run-Time Management Techniques for Many-core Architectures / 9th Workshop on Design Tools and Architectures for Multicore Embedded Computing Platforms. https://doi.org/10.1145/3381427.3381428

Embedded intelligence is becoming the primary driver for new applications in industry, healthcare, and automotive, to name a few. The main characteristics of these applications are high computational demand, real-time interaction with the environment... Read More about Sparse matrix-dense matrix multiplication on heterogeneous CPU+FPGA embedded system.

Run-time power modelling in embedded GPUs with dynamic voltage and frequency scaling (2020)
Conference Proceeding
Nunez-Yanez, J., Nikov, K., Eder, K., & Hosseinabady, M. (2020). Run-time power modelling in embedded GPUs with dynamic voltage and frequency scaling. . https://doi.org/10.1145/3381427.3381429

This paper investigates the application of a robust CPU-based power modelling methodology that performs an automatic search of explanatory events derived from performance counters to embedded GPUs. A 64-bit Tegra TX1 SoC is configured with DVFS enabl... Read More about Run-time power modelling in embedded GPUs with dynamic voltage and frequency scaling.

Pipelined streaming computation of histogram in FPGA OpenCL (2018)
Conference Proceeding
Hosseinabady, M., & Nunez-Yanez, J. L. (2018). Pipelined streaming computation of histogram in FPGA OpenCL. In S. Bassini, M. Danelutto, P. Dazzi, G. R. Joubert, & F. Peters (Eds.), Parallel Computing is Everywhere (632-641). https://doi.org/10.3233/978-1-61499-843-3-632

The emergence of High-Level Synthesis (HLS) techniques and tools, along with new features in high-end FPGAs such as multi-port memory interfaces, has enabled designers to utilize FPGAs not only for compute-bound but also for memory-bound tasks. This... Read More about Pipelined streaming computation of histogram in FPGA OpenCL.

Multi-precision convolutional neural networks on heterogeneous hardware (2018)
Conference Proceeding
Amiri, S., Hosseinabady, M., McIntosh-Smith, S., & Nunez-Yanez, J. (2018). Multi-precision convolutional neural networks on heterogeneous hardware. In 2018 Design, Automation & Test in Europe Conference & Exhibition (419-424). https://doi.org/10.23919/DATE.2018.8342046

Fully binarised convolutional neural networks (CNNs) deliver very high inference performance using single-bit weights and activations, together with XNOR type operators for the kernel convolutions. Current research shows that full binarisation result... Read More about Multi-precision convolutional neural networks on heterogeneous hardware.