W05 OSSMPIC - Open Source Solutions for Massively Parallel Integrated Circuits

Start

Tue, 1 Apr 2025 14:00

End

Tue, 1 Apr 2025 18:00

Important Dates

Friday, 24 January 2025

Paper Submission Deadline

Monday, 10 February 2025

Author notification

Friday, 21 February 2025

Final Submission and Registration

Room

Rhône 1

General Chair

Kevin Martin, Université Bretagne Sud, France

General Chair

Adrian Evans, CEA/LIST, France

Programme Committee Member

Caroline Collange, INRIA, France

Programme Committee Member

David Defour, Université de Perpignan, France

Programme Committee Member

Hyesoon Kim, Georgia Technical University, United States

Programme Committee Member

Leonidis Kosmidis, Barcelona Supercomuting Centre, France

Programme Committee Member

Christine Rochange, Univesité de Toulouse, France

Programme Committee Member

Blaise Tine, UCLA, United States

Programme Committee Member

Henk Corporaal , Eindhoven University of Technology , Netherlands

Programme Committee Member

Artur Podobas, KTH, Sweden

URL

Paper Submission Site

W05.1 Session 1: Invited Talks

Session Start

Tue, 14:10

Session End

Mon, 15:06

Keynote Speaker

Blaise Tine, UCLA, United States

Keynote Speaker

Henk Corporaal , Eindhoven University of Technology , Netherlands

This session includes two invited talks

The Hidden Costs of Open-Source Hardware Research

Prof. Blaise Tine - UCLA

In this talk, Dr. Tine will delve into the challenges and trade-offs inherent in open-source hardware research. While open-source initiatives democratize hardware development and foster innovation, they also introduce hidden complexities that can impact adoption, collaboration, and long-term sustainability. Drawing from his extensive experience in open-source hardware and system design, Dr. Tine will shed light on these often-overlooked challenges and offer strategies for researchers and practitioners to navigate them effectively. Looking ahead, Dr. Tine will discuss emerging trends that could shape the future of open-source hardware research.

CGRAs for the Edge: Balancing Compute Efficiency and Flexibility

Prof. Henk Corporaal (Eindhoven University of Technology)

Driven by AI and advanced signal-processing developments we observe a huge increase of computational requirements. Not only in the cloud, but even more at the Edge. There are substantial advantages of performing computation locally at the edge, like less data traffic, performing the computation close to the sensing data, reliability, real-time feedback and data privacy. This drives a strong demand for smart Edge computing. Edge compute devices have limited resources, and therefore require high energy- and area-efficient computing. This naturally demands for highly specialized processors. However, high specialization typically means high development costs and lower volume. Much worse, it makes them inflexible; they cannot adapt to (late) application changes and code updates, which are very common in our fast moving (software) world. Coarse Grain Reconfigurable Architectures (CGRAs) may be the solution; they aim to find a good balance between flexibility and compute efficiency. They can be easily tuned and scaled for application domains, while staying flexible, especially when they are fully programmable. In this presentation, we give an overview of CGRAs and their recent developments. We more precisely define and characterize CGRAs. We also present a metric for flexibility. Designing CGRAs results into various challenges. We illustrate key concepts and challenges using the recent open-source R-Blocks CGRA as example. Finally, we conclude by offering a glimpse into the CGRA future, exploring potential breakthroughs on the horizon.

W05.2 Session 2: Open Source GPU Applications

Session Start

Tue, 15:06

Session End

Tue, 15:30

Session chair

Kevin Martin, Université Bretagne Sud, France

resources, even when multiple tasks are sharing the GPU.

Presentations

A RISC-V Multicore and GPU SoC Platform with a Qualifiable Software Stack for Safety Critical Systems

Start

15:06

End

15:17

Speaker

Kevin Martin, Université Bretagne Sud, France

W05.2.1 GPGPUs on FPGAs: A Competitive Approach for Scientific Computing ?

Start

15:18

End

15:30

Speaker

Eric Guthmuller, CEA/LIST, France

FPGA architectures include increasingly complex arithmetic operators and optimized hard IPs, such as memory subsystems and Networks-on-Chip (NoC). This evolution leads to higher compute density also linked with high memory bandwidth. It represents an opportunity to tailor an architecture to niche application needs while being competitive with a costly ASIC implementation. More specifically, scientific computing requires high precision (> 32 bits) floating point computation. However, GPU vendors are progressively favoring low precision performance for AI needs, and are even phasing out support for 64-bit floating point compute. We present an analytical study motivating the need to investigate the implementation of an open source 64-bit GPGPU architecture on a state of the art FPGA, as an alternative to GPUs for scientific computing.

W05.3 Poster Session / Coffee Break

Session Start

Tue, 15:30

Session End

Tue, 16:30

Session chair

Kevin Martin, Université Bretagne Sud, France

Presentations

W05.3.1 Evaluation of CGRA Toolchains

Start

15:30

End

16:30

Author

Dominik Walter, Friedrich-Alexander-Universitat Erlangen-Nuernberg, Germany

W05.3.2 From Concept to Silicon: Rapid GPGPU Core Design and Integration with Open-Source ASIC Tools

Start

15:30

End

16:30

Author

Wang Wang, Berkeley University, United States

W05.3.3 Open-hardware GPUs as platforms for research: a feedback on the use of Vortex

Start

15:30

End

16:30

Author

Noïc Crouzet, Universite de Toulouse, France

W05.3.4 Multiport Support for Vortex OpenGPU Memory Hierarchy

Start

15:30

End

16:30

Author

Injae Shin, University of California Los Angeles, United States

W05.3.5 Benchmarking Floating Point Performance of Massively Parallel Dataflow Overlays on AMD Versal FPGA Compute Primitives

Start

15:30

End

16:30

Author

Mohamed Bouaziz, King Abdullah University of Science and Technology, Saudi Arabia

W05.4 Software and Tools

Session Start

Tue, 16:30

Session End

Tue, 16:52

Session chair

Adrian Evans, CEA/LIST, France

Presentations

W05.4.1 Hardware vs. Software Implementation of Warp-Level Features in Vortex RISC-V GPU

Start

16:30

End

16:42

Author

Huanzhi Pu, Georgia Institute of Technology, United States

RISC-V GPUs present a promising path for supporting GPU applications. Traditionally, GPUs achieve high efficiency through the SPMD (Single Program Multiple Data) programming model. However, modern GPU programming increasingly relies on warp-level features, which diverge from the conventional SPMD paradigm. In this paper, we explore how RISC-V GPUs can support these warp-level features both through hardware implementation and via software-only approaches. Our evaluation shows that a hardware implementation achieves up to 4 times geomean IPC speedup in microbenchmarks, while softwarebased solutions provide a viable alternative for area-constrained scenarios.

W05.4.2 Case Study on Combining Open-Source Tool Flows for Grids of Processing Cells

Start

16:42

End

16:54

Author

Lars Luchterhandt, Heinz Nixdorf Institute, Paderborn University, Paderborn, Germany

Massively parallel computer architectures based on identical microprocessor tiles are well known for their high scalability and performance. In this work, we introduce an opensource tool flow for scalable on-chip grids of RISC-V processor cells that seamlessly combines high-level SystemC modeling with the generation and simulation of hardware models at RTL down to FPGA implementation featuring the Chipyard framework. Our experimental evaluation quantifies the speed-accuracy trade-offs at different abstraction levels and compares them with their physical implementation on an FPGA.

W05.5 Invited Talks

Session Start

Tue, 16:54

Session End

Sat, 17:50

Session chair

Kevin Martin, Université Bretagne Sud, France

Presentations

W05.5.1 X-HEEP + CGRAs + GPU work-in-progress activities

Start

16:54

End

17:22

Keynote Speaker

Davide Schiavone, ESL, EPFL, Switzerland

This talk presents an ongoing evaluation of a Very-Wide-Register Coarse-Grained Reconfigurable Arrays (CGRAs) and a RISC-V GPU for edge computing nodes within the ESL EPFL. We are currently performing an analysis of these architectures in TSMC 16nm technology, aiming to identify optimal solutions for diverse computational workloads. This work is still in progress, but we will discuss preliminary findings, and insights, including HW and SW extensions for the open-source Vortex GPGPU. Furthermore, we will present three more CGRA designs, two of which have also been fabricated in TSMC 65nm LP. We will present initial results, highlighting their performance characteristics and potential applications. All of these accelerators are being integrated within the X-HEEP platform, a versatile RISC-V microcontroller system. X-HEEP leverages a rich ecosystem of open-source IPs, including CPUs from the OpenHW Group, uncore IPs from the PULP platform and OpenTitan project, and custom IPs. X-HEEP enables seamless integration and rapid prototyping. We will discuss the integration process and demonstrate how X-HEEP facilitates the evaluation and deployment of custom accelerators.

W05.5.2 ESP as an Open-Source Platform for Massively Parallel Integrated Circuits

Start

17:22

End

17:50

Keynote Speaker

Luca Carloni, Columbia University , United States

Open-source hardware can play a unique role to spark interdisciplinary research across computer architecture, programming languages, operating systems and computer-aided design. Further, it can enable collaborative engineering among researchers in academic, industrial and government labs. ESP is an open-source research platform for SoC design that combines a scalable tile-based architecture, and a flexible system-level design methodology. With ESP, designers can rapidly prototype a SoC architecture with multiple RISC-V processor cores and dozens of loosely coupled accelerators, all interconnected with a multiplane network-on-chip. Conceived as a heterogeneous system integration platform, ESP can scale to support the realization of massively parallel integrated circuits and chiplet-based systems.