GSoC Report: 64 bit global pointers in RV32 based GP-GPU

by Reshabh Sharma on September 3, 2019

This is a guest post by Reshabh Sharma, who worked this summer on a Google Summer of Code (GSoC) project under the umbrella of the FOSSi Foundation.

RISC-V will change the world. Prof Taylor's Bespoke Silicon Group is contributing by developing a GP-GPU based on RISC-V 32 bit ISA (RV32), after the huge success of their Open-Source RISC-V Tiered Accelerator Fabric SoC, Celerity which holds the world record for RISC-V performance; 500B RISC-V instructions per second, beating prior records by 100X.

For compute 32 bit is common for requirements like very high energy efficiency and density. Since GPGPUs often requires 4GB+ of memory, we need 64 bit addresses to access DRAM. This summer I worked under the vision of Prof. Taylor to initiate the support for custom instructions specifically designed for RISC-V based GP-GPU. We started with supporting 64 bit pointers using custom load and store instructions in address space 1 inside the RISC-V LLVM backend.

Getting Started

Our RISC-V LLVM backend fork is available here. Get started by building LLVM with our custom RISC-V backend fork following the steps given here.

Custom Instructions

Following new instructions have been added:

  • LDW rd, rs1, rs2 Loads the value from a 64 bit address by concating the i32 values in two registers.
  • SDW rd, rs1, rs2 Stores the value to a 64 bit address by concating the i32 values in two registers.

Phase 1: Define new instructions

This phase was fairly simple and dealt with the addition of new instructions in RISCV LLVM backend.

Phase 2: Update data layout string

Data layout string conveys the front-end about the size and alignments of different entities like integers, pointers etc. We updated the data layout string to support 64 bit pointers in address space 1.

Phase 3: Lowering to custom load and store

Lowering to custom store was a huge challange and I'm glad that we could complete it. More information about the challanges we faced during lowering can be found at this blog post

All the code is hosted here List of commits:


It was a wonderful experience working under the mentorship of Prof. Taylor. I also appriciate all the efforts from Neil Ryan and the awesome collaboration with Bespoke Silicon Group. The task was complex and looked hard to be completed in the given time frame, I'm glad we did it. I've got a lot of help from llvm-dev mailing list, riscv-llvm group especially Alex Bradbury and Luís Marques who are still helping us to run perf benchmark on spike. Million thanks to everyone who supported us. Feel free to reach out at for any feedback/suggestions.