Hunter McCoy

PhD Student in Computer Science

Concurrent GPU Hash Tables

Concurrent GPU Hash Tables

Adversarial test for concurrent safety

Authors: Hunter McCoy, Prashant Pandey

Status:Accepted to ALENEX`26

Abstract

GPU hash tables are increasingly used to accelerate data processing, but their limited functionality restricts adoption in large-scale data processing applications. Current limitations include incomplete concurrency support and missing compound operations such as upserts.

This paper presents WarpSpeed, a library of highperformance concurrent GPU hash tables with a unified benchmarking framework for performance analysis. WarpSpeed implements eight state-of-the-art Nvidia GPU hash table designs and provides a rich API designed for modern GPU applications. Our evaluation uses diverse benchmarks to assess both correctness and scalability, and we demonstrate real-world impact by integrating these hash tables into three downstream applications.

We propose several optimization techniques to reduce concurrency overhead, including fingerprint-based metadata to minimize cache line probes and specialized Nvidia GPU instructions for lock-free queries. Our findings provide new insights into concurrent GPU hash table design and offer practical guidance for developing efficient, scalable data structures on modern GPUs.