Concurrent Garbage Collection

To collect or not to collect, that is the garbage question.
—Unknown

All pointer-based nonblocking concurrent data structures should deal with the problem of safe memory reclamation: before reclaiming a memory block, a thread should make sure that no other threads are concurrently dereferencing the block. Various safe memory reclamation schemes have been proposed in the literature, but none of them is clearly better than the others in every aspect. The trade-offs indicate the complex nature of memory reclamation.

We aim to break the trade-offs by combining the great ideas of prior work and our new ideas in an interesting way, producing the off-the-shelf solution for safe memory reclamation.

Publications

(PLDI 2025) Leveraging Immutability to Validate Hazard Pointers for Optimistic Traversals.
Janggun Lee, Jeonghyeon Kim, Jeehoon Kang.
ACM SIGPLAN conference on Programming Languages Design and Implementation.
[paper: doi, local] [artifact: proofs, benchmark]

Abstract
Hazard pointers (HP) is one of the earliest manual memory reclamation algorithms for concurrent data structures. It is widely used for its robustness: memory overhead is bounded (e.g., by the number of threads). To access a node, threads first announce the protection of each to-be-accessed node, which prevents its reclamation. After announcement, they validate the node's reachability from the root to ensure that no threads have missed the announcement and reclaimed it. Traversal-based data structures typically takes a marking-based validation strategy. This strategy uses a node's mark to indicate whether the node is to be detached. Unmarked nodes are considered safe to traverse as both the node and its successors are still reachable, while marked nodes are considered unsafe. However, this strategy is inapplicable to the efficient optimistic traversal strategy that skips over marked nodes.
We propose a new validation strategy for HP that supports lock-free data structures with optimistic traversal, such as lists, trees, and skip lists. The key idea is to exploit the immutability of marked nodes, and validate their reachability at once by checking the reachability of the most recent unmarked node. To ensure correctness, we prove the safety of Harris's list protected with the new strategy in Rocq using the Iris separation logic framework. We show that the new strategy's performance is competitive with state-of-the-art reclamation algorithms when applied to data structures with optimistic traversal, while remaining simple and robust.

(PLDI 2025) Verifying General-Purpose RCU for Reclamation in Relaxed Memory Separation Logic.
Jaehwang Jung, Sunho Park, Janggun Lee, Jeho Yeon, Jeehoon Kang.
ACM SIGPLAN conference on Programming Languages Design and Implementation (Distinguished Paper Award).
[paper: doi, local]

Abstract
Read-Copy-Update (RCU) is a critical synchronization mechanism for concurrent data structures, enabling efficient deferred memory reclamation. However, implementing and using RCU correctly is challenging due to its inherent concurrency complexities. While previous work verified RCU, they either relied on unrealistic assumptions of sequentially consistent (SC) memory model or lacked three key features of general-purpose RCU libraries: modular specification, switchable critical sections, and concurrent writer support.
We present the first formal verification of a general-purpose RCU in realistic relaxed memory consistency (RMC), addressing the challenges posed by these features. To achieve modular specification that encompasses relaxed behaviors, we extend existing SC specifications to account for explicit synchronization. To support switchable critical sections, which require read-after-write (RAW) synchronization, we introduce a reasoning principle for RAW-synchronizing SC fences. Using this principle, we also present the first formal verification of Peterson's mutex in RMC. To support concurrent writers performing partially ordered writes, we avoid assuming a total order of links and instead formulate invariants based on per-node incoming link histories. Our proofs are mechanized in the iRC11 relaxed memory separation logic, built upon Iris, in Rocq.

(Ph.D. Dissertation 2025) Design and Verification of Concurrent Memory Reclamation Algorithms.
Jaehwang Jung.
School of Computing, KAIST (Outstanding PhD Thesis Award).
[paper: local]

Abstract
Memory reclamation for optimistic concurrency in unmanaged programming languages is a challenging problem that poses subtle trade-offs. For example, Hazard Pointers is memory-efficient but not applicable to a wide range of efficient data structures, while reference counting provides an easy-to-use automatic interface but is slower than manual algorithms that are prone to usage errors leading to critical bugs. This dissertation tackles such trade-offs among various performance characteristics and ease of use through novel algorithm designs and formal verification.
First, we introduce HP++, an extension of Hazard Pointers to improve its applicability. Specifically, HP++ supports optimistic traversal algorithms while retaining the memory efficiency of Hazard Pointers by under-approximating unreachability and patching up potentially unsafe accesses. Second, we present Concurrent Immediate Reference Counting (CIRC), which achieves both high throughput and ease of use by wrapping the intricacy of fast manual algorithms in a safe and automatic interface of reference counting. At the same time, CIRC features immediate recursive reclamation, which is essential for quickly reclaiming linked structures.
Finally, to enhance confidence in the correctness of performant but difficult manual reclamation methods, we formally verify their algorithm and usage in concurrent data structures. Our approach is centered around modular specifications of memory reclamation algorithms based on concurrent separation logic, which enables compositional verification of concurrent data structures with memory reclamation. Furthermore, we extend our verification to relaxed memory consistency models, addressing the complexity caused by instruction reordering in modern hardware architectures and compiler optimizations.

(SPAA 2024) Expediting Hazard Pointers with Bounded RCU Critical Sections.
Jeonghyeon Kim, Jaehwang Jung, Jeehoon Kang.
ACM Symposium on Parallelism in Algorithms and Architectures (Best Paper Award).
[paper: doi, local] [artifact: benchmark]

Abstract
Reclamation schemes for concurrent data structures tackle the challenge of synchronizing memory accesses and reclamation. Early schemes faced a tradeoff between robustness and efficiency: hazard pointers (HP) bounds the number of unreclaimed nodes, but it is inefficient due to per-node protection; and RCU sacrifices robustness for efficiency as a single thread may block the entire reclamation. Recent schemes attempt to break the tradeoff by sending signals to blocking threads to abort their operations. However, they are (1) inefficient due to starvation in long-running operations and frequent signals, and (2) inapplicable to a wide class of data structures.
We design a novel reclamation scheme that overcomes the above limitations. To address the long-running operations and applicability, we propose HP-RCU, integrating RCU-expedited traversal that alternates between HP and RCU phases. To additionally ensure robustness against stalled threads, we develop HP-BRCU by modularly replacing RCU with bounded RCU (BRCU) that efficiently bounds the duration of RCU phases by rarely sending signals. We show that HP-BRCU is robust, widely applicable, and as efficient as RCU, outperforming robust schemes across various workloads.

(PLDI 2024) Concurrent Immediate Reference Counting.
Jaehwang Jung, Jeonghyeon Kim, Matthew J. Parkinson, Jeehoon Kang.
ACM SIGPLAN conference on Programming Languages Design and Implementation.
[paper: doi, local] [artifact: benchmark]

Abstract
Memory management for optimistic concurrency in unmanaged programming languages is challenging. Safe memory reclamation (SMR) algorithms help address this, but they are difficult to use correctly. Automatic reference counting provides a simpler interface, but it has been less efficient than SMR algorithms. Recently, there has been a push to apply the optimizations used in garbage collectors for managed languages to elide reference count updates from local references. Notably, Fast Reference Counter, OrcGC, and Concurrent Deferred Reference Counting use SMR algorithms to protect local references by deferring decrements or reclamation. While they show a significant performance improvement, their use of deferral may result in growing memory usage due to slow reclamation of linked structures, and suboptimal performance in update-heavy workloads.
We present Concurrent Immediate Reference Counting (CIRC), a new combination of SMR algorithms with reference counting. CIRC employs deferral like other modern methods, but it avoids their problems with novel algorithms for (1) immediately reclaiming linked structures recursively by tracking the reachability of each object, and (2) applying decrements immediately and deferring only the reclamation. Our experiments show that CIRC's memory usage does not grow over time and is only slightly higher than the underlying SMR. Moreover, CIRC further narrows the performance gap between the underlying SMR, positioning it as a promising solution to safe automatic memory management for highly concurrent data structures in unmanaged languages.

(OOPSLA 2023) Modular Verification of Safe Memory Reclamation in Concurrent Separation Logic.
Jaehwang Jung, Janggun Lee, Jaemin Choi, Jaewoo Kim, Sunho Park, Jeehoon Kang.
Object-oriented Programming, Systems, Languages, and Applications.
[paper: doi, local] [artifact: proofs]

Abstract
Formal verification is an effective method to address the challenge of designing correct and efficient concurrent data structures. But verification efforts often ignore memory reclamation, which involves nontrivial synchronization between concurrent accesses and reclamation. When incorrectly implemented, it may lead to critical safety errors such as use-after-free and the ABA problem. Semi-automatic safe memory reclamation schemes such as hazard pointers and RCU encapsulate the complexity of manual memory management in modular interfaces. However, this modularity has not been carried over to formal verification.
We propose modular specifications of hazard pointers and RCU, and formally verify realistic implementations of them in concurrent separation logic. Specifically, we design abstract predicates for hazard pointers that capture the meaning of validating the protection of nodes, and those for RCU that support optimistic traversal to possibly retired nodes. We demonstrate that the specifications indeed facilitate modular verification in three criteria: compositional verification, general applicability, and easy integration. In doing so, we present the first formal verification of Harris's list, the Harris-Michael list, the Chase-Lev deque, and RDCSS with reclamation. We report the Coq mechanization of all our results in the Iris separation logic framework.

(SPAA 2023) Applying Hazard Pointers to More Concurrent Data Structures.
Jaehwang Jung, Janggun Lee, Jeonghyeon Kim, Jeehoon Kang.
ACM Symposium on Parallelism in Algorithms and Architectures.
[paper: doi, local] [artifact: development, benchmark]

Abstract
Hazard pointers is a popular semi-manual memory reclamation scheme for concurrent data structures, where each accessing thread announces the protection of each object to access and validates that the pointer is not already freed. Validation is typically done by over-approximating unreachability: if an object seems to be unreachable from the root of the data structure, the protecting thread decides not to access the object as it might have been freed. However, many efficient data structures are incompatible with validation by over-approximation as their optimistic traversal strategy intentionally ignores the warning of unreachability to achieve better performance.
We design HP++, an extension to hazard pointers that supports optimistic traversal. The key idea is under-approximating unreachability during validation and patching up the potentially unsafe accesses arising from false-negatives. Thanks to optimistic traversal, data structures with HP++ outperform the same-purpose data structures with HP under contention while consuming a similar amount of memory.
Changelog/Errata
- Figure 8: Optimized HP++ SkipList.
- Figure 9: Fixed Y-axis labels.
- Algorithm 4: Fixed the ABA problem involving anchor_next.

(PLDI 2020) A Marriage of Pointer- and Epoch-Based Reclamation.
Jeehoon Kang, Jaehwang Jung.
ACM SIGPLAN conference on Programming Languages Design and Implementation.
[paper: doi, local] [artifact: benchmark]

Abstract
All pointer-based nonblocking concurrent data structures should deal with the problem of safe memory reclamation: before reclaiming a memory block, a thread should ensure no other threads hold a local pointer to the block that may later be dereferenced. Various safe memory reclamation schemes have been proposed in the literature, but none of them satisfy the following desired properties at the same time: (i) robust: a non-cooperative thread does not prevent the other threads from reclaiming an unbounded number of blocks; (ii) fast: it does not incur significant time overhead; (iii) compact: it does not incur significant space overhead; (iv) self-contained: it neither relies on special hardware/OS supports nor intrusively affects execution environments; and (v) widely applicable: it supports many data structures.
We introduce PEBR, which we believe is the first scheme that satisfies all the properties above. PEBR is inspired by Snowflake's hybrid design of pointer- and epoch-based reclamation schemes (PBR and EBR, resp.) that is mostly robust, fast, and compact but neither self-contained nor widely applicable. To achieve self-containedness, we design algorithms using only the standard C/C++ concurrency features and process-wide memory fence. To achieve wide applicability, we characterize PEBR's requirement for safe reclamation that is satisfied by a variety of data structures, including Harris's and Harris-Herlihy-Shavit's lists that are not supported by PBR. We experimentally evaluate whether PEBR is fast and robust using microbenchmarks, for which PEBR performs comparably to the state-of-the-art schemes.