Concurrent Garbage Collection

Garbage in, garbage out

To collect or not to collect, that is the garbage question.
—Unknown

All pointer-based nonblocking concurrent data structures should deal with the problem of safe memory reclamation: before reclaiming a memory block, a thread should make sure that no other threads are concurrently dereferencing the block. Various safe memory reclamation schemes have been proposed in the literature, but none of them is clearly better than the others in every aspect. The trade-offs indicate the complex nature of memory reclamation.

We aim to break the trade-offs by combining the great ideas of prior work and our new ideas in an interesting way, producing the off-the-shelf solution for safe memory reclamation.

Publications

  • (SPAA 2024) Expediting Hazard Pointers with Bounded RCU Critical Sections.
    Jeonghyeon Kim, Jaehwang Jung, Jeehoon Kang.
    ACM Symposium on Parallelism in Algorithms and Architectures (Best Paper Award, to appear).
    [paper: doi, local] ​ [artifact: benchmark] ​ ​

    Abstract

    Reclamation schemes for concurrent data structures tackle the challenge of synchronizing memory accesses and reclamation. Early schemes faced a tradeoff between robustness and efficiency: hazard pointers (HP) bounds the number of unreclaimed nodes, but it is inefficient due to per-node protection; and RCU sacrifices robustness for efficiency as a single thread may block the entire reclamation. Recent schemes attempt to break the tradeoff by sending signals to blocking threads to abort their operations. However, they are (1) inefficient due to starvation in long-running operations and frequent signals, and (2) inapplicable to a wide class of data structures.

    We design a novel reclamation scheme that overcomes the above limitations. To address the long-running operations and applicability, we propose HP-RCU, integrating RCU-expedited traversal that alternates between HP and RCU phases. To additionally ensure robustness against stalled threads, we develop HP-BRCU by modularly replacing RCU with bounded RCU (BRCU) that efficiently bounds the duration of RCU phases by rarely sending signals. We show that HP-BRCU is robust, widely applicable, and as efficient as RCU, outperforming robust schemes across various workloads.

  • (PLDI 2024) Concurrent Immediate Reference Counting.
    Jaehwang Jung, Jeonghyeon Kim, Matthew J. Parkinson, Jeehoon Kang.
    ACM SIGPLAN conference on Programming Languages Design and Implementation (to appear).
    [paper: doi, local] ​ [artifact: benchmark] ​ ​

    Abstract

    Memory management for optimistic concurrency in unmanaged programming languages is challenging. Safe memory reclamation (SMR) algorithms help address this, but they are difficult to use correctly. Automatic reference counting provides a simpler interface, but it has been less efficient than SMR algorithms. Recently, there has been a push to apply the optimizations used in garbage collectors for managed languages to elide reference count updates from local references. Notably, Fast Reference Counter, OrcGC, and Concurrent Deferred Reference Counting use SMR algorithms to protect local references by deferring decrements or reclamation. While they show a significant performance improvement, their use of deferral may result in growing memory usage due to slow reclamation of linked structures, and suboptimal performance in update-heavy workloads.

    We present Concurrent Immediate Reference Counting (CIRC), a new combination of SMR algorithms with reference counting. CIRC employs deferral like other modern methods, but it avoids their problems with novel algorithms for (1) immediately reclaiming linked structures recursively by tracking the reachability of each object, and (2) applying decrements immediately and deferring only the reclamation. Our experiments show that CIRC's memory usage does not grow over time and is only slightly higher than the underlying SMR. Moreover, CIRC further narrows the performance gap between the underlying SMR, positioning it as a promising solution to safe automatic memory management for highly concurrent data structures in unmanaged languages.

  • (OOPSLA 2023) Modular Verification of Safe Memory Reclamation in Concurrent Separation Logic.
    Jaehwang Jung, Janggun Lee, Jaemin Choi, Jaewoo Kim, Sunho Park, Jeehoon Kang.
    Object-oriented Programming, Systems, Languages, and Applications.
    [paper: doi, local] ​ [artifact: Coq proofs] ​ ​

    Abstract

    Formal verification is an effective method to address the challenge of designing correct and efficient concurrent data structures. But verification efforts often ignore memory reclamation, which involves nontrivial synchronization between concurrent accesses and reclamation. When incorrectly implemented, it may lead to critical safety errors such as use-after-free and the ABA problem. Semi-automatic safe memory reclamation schemes such as hazard pointers and RCU encapsulate the complexity of manual memory management in modular interfaces. However, this modularity has not been carried over to formal verification.

    We propose modular specifications of hazard pointers and RCU, and formally verify realistic implementations of them in concurrent separation logic. Specifically, we design abstract predicates for hazard pointers that capture the meaning of validating the protection of nodes, and those for RCU that support optimistic traversal to possibly retired nodes. We demonstrate that the specifications indeed facilitate modular verification in three criteria: compositional verification, general applicability, and easy integration. In doing so, we present the first formal verification of Harris's list, the Harris-Michael list, the Chase-Lev deque, and RDCSS with reclamation. We report the Coq mechanization of all our results in the Iris separation logic framework.

  • (SPAA 2023) Applying Hazard Pointers to More Concurrent Data Structures.
    Jaehwang Jung, Janggun Lee, Jeonghyeon Kim, Jeehoon Kang.
    ACM Symposium on Parallelism in Algorithms and Architectures.
    [paper: doi, local] ​ [artifact: development, benchmark] ​ ​

    Abstract

    Hazard pointers is a popular semi-manual memory reclamation scheme for concurrent data structures, where each accessing thread announces the protection of each object to access and validates that the pointer is not already freed. Validation is typically done by over-approximating unreachability: if an object seems to be unreachable from the root of the data structure, the protecting thread decides not to access the object as it might have been freed. However, many efficient data structures are incompatible with validation by over-approximation as their optimistic traversal strategy intentionally ignores the warning of unreachability to achieve better performance.

    We design HP++, an extension to hazard pointers that supports optimistic traversal. The key idea is under-approximating unreachability during validation and patching up the potentially unsafe accesses arising from false-negatives. Thanks to optimistic traversal, data structures with HP++ outperform the same-purpose data structures with HP under contention while consuming a similar amount of memory.

    Changelog/Errata
    • Figure 8: Optimized HP++ SkipList.
    • Figure 9: Fixed Y-axis labels.
    • Algorithm 4: Fixed the ABA problem involving anchor_next.
  • (PLDI 2020) A Marriage of Pointer- and Epoch-Based Reclamation.
    Jeehoon Kang, Jaehwang Jung.
    ACM SIGPLAN conference on Programming Languages Design and Implementation.
    [paper: doi, local] ​ [artifact: benchmark] ​ ​

    Abstract

    All pointer-based nonblocking concurrent data structures should deal with the problem of safe memory reclamation: before reclaiming a memory block, a thread should ensure no other threads hold a local pointer to the block that may later be dereferenced. Various safe memory reclamation schemes have been proposed in the literature, but none of them satisfy the following desired properties at the same time: (i) robust: a non-cooperative thread does not prevent the other threads from reclaiming an unbounded number of blocks; (ii) fast: it does not incur significant time overhead; (iii) compact: it does not incur significant space overhead; (iv) self-contained: it neither relies on special hardware/OS supports nor intrusively affects execution environments; and (v) widely applicable: it supports many data structures.

    We introduce PEBR, which we believe is the first scheme that satisfies all the properties above. PEBR is inspired by Snowflake's hybrid design of pointer- and epoch-based reclamation schemes (PBR and EBR, resp.) that is mostly robust, fast, and compact but neither self-contained nor widely applicable. To achieve self-containedness, we design algorithms using only the standard C/C++ concurrency features and process-wide memory fence. To achieve wide applicability, we characterize PEBR's requirement for safe reclamation that is satisfied by a variety of data structures, including Harris's and Harris-Herlihy-Shavit's lists that are not supported by PBR. We experimentally evaluate whether PEBR is fast and robust using microbenchmarks, for which PEBR performs comparably to the state-of-the-art schemes.