I've encountered an issue with memory reclamation in crossbeam
. Say you're implementing a simple thread-safe lock-free container that holds a single value. Any thread can obtain a clone of the stored value, and the value can get updated at any point, after which readers begin observing clones of the new value.
Although the typical use case would be to specify something like Arc<X>
as T, the implementation cannot rely on T being pointer-sized - for example, X
could be a trait, resulting in a fat-pointer Arc<X>
. But lock-free access to an arbitrary T seems like a great fit for epoch-based lock-free code. Based on the examples, I came up with this:
extern crate crossbeam;
use std::thread;
use std::sync::atomic::Ordering;
use crossbeam::epoch::{self, Atomic, Owned};
struct Container<T: Clone> {
current: Atomic<T>,
}
impl<T: Clone> Container<T> {
fn new(initial: T) -> Container<T> {
Container { current: Atomic::new(initial) }
}
fn set_current(&self, new: T) {
let guard = epoch::pin();
let prev = self.current.swap(Some(Owned::new(new)),
Ordering::AcqRel, &guard);
if let Some(prev) = prev {
unsafe {
// once swap has propagated, *PREV will no longer
// be observable
//drop(::std::ptr::read(*prev));
guard.unlinked(prev);
}
}
}
fn get_current(&self) -> T {
let guard = epoch::pin();
// clone the latest visible value
(*self.current.load(Ordering::Acquire, &guard).unwrap()).clone()
}
}
When used with a type that doesn't allocate, e.g. with T=u64
, it works great - set_current
and get_current
can be called millions of times without leaking. (The process monitor shows minor memory oscillations due to epoch
pseudo-gc, as would be expected, but no long-term growth.) However, when T is a type that allocates, e.g. Box<u64>
, one can easily observe leaks. For example:
fn main() {
use std::sync::Arc;
let c = Arc::new(Container::new(Box::new(0)));
const ITERS: u64 = 100_000_000;
let producer = thread::spawn({
let c = Arc::clone(&c);
move || {
for i in 0..ITERS {
c.set_current(Box::new(i));
}
}
});
let consumers: Vec<_> = (0..16).map(|_| {
let c = Arc::clone(&c);
thread::spawn(move || {
let mut last = 0;
loop {
let current = c.get_current();
if *current == ITERS - 1 {
break;
}
assert!(*current >= last);
last = *current;
}
})}).collect();
producer.join().unwrap();
for x in consumers {
x.join().unwrap();
}
}
Running this program shows a steady and significant increase in memory usage that ends up consuming the amount of memory proportional to the number of iterations.
According to the blog post introducing it, Crossbeam's epoch reclamation "does not run destructors, but merely deallocates memory". The try_pop
in the Treiber stack example uses ptr::read(&(*head).data)
to move the value contained in head.data
out of the head
object destined for deallocation. The ownership of the data object is transferred to the caller which will either move it elsewhere or deallocate it when it goes out of scope.
How would that translate to the code above? Is the setter the proper place for guard.unlinked
, or how else does one ensure that the drop
is run on the underlying object? Uncommenting the explicit drop(ptr::read(*prev))
results in failed assertion that checks monotonicity, possibly indicating premature deallocation.