Description
Note: I am not proposing any changes to #643. The algorithm works, and its style is consistent with other parts of mmtk-core. I am talking about a more general issue in mmtk-core.
TL;DR: We have Chunk(Address)
, Block(Address)
, Line(Address)
and Page(Address)
. They implement Copy
, and has pointer semantics, but does not own the region they point to. This approach is not idiomatic in Rust. It does not prevent two pointers to point to the same region mutably, and the programmers need to add hard-to-reason-about locks even when they have &mut
access. We may need to introduce a set of owning types that own those regions.
Background
Rust ownership and references
Rust has&T
, &mut T
and T
.
&mut T
is an exclusive (unique) borrowing reference, and it grants read-write access.
By enforcing the borrowing rules, Rust programmers are confident that if a variable holds a &mut T
, it is the only variable that refers to the T
, and it has read-write access to it. No other &mut T
, not even &T
, could coexist. For this reason, Rust often advertise as having "fearless concurrency". If you respect the borrowing rules, there can be no data race.
Region types
The Region
trait represents an aligned region of memory of a certain size. Instances include Chunk(Address)
, Block(Address)
, Line(Address)
and Page(Address)
. The PR #643 also introduces a new type Block(NonZeroUsize)
. All of them are wrappers of addresses.
Region
instances implement Copy
. They can be copied just like C pointers. And they can be constructed from addresses. In this way, given an address range, we can iterate through all aligned regions in that range. Each region is constructed from the address.
// In mmtk::policy::immix::chunk::Chunk::sweep
for block in self.blocks().filter(/* ... */) {
// A new Block instance is created in each iteration.
if !block.sweep(/* ... */) { // Accessing block mutably
// ...
}
}
As we can see, in each iteration, we create a Block
instance from an address, and access the block mutably (sweep
).
This means any thread can gain read-write access to a Block
without borrowing or getting ownership, and dangers await.
The problem
The mysterious lock
When reviewing the PR #643, I noticed an interesting piece of code.
impl Block {
// ... other methods here
#[inline(always)]
pub fn attempt_release<VM: VMBinding>(self, space: &MarkSweepSpace<VM>) -> bool {
// ...
let block_list: *mut BlockList = loop {
let list: *mut BlockList = self.load_block_list();
(*list).lock();
if list == self.load_block_list() {
break list;
}
(*list).unlock();
};
(*block_list).remove(self);
(*block_list).unlock();
// ...
}
// ... other methods here.
}
This code snippet tries to remove a Block
from a BlockList
(a doubly-linked list), but another thread may have moved the Block
to another BlockList
concurrently. So it has to try to lock the BlockList
in a loop until it successfully locked the BlockList
and the Block
is still in that BlockList
.
This piece of code puzzled me in two ways.
- If we already have a
*mut T
, why there is still a race? Why don't we have unique access to it? (OK. It is pointer, not reference. But it is still counter-intuitive to Rust programmers.) - What operation could the other thread be performing? If it is racing, it is racing against something.
To answer question (2), I found the two places that tries to acquire the same lock. One (the shown snippet) is sweeping chunks, and the other is releasing mutators.
During the Release
stage, the MarkSweep collector does two things concurrently.
a. For all chunks, find all blocks in it that are unmarked, and release them from their BlockList (the shown snippet).
b. For all mutators, transfer all blocks from the available_blocks
BlockList and the consumed_blocks
BlockList into the unswept_blocks
BlockList.
When doing both concurrently, each block may be reached from two different places, namely through chunks and through mutators. If not synchronised properly, it may leave the doubly-linked list in an inconsistent state. Therefore, a spinlock is added to the BlockList
struct, and used to prevent this race. The lock is obtained using the lock()
method, as shown in the snippet above.
The lock solved the racing problem here. But I am curious why this problem still occurs in Rust. Shouldn't Rust protect us from this kind of race?
Why does this happen?
The root cause is that Block
implements Copy
and can be created from addresses. It has pointer semantics. It points to a Block
, but does not represent ownership. So all the ownership-transfer and borrow-checking mechanisms stopped working in this use case. Any thread can gain mutable access by simply creating a Block
instance.
The SweepChunk work packet uses RegionIterator<Block>
over the Chunk
to gain access. It bypassed the borrow checker, allowing two threads to hold &mut BlockList
simultaneously.
But why we don't need locks in other places?
If the Chunk
and Block
types are inherently unsafe, I would expect the PR #643 has lock/unlock operations everywhere. But it's not the case. BlockList
is usually accessed without locking.
A BlockList
represents a list of Block
of certain size. It has several mutable methods: remove()
, push()
, pop()
, append()
and reset()
.
- On the fast path of
alloc()
, the mutator tries topop()
a block from theBlockList
in theavailable_blocks
array owned by the allocator which is owned by the current mutator. Since the mutator is callingpop()
on aBlockList
it already owns, it shall not cause data race. - If that fails, the mutator will try to recycle local blocks (if not doing eager sweeping). It will
pop()
aBlock
from its localunswept_blocks
list, sweep it, and retry allocating from that block. It is still accessing blocks and block lists it owns, so there is still no data race. - If that fails, too, the mutator will enter the slow path, and try to acquire blocks from the global pool. This time, it accesses
MarkSweepSpace::abandoned
which is aMutex<AbandonedBlockLists>
. This forces all mutators to acquire a lock before accessing the lists within, and the lock prevents data race. - If that still fails, it will ask the PageResource for fresh pages. PageResource uses Mutex internally to prevent data race.
Observed from this analysis, we see our MiMalloc implementation is actually a good example of using ownership to prevent data race. The fast path accesses owned data so there can't be data race; the slow path accesses global data protected by Mutex so it prevents data race; once the blocks are transferred from the global pool to the local BlockList, subsequent allocations from that block will be local, and can't have data race, either.
Unfortunately, the ownership model is only in the developers' head, but not reflected in the code. Block
implements Copy
, so once assigned, the original owner may still keep a Block
instance that represents the same block. In Rust, this shouldn't happen. Once the ownership is transferred, the original owner no longer has access to it. This ensures the global pool cannot give the same Block
to two different mutators.
Throughout the code, Block
is treated as a pointer. Therefore, whether to use &self
or &mut self
is only a matter of whether it mutates the block.0
field of the Block(NonZeroUsize)
instance. We can see that Block::sweep(&self)
actually takes a &self
as parameter, even though it writes into the block.
I would not disapprove the MiMalloc PR for this, because this is common in MMTk core. The immix::Block
is the same. All of its methods take &self
parameter, including immix::Block::sweep(&self)
.
So what kind of Block
type should we have?
We need something that represents the ownership of a block. For this, the Block
type should not implement Clone
or Copy
. MarkSweepSpace::acquire_block
should be the only place where Block
instances can be created, and it creates Block
from the address returned from PageResource when allocating "fresh" blocks. Block
instances are destroyed in Block::attempt_release(self)
(:thumbup: to the owning self
parameter), when its underlying pages are released back to the PageResource. When transferring a Block
from one BlockList
to another, or from the global pool to the mutator, Rust's ownership model will ensure no two mutators share the same Block
, and even no two BlockList
can hold the same Block
at the same time.
What about sweeping all blocks in a Chunk
? From Rust's point of view, the mutator owns the block, because the MarkSweepSpace
creates Block
from the address returned from PageResource, and transfers it to the mutator. Mutators, however, can abandon blocks so they are transferred to MarkSweepSpace
. If each Block
is owned by either a mutator or MarkSweepSpace
, I would argue that it is completely unnecessary to sweep blocks by iterating chunks. Instead, we can simply iterate through each block in each mutator and the MarkSweepSpace
, see if it is marked, and remove it from its BlockList
. That's all the blocks out there. This can be parallelised by creating one work packet for each mutator, and one more for the global MarkSweepSpace
. And we don't need any lock on the BlockList
, because each BlockList
is also owned by exactly one mutator (or the MarkSweepSpace
) and the race is impossible.
Known issues
Is this approach applicable to other GC algorithms, like Immix?