Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Mutator-related GCWork and Send/Sync traits #543

Open
wks opened this issue Feb 7, 2022 · 6 comments
Open

Mutator-related GCWork and Send/Sync traits #543

wks opened this issue Feb 7, 2022 · 6 comments
Labels
P-normal Priority: Normal.

Comments

@wks
Copy link
Collaborator

wks commented Feb 7, 2022

Although Mutator is supposed to be a thread-local data structure of a mutator thread, some GCWork instances modifies the Mutator instances on behalf of the mutators. These include:

  • PrepareMutator
  • ReleaseMutator
  • ScanStackRoot

Remember that GCWork implements Send but not Sync. (It is Send because it can be distributed among workers; it is not Sync because a GCWork is supposed to be visible to only one thread at a time.)

Currently, those structs contain &mut mutator. PrepareMutator is a GCWork, and it needs to implements Send. Because it contains a &mut Mutator, it requires Mutator to implement Send, too. Similar is true for ReleaseMutator and ScanStackRoot.

p.s. Changing &mut Mutator to &Mutator will still require Mutator to implement Send, because PrepareMutator implements Send.

Can we make Mutator implement Send?

Mutator currently implements Send, but we probably don't want Mutator to implement Send, because it is supposed to be local to a mutator thread. (Update: As we discussed later in #543 (comment), it probably should implement Send after all. The Send trait may make sense during initialization.)

If Mutator implements Send (the status quo), there will be other consequences:

  1. PrepareMutator and its friends contain &mut mutator.
    • It requires Mutator to be Send
    • Currently, MutatorContext is declared as pub trait MutatorContext<VM: VMBinding>: Send + 'static
  2. Mutator contains a Box<dyn Barrier> which can contain a ObjectRememberingBarrier.
    • It will require Barrier to implement Send.
    • Currently, Barrier is declared as pub trait Barrier: 'static + Send
  3. ObjectRememberingBarrier contains a &MMTK.
    • This requiers &MMTK to implement Send, which in turn requires MMTK to implement Sync. (In Rust, T implements Sync if and only if &T implements Send)
  4. MMTK contains an Arc<GCWorkScheduler>
  5. GCWorkScheduler contains many WorkBucket.
  6. WorkBucket contains many PrioritizedWork.
  7. PrioritizedWork contains a Box<dyn GCWork<VM>>. Because it is dyn, it can be any GCWork, including PrepareMutator and others.
    • Because MMTK implements Sync, GCWork needs to implement Sync. This contradicts with our design that GCWork does not implement Sync.

From a different point of view, having ObjectRememberingBarrier in Mutator closes the loop from one work to any other work. Any work that contains &mut mutator can potentially see any other GCWork instances. This allows one worker to peek into another worker's work. Remember that GCWork is not Sync. Not implementing Sync means it cannot be visible to two threads at the same time.

But why is it working so far?

In scheduler.rs, there is a line:

unsafe impl<VM: VMBinding> Sync for GCWorkScheduler<VM> {}

That line broke the dependency chain shown above.

Solution?

Let's think about what should implement Sync. Shared data structures need to implement Sync.

  • MMTK needs to implement Sync.
  • GCWorkScheduler needs to implement Sync.
  • GCWorkerShared needs to implement Sync.

But we shouldn't need to write anything because the Rust compiler can figure out if MMTK satisfies the need of Sync.

And what shouldn't implement Sync?

  • Mutator should not implement Sync, because it should be local to a mutator. But GC also needs to mutate it.
    • Mutator::barrier: This should probably be not Sync, because only the mutator thread ever execute it. GC never touches it. And to some degree, it shouldn't exist at all, because Mutator has a reference to Plan. A barrier is "what to do when reading/writing a reference field". It is the responsibility of the Plan which describes the GC algorithm.
    • Mutator::allocator: This probably belongs to the part shared between the mutator thread and GC threads.

Should we do the same refactoring as we did for GC workers, i.e. splitting Mutator into a part shared with GC threads, and the part local to the mutator thread? Mutator is #[repr(C)], and the structure is visible to the VM.

@qinsoon
Copy link
Member

qinsoon commented Feb 7, 2022

I think Mutator is Send because we move it to a new thread when we create a new mutator.

@wks
Copy link
Collaborator Author

wks commented Feb 7, 2022

Yes. That makes sense. Given that Mutator does not contain types like Rc, it should be OK if it implements Send.

@wks
Copy link
Collaborator Author

wks commented Feb 11, 2022

Here is a minimum program that can reproduce the problem.

use std::sync::{Arc, RwLock};
use std::thread;

trait GCWork: Send {
    fn do_work(&mut self);
}

struct MMTK {
    work_packets: RwLock<Vec<Box<dyn GCWork>>>,
}

fn main() {
    let mmtk = Arc::new(MMTK {
        work_packets: RwLock::new(Vec::new()),
    });

    thread::spawn(move || {   // ERROR
        println!("Work packets length: {}", mmtk.work_packets.read().unwrap().len());
    }).join().unwrap();
}

Since GCWork doesn't implement Sync, RwLock doesn't implement Sync, either. Hence Rust doesn't allow MMTK to be shared between threads.

A surprising (to me at least) solution is to replace RwLock with Mutex. The following is an excerpt from the standard library:

unsafe impl<T: ?Sized + Send> Send for Mutex<T> {}
unsafe impl<T: ?Sized + Send> Sync for Mutex<T> {}

unsafe impl<T: ?Sized + Send> Send for RwLock<T> {}
unsafe impl<T: ?Sized + Send + Sync> Sync for RwLock<T> {}

Since Mutex turns anything Send into Sync, hence Mutex<GCWork> still implements Sync even though GCWork does not implement Sync. Therefore, the following code works:

use std::sync::{Arc, Mutex};
use std::thread;

trait GCWork: Send {
    fn do_work(&mut self);
}

struct MMTK {
    work_packets: Mutex<Vec<Box<dyn GCWork>>>,
}

fn main() {
    let mmtk = Arc::new(MMTK {
        work_packets: Mutex::new(Vec::new()),
    });

    thread::spawn(move || { // OK
        println!("Work packets length: {}", mmtk.work_packets.lock().unwrap().len());
    }).join().unwrap();
}

One important difference between Mutex and RwLock, I think, is that RwLock allows multiple readers to look at the same GCWork at the same time. This will require GCWork to implement Sync.

@qinsoon
Copy link
Member

qinsoon commented Feb 11, 2022

That is interesting and makes sense. Probably we should use Mutex for GCWork. GCWork should be Send and not Sync.

@wks
Copy link
Collaborator Author

wks commented Feb 11, 2022

Another interesting observation is that Rust doesn't seem to be smart enough about interface abstraction. See:

use std::sync::{Arc, RwLock};
use std::thread;

trait GCWork: Send {
    fn do_work(&mut self);
}

struct MMTK {
    work_packets: RwLock<Vec<Box<dyn GCWork>>>,
}

trait PacketSource {
    fn len(&self) -> usize;
    fn get(&self) -> Option<Box<dyn GCWork>>;
}

impl PacketSource for MMTK {
    fn len(&self) -> usize {
        self.work_packets.read().unwrap().len()
    }

    fn get(&self) -> Option<Box<dyn GCWork>> {
        self.work_packets.write().unwrap().pop()
    }
}

fn main() {
    let mmtk = Arc::new(MMTK {
        work_packets: RwLock::new(Vec::new()),
    });

    let packet_source = mmtk.clone() as Arc<dyn PacketSource>;

    thread::spawn(move || { // ERROR: dyn PacketSource cannot be shared between threads
        println!("Work packets length: {}", packet_source.len());
        if let Some(mut w) = packet_source.get() {
            w.do_work();
        }
    }).join().unwrap();
}

I attempted to hide the implementation details of MMTK behind the PacketSource trait. By providing two methods: len and get, I make sure that no two GCWork instances are visible to two different threads at any time, not even read-only visible. But Rust insisted that dyn PacketSource cannot be shared between threads because it doesn't require the Sync trait. Changing it to trait PacketSource: Sync will require MMTK to implement Sync as well.

Obviously, Rust is being overly protective. At some point, someone (either us or some library implementers) must use unsafe impl Sync to persuade Rust that it is safe for multiple thread to share reference to that object. Mutex obviously does that already.

If a data structure is synchronised, and does not allow accessing the elements in place, it should implement Sync, too. crossbeam::queue::ArrayQueue and crossbeam::queue::SegQueue do this, too:

unsafe impl<T: Send> Sync for ArrayQueue<T> {}
unsafe impl<T: Send> Send for ArrayQueue<T> {}

unsafe impl<T: Send> Send for SegQueue<T> {}
unsafe impl<T: Send> Sync for SegQueue<T> {}

And I confirmed that replacing RwLock<Vec<...>> with either ArrayQueue<...> or SegQueue<...> can solve the problem. Therefore the following program works:

use std::thread;
use std::sync::Arc;
use crossbeam::queue::SegQueue;

trait GCWork: Send {
    fn do_work(&mut self);
}

struct MMTK {
    work_packets: SegQueue<Box<dyn GCWork>>,
}

fn main() {
    let mmtk = Arc::new(MMTK {
        work_packets: SegQueue::new(),
    });

    thread::spawn(move || {
        println!("Work packets length: {}", mmtk.work_packets.len());
        while let Some(mut w) = mmtk.work_packets.pop() {
            w.do_work();
        }
    }).join().unwrap();
}

@k-sareen k-sareen added the P-normal Priority: Normal. label Nov 22, 2023
@qinsoon
Copy link
Member

qinsoon commented Mar 24, 2025

A related topic on this: Currently Mutator is !Sync, but we require an iterator of &mut Mutator. See #mmtk-core > Soundness of `ActivePlan::mutators`.

We can check if Mutator is Sync:

        fn is_sync<T: Sync>() {}
        is_sync::<crate::Mutator<VM>>();

We got this with the current version (v0.30):

error[E0277]: `(dyn barriers::Barrier<VM> + 'static)` cannot be shared between threads safely
   --> src/mmtk.rs:211:19
    |
211 |         is_sync::<crate::Mutator<VM>>();
    |                   ^^^^^^^^^^^^^^^^^^ `(dyn barriers::Barrier<VM> + 'static)` cannot be shared between threads safely
    |
    = help: the trait `Sync` is not implemented for `(dyn barriers::Barrier<VM> + 'static)`, which is required by `mutator_context::Mutator<VM>: Sync`
    = note: required for `std::ptr::Unique<(dyn barriers::Barrier<VM> + 'static)>` to implement `Sync`
note: required because it appears within the type `Box<(dyn barriers::Barrier<VM> + 'static)>`
   --> /opt/rust/toolchains/1.83.0-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/alloc/src/boxed.rs:234:12
    |
234 | pub struct Box<
    |            ^^^
note: required because it appears within the type `mutator_context::Mutator<VM>`
   --> src/plan/mutator_context.rs:89:12
    |
89  | pub struct Mutator<VM: VMBinding> {
    |            ^^^^^^^
note: required by a bound in `is_sync`
   --> src/mmtk.rs:210:23
    |
210 |         fn is_sync<T: Sync>() {}
    |                       ^^^^ required by this bound in `is_sync`

For more information about this error, try `rustc --explain E0277`.
error: could not compile `mmtk` (lib) due to 1 previous error

There are two different things for this issue:

  1. At mutator phase, a mutator thread owns Mutator, but at GC time, a GC thread needs to access Mutator.
  2. At GC time, multiple GC threads may access Mutator.

I think maybe we could avoid using &mut Mutator in our API, and instead use something like DuringGC<Mutator>. So Mutator is still not Sync, and is still owned by the mutator threads. But during GC, a different type DuringGC<Mutator> is created to allow GC threads to access. We can play around to make sure DuringGC works with Rust.

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
P-normal Priority: Normal.
Projects
None yet
Development

No branches or pull requests

3 participants