Skip to content

Clone without unnecessary allocations #112

Closed
@hbgl

Description

@hbgl

For my use case I have a worker thread that periodically updates a thread local HashMap and then copies it to another thread. For most iterations no entries are added or removed, only the values change.

I would like to efficiently synchronize the two HashMaps by copying the arrays of control bytes and elements directly to the already allocated arrays of the other HashMap. Basically it would be the same thing that's done in RawTable's clone function without the extra allocations.

hashbrown/src/raw/mod.rs

Lines 978 to 1028 in bacb169

impl<T: Clone> Clone for RawTable<T> {
fn clone(&self) -> Self {
if self.is_empty_singleton() {
Self::new()
} else {
unsafe {
let mut new_table = ManuallyDrop::new(
Self::new_uninitialized(self.buckets(), Fallibility::Infallible)
.unwrap_or_else(|_| hint::unreachable_unchecked()),
);
// Copy the control bytes unchanged. We do this in a single pass
self.ctrl(0)
.copy_to_nonoverlapping(new_table.ctrl(0), self.num_ctrl_bytes());
{
// The cloning of elements may panic, in which case we need
// to make sure we drop only the elements that have been
// cloned so far.
let mut guard = guard((0, &mut new_table), |(index, new_table)| {
if mem::needs_drop::<T>() {
for i in 0..=*index {
if is_full(*new_table.ctrl(i)) {
new_table.bucket(i).drop();
}
}
}
new_table.free_buckets();
});
for from in self.iter() {
let index = self.bucket_index(&from);
let to = guard.1.bucket(index);
to.write(from.as_ref().clone());
// Update the index in case we need to unwind.
guard.0 = index;
}
// Successfully cloned all items, no need to clean up.
mem::forget(guard);
}
// Return the newly created table.
new_table.items = self.items;
new_table.growth_left = self.growth_left;
ManuallyDrop::into_inner(new_table)
}
}
}
}

I saw that the RawTable was recently exposed in #104 but unless I missed something it is not enough to copy the contents from one map into another. Do you think it would be a good idea to implement clone_from on RawTable which reuses the arrays if possible?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions