Processing vec in parallel: how to do safely, or without using unstable features?
Asked Answered
V

2

12

I have a massive vector that I want to be able to load/act on in parallel, e.g. load first hundred thousand indices in one thread, next in another and so on. As this is going to be a very hot part of the code, I have come up with this following proof of concept unsafe code to do this without Arcs and Mutexes:

let mut data:Vec<u32> = vec![1u32, 2, 3];
let head = data.as_mut_ptr();
let mut guards = (0..3).map(|i|
  unsafe {
    let mut target = std::ptr::Unique::new(head.offset(i));
    let guard = spawn(move || {
      std::ptr::write(target.get_mut(), 10 + i as u32);
    });
    guard
  });

Is there anything I have missed here that can make this potentially blow up?

This uses #![feature(unique)] so I don't see how to use this in stable. Is there a way to do this sort of thing in stable (ideally safely without using raw pointers and overhead of Arc's and Mutex's)?

Also, looking at documentation for Unique, it says

It also implies that the referent of the pointer should not be modified without a unique path to the Unique reference

I am not clear what "unique path" means.

Vtarj answered 27/7, 2015 at 2:33 Comment(5)
Could Vec.split_at_mut(..) work for you?Extensile
you can split it mutably but you are still stuck with how to move ownership into the worker threads and then get it back. i don't see how to do this safely since while the worker threads are computing, in the parent thread you still have mutable access to the original vec. there are lifetime issues too as the closure may escape the block, unboxed closures may fix that - i don't know much about them though.Vtarj
once scoped gets a proper replacement split_at_mut is the correct solution. Until then I suggest simply creating multiple vectors, one for every thread.Catabolism
chunks_mut is a nicer version of split_at_mut for this purpose: for target in data.chunks_mut(100_000) { ... }.Ada
Maybe this approach is useful for you: https://mcmap.net/q/583609/-how-do-i-write-to-a-mutable-slice-from-multiple-threads-at-arbitrary-indexes-without-using-mutexesEdifice
P
22

Today the rayon crate is the de facto standard for this sort of thing:

use rayon::prelude::*;

fn main() {
    let mut data = vec![1, 2, 3];
    data.par_iter_mut()
        .enumerate()
        .for_each(|(i, x)| *x = 10 + i as u32);
    assert_eq!(vec![10, 11, 12], data);
}

Note that this is just one line different from the single-threaded version using standard iterators, which would replace par_iter_mut with iter_mut.

See also Writing a small ray tracer in Rust and Zig.

Polymer answered 1/7, 2019 at 18:7 Comment(2)
How can you limit parallelism with this (like do not make more than 4 API requests at a time while iterating over a list of requests to make)?Juback
@BrandonRos github.com/rayon-rs/rayon/blob/master/…Hett
A
8

One can use an external library for this, e.g. simple_parallel (disclaimer, I wrote it) allows one to write:

extern crate simple_parallel;

let mut data = vec![1u32, 2, 3, 4, 5];

let mut pool = simple_parallel::Pool::new(4);

pool.for_(data.chunks_mut(3), |target| {
    // do stuff with `target`
})

The chunks and chunks_mut methods are the perfect way to split a vector/slice of Ts into equally sized chunks: they respectively return an iterator over elements of type &[T] and &mut [T].

Ada answered 27/7, 2015 at 17:36 Comment(2)
Thanks! This is very helpful. Though I'd like to sort out all the details of this myself, so I can learn. I hadn't thought of doing this using channels. I will need to think about this some more.Vtarj
I rolled my own with github.com/rust-lang/threadpool . huonw.github.io/blog/2015/05/finding-closure-in-rust was very helpful.Vtarj

© 2022 - 2024 — McMap. All rights reserved.