Change elements in vector using multithreading in Rust
Asked Answered
D

3

6

I'm new in Rust and i'm trying to allocate computational work to threads.

I have vector of strings, i would want to create to each string one thread to do his job. There's simple code:

use std::thread;

fn child_job(s: &mut String) {
    *s = s.to_uppercase();
}

fn main() {
    // initialize
    let mut thread_handles = vec![];
    let mut strings = vec![
        "hello".to_string(),
        "world".to_string(),
        "testing".to_string(),
        "good enough".to_string(),
    ];

    // create threads
    for s in &mut strings {
        thread_handles.push(thread::spawn(|| child_job(s)));
    }

    // wait for threads
    for handle in thread_handles {
        handle.join().unwrap();
    }

    // print result
    for s in strings {
        println!("{}", s);
    }
}

I got errors while compile:

error[E0597]: `strings` does not live long enough
  --> src/main.rs:18:14
   |
18 |     for s in &mut strings {
   |              ^^^^^^^^^^^^
   |              |
   |              borrowed value does not live long enough
   |              argument requires that `strings` is borrowed for `'static`
...
31 | }
   | - `strings` dropped here while still borrowed

error[E0505]: cannot move out of `strings` because it is borrowed
  --> src/main.rs:28:14
   |
18 |     for s in &mut strings {
   |              ------------
   |              |
   |              borrow of `strings` occurs here
   |              argument requires that `strings` is borrowed for `'static`
...
28 |     for s in strings {
   |              ^^^^^^^ move out of `strings` occurs here

I can't understand what's wrong with lifetime of pointers and how i should fix that. For me it looks OK, because every thread get only one mutable pointer of string and doesn't affect the vector itself in any way.

Doubleness answered 19/2, 2022 at 4:14 Comment(2)
I don't know much about what you're planning, but if you have much more strings in strings than the typical machine has cores, you probably want to use rayon's par_iter_mut.Booboo
Actually, I should have probably marked this as a duplicate… 1 2 Meh.Booboo
R
6

Caesar's answer shows how to solve the problem using crossbeam's scoped threads. If you don't want to depend on crossbeam, then the approach of wrapping the values in Arc<Mutex<T>>, as shown in tedtanner's answer, is a reasonable general strategy.

However in this case the mutex is really unnecessary because the threads don't share the strings, either with each other or with the main thread. Locking is an artifact of using Arc, which is itself mandated by the static lifetime rather than a need for sharing. Although the locks are uncontended, they do add some overhead and are best avoided. In this case we can avoid both Arc and Mutex by moving each string to its respective thread, and retrieving the modified string once the threads finish.

This modification compiles and runs using just the standard library and safe code, and without requiring Arc or Mutex:

// ... child_job defined as in the question ...

fn main() {
    let strings = vec![
        "hello".to_string(),
        "world".to_string(),
        "testing".to_string(),
        "good enough".to_string(),
    ];

    // start the threads, giving them the strings
    let mut thread_handles = vec![];
    for mut s in strings {
        thread_handles.push(thread::spawn(move || {
            child_job(&mut s);
            s
        }));
    }

    // wait for threads and re-populate `strings`
    let strings = thread_handles.into_iter().map(|h| h.join().unwrap());

    // print result
    for s in strings {
        println!("{}", s);
    }
}

Playground

Russi answered 19/2, 2022 at 11:15 Comment(0)
B
5

With thread::spawn and JoinHandles, the borrow checker isn't smart enough to know that your threads will finish before main exits (this is a bit unfair to the borrow checker, it really can't know), and thus it can't prove that strings will live long enough for your threads to work on it. You can either sidestep that problem by using Arcs like @tedtanner suggests (in a sense, that means you're doing the lifetime management at runtime), or you can use scoped threads.

Scoped threads are essentially a way of telling the borrow checker: Yeah, this thread will finish before that scope ends (gets dropped). And then, you can pass references to things on your current thread's stack to another thread:

crossbeam::thread::scope(|scope| {
    for s in &mut strings {
        scope.spawn(|_| child_job(s));
    }
}) // All spawned threads are auto-joined here, no need for join_handles
.unwrap();

Playground

When this answer was written, one needed a crate for scoped threads (crossbeam used here), but this is stable in std since 1.63.

Booboo answered 19/2, 2022 at 4:43 Comment(6)
The threads will finish because the main function calls .join() on the thread handles. Even if .join() weren't called on them, the Rust compiler wouldn't care. It would just let those threads be cleaned up by the operating system when the program exits.Glyphography
@Glyphography That is true, but I fail to see how it is relevant…? (The borrow checker certainly doesn't know about it.)Booboo
I apologize, I was confused about what you were really saying. I missed the "thus it can't prove that strings will live long enough for your threads to work on it" line.Glyphography
In all fairness, I added that late. ;) Good to know it is necessary.Booboo
Haha Yeah, it helps. I also added my comment "late," meaning late at night so I am not firing on all cylinders :)Glyphography
"It would just let those threads be cleaned up by the operating system when the program exits" - but the program is not guaranteed to exit at the end of main, since it can be called as regular function, not just as an entry point. Furthermore, there's a little code to be run after the main, which will keep the unjoined threads alive a little more.Corrinnecorrival
G
2

Rust doesn't know that your strings will last as long as your threads, so it won't pass a reference to them to the threads. Imagine if you passed a reference to a string to another thread, then the original thread thought it was done with that string and freed its memory. That would cause undefined behavior. Rust prevents this by requiring that the strings either be kept behind a reference-counted pointer (ensuring their memory doesn't get freed while they are still referenced somewhere) or that they have a 'static lifetime, meaning they are stored in the executable binary itself.

Also, Rust won't allow you to share a mutable reference across threads because it is unsafe (multiple threads could try to alter the referenced data at once). You want to use an std::sync::Arc in conjunction with a std::sync::Mutex. Your strings vector would become a Vec<Arc<Mutex<String>>>. Then, you could copy the Arc (using .clone()) and send that across threads. Arc is a pointer that keeps a reference count that gets incremented atomically (read: in a thread-safe way). The mutex allows a thread to lock the string temporarily so other threads can't touch it then unlock the string later (the thread can safely change the string while it is locked).

Your code would look something like this:

use std::thread;
use std::sync::{Arc, Mutex};

fn child_job(s: Arc<Mutex<String>>) {
    // Lock s so other threads can't touch it. It will get
    // unlocked when it goes out of scope of this function.
    let mut s = s.lock().unwrap();
    *s = s.to_uppercase();
}

fn main() {
    // initialize
    let mut thread_handles = Vec::new();
    let strings = vec![
        Arc::new(Mutex::new("hello".to_string())),
        Arc::new(Mutex::new("world".to_string())),
        Arc::new(Mutex::new("testing".to_string())),
        Arc::new(Mutex::new("good enough".to_string())),
    ];

    // create threads
    for i in 0..strings.len() {
        let s = strings[i].clone();
        thread_handles.push(thread::spawn(|| child_job(s)));
    }

    // wait for threads
    for handle in thread_handles {
        handle.join().unwrap();
    }

    // print result
    for s in strings {
        let s = s.lock().unwrap();
        println!("{}", *s);
    }
}
Glyphography answered 19/2, 2022 at 4:31 Comment(3)
Humm? "the child_job function takes ownership"? What do you mean by this? In its type signature, it takes a mutable reference. Also: "Rust won't allow you to share a mutable reference across threads" is true, but not really relevant: If I understood the intricacies correctly, you can't have two mutable references, but you can pass ownership of a mutable reference between threads.Booboo
Oops, yeah that statement is incorrect. I'll change it. However, no you cannot pass a mutable reference to another thread. That would make the data mutable by two threads at least. Remember that the original thread still retains the ability to mutate the data.Glyphography
"you cannot pass a mutable reference to another thread" - with scoped threads, you can (both crossbeam::scope and crossbeam::Scope::spawn take FnOnce, so FnMuts are allowed too). The original thread will not be able to do anything with the referenced data, of course.Corrinnecorrival

© 2022 - 2024 — McMap. All rights reserved.