Multithreaded Rust

Memory safety is one of the most pressing concerns in writing multithreaded applications. Most programming languages will provide you with the tools necessary to avoid data races, however also the freedom to wrongly implement them. Consistent with Rust's theme of guarantees, safety across threads is a guarantee.

Rust seems like a very natural language to implement a multithreaded application in. All shared data must either be borrowed immutably, deep copied, or exclusively (by a single thread/scope) borrowed mutably.

The main synchronization primitives in Rust are as follows:

  • An Arc is a container which may be shared across threads and contains an immutable value of an arbitrary type (can also be used mutably if used with atomic types).

  • A Mutex is a container which functions mostly like a traditional mutex, however contains a mutable value of an arbitrary type, rather than requiring the user to enforce safety independently. A mutex is automatically released when it goes out of scope, playing nicely into the existing borrow model.

  • A RwLock is a container that functions like a mutex, however it can at any time either have any number of readers, or only a single writer.

  • Channels allow for message-passing between threads.

Creating threads

Threads are created with thread::spawn()

thread::spawn(move || {
    do_something(foo)
});

thread::spawn() accepts a closure, and executes the contents of it in a new thread. You will generally be passing a move closure, which tells Rust that we want the closure, and by extension the thread, to take ownership of any variables passed into it. This is important, as a thread can exist for any arbitrary length of time, and there would be no way to guarantee that variables created in the enclosing scope will continue to exist. This is an example of a common use-after-free scenario that Rust, by design, simply does not allow you to do.

Design patterns

If you are familiar with threaded programming in other languages, Rust's interpretation will feel familiar.

Commonly you will want to create many threads at once, and then wait for all of them to complete.

This can be achieved by keeping a pool of thread s:

use std::thread;

let mut threads = vec![];
 
for i in 0..10 {
    threads.push(thread::spawn(move || {
        do_something(foo);
    }));
}

for thread in threads.into_iter()  {
    let thread = thread.join().unwrap();
    println!("thread return value: {}", thread);
}
 

Sharing data between threads requires that you choose a thread-safe container for it. A Mutex, Arc, or RwLock may be used depending on your needs.

You may also want to use channels for directional communication.

channel s are used for allowing threads to continuously and asynchronously push data to a consumer.

use std::thread;
use std::sync::mpsc::channel;

let (tx, rx) = channel();
for i in 0..10 {
    let tx = tx.clone();
    thread::spawn(move|| {
        // pass the channel tx in as a buffer to write to
        do_something(foo, tx);
    });
}

for _ in 0..10 {
    let ret = rx.recv().unwrap();

    println!("something has been done!: {}", ret);
}

Barriers are an interesting synchronization primitive, allowing threads to all rendezvous at a particular point in their work. This allows for more asynchronicity and autonomy of each thread than having to spawn threads in multiple waves each time work is completed.

A barrier is initialized with a usize value determining how many wait()ers are needed to "break it down".

use std::sync::Barrier;

let barrier = Barrier::new(8);

This barrier can then be cloned and passed into 8 different threads. When the threads reach the point that they must wait for the other threads, they each call wait() on the barrier, which will block until it is called for the 8th time.

Condvars are useful as a low-cost alternative to channels for when you are looking for behavior more similar to Go's channel-synchronization pattern. Condvars allow threads to wait for a signal that may be sent to one or all wait-ers.