Understanding a thread safe RwLock<Arc<T>> mechanism in Rust
Asked Answered
P

1

3

Background

I am completely new to Rust (started yesterday) and I'm trying to ensure I've understood correctly. I am looking to write a configuration system for a 'game', and want it to be fast access but occasionally mutable. To start, I wanted to investigate localization which seemed a reasonable use case for static configuration (as I appreciate such things are generally not 'Rusty' otherwise). I came up with the following (working) code, based in part on this blog post (found via this question). I've included here for reference, but feel free to skip it over for now...

#[macro_export]
macro_rules! localize {
    (@single $($x:tt)*) => (());
    (@count $($rest:expr),*) => (<[()]>::len(&[$(localize!(@single $rest)),*]));

    ($name:expr $(,)?) => { LOCALES.lookup(&Config::current().language, $name) };
    ($name:expr, $($key:expr => $value:expr,)+) => { localize!(&Config::current().language, $name, $($key => $value),+) };
    ($name:expr, $($key:expr => $value:expr),*) => ( localize!(&Config::current().language, $name, $($key => $value),+) );

    ($lang:expr, $name:expr $(,)?) => { LOCALES.lookup($lang, $name) };
    ($lang:expr, $name:expr, $($key:expr => $value:expr,)+) => { localize!($lang, $name, $($key => $value),+) };
    ($lang:expr, $name:expr, $($key:expr => $value:expr),*) => ({
        let _cap = localize!(@count $($key),*);
        let mut _map : ::std::collections::HashMap<String, _>  = ::std::collections::HashMap::with_capacity(_cap);
        $(
            let _ = _map.insert($key.into(), $value.into());
        )*
        LOCALES.lookup_with_args($lang, $name, &_map)
    });
}

use fluent_templates::{static_loader, Loader};
use std::sync::{Arc, RwLock};
use unic_langid::{langid, LanguageIdentifier};

static_loader! {
    static LOCALES = {
        locales: "./resources",
        fallback_language: "en-US",
        core_locales: "./resources/core.ftl",
        // Removes unicode isolating marks around arguments, you typically
        // should only set to false when testing.
        customise: |bundle| bundle.set_use_isolating(false)
    };
}
#[derive(Debug, Clone)]
struct Config {
    #[allow(dead_code)]
    debug_mode: bool,
    language: LanguageIdentifier,
}

#[allow(dead_code)]
impl Config {
    pub fn current() -> Arc<Config> {
        CURRENT_CONFIG.with(|c| c.read().unwrap().clone())
    }
    pub fn make_current(self) {
        CURRENT_CONFIG.with(|c| *c.write().unwrap() = Arc::new(self))
    }
    pub fn set_debug(debug_mode: bool) {
        CURRENT_CONFIG.with(|c| {
            let mut writer = c.write().unwrap();
            if writer.debug_mode != debug_mode {
                let mut config = (*Arc::clone(&writer)).clone();
                config.debug_mode = debug_mode;
                *writer = Arc::new(config);
            }
        })
    }
    pub fn set_language(language: &str) {
        CURRENT_CONFIG.with(|c| {
            let l: LanguageIdentifier = language.parse().expect("Could not set language.");
            let mut writer = c.write().unwrap();
            if writer.language != l {
                let mut config = (*Arc::clone(&writer)).clone();
                config.language = l;
                *writer = Arc::new(config);
            }
        })
    }
}

impl Default for Config {
    fn default() -> Self {
        Config {
            debug_mode: false,
            language: langid!("en-US"),
        }
    }
}

thread_local! {
    static CURRENT_CONFIG: RwLock<Arc<Config>> = RwLock::new(Default::default());
}

fn main() {
    Config::set_language("en-GB");
    println!("{}", localize!("apologize"));
}

I've not included the tests for brevity. I would welcome feedback on the localize macro too (as I'm not sure whether I've done that right).

Question

Understanding Arc cloning

However, my main question is on this bit of code in particular (there is a similar example in set_language too):

    pub fn set_debug(debug_mode: bool) {
        CURRENT_CONFIG.with(|c| {
            let mut writer = c.write().unwrap();
            if writer.debug_mode != debug_mode {
                let mut config = (*Arc::clone(&writer)).clone();
                config.debug_mode = debug_mode;
                *writer = Arc::new(config);
            }
        })
    }

Although this works, I want to ensure it is the right approach. From my understanding it

  1. Get's a write lock on the config Arc struct.
  2. Checks for changes, and, if changed:
  3. Calls Arc::clone() on the writer (which will automatically DeRefMut the parameter to an Arc before cloning). This doesn't actually 'clone' the struct but increments the reference counter (so should be fast)?
  4. Call Config::clone due to step 3 being wrapped in (*...) - is this the right approach? My understanding is this does now clone the Config, producing a mutable owned instance, which I can then modify.
  5. Mutates the new config setting the new debug_mode.
  6. Creates a new Arc<Config> from this owned Config.
  7. Updates the static CURRENT_CONFIG.
  8. Releases the reference counter to the old Arc<Config> (potentially freeing the memory if nothing else is currently using it).
  9. Releases the write lock.

If I understand this correctly, then only one memory alloc will occur in step 4. Is that right? Is step 4 the right way to go about this?

Understanding performance implications

Similarly, this code:

LOCALES.lookup(&Config::current().language, $name)

Should be quick under normal use as it uses this function:

    pub fn current() -> Arc<Config> {
        CURRENT_CONFIG.with(|c| c.read().unwrap().clone())
    }

Which gets a ref-counted pointer to the current config, without actually copying it (the clone() should call Arc::clone() as above), using a read lock (fast unless a write is occurring).

Understanding thread_local! macro use

If all that is good, then great! However, I'm then stuck on this last bit of code:

thread_local! {
    static CURRENT_CONFIG: RwLock<Arc<Config>> = RwLock::new(Default::default());
}

Surely this is wrong? Why are we creating the CURRENT_CONFIG as a thread_local. My understanding (admittedly from other languages, combined with the limited docs) means that there will be a unique version to the currently executing thread, which is pointless as a thread cannot interrupt itself? Normally I would expect a truly static RwLock shared across multiple thread? Am I misunderstanding something or is this a bug in the original blog post?

Indeed, the following test seems to confirm my suspicions:

    #[test]
    fn config_thread() {
        Config::set_language("en-GB");
        assert_eq!(langid!("en-GB"), Config::current().language);
        let tid = thread::current().id();
        let new_thread =thread::spawn(move || {
            assert_ne!(tid, thread::current().id());
            assert_eq!(langid!("en-GB"), Config::current().language);
        });

        new_thread.join().unwrap();
    }

Produces (demonstrating that the config is not shared across thread):

thread '<unnamed>' panicked at 'assertion failed: `(left == right)`
  left: `LanguageIdentifier { language: Language(Some("en")), script: None, region: Some(Region("GB")), variants: None }`,
 right: `LanguageIdentifier { language: Language(Some("en")), script: None, region: Some(Region("US")), variants: None }`
Perionychium answered 16/8, 2021 at 17:35 Comment(5)
Removing the thread_local does appear to fix my tests, including ensuring Config state is shared across threads and updatable safely, full code below (makes use of latest SyncLazy from nightly builds though:Perionychium
full code is available herePerionychium
(*Arc::clone(&writer)).clone() looks like an unnecessary clone of the Arc - writer.as_ref().clone() should achieve the same purpose without the inner clone. While cloning an Arc is cheap compared to copying an allocated type, it is not free because it involves memory barriers when manipulating the atomic counter. (The counter is updated once when creating the temporary clone of the Arc and again when it is destroyed - and those cannot be optimized away because they can be visible to other threads, so the compiler must generate both adjustments.)Circumambient
Thanks @user4815162342, does Arc::_as_ref() increment the ref count correctly?Perionychium
as_ref() doesn't increment the refcount at all. It gives you a &T that is not allowed to outlive the Arc that handed it out. You can use that &T, in this case to call T::clone() without touching the reference count of the Arc. And the fact that the reference can't outlive the Arc guarantees that the object cannot be destructed while you're using the reference.Circumambient
L
3

The section of the blog post you are referring to is, in my opinion, not very good.

You are correct that RwLock here is bogus - it can be replaced with a RefCell as it is thread local.

The justification for the approach in the blog post is flimsy:

However, in the previous example we introduced interior mutability. Imagine we have multiple threads running, all referencing the same config, but one flips a flag. What happens to concurrently running code that now is not expecting the flag to randomly flip?

The entire point of a RwLock is that modifications cannot be made while the object is locked for reading (i.e. the RwLockReadGuard returned from RwLock::read() is alive). So an Arc<RwLock<Config>> won't have your flags "randomly flipped" while a read lock is taken out. (Granted, it can be an issue, if you release the lock and take it again, and assume the flag has not changed in the meantime.)

The section also doesn't specify how an update to a configuration would actually take place. You'd need a mechanism to signal the other threads that a config change had taken place (ex. a channel) and the threads themselves would have to update their own thread-local variable with the new configuration.

Ultimately, I'd just consider that section as bad advice, and certainly not tailored to a beginner.

Lamond answered 16/8, 2021 at 22:6 Comment(3)
I agree overall the blog is very clumsy and incomplete. I think the only additional useful thing to mention to OP is that, overall, this is just implementing a very basic RCU pattern. The arc_swap crate more or less handles this completely without the need for a heavy-weight lock.Woodall
Thanks, @GManNickG, that looks exactly like what I need!Perionychium
Thanks @[Colonel Thirty Two], that confirmed my suspicion.Perionychium

© 2022 - 2024 — McMap. All rights reserved.