How to pass options to Rust's serde that can be accessed in Deserialize::deserialize()?

For context: I'm writing a ray tracer in Rust but I'm struggling with finding a good way to load the scene in a filesystem-agnostic way. I'm using serde so that I don't have to invent my own file format (yet). The assets (image textures and mesh data) are stored separately to the scene file. The scene file only stores the paths of these files. Because the ray tracer itself is supposed to be a platform-agnostic library (I want to be able to compile it to WebAssembly for the Browser) the ray tracer itself has no idea about the file system. I intend to load the assets when deserializing the scene but this is causing me real problems now:

I need to pass an implementation of the file system interfacing code to serde that I can use in Deserialize::deserialize() but there doesn't seem to be any easy way to do that. I came up with a way to do it with generics, but I'm not happy about it.

Here's the way I'm doing it at the moment, stripped down as an MCVE (packages used are serde and serde_json):

The library code (lib.rs):

use std::marker::PhantomData;
use serde::{Serialize, Serializer, Deserialize, Deserializer};

pub struct Image {}

pub struct Texture<L: AssetLoader> {
    path: String,
    image: Image,
    phantom: PhantomData<L>,
}

impl<L: AssetLoader> Serialize for Texture<L> {
    fn serialize<S: Serializer>(&self, serializer: S) -> Result<S::Ok, S::Error> {
        self.path.serialize(serializer)
    }
}

impl<'de, L: AssetLoader> Deserialize<'de> for Texture<L> {
    fn deserialize<D: Deserializer<'de>>(deserializer: D) -> Result<Texture<L>, D::Error> {
        let path = String::deserialize(deserializer)?;

        // This is where I'd much rather have an instance of AssetLoader
        let image = L::load_image(&path);

        Ok(Texture {
            path,
            image,
            phantom: PhantomData,
        })
    }
}

pub trait AssetLoader {
    fn load_image(path: &str) -> Image;
    // load_mesh(), load_hdr(), ...
}

#[derive(Serialize, Deserialize)]
pub struct Scene<L: AssetLoader> {
    textures: Vec<Texture<L>>,
    // meshes, materials, lights, ...
}

The platform-specific code (main.rs):

use serde::{Serialize, Deserialize};
use assetloader_mcve::{AssetLoader, Image, Scene};

#[derive(Serialize, Deserialize)]
struct AssetLoaderImpl {}

impl AssetLoader for AssetLoaderImpl {
    fn load_image(path: &str) -> Image {
        println!("Loading image: {}", path);
        // Load the file from disk, the web, ...
        Image {}
    }
}

fn main() {
    let scene_str = r#"
    {
      "textures": [
        "texture1.jpg",
        "texture2.jpg"
      ]
    }
    "#;

    let scene: Scene<AssetLoaderImpl> = serde_json::from_str(scene_str).unwrap();

    // ...
}

What I don't like about this approach:

AssetLoaderImpl has to implement Serialize and Deserialize even though it's never (de-)serialized
I'm also using typetag which causes a compilation error because "deserialization of generic impls is not supported yet"
Caching assets will be very difficult because I don't have an instance of AssetLoaderImpl which could cache them in a member variable
Passing the AssetLoader type parameter around is getting unwieldy when Texture (or other assets) are nested deeper
It just doesn't feel right, mostly because of the PhantomData and the abuse of generics

This makes me think that I'm not going about this the right way but I'm struggling to come up with a better solution. I thought about using a mutable global variable in the library holding an instance of AssetLoader (maybe with lazy_static) but that also doesn't seem right. Ideally I'd pass an instance of AssetLoader (Box<dyn AssetLoader> probably) to serde when deserializing that I can access in the impl Deserialize for Texture. I haven't found any way to do that and I'd really appreciate if anybody could point me in the right direction.

For passing in state to deserialization, you should use the DeserializeSeed trait. The documentation for DeserializeSeed addresses this use case:

DeserializeSeed is the stateful form of the Deserialize trait. If you ever find yourself looking for a way to pass data into a Deserialize impl, this trait is the way to do it.

Stateful `AssetLoader`

Like you said, passing AssetLoader as a generic parameter means you aren't able to store a cache (or other things) within it. Using DeserializeSeed, we're able to pass an instance of our AssetLoader struct, so let's modify AssetLoader's functions to give access to self:

pub trait AssetLoader {
    // Adding `&mut self` allows implementers to store data in a cache or 
    // whatever else they want to do.
    fn load_image(&mut self, path: &str) -> Image;
}

Now we can modify the AssetLoaderImpl to use this new definition:

struct AssetLoaderImpl {
    // cache, etc.
}

impl AssetLoader for AssetLoaderImpl {
    fn load_image(&mut self, path: &str) -> Image {
        // Access cache here.
        println!("Loading image: {}", path);
        Image {}
    }
}

Deserializing with the `AssetLoader`

Now we can use an AssetLoader during deserialization using the DeserializeSeed trait. Since we want this to work for any implementer of AssetLoader (allowing us to keep the filesystem logic separate from our deserialization logic), we still have to use a generic L: AssetLoader, but it no longer has to be attached to the Texture struct (or any structs containing Texture).

A good pattern is to introduce a separate TextureDeserializer type to handle the stateful deserialization, and implement DeserializeSeed on that struct. We can set the Value associated type to indicate that the deserialization should return a Texture.

pub struct Texture {
    path: String,
    image: Image,
}

struct TextureDeserializer<'a, L> {
    asset_loader: &'a mut L,
}

impl<'de, L> DeserializeSeed<'de> for TextureDeserializer<'_, L>
where
    L: AssetLoader,
{
    type Value = Texture;

    fn deserialize<D>(self, deserializer: D) -> Result<Self::Value, D::Error>
    where
        D: Deserializer<'de>,
    {
        let path = String::deserialize(deserializer)?;

        let image = self.asset_loader.load_image(&path);

        Ok(Texture { path, image })
    }
}

Notice that the generic AssetLoader is no longer used by the `Texture directly.

We now have to define DeserializeSeed all the way up the chain to Scene's deserialization logic, since we will have the AssetLoader state through the whole process. This may seem very verbose, and it is unfortunate we can't just derive it with serde-derive, but the advantage of not having deserialization state tied up in the structs we are deserializing far outweighs the extra verbosity.

To deserialize a Vec<Texture>, we define a TexturesDeserializer:

struct TexturesDeserializer<'a, L> {
    asset_loader: &'a mut L,
}

impl<'de, L> DeserializeSeed<'de> for TexturesDeserializer<'_, L>
where
    L: AssetLoader,
{
    type Value = Vec<Texture>;

    fn deserialize<D>(self, deserializer: D) -> Result<Self::Value, D::Error>
    where
        D: Deserializer<'de>,
    {
        struct TexturesVisitor<'a, L> {
            asset_loader: &'a mut L,
        }

        impl<'de, L> Visitor<'de> for TexturesVisitor<'_, L>
        where
            L: AssetLoader,
        {
            type Value = Vec<Texture>;

            fn expecting(&self, formatter: &mut fmt::Formatter) -> fmt::Result {
                formatter.write_str("a sequence of Textures")
            }

            fn visit_seq<A>(self, mut seq: A) -> Result<Self::Value, A::Error>
            where
                A: SeqAccess<'de>,
            {
                let mut textures = Vec::new();

                while let Some(texture) = seq.next_element_seed(TextureDeserializer {
                    asset_loader: self.asset_loader,
                })? {
                    textures.push(texture);
                }

                Ok(textures)
            }
        }

        deserializer.deserialize_seq(TexturesVisitor {
            asset_loader: self.asset_loader,
        })
    }
}

And a SceneDeserializer to deserialize the Scene itself:

pub struct Scene {
    textures: Vec<Texture>,
}

pub struct SceneDeserializer<'a, L> {
    pub asset_loader: &'a mut L,
}

impl<'de, L> DeserializeSeed<'de> for SceneDeserializer<'_, L>
where
    L: AssetLoader,
{
    type Value = Scene;

    fn deserialize<D>(self, deserializer: D) -> Result<Self::Value, D::Error>
    where
        D: Deserializer<'de>,
    {
        struct SceneVisitor<'a, L> {
            asset_loader: &'a mut L,
        }

        impl<'de, L> Visitor<'de> for SceneVisitor<'_, L>
        where
            L: AssetLoader,
        {
            type Value = Scene;

            fn expecting(&self, formatter: &mut fmt::Formatter) -> fmt::Result {
                formatter.write_str("struct Scene")
            }

            fn visit_map<A>(self, mut map: A) -> Result<Self::Value, A::Error>
            where
                A: MapAccess<'de>,
            {
                if let Some(key) = map.next_key()? {
                    if key != "textures" {
                        return Err(de::Error::unknown_field(key, FIELDS));
                    }
                } else {
                    return Err(de::Error::missing_field("textures"));
                }

                let textures = map.next_value_seed(TexturesDeserializer {
                    asset_loader: self.asset_loader,
                })?;

                Ok(Scene { textures })
            }
        }

        const FIELDS: &[&str] = &["textures"];
        deserializer.deserialize_struct(
            "Scene",
            FIELDS,
            SceneVisitor {
                asset_loader: self.asset_loader,
            },
        )
    }
}

Note that these above DeserializeSeed definitions are very similar to what would be generated by #[derive(Deserialize)] (in the case of Scene) and what is already defined by serde for Vec<T>. However, defining these custom implementations allows state to be passed through the whole process into the deserialization of Texture.

Putting it all together

Now we can use serde_json to deserialize from our JSON input. Note that serde_json does not provide any helper methods for deserializing with DeserializeSeed (there has been discussion on this in the past), so we have to use the serde_json::Deserializer manually. Lucky for us, it's pretty simple to use:

fn main() {
    let mut asset_loader = AssetLoaderImpl {
        // cache, etc.
    };

    let scene_str = r#"
    {
      "textures": [
        "texture1.jpg",
        "texture2.jpg"
      ]
    }
    "#;

    let mut deserializer = serde_json::Deserializer::new(serde_json::de::StrRead::new(&scene_str));
    let scene = SceneDeserializer {
        asset_loader: &mut asset_loader,
    }.deserialize(&mut deserializer);

    // ...
}

Now we can deserialize a Scene with a stateful AssetLoader. This can be easily extended to include other resources for other members of Scene to access during deserialization as well. And best of all, it keeps the deserialized state decoupled from the actual deserialized structs, meaning you don't need to care about what AssetLoader was used outside of deserialization.

Stateful `AssetLoader`

Deserializing with the `AssetLoader`

Putting it all together

Recommended topics

Hot tags

Stateful AssetLoader

Deserializing with the AssetLoader

Putting it all together

Recommended topics

Hot tags

Stateful `AssetLoader`

Deserializing with the `AssetLoader`