Using serde for two (de)serialization formats
Asked Answered
G

2

12

I have successfully used serde_json to deserialize and serialize JSON. My setup looks somewhat like this (very simplified):

use serde::{Deserialize, Serialize};
use serde_json;
use serde_with::skip_serializing_none;

#[skip_serializing_none]
#[derive(Deserialize, Serialize)]
#[serde(rename_all = "camelCase")]
struct Foo {
    #[serde(flatten)]
    bar: Option<Bar>,
    
    baz_quux: Option<u8>,
}

#[skip_serializing_none]
#[derive(Deserialize, Serialize)]
struct Bar {
    #[serde(rename = "plughXyzzySomeRandomStuff")]
    plugh_xyzzy: Option<u8>
}

And then I have implemented FromStr and Display on Foo, which in turn call serde_json::from_str and serde_json::to_string respectively, to easily (de)serialize the struct.

However, I'd now like to also use serde_ini to support (de)serializing INI files as well, to the same Rust data structure. But I can't really figure out how to do that.

The structure itself is simple enough, but my specific problems are with the attributes:

  • The keys are named differently in the JSON and INI formats (the JSON format uses the customary camelCase, while the INI doesn't), so I have to solve the #[serde(rename)] and #[serde(rename_all)] attributes some other way, but I'm not sure where or how.
  • #[serde(flatten)] doesn't seem to work with serde_ini's all-string values, which require a #[serde(deserialize_with="from_str)]" attribute for all non-string values, but this should obviously only apply to the INI values and not the JSON ones.

So all in all, I guess what I need to do is to re-implement these attributes, or use them conditionally based on what (De)Serializer is used, but I'm stumped on how to do that.

Gravante answered 23/10, 2021 at 19:29 Comment(3)
I understand that #[serde(alias)] could be used to deserialize from multiple formats, but not for serializing.Gravante
that not possible, the two format specification are incompatible if there don't use the same key. This should be implemented as two structure.Rene
This is something that I was looking into last year; ultimately I ended up following @dtolnay's advice to pass context to the Serialize via a thread local.Agglutinative
R
14

This is a limitation of serde's design. The Deserialize and Serialize implementations are intentionally separated from the Serializer and Deserializer implementations, which gives great flexibility and convenience when choosing different formats and swapping them out. Unfortunately, it means it is isn't possible to individually fine-tune your Deserialize and Serialize implementations for different formats.

The way I have done this before is to duplicate the data types so that I can configure them for each format, and then provide a zero-cost conversion between them.

Rocray answered 23/10, 2021 at 20:10 Comment(6)
Thanks! The idea had occured to me; glad it wasn't completely crazy (I had just hoped that there was a more elegant way). But what's the idiomatic way to provide a zero-cost conversion between the two structs?Gravante
@tobiasvl: Assuming the structs have identical layouts, which I think the compiler can only guarantee if they are repr(C), I'd suggest using a union.Agglutinative
@Gravante impl from and that all, rust will be smart enough.Rene
@Rene Does that really end up as zero-cost? Interesting! That's what I ended up doing, so that's cool then. A lot of code duplication though (there's basically two of every struct and enum), but if that can't be helped then I'm fine with it. Works like a charm, at least! Thanks!Gravante
@Agglutinative I'm running into an issue trying to follow this... am I correct in understanding this approach will not work if any field doesn't implement Copy? That is pretty restrictive (no Strings, Vecs...)Internationale
@KyleCarow No, that's not correct. There is no need to copy or clone; you generally would write it in such a way that the data is moved instead.Rocray
S
1

I found another approach (documented in https://github.com/serde-rs/serde/issues/2660). My problem was that I wanted to serialize a struct to csv and json, but csv doesn't know how to handle sequences.

The idea is to use a generic parameter for my type who knows how to serialize some special stuff. I use a newtype type for Vec<String>, so that I can impl Serialize for my own. So, my wrapper type looks like this:

pub trait SerializationType {
    fn serialize<'a, S>(
        items: impl Iterator<Item = &'a str>,
        serializer: S,
    ) -> Result<S::Ok, S::Error>
    where
        S: serde::Serializer;
}

pub struct StringSet<T: SerializationType>(Vec<String>, PhantomData<T>);

Next, I implement Serialize for my type StringSet<T>:

impl<T> Serialize for StringSet<T>
where
    T: SerializationType,
{
    fn serialize<S>(&self, serializer: S) -> Result<S::Ok, S::Error>
    where
        S: serde::Serializer,
    {
        T::serialize(self.0.iter().map(|s| &s[..]), serializer)
    }
}

This could be more specific (not using Iterator), because I know that StringSet<T> holds a Vec<String>, but I wanted to make things more generic.

The next steps are implementations for some serialization specifics:

pub struct CsvSerialization;
impl SerializationType for CsvSerialization {
    fn serialize<'a, S>(
        items: impl Iterator<Item = &'a str>,
        serializer: S,
    ) -> Result<S::Ok, S::Error>
    where
        S: serde::Serializer,
    {
        let v = Vec::from_iter(items).join(",");
        serializer.serialize_str(&v)
    }
}

pub struct JsonSerialization;
impl SerializationType for JsonSerialization {
    fn serialize<'a, S>(
        items: impl Iterator<Item = &'a str>,
        serializer: S,
    ) -> Result<S::Ok, S::Error>
    where
        S: serde::Serializer,
    {
        let mut ser = serializer.serialize_seq(None)?;
        for item in items {
            ser.serialize_element(item)?;
        }
        ser.end()
    }
}

The usage of this is very simple:


#[cfg(test)]
mod tests {
    use serde::Serialize;

    use super::{CsvSerialization, JsonSerialization, SerializationType, StringSet};

    #[derive(Serialize)]
    #[serde(bound = "T: SerializationType")]
    struct SampleRecord<T: SerializationType> {
        data: StringSet<T>,
    }

    fn test_data<T>() -> SampleRecord<T>
    where
        T: SerializationType,
    {
        SampleRecord {
            data: StringSet::<T>::from(vec!["a", "b", "c"].into_iter()),
        }
    }

    #[test]
    fn test_serialize_csv() {
        let mut wtr = csv::Writer::from_writer(vec![]);
        wtr.serialize(&test_data::<CsvSerialization>()).unwrap();

        let result = String::from_utf8(wtr.into_inner().unwrap()).unwrap();

        assert_eq!(
            result,
            r#"data
"a,b,c"
"#
        );
    }

    #[test]
    fn test_serialize_json() {
        let result = serde_json::to_string(&test_data::<JsonSerialization>()).unwrap();
        assert_eq!(result, r#"{"data":["a","b","c"]}"#);
    }
}

As you can see, SampleRecord has a generic parameter which specifies how to serialize StringSet values. Using this parameter, I don't need to create multiple types, because SampleRecord<CsvSerialization> is already different from SampleRecord<JsonSerialization>. But converting one type into the other is pretty simple.

Sparkie answered 11/12, 2023 at 16:22 Comment(0)

© 2022 - 2025 — McMap. All rights reserved.