How can I deserialize a type where all the fields are default values as a None instead?
Asked Answered
E

2

10

I have to deserialize JSON blobs where in some places the absence of an entire object is encoded as an object with the same structure but all of its fields set to default values (empty strings and zeroes).

extern crate serde_json; // 1.0.27
#[macro_use] extern crate serde_derive; // 1.0.78
extern crate serde; // 1.0.78

#[derive(Debug, Deserialize)]
struct Test<T> {
    text: T,
    number: i32,
}

#[derive(Debug, Deserialize)]
struct Outer {
    test: Option<Test<String>>,
}

#[derive(Debug, Deserialize)]
enum Foo { Bar, Baz }
#[derive(Debug, Deserialize)]
struct Outer2 {
    test: Option<Test<Foo>>,
}

fn main() {
    println!("{:?}", serde_json::from_str::<Outer>(r#"{ "test": { "text": "abc", "number": 42 } }"#).unwrap());
    // good: Outer { test: Some(Test { text: "abc", number: 42 }) }

    println!("{:?}", serde_json::from_str::<Outer>(r#"{ "test": null }"#).unwrap());
    // good: Outer { test: None }

    println!("{:?}", serde_json::from_str::<Outer>(r#"{ "test": { "text": "", "number": 0 } }"#).unwrap());
    // bad: Outer { test: Some(Test { text: "", number: 0 }) }
    // should be: Outer { test: None }

    println!("{:?}", serde_json::from_str::<Outer2>(r#"{ "test": { "text": "Bar", "number": 42 } }"#).unwrap());
    // good: Outer2 { test: Some(Test { text: Bar, number: 42 }) }

    println!("{:?}", serde_json::from_str::<Outer2>(r#"{ "test": { "text": "", "number": 0 } }"#).unwrap());
    // bad: error
    // should be: Outer { test: None }
}

I would handle this after deserialization but as you can see this approach is not possible for enum values: no variant matches the empty string so the deserialization fails entirely.

How can I teach this to serde?

Err answered 2/10, 2018 at 15:10 Comment(0)
O
1

There are two things that need to be solved here: replacing Some(value) with None if value is all defaults, and handling the empty string case for Foo.

The first thing is easy. The Deserialize implementation for Option unconditionally deserializes it as Some if the input field isn't None, so you need to create a custom Deserialize implementation that replaces Some(value) with None if the value is equal to some sentinel, like the default (this is the answer proposed by Issac, but implemented correctly here):

fn none_if_all_default<'de, T, D>(deserializer: D) -> Result<Option<T>, D::Error>
where
    T: Deserialize<'de> + Default + Eq,
    D: Deserializer<'de>,
{
    Option::deserialize(deserializer).map(|opt| match opt {
        Some(value) if value == T::default() => None,
        opt => opt,
    })
}

#[derive(Deserialize)]
struct Outer<T: Eq + Default> {
    #[serde(deserialize_with = "none_if_all_default")]
    #[serde(bound(deserialize = "T: Deserialize<'de>"))]
    test: Option<Test<T>>,
}

This solves the first half of your problem, with Option<Test<String>>. This will work for any deserializable type that is Eq + Default.

The enum case is much more tricky; the problem you're faced with is that Foo simply won't deserialize from a string other than "Bar" or "Baz". I don't really see a good solution for this other than adding a third "dead" variant to the enum:

#[derive(PartialEq, Eq, Deserialize)]
enum Foo {
    Bar,
    Baz,

    #[serde(rename = "")]
    Absent,
}

impl Default for Foo { fn default() -> Self { Self::Absent } }

The reason this problem exists from a data-modeling point of view is that it has to account for the possibility that you'll get json like this:

{ "test": { "text": "", "number": 42 } }

In this case, clearly Outer { test: None } is not the correct result, but it still needs a value to store in Foo, or else return a deserialization error.

If you want it to be the case that "" is valid text only if number is 0, you could do something significantly more elaborate and probably overkill for your needs, compared to just using Absent. You'd need to use an untagged enum, which can store either a "valid" Test or an "all empty" Test, and then create a version of your struct that only deserializes default values:

struct MustBeDefault<T> {
    marker: PhantomData<T>
}

impl<'de, T> Deserialize<'de> for MustBeDefault<T>
where
    T: Deserialize<'de> + Eq + Default
{
    fn deserialize<D>(deserializer: D) -> Result<Self, D::Error>
    where
        D: Deserializer<'de>
    {
        match T::deserialize(deserializer)? == T::default() {
            true => Ok(MustBeDefault { marker: PhantomData }),
            false => Err(D::Error::custom("value must be default"))
        }
    }
}

// All fields need to be generic in order to use this solution.
// Like I said, this is radically overkill.
#[derive(Deserialize)]
struct Test<T, U> {
    text: T,
    number: U,
}

#[derive(Deserialize)]
#[serde(untagged)]
enum MaybeDefaultedTest<T> {
    AllDefault(Test<EmptyString, MustBeDefault<i32>>),
    Normal(Test<Foo, i32>),
}

// `EmptyString` is a type that only deserializes from empty strings;
// its implementation is left as an exercise to the reader.
// You'll also need to convert from MaybeDefaultedTest<T> to Option<T>;
// this is also left as an exercise to the reader.

It is now possible to write MaybeDefaulted<Foo>, which will deserialize from things like {"text": "", "number": 0} or {"text": "Baz", "number": 10} or {"text": "Baz", "number": 0}, but will fail to deserialize from {"text": "", "number": 10}.

Again, for the third time, this solution is probably radically overkill (especially if your real-world use case involves more than 2 fields in the Test struct), and so unless you have very intense data modeling requirements, you should go with adding an Absent variant to Foo.

Ow answered 15/4, 2022 at 21:55 Comment(0)
K
-3

You can look at an example of custom field deserializing.

In particular, you might want to define something like

extern crate serde; // 1.0.78
#[macro_use]
extern crate serde_derive; // 1.0.78

use serde::{Deserialize, Deserializer, de::Visitor};

fn none_if_all_default<'de, T, D>(deserializer: D) -> Result<Option<T>, D::Error>
where
    T: Deserialize<'de>,
    D: Deserializer<'de> + Clone,
{
    struct AllDefault;

    impl<'de> Visitor<'de> for AllDefault {
        type Value = bool;

        // Implement the visitor functions here -
        // You can recurse over all values to check if they're
        // the empty string or 0, and return true
        //...
    }

    let all_default = deserializer.clone().deserialize_any(AllDefault)?;

    if all_default {
        Ok(None)
    } else {
        Ok(Some(T::deserialize(deserializer)?))
    }
}

And then do

#[derive(Deserialize)]
struct Outer2 {
    #[serde(deserialize_with = "none_if_all_default")]
    test: Option<Test<Foo>>,
}
Kiesha answered 14/10, 2018 at 12:5 Comment(4)
Please enhance your example to show how this could be used for an enum as OP asked. I wasn't able to get my own examples to work with an enum. This also fails to compile due to not all trait items implemented, missing: `expecting`.Victor
It fails to compile because it's not meant to compile - it's a skeleton answer, and implementing the trait is manual work that can be done by referring to the docs. The point is to demonstrate the high-level idea that solves the OP's problem. This works for an enum field, because it works for any field that's an Option<T>.Kiesha
deserialize_any consumes self. How do you call T::deserialize once it's gone? The return type of your function is inconsistent (Result::Ok vs. Option::Some). This answer does not work.Victor
Deserializers can not be cloned. Note that this back-and-forth is why it's strongly encouraged for you to create a working solution before answering.Victor

© 2022 - 2024 — McMap. All rights reserved.