Convert two types into a single type with Serde
Asked Answered
P

3

16

I'm writing for a program that hooks into a web service which sends back JSON.

When a certain property isn't there it provides a empty object, with all its fields as empty strings instead of excluding the value. When the property exists, some of the properties are u64. How can I have it so Serde handles this case?

Rust Structs

#[derive(Clone, Debug, Deserialize)]
struct WebResponse {
    foo: Vec<Foo>,
}

#[derive(Clone, Debug, Deserialize)]
struct Foo {
    points: Points,
}

#[derive(Clone, Debug, Deserialize)]
struct Points {
    x: u64,
    y: u64,
    name: String,
}

Example JSON

{
    "foo":[
        {
            "points":{
                "x":"",
                "y":"",
                "name":""
            }
        },
        {
            "points":{
                "x":78,
                "y":92,
                "name":"bar"
            }
        }
    ]
}
Petr answered 16/6, 2016 at 22:31 Comment(2)
What should happen when you get an empty string instead of a u64? Do you want to set the field to 0? to None? something else?Statistical
@FrancisGagné I think it would be 0.Petr
B
19

Serde supports an interesting selection of attributes that can be used to customize the serialization or deserialization for a type while still using the derived implementation for the most part.

In your case, you need to be able to decode a field that can be specified as one of multiple types, and you don't need information from other fields to decide how to decode the problematic fields. The #[serde(deserialize_with="$path")] annotation is well suited to solve your problem.

We need to define a function that will decode either an empty string or an integer value into an u64. We can use the same function for both fields, since we need the same behavior. This function will use a custom Visitor to be able to handle both strings and integers. It's a bit long, but it makes you appreciate all the work that Serde is doing for you!

extern crate serde;
#[macro_use]
extern crate serde_derive;
extern crate serde_json;

use serde::Deserializer;
use serde::de::{self, Unexpected};
use std::fmt;

#[derive(Clone, Debug, Deserialize)]
struct WebResponse {
    foo: Vec<Foo>,
}

#[derive(Clone, Debug, Deserialize)]
struct Foo {
    points: Points,
}

#[derive(Clone, Debug, Deserialize)]
struct Points {
    #[serde(deserialize_with = "deserialize_u64_or_empty_string")]
    x: u64,
    #[serde(deserialize_with = "deserialize_u64_or_empty_string")]
    y: u64,
    name: String,
}

struct DeserializeU64OrEmptyStringVisitor;

impl<'de> de::Visitor<'de> for DeserializeU64OrEmptyStringVisitor {
    type Value = u64;

    fn expecting(&self, formatter: &mut fmt::Formatter) -> fmt::Result {
        formatter.write_str("an integer or a string")
    }

    fn visit_u64<E>(self, v: u64) -> Result<Self::Value, E>
    where
        E: de::Error,
    {
        Ok(v)
    }

    fn visit_str<E>(self, v: &str) -> Result<Self::Value, E>
    where
        E: de::Error,
    {
        if v == "" {
            Ok(0)
        } else {
            Err(E::invalid_value(Unexpected::Str(v), &self))
        }
    }
}

fn deserialize_u64_or_empty_string<'de, D>(deserializer: D) -> Result<u64, D::Error>
where
    D: Deserializer<'de>,
{
    deserializer.deserialize_any(DeserializeU64OrEmptyStringVisitor)
}

fn main() {
    let value = serde_json::from_str::<WebResponse>(
        r#"{
        "foo": [
            {
                "points": {
                    "x": "",
                    "y": "",
                    "name": ""
                }
            },
            {
                "points": {
                    "x": 78,
                    "y": 92,
                    "name": "bar"
                }
            }
        ]
    }"#,
    );
    println!("{:?}", value);
}

Cargo.toml:

[dependencies]
serde = "1.0.15"
serde_json = "1.0.4"
serde_derive = "1.0.15"
Bowles answered 17/6, 2016 at 0:33 Comment(0)
S
11

In str_or_u64, we use an untagged enum to represent either a string or a number. We can then deserialize the field into that enum and convert it to a number.

We annotate the two fields in Points using deserialize_with to tell it to use our special conversion:

use serde::{Deserialize, Deserializer}; // 1.0.124
use serde_json; // 1.0.64

#[derive(Debug, Deserialize)]
struct WebResponse {
    foo: Vec<Foo>,
}

#[derive(Debug, Deserialize)]
struct Foo {
    points: Points,
}

#[derive(Debug, Deserialize)]
struct Points {
    #[serde(deserialize_with = "str_or_u64")]
    x: u64,
    #[serde(deserialize_with = "str_or_u64")]
    y: u64,
    name: String,
}

fn str_or_u64<'de, D>(deserializer: D) -> Result<u64, D::Error>
where
    D: Deserializer<'de>,
{
    #[derive(Deserialize)]
    #[serde(untagged)]
    enum StrOrU64<'a> {
        Str(&'a str),
        U64(u64),
    }

    Ok(match StrOrU64::deserialize(deserializer)? {
        StrOrU64::Str(v) => v.parse().unwrap_or(0), // Ignoring parsing errors
        StrOrU64::U64(v) => v,
    })
}

fn main() {
    let input = r#"{
        "foo":[
            {
                "points":{
                    "x":"",
                    "y":"",
                    "name":""
                }
            },
            {
                "points":{
                    "x":78,
                    "y":92,
                    "name":"bar"
                }
            }
        ]
    }"#;

    dbg!(serde_json::from_str::<WebResponse>(input));
}

See also:

Sirreverence answered 6/4, 2021 at 0:32 Comment(0)
H
1

Just adding a note for future viewers: in case it is helpful, I have implemented the solution from the accepted answer and published it as a crate serde-this-or-that.

I've added a section on Performance to explain that an approach with a custom Visitor as suggested, should perform overall much better than a version with an untagged enum, which does also work.

Here is a shortened implementation of the accepted solution above (should have the same result):

use serde::Deserialize;
use serde_json::from_str;
use serde_this_or_that::as_u64;

#[derive(Clone, Debug, Deserialize)]
struct WebResponse {
    foo: Vec<Foo>,
}

#[derive(Clone, Debug, Deserialize)]
struct Foo {
    points: Points,
}

#[derive(Clone, Debug, Deserialize)]
struct Points {
    #[serde(deserialize_with = "as_u64")]
    x: u64,
    #[serde(deserialize_with = "as_u64")]
    y: u64,
    name: String,
}

fn main() {
    let value = from_str::<WebResponse>(
        r#"{
        "foo": [
            {
                "points": {
                    "x": "",
                    "y": "",
                    "name": ""
                }
            },
            {
                "points": {
                    "x": 78,
                    "y": 92,
                    "name": "bar"
                }
            }
        ]
    }"#,
    );
    println!("{:?}", value);
}

The Cargo.toml would look like:

[dependencies]
serde = { version = "1.0.136", features = ["derive"] }
serde_json = "1.0.79"
serde-this-or-that = "0.4"
Holocaine answered 17/4, 2022 at 17:36 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.