Enforce strict ordering when deserializing JSON with serde
Asked Answered
P

1

6

I want to deserialize a string of JSON data into a struct with multiple fields and return an error if the ordering of the serialized data does not match the order of the fields in the struct.

I have read through the serde documentation, including the section on custom serialization, but cannot find a solution. I imagine it might be possible to enforce strict ordering by implementing Deserializer with field name checks but I'm not entirely sure about this.

An example following the format of the serde_json docs:

#[derive(Serialize, Deserialize)]
struct Person {
    name: String,
    age: u8,
    phones: Vec<String>,
}

let correct_order = r#"
    {
        "name": "John Doe",
        "age": 43,
        "phones": [
            "+44 1234567",
            "+44 2345678"
        ]
    }"#;

// this deserializes correctly (no error)
let p: Person = serde_json::from_str(data)?;

let incorrect_order = r#"
    {
        "age": 43,
        "phones": [
            "+44 1234567",
            "+44 2345678"
        ]
        "name": "John Doe"
    }"#;

// how to ensure this returns an error? (data fields out of order)
let p2: Person = serde_json::from_str(data)?;
Penchant answered 3/5, 2021 at 8:50 Comment(4)
you can't, json is an unordered data format. Why want do that ? the only way is to parse every field you want ordered inside a hashmap ordered lib.rs/crates/indexmap but you will loose a lot of what make serde niceFluff
Your expected behaviour means it's not JSON anymore, only something similar. However, for serialization there's the preserve_order crate feature which keeps the struct order while creating the string as one possible ordering of the many allowed.Accidie
Thanks for your comments. You're right, it's not strictly JSON with this ordering requirement. The reason for requiring this is that the object is cryptographically signed and so order becomes important. I was hoping someone else may have solved this ;) I will take a look at indexmap and preserve_order. Thanks again.Penchant
If you want to ensure the payload matches a signature, then do that. I'm not sure what benefit you hope to get from serde by trying to enforce strict deserialization formatting. You'll have to validate it separately anyway since "age": 43 and "age": 45 would be correctly formatted, but would be wrong.Ickes
L
1

You can do this by providing a custom Deserialize implementation.

For JSON, the visitor function you'll be going through for struct deserialization is Visitor::visit_map(). Normally, struct fields are visited in whatever order they are given (for example, when you use #[derive(Deserialize)]). We simply have to write the visitor to ensure the fields come in the strict order we expect.

use serde::{
    de,
    de::{Deserialize, Deserializer, MapAccess, Visitor},
};
use std::fmt;

#[derive(Debug)]
struct Person {
    name: String,
    age: u8,
    phones: Vec<String>,
}

impl<'de> Deserialize<'de> for Person {
    fn deserialize<D>(deserializer: D) -> Result<Self, D::Error>
    where
        D: Deserializer<'de>,
    {
        // Some boilerplate logic for deserializing the fields.
        enum Field {
            Name,
            Age,
            Phones,
        }

        impl<'de> Deserialize<'de> for Field {
            fn deserialize<D>(deserializer: D) -> Result<Self, D::Error>
            where
                D: Deserializer<'de>,
            {
                struct FieldVisitor;

                impl<'de> Visitor<'de> for FieldVisitor {
                    type Value = Field;

                    fn expecting(&self, formatter: &mut fmt::Formatter) -> fmt::Result {
                        formatter.write_str("name, age, or phones")
                    }

                    fn visit_str<E>(self, v: &str) -> Result<Self::Value, E>
                    where
                        E: de::Error,
                    {
                        match v {
                            "name" => Ok(Field::Name),
                            "age" => Ok(Field::Age),
                            "phones" => Ok(Field::Phones),
                            _ => Err(E::unknown_field(v, FIELDS)),
                        }
                    }
                }

                deserializer.deserialize_identifier(FieldVisitor)
            }
        }

        // Logic for actually deserializing the struct itself.
        struct PersonVisitor;

        impl<'de> Visitor<'de> for PersonVisitor {
            type Value = Person;

            fn expecting(&self, formatter: &mut fmt::Formatter) -> fmt::Result {
                formatter.write_str("struct Person with fields in order of name, age, and phones")
            }

            fn visit_map<A>(self, mut map: A) -> Result<Self::Value, A::Error>
            where
                A: MapAccess<'de>,
            {
                // Deserialize name.
                let name = match map.next_key()? {
                    Some(Field::Name) => Ok(map.next_value()?),
                    Some(_) => Err(de::Error::missing_field("name")),
                    None => Err(de::Error::invalid_length(0, &self)),
                }?;

                // Deserialize age.
                let age = match map.next_key()? {
                    Some(Field::Age) => Ok(map.next_value()?),
                    Some(_) => Err(de::Error::missing_field("age")),
                    None => Err(de::Error::invalid_length(1, &self)),
                }?;

                // Deserialize phones.
                let phones = match map.next_key()? {
                    Some(Field::Phones) => Ok(map.next_value()?),
                    Some(_) => Err(de::Error::missing_field("phones")),
                    None => Err(de::Error::invalid_length(2, &self)),
                }?;

                Ok(Person { name, age, phones })
            }
        }

        const FIELDS: &[&str] = &["name", "age", "phones"];
        deserializer.deserialize_struct("Person", FIELDS, PersonVisitor)
    }
}

There's a lot of boilerplate here (that is normally hidden behind #[derive(Deserialize)]):

  • First we define an internal enum Field to deserialize the struct fields, with its own Deserialize implementation. This is a standard implementation, we just write it out by hand here.
  • Then we define a PersonVisitor to actually provide our Visitor trait implementation. This part is where we actually enforce the ordering of the fields.

You can see that this now works as expected. The following code:

fn main() {
    let correct_order = r#"
        {
            "name": "John Doe",
            "age": 43,
            "phones": [
                "+44 1234567",
                "+44 2345678"
            ]
        }"#;

    // this deserializes correctly (no error)
    let p: serde_json::Result<Person> = serde_json::from_str(correct_order);
    dbg!(p);

    let incorrect_order = r#"
        {
            "age": 43,
            "phones": [
                "+44 1234567",
                "+44 2345678"
            ]
            "name": "John Doe"
        }"#;

    // how to ensure this returns an error? (data fields out of order)
    let p2: serde_json::Result<Person> = serde_json::from_str(incorrect_order);
    dbg!(p2);
    assert!(false)
}

prints this output:

[src/main.rs:114] p = Ok(
    Person {
        name: "John Doe",
        age: 43,
        phones: [
            "+44 1234567",
            "+44 2345678",
        ],
    },
)
[src/main.rs:128] p2 = Err(
    Error("missing field `name`", line: 3, column: 17),
)
Ligan answered 7/4, 2023 at 0:18 Comment(0)

© 2022 - 2025 — McMap. All rights reserved.