As I said in my comment, as far as the yaml-deserializer is concerned, the text "5_000_000" is always a string, never an integer. So you need to tell serde
that this field needs special treatment. Either you create a FooSerialized
struct as described in the comment (which would duplicate a lot of definitions) or you use the deserialize_with
attribute to customize the fields deserialization:
use serde::{de, Deserialize};
#[derive(Deserialize, Debug)]
pub struct Foo {
#[serde(deserialize_with = "deserialize_underscored_integer")]
pub instrument: u64,
pub other_field: String,
}
fn deserialize_underscored_integer<'de, D, T>(deserializer: D) -> Result<T, D::Error>
where
D: de::Deserializer<'de>,
T: std::str::FromStr,
{
// First, deserialize the value as a string (which might fail...)
let s: String = de::Deserialize::deserialize(deserializer)?;
// next, filter out the underscores (and invalid chars while we are at it),
// collect the remaining chars into a new string, parse that as an integer
// and return that
s.chars()
.filter_map(|c| match c {
c @ '0'..='9' => Some(Ok(c)),
'_' => None,
_ => Some(Err(de::Error::custom("invalid char in string"))),
})
.collect::<Result<String, _>>()?
.parse()
.map_err(|_: <T as std::str::FromStr>::Err| {
de::Error::custom("string does not represent an integer")
})
}
fn main() {
// This will succeed
let inp = r#"
- instrument: 1_2_34_567_8_9____0
other_field: this string contains an _
- instrument: 5_000_000
other_field: this string contains an _
"#;
println!("{:?}", serde_yaml::from_str::<Vec<Foo>>(inp).unwrap());
// This will fail because its not a integer
let inp = r#"
- instrument: 5000 abcdef
other_field: this string contains some other stuff
"#;
println!("{:?}", serde_yaml::from_str::<Vec<Foo>>(inp).unwrap_err());
// This looks like an integer but is not a u64
let inp = r#"
- instrument: 5_000_000_000_000_000_000_000
other_field: this string is too large to be a u64
"#;
println!("{:?}", serde_yaml::from_str::<Vec<Foo>>(inp).unwrap_err());
}
5_000_000
is and will always be a string, as it can't be an integer. Since you probably don't want to fork a custom yaml deserializer, the approach probably should be to have aFooSerialized
struct withinstrument
defined as aString
/&str
, and an implementation ofTryFrom
intoFoo
(whereinstrument
isu64
), where conversion may result in astd::num::ParseIntError
in case the string can't be converted to an integer after stripping the underscores. – Standeedeserialize_with
to specify a custom function for deserializinginstrument
and other similar fields. – NeolithicDeserializer
. Something similar to here: github.com/serde-rs/json/issues/833#issuecomment-981989078 – Gareri