Parsing an integer with nom always results in Incomplete
Asked Answered
B

2

7

Everything I try gives me Incomplete(Size(1)). My best guess right now is:

named!(my_u64(&str) -> u64,
    map_res!(recognize!(nom::digit), u64::from_str)
);

Test:

#[cfg(test)]
mod test {
    #[test]
    fn my_u64() {
        assert_eq!(Ok(("", 0)), super::my_u64("0"));
    }
}

Sometimes in my variations (e.g. adding complete!) I've been able to get it to parse if I add a character onto the end.

I'd like to get a working parse for this (ultimately my hope is that this will allow me to create a parser for a u64 wrapper type) but bigger picture I'd like to get a grasp of how to build a parser properly myself.

Beedon answered 10/7, 2018 at 3:54 Comment(0)
L
7

Nom 4 made the handling of partial data much stricter than in previous versions, to better support streaming parsers and custom input types.

Effectively, if the parser runs out of input and it can't tell that it's meant to have run out of input, it'll always return Err::Incomplete. This may also contain information on exactly how much more input the parser was expecting (in your case, at least 1 more byte).

It determines whether there's potentially any more input using the AtEof trait. This always returns false for &str and &[u8], as they don't provide any information about whether they're complete or not!

The trick is to change the input type of your parsers to make it explicit that the input will always be complete - Nom provides the CompleteStr and CompleteByteSlice wrappers for this purpose, or you can implement your own input type.

So in order for your parser to work as expected, it'd need to look something like this:

named!(my_u64(CompleteStr) -> u64,
    map_res!(recognize!(nom::digit), u64::from_str)
);

And your test would look something like this:

#[cfg(test)]
mod test {
    #[test]
    fn my_u64() {
        assert_eq!(Ok((CompleteStr(""), 0)), super::my_u64(CompleteStr("0")));
    }
}

See the announcement post for Nom 4 for more details.

Lacrimator answered 10/7, 2018 at 12:49 Comment(9)
An aside - I'm not on a machine with a Rust compiler right now, so if the code examples are wrong, let me know and I'll fix them later :pLacrimator
This is great, thank you. This is a really good explanation and I feel like I've progressed from throwing stuff at the wall to it sticks to actually having a handle on the behavior now.Beedon
@spease: No problem, I think writing this answer was the first time it fully clicked for me too :DLacrimator
I get this error expected &str, found struct nom::types::CompleteStr any ideas why?Trierarch
@Trierarch a) Do your parsers have the correct signature? b) Are you wrapping your input strings in CompleteStr before trying to pass them in?Lacrimator
@Trierarch Never mind, saw your separate question got an answer :)Lacrimator
Just to be sure... Is recognize taking the digits from the CompleteStr and returning a &str directly? The example in the other question needed to retrieve the inner value from CompleteStr explicitly.Thornberry
You might be right (I didn't have nom installed when I wrote this answer) - will try to verify later.Lacrimator
I think those types are gone from nom 5.1.1 documentation, answer might required updateKight
K
12

As of nom 5.1.1 approach towards combining parsers changed from macro-based to function based, what is discussed broader in nom's author blog.

Along with this change another followed - streaming and complete parsers are now residing in different modules and you need to explicitly choose which type of parsing you need. Most usually there is a clear distinction with module name.

Old macros are preserved, but they work strictly in streaming mode. Types like CompleteStr or CompleteByteSlice are gone.

To write code you asked for the new way you could do it for example like this (notice explicit character::complete in imports)

Since it took me some time to grasp it - parsers e.g map_res return a impl Fn(I) -> IResult<I, O2, E> which is why there is additional pair of parenthesis - to call that closure.

use std::str;
use nom::{
    IResult,
    character::complete::{
        digit1
    },
    combinator::{
        recognize,
        map_res
    }
};

fn my_u64(input : &str) -> IResult<&str, u64> {
    map_res(recognize(digit1), str::parse)(input)
}

#[cfg(test)]
mod test {
    use super::*;
    #[test]
    fn test_my_u64() {
        let input = "42";
        let num = my_u64(input);
        assert_eq!(Ok(("", 42u64)), num);
    }
}
Kight answered 27/4, 2020 at 21:7 Comment(0)
L
7

Nom 4 made the handling of partial data much stricter than in previous versions, to better support streaming parsers and custom input types.

Effectively, if the parser runs out of input and it can't tell that it's meant to have run out of input, it'll always return Err::Incomplete. This may also contain information on exactly how much more input the parser was expecting (in your case, at least 1 more byte).

It determines whether there's potentially any more input using the AtEof trait. This always returns false for &str and &[u8], as they don't provide any information about whether they're complete or not!

The trick is to change the input type of your parsers to make it explicit that the input will always be complete - Nom provides the CompleteStr and CompleteByteSlice wrappers for this purpose, or you can implement your own input type.

So in order for your parser to work as expected, it'd need to look something like this:

named!(my_u64(CompleteStr) -> u64,
    map_res!(recognize!(nom::digit), u64::from_str)
);

And your test would look something like this:

#[cfg(test)]
mod test {
    #[test]
    fn my_u64() {
        assert_eq!(Ok((CompleteStr(""), 0)), super::my_u64(CompleteStr("0")));
    }
}

See the announcement post for Nom 4 for more details.

Lacrimator answered 10/7, 2018 at 12:49 Comment(9)
An aside - I'm not on a machine with a Rust compiler right now, so if the code examples are wrong, let me know and I'll fix them later :pLacrimator
This is great, thank you. This is a really good explanation and I feel like I've progressed from throwing stuff at the wall to it sticks to actually having a handle on the behavior now.Beedon
@spease: No problem, I think writing this answer was the first time it fully clicked for me too :DLacrimator
I get this error expected &str, found struct nom::types::CompleteStr any ideas why?Trierarch
@Trierarch a) Do your parsers have the correct signature? b) Are you wrapping your input strings in CompleteStr before trying to pass them in?Lacrimator
@Trierarch Never mind, saw your separate question got an answer :)Lacrimator
Just to be sure... Is recognize taking the digits from the CompleteStr and returning a &str directly? The example in the other question needed to retrieve the inner value from CompleteStr explicitly.Thornberry
You might be right (I didn't have nom installed when I wrote this answer) - will try to verify later.Lacrimator
I think those types are gone from nom 5.1.1 documentation, answer might required updateKight

© 2022 - 2024 — McMap. All rights reserved.