How does the type deduction work in this Docopt example?
Asked Answered
L

1

2

Take a look at this code using the docopt library:

const USAGE: &'static str = "...something...";

#[derive(Deserialize)]
struct Args {
    flag: bool,
}

type Result<T> = result::Result<T, Box<error::Error + Send + Sync>>;

fn main() {
    let mut args: Args = Docopt::new(USAGE)
        .and_then(|d| d.deserialize())
        .unwrap_or_else(|e| e.exit());
}

If you look at the expression to the right of equals sign, you'll see that it doesn't mention the Args struct anywhere. How does the compiler deduce the return type of this expression? Can the type information flow in opposite direction (from initialization target to initializer expression) in Rust?

Longrange answered 26/2, 2019 at 11:47 Comment(0)
E
7

"How does it work?" might be too big of a question for Stack Overflow but (along with other languages like Scala and Haskell) Rust's type system is based on the Hindley-Milner type system, albeit with many modifications and extensions.

Simplifying greatly, the idea is to treat each unknown type as a variable, and define the relationships between types as a series of constraints, which can then be solved by an algorithm. In some ways it's similar to simultaneous equations you may have solved in algebra at school.


Type inference is a feature of Rust (and other languages in the extended Hindley-Milner family) that is exploited pervasively in idiomatic code to:

  • reduce the noise of type annotations
  • improve maintainability by not hard-coding types in multiple places (DRY)

Rust's type inference is powerful and, as you say, can flow both ways. To use Vec<T> as a simpler and more familiar example, any of these are valid:

let vec = Vec::new(1_i32);
let vec = Vec::<i32>::new();
let vec: Vec<i32> = Vec::new();

The type can even be inferred just based on how a type is later used:

let mut vec = Vec::new();
// later...
vec.push(1_i32);

Another nice example is picking the correct string parser, based on the expected type:

let num: f32 = "100".parse().unwrap();
let num: i128 = "100".parse().unwrap();
let address: SocketAddr = "127.0.0.1:8080".parse().unwrap();

So what about your original example?

  1. Docopt::new returns a Result<Docopt, Error>, which will be Result::Err<Error> if the supplied options can't be parsed as arguments. At this point, there is no knowledge of if the arguments are valid, just that they are correctly formed.
  2. Next, and_then has the following signature:
    pub fn and_then<U, F>(self, op: F) -> Result<U, E> 
    where
        F: FnOnce(T) -> Result<U, E>,
    
    The variable self has type Result<T, E> where T is Docopt and E is Error, deduced from step 1. U is still unknown, even after you supply the closure |d| d.deserialize().
  3. But we know that T is Docopts, so deserialize is Docopts::deserialize, which has the signature:
    fn deserialize<'a, 'de: 'a, D>(&'a self) -> Result<D, Error> 
    where
        D: Deserialize<'de>
    
    The variable self has type Docopts. D is still unknown, but we know it is the same type as U from step 2.
  4. Result::unwrap_or_else has the signature:
    fn unwrap_or_else<F>(self, op: F) -> T 
    where
        F: FnOnce(E) -> T
    
    The variable self has type Result<T, Error>. But we know that T is the same as U and D from the previous step.
  5. We then assign to a variable of type Args, so T from the previous step is Args, which means that the D in step 3 (and U from step 2) is also Args.
  6. The compiler can now deduce that when you wrote deserialize you meant the method <Args as Deserialize>::deserialize, which was derived automatically with the #[derive(Deserialize)] attribute.
Exhibitive answered 26/2, 2019 at 12:15 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.