Can we get the source code location of the caller in a procedural macro attribute?
Asked Answered
C

2

4

I have requirement to get the source location of the caller of every method. I am trying to create a proc_macro_attribute to capture the location and print it.

#[proc_macro_attribute]
pub fn get_location(attr: TokenStream, item: TokenStream) -> TokenStream {
    // Get and print file!(), line!() of source
    // Should print line no. 11
    item
}
#[get_location]
fn add(x: u32, y: u32) -> u32 {
    x + y
}

fn main() {
    add(1, 5); // Line No. 11
}
Cumber answered 15/3, 2020 at 10:56 Comment(1)
C
2

Ready to use solutions are available (see @timotree 's comment). If you want to do this yourself, have more flexibility or learn, you can write a procedural macro that will parse a backtrace (obtained from inside the function that is called) and print the information that you need. Here is a procedural macro inside a lib.rs:

extern crate proc_macro;
use proc_macro::{TokenStream, TokenTree};

#[proc_macro_attribute]
pub fn get_location(_attr: TokenStream, item: TokenStream) -> TokenStream {

    // prefix code to be added to the function's body
    let mut prefix: TokenStream = "
        // find earliest symbol in source file using backtrace
        let ps = Backtrace::new().frames().iter()
            .flat_map(BacktraceFrame::symbols)
            .skip_while(|s| s.filename()
                .map(|p|!p.ends_with(file!())).unwrap_or(true))
            .nth(1 as usize).unwrap();

        println!(\"Called from {:?} at line {:?}\",
            ps.filename().unwrap(), ps.lineno().unwrap());
    ".parse().unwrap(); // parse string into TokenStream

    item.into_iter().map(|tt| { // edit input TokenStream
        match tt { 
            TokenTree::Group(ref g) // match the function's body
                if g.delimiter() == proc_macro::Delimiter::Brace => { 

                    prefix.extend(g.stream()); // add parsed string

                    TokenTree::Group(proc_macro::Group::new(
                        proc_macro::Delimiter::Brace, prefix.clone()))
            },
            other => other, // else just forward TokenTree
        }
    }).collect()
} 

The backtrace is parsed to find the earliest symbol inside the source file (retrieved using file!(), another macro). The code we need to add to the function is defined in a string, that is then parsed as a TokenStream and added at the beginning of the function's body. We could have added this logic at the end, but then returning a value without a semicolon wouldn't work anymore. You can then use the procedural macro in your main.rs as follow:

extern crate backtrace;
use backtrace::{Backtrace, BacktraceFrame};
use mylib::get_location;

#[get_location]
fn add(x: u32, y: u32) -> u32 { x + y }

fn main() { 
    add(1, 41);
    add(41, 1);
}

The output is:

> Called from "src/main.rs" at line 10
> Called from "src/main.rs" at line 11

Don't forget to specify that your lib crate is providing procedural macros by adding these two lines to your Cargo.toml:

[lib]
proc-macro = true
Cane answered 18/3, 2020 at 0:11 Comment(3)
Thanks Victor. I should have asked different question actually. I didn't find a way to modify function in tokenstream, which I got form your example. Thanks again.Cumber
My pleasure. Do you need more explanations as to how the function is modified ? You can also ask another question if neededCane
Will do. I got enough information from your example. Thanks again VictorCumber
L
6

TL;DR

Here is a procedural macro that uses syn and quote to do what you've described:

// print_caller_location/src/lib.rs

use proc_macro::TokenStream;
use quote::quote;
use syn::spanned::Spanned;

// Create a procedural attribute macro
//
// Notably, this must be placed alone in its own crate
#[proc_macro_attribute]
pub fn print_caller_location(_attr: TokenStream, item: TokenStream) -> TokenStream {
    // Parse the passed item as a function
    let func = syn::parse_macro_input!(item as syn::ItemFn);

    // Break the function down into its parts
    let syn::ItemFn {
        attrs,
        vis,
        sig,
        block,
    } = func;

    // Ensure that it isn't an `async fn`
    if let Some(async_token) = sig.asyncness {
        // Error out if so
        let error = syn::Error::new(
            async_token.span(),
            "async functions do not support caller tracking functionality
    help: consider returning `impl Future` instead",
        );

        return TokenStream::from(error.to_compile_error());
    }

    // Wrap body in a closure only if function doesn't already have #[track_caller]
    let block = if attrs.iter().any(|attr| attr.path.is_ident("track_caller")) {
        quote! { #block }
    } else {
        quote! {
            (move || #block)()
        }
    };

    // Extract function name for prettier output
    let name = format!("{}", sig.ident);

    // Generate the output, adding `#[track_caller]` as well as a `println!`
    let output = quote! {
        #[track_caller]
        #(#attrs)*
        #vis #sig {
            println!(
                "entering `fn {}`: called from `{}`",
                #name,
                ::core::panic::Location::caller()
            );
            #block
        }
    };

    // Convert the output from a `proc_macro2::TokenStream` to a `proc_macro::TokenStream`
    TokenStream::from(output)
}

Make sure to put it in its on crate and add these lines to its Cargo.toml:

# print_caller_location/Cargo.toml

[lib]
proc-macro = true

[dependencies]
syn = {version = "1.0.16", features = ["full"]}
quote = "1.0.3"
proc-macro2 = "1.0.9"

In-depth explanation

A macro can only expand to code that's possible to write by hand to begin with. Knowing this, I see two questions here:

  1. How can I write a function that tracks the location of its caller?
  2. How can I write a procedural macro that creates such functions?

Initial attempt

We want a procedural macro that

  • takes a function,
  • marks it #[track_caller],
  • and adds a line that prints Location::caller.

For example, it would transform a function like this:

fn foo() {
    // body of foo
}

into

#[track_caller]
fn foo() {
    println!("{}", std::panic::Location::caller());
    // body of foo
}

Below, I present a procedural macro that executes that transformation exactly — although, as you'll see in later versions, you probably want something different. To try this code, like before in the TL;DR section, put it into its own crate and add its dependencies to the Cargo.toml.

// print_caller_location/src/lib.rs

use proc_macro::TokenStream;
use quote::quote;

// Create a procedural attribute macro
//
// Notably, this must be placed alone in its own crate
#[proc_macro_attribute]
pub fn print_caller_location(_attr: TokenStream, item: TokenStream) -> TokenStream {
    // Parse the passed item as a function
    let func = syn::parse_macro_input!(item as syn::ItemFn);

    // Break the function down into its parts
    let syn::ItemFn {
        attrs,
        vis,
        sig,
        block,
    } = func;

    // Extract function name for prettier output
    let name = format!("{}", sig.ident);

    // Generate the output, adding `#[track_caller]` as well as a `println!`
    let output = quote! {
        #[track_caller]
        #(#attrs)*
        #vis #sig {
            println!(
                "entering `fn {}`: called from `{}`",
                #name,
                ::core::panic::Location::caller()
            );
            #block
        }
    };

    // Convert the output from a `proc_macro2::TokenStream` to a `proc_macro::TokenStream`
    TokenStream::from(output)
}

Example usage:

// example1/src/main.rs

#![feature(track_caller)]

#[print_caller_location::print_caller_location]
fn add(x: u32, y: u32) -> u32 {
    x + y
}

fn main() {
    add(1, 5); // entering `fn add`: called from `example1/src/main.rs:11:5`
    add(1, 5); // entering `fn add`: called from `example1/src/main.rs:12:5`
}

Unfortunately, we won't be able to get away with that simple version. There are at least two problems with that version:

  • How it composes with async fns:

    • Instead of printing the caller location, it prints the location in which our macro (#[print_caller_location]) is invoked. For example:

    // example2/src/main.rs
    
    #![feature(track_caller)]
    
    #[print_caller_location::print_caller_location]
    async fn foo() {}
    
    fn main() {
        let future = foo();
        // ^ oops! prints nothing
        futures::executor::block_on(future);
        // ^ oops! prints "entering `fn foo`: called from `example2/src/main.rs:5:1`"
        let future = foo();
        // ^ oops! prints nothing
        futures::executor::block_on(future);
        // ^ oops! prints "entering `fn foo`: called from `example2/src/main.rs:5:1`"
    }
    
  • How it works with other invocations of itself, or generally, of #[track_caller]:

    • Nested functions with #[print_caller_location] will print the location of the root caller, rather than the direct caller of a given function. For example:

    // example3/src/main.rs
    
    #![feature(track_caller)]
    
    #[print_caller_location::print_caller_location]
    fn add(x: u32, y: u32) -> u32 {
        x + y
    }
    
    #[print_caller_location::print_caller_location]
    fn add_outer(x: u32, y: u32) -> u32 {
        add(x, y)
        // ^ we would expect "entering `fn add`: called from `example3/src/main.rs:12:5`"
    }
    
    fn main() {
        add(1, 5);
        // ^ "entering `fn add`: called from `example3/src/main.rs:17:5`"
        add(1, 5);
        // ^ "entering `fn add`: called from `example3/src/main.rs:19:5`"
        add_outer(1, 5);
        // ^ "entering `fn add_outer`: called from `example3/src/main.rs:21:5`"
        // ^ oops! "entering `fn add`: called from `example3/src/main.rs:21:5`"
        //
        // In reality, `add` was called on line 12, from within the body of `add_outer`
        add_outer(1, 5);
        // ^ "entering `fn add_outer`: called from `example3/src/main.rs:26:5`"
        // oops! ^ entering `fn add`: called from `example3/src/main.rs:26:5`
        //
        // In reality, `add` was called on line 12, from within the body of `add_outer`
    }
    

Addressing async fns

It is possible to work around the problem with async fns using -> impl Future, for example, if we wanted our async fn counter-example to work correctly, we could instead write:

// example4/src/main.rs

#![feature(track_caller)]

use std::future::Future;

#[print_caller_location::print_caller_location]
fn foo() -> impl Future<Output = ()> {
    async move {
        // body of foo
    }
}

fn main() {
    let future = foo();
    // ^ prints "entering `fn foo`: called from `example4/src/main.rs:15:18`"
    futures::executor::block_on(future);
    // ^ prints nothing
    let future = foo();
    // ^ prints "entering `fn foo`: called from `example4/src/main.rs:19:18`"
    futures::executor::block_on(future);
    // ^ prints nothing
}

We could add a special case that applies this transformation to our macro. However, that transformation changes the public API of the function from async fn foo() to fn foo() -> impl Future<Output = ()> in addition to affecting the auto traits that the returned future can have.

Therefore I recommend that we allow users to use that workaround if they desire, and simply emit an error if our macro is used on an async fn. We can do this by adding these lines to our macro code:

// Ensure that it isn't an `async fn`
if let Some(async_token) = sig.asyncness {
    // Error out if so
    let error = syn::Error::new(
        async_token.span(),
        "async functions do not support caller tracking functionality
    help: consider returning `impl Future` instead",
    );

    return TokenStream::from(error.to_compile_error());
}

Fixing nested behavior of #[print_caller_location] functions

The problematic behaviour minimizes down to this fact: When a #[track_caller] function, foo, directly calls into another #[track_caller] function, bar, Location::caller will give both of them access to foo's caller. In other words, Location::caller gives access to the root caller in the case of nested #[track_caller] functions:

#![feature(track_caller)]

fn main() {
    foo(); // prints `src/main.rs:4:5` instead of the line number in `foo`
}

#[track_caller]
fn foo() {
   bar();
}

#[track_caller]
fn bar() {
    println!("{}", std::panic::Location::caller());
}

playground link

To remedy this, we need to break the chain of #[track_caller] calls. We can break the chain by hiding the nested call to bar within a closure:

#![feature(track_caller)]

fn main() {
    foo();
}

#[track_caller]
fn foo() {
    (move || {
        bar(); // prints `src/main.rs:10:9`
    })()
}

#[track_caller]
fn bar() {
    println!("{}", std::panic::Location::caller());
}

playground link

Now that we know how to break the chain of #[track_caller] functions, we can address this problem. We just need to make sure that if the user actually marks their function with #[track_caller] on purpose, we refrain from inserting the closure and breaking the chain.

We can add these lines to our solution:

// Wrap body in a closure only if function doesn't already have #[track_caller]
let block = if attrs.iter().any(|attr| attr.path.is_ident("track_caller")) {
    quote! { #block }
} else {
    quote! {
        (move || #block)()
    }
};

Final solution

After those two changes, we've ended up with this code:

// print_caller_location/src/lib.rs

use proc_macro::TokenStream;
use quote::quote;
use syn::spanned::Spanned;

// Create a procedural attribute macro
//
// Notably, this must be placed alone in its own crate
#[proc_macro_attribute]
pub fn print_caller_location(_attr: TokenStream, item: TokenStream) -> TokenStream {
    // Parse the passed item as a function
    let func = syn::parse_macro_input!(item as syn::ItemFn);

    // Break the function down into its parts
    let syn::ItemFn {
        attrs,
        vis,
        sig,
        block,
    } = func;

    // Ensure that it isn't an `async fn`
    if let Some(async_token) = sig.asyncness {
        // Error out if so
        let error = syn::Error::new(
            async_token.span(),
            "async functions do not support caller tracking functionality
    help: consider returning `impl Future` instead",
        );

        return TokenStream::from(error.to_compile_error());
    }

    // Wrap body in a closure only if function doesn't already have #[track_caller]
    let block = if attrs.iter().any(|attr| attr.path.is_ident("track_caller")) {
        quote! { #block }
    } else {
        quote! {
            (move || #block)()
        }
    };

    // Extract function name for prettier output
    let name = format!("{}", sig.ident);

    // Generate the output, adding `#[track_caller]` as well as a `println!`
    let output = quote! {
        #[track_caller]
        #(#attrs)*
        #vis #sig {
            println!(
                "entering `fn {}`: called from `{}`",
                #name,
                ::core::panic::Location::caller()
            );
            #block
        }
    };

    // Convert the output from a `proc_macro2::TokenStream` to a `proc_macro::TokenStream`
    TokenStream::from(output)
}
Lowbred answered 18/3, 2020 at 1:20 Comment(1)
This is superb. Thanks.Cumber
C
2

Ready to use solutions are available (see @timotree 's comment). If you want to do this yourself, have more flexibility or learn, you can write a procedural macro that will parse a backtrace (obtained from inside the function that is called) and print the information that you need. Here is a procedural macro inside a lib.rs:

extern crate proc_macro;
use proc_macro::{TokenStream, TokenTree};

#[proc_macro_attribute]
pub fn get_location(_attr: TokenStream, item: TokenStream) -> TokenStream {

    // prefix code to be added to the function's body
    let mut prefix: TokenStream = "
        // find earliest symbol in source file using backtrace
        let ps = Backtrace::new().frames().iter()
            .flat_map(BacktraceFrame::symbols)
            .skip_while(|s| s.filename()
                .map(|p|!p.ends_with(file!())).unwrap_or(true))
            .nth(1 as usize).unwrap();

        println!(\"Called from {:?} at line {:?}\",
            ps.filename().unwrap(), ps.lineno().unwrap());
    ".parse().unwrap(); // parse string into TokenStream

    item.into_iter().map(|tt| { // edit input TokenStream
        match tt { 
            TokenTree::Group(ref g) // match the function's body
                if g.delimiter() == proc_macro::Delimiter::Brace => { 

                    prefix.extend(g.stream()); // add parsed string

                    TokenTree::Group(proc_macro::Group::new(
                        proc_macro::Delimiter::Brace, prefix.clone()))
            },
            other => other, // else just forward TokenTree
        }
    }).collect()
} 

The backtrace is parsed to find the earliest symbol inside the source file (retrieved using file!(), another macro). The code we need to add to the function is defined in a string, that is then parsed as a TokenStream and added at the beginning of the function's body. We could have added this logic at the end, but then returning a value without a semicolon wouldn't work anymore. You can then use the procedural macro in your main.rs as follow:

extern crate backtrace;
use backtrace::{Backtrace, BacktraceFrame};
use mylib::get_location;

#[get_location]
fn add(x: u32, y: u32) -> u32 { x + y }

fn main() { 
    add(1, 41);
    add(41, 1);
}

The output is:

> Called from "src/main.rs" at line 10
> Called from "src/main.rs" at line 11

Don't forget to specify that your lib crate is providing procedural macros by adding these two lines to your Cargo.toml:

[lib]
proc-macro = true
Cane answered 18/3, 2020 at 0:11 Comment(3)
Thanks Victor. I should have asked different question actually. I didn't find a way to modify function in tokenstream, which I got form your example. Thanks again.Cumber
My pleasure. Do you need more explanations as to how the function is modified ? You can also ask another question if neededCane
Will do. I got enough information from your example. Thanks again VictorCumber

© 2022 - 2024 — McMap. All rights reserved.