Is there a way to create a hash of a function wrapped by `std::function<>`?
Asked Answered
G

1

2

I have a C++ function that takes a std::function as an input argument.
Specifically, a std::function<void (const Message&, Error)>.

In my use-case, the caller may bind the std::function to either a free function or a member function.

(I'm not experienced with std::bind or std::function, so I found it noteworthy that the same object type, std::function<void (const Message&, Error)>, can be bound to a free function as well as a member function -- the latter by using std::bind. I found it interesting because it seemed to abstract away the difference between a function pointer and a member function pointer (at least it gave me that impression))

For my debugging need, it would be useful to log a hash, something unique, associated with the std::function input argument.
Here's where I quickly realized I can't escape that fundamental difference between free function pointers and member function pointers.
I can get the underlying void (*)(const Message&, Error) free function pointer using std::function::target<void (*)(const Message&, Error)>(), which serves my needs as a unique hash.
But that doesn't work if the std::function<void (const Message&, Error)> is bound to a member function.
In my head, I reasoned that if the std::function<void (const Message&, Error)> was bound to a class Foo member function, then std::function::target<void (Foo::*)(const Message&, Error)>() would return the pointer to a member function pointer -- but that didn't seem to be the case.

Which leads to my question: is there any way to generically get a unique hash from a std::function instance regardless whether it's bound to a free function or a member function?

#include <functional>
#include <iostream>

using namespace std;

struct Message {
  int i_;
};

struct Error {
  char c_;
};

class Foo {
public:
  void print(const Message& m, Error e) {
    cout << "member func: " << m.i_ << " " << e.c_ << endl;
  }
};

void print(const Message& m, Error e) {
  cout << "free func: " << m.i_ << " " << e.c_ << endl;
};

void doWork(function<void (const Message&, Error)> f) {
  // I can invoke f regardless of whether it's been bound to a free function or
  // a member function...
  {
    Message m{42};
    Error e{'x'};

    f(m, e);
  }

  // ...but since I don't know whether f is bound to a free function or a member
  // function, I can't use std::function::target<>() to generically get a
  // function pointer, whose (void*) value would have served my need for a
  // hash...
  {
    typedef void (*Fptr)(const Message&, Error);
    typedef void (Foo::*Mfptr)(const Message&, Error);

    Fptr* fptr = f.target<Fptr>();
    Mfptr* mfptr = nullptr;

    cout << "free func target: " << (void*)fptr << endl;

    if (fptr) {
      cout << "free func hash: " << (void*)*fptr << endl;
    }
    else {
      // ...moreover, when f is bound to a Foo member function (using
      // std::bind), std::function::target<>() doesn't return a Foo member
      // function pointer either...I can't reason why not.
      // (this also isn't scalable because in future, f may be bound to a 
      // class Bar or class Baz member function)
      mfptr = f.target<Mfptr>();
      cout << "not a free function; checking for a Foo member function" << endl;
      cout << "member func target: " << (void*)mfptr << endl;

      if (mfptr) {
        cout << "member func hash: " << (void*)*mfptr << endl;
      }
    }
  }
}

int main()
{
  {
    function<void (const Message&, Error)> f = print;

    doWork(f);
  }

  cout << "---" << endl;

  {
    Foo foo;
    function<void (const Message&, Error)> f = bind(&Foo::print,
                                                    &foo,
                                                    placeholders::_1,
                                                    placeholders::_2);

    doWork(f);
  }

  return 0;
}

Compilation and output:

$ g++ --version && g++ -g ./main.cpp && ./a.out
g++ (Debian 8.3.0-6) 8.3.0
Copyright (C) 2018 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

free func: 42 x
free func target: 0x7ffda4547bf0
free func hash: 0x55db499c51e5
---
member func: 42 x
free func target: 0
not a free function; checking for a Foo member function
member func target: 0
Gobble answered 26/4, 2023 at 20:10 Comment(6)
how about.... std::function::target_type().name()?Sotos
Possibly related: #73744746Underhill
Each function is just a pointer to an address in memory. Can't you just use that address? Ex: size_t addr = (size_t)my_func;. If two functions are equal, they have the same address.Dwelt
@Sotos -- looks promising; I'll experiment with this some more, and if you are inclined to expand your comment into an answer, I'd be happy to upvote/accept.Gobble
@GabrielStaples -- that's exactly what I figured, and was trying to do. The issue is getting that address. I am not allowed to change the signature of doWork(), so I can only work with whatever std::function provides.Gobble
If you know the exact way the bind was created, you may be able to obtain its type via decltype(std::bind(...)). And then target<...> will point to the instance. The problem is, it is likely allocated locally inside the std::function, meaning the address of target will change for each instance of std::function and there isn't any suitable interface for an output bind to tell anything about its intrinsics.Penton
S
3

The following code:

#include <functional>
#include <iostream>
#include <vector>
#include <string>
#include <cstdint>

int f(int a) { return -a; }
int f2(int a) { return a; }

int main() {
    std::vector<std::function<int(int)>> fn{
        f,
        f,
        f2,
        f2,
        [](int a) {return -a;},
        [](int a) {return -a;},
        [](int a) {return -a;},
    };

    for (auto&& a : fn) {
        const auto t = a.target<int(*)(int)>();
        const auto hash = t ?
            (size_t)(uintptr_t)(void*)*t :
            a.target_type().hash_code();
        std::cout << hash << '\n';
    }
}

Initialized vector of two f functions, two f2 functions, and 3 lambda functions. Thus we are expecting two same hashes, two same hashes, and each lambda is a new type - 3 different hashes. The code outputs:

4198918
4198918
4198932
4198932
11513669940284151167
7180698749978361212
13008242069459866308
Sotos answered 26/4, 2023 at 20:33 Comment(8)
This results in the same hash if lambda/functor comes from some factory, e.g. for auto factory(int x){ return [x](int y){ return x + y;}; } we have factory(10) and factory(20) share the type.Luscious
It may not matter for OP nor invalidates the answer, but I thought it should be said as footnote.Luscious
Is there any advantage to hashing the target_type().name()s (name-mangled lambda function names) with std::hash<std::string_view>{}(a.target_type().name()) rather than just obtaining the hash_code directly via target_type().hash_code()?Dwelt
Why cast the function address, *t, to void* and then to uintptr_t and then to size_t via (size_t)(uintptr_t)(void*)*t instead of just casting straight to size_t via (size_t)*t? I've always just cast pointers directly to size_t with no intermediate casts in-between.Dwelt
Note: both hash_code and the return value of calling the std::hash<>::operator() callable operator are std::size_t, so I recommend removing the const auto hash = usage of auto and saying const std::size_t hash = instead. auto obfuscates the type of the hash variable in an undesirable way otherwise.Dwelt
Why cast the function address, *t It is undefined behavior to convert a function pointer to an integer, converting to a pointer is conditionally supported eel.is/c++draft/expr#reinterpret.cast-8 . Converting a pointer to an integer type not large enough is undefined behavior: port70.net/~nsz/c/c11/n1570.html#6.3.2.3p6 and eel.is/c++draft/expr#reinterpret.cast-4 . If size_t would not be large enough to store the result of pointer->integer conversion, the behavior would be undefined. The proper way is void *, then uintptr_t which is guaranteed to be large enoughSotos
But, I will agree, this is all moot. Because the conversions themselves are implementation defined, we might as well expect compilers to just support (size_t)*t in an implementation defined manner instead of doing standard C++ shenanigans to avoid standard C++ undefined behavior. We all know only 3 C++ compilers matter. But from the other side, you never know when writers of optimizer part of the compiler will kick in, decide that this code is undefined behavior and optimize it all out. So it is what it is.Sotos
Which 3 C++ compilers? Microsoft Visual C++, GNU gcc/g++, and LLVM clang? That's my guess. Then again, I think Apple/Mac has their own compiler too, and many microcontrollers have some other compilers too, frequently with only partial language implementations I think.Dwelt

© 2022 - 2025 — McMap. All rights reserved.