Detecting specific function calls in unit tests
Asked Answered
P

4

8

I would like to be able to detect if my function (or any other function it calls) will end up calling some specific functions (for instance, malloc and free) in my unit tests: some small portions of my software have hard-real-time requirements, and I'd like to ensure that no ones adds something that would trigger an allocation by accident in these functions (and have my CI pipeline check it automatically).

I know that I can just put a breakpoint on gdb, but ideally I'd like to do something like :

void my_unit_test() {
    my_object obj; // perform some initialization that will allocate

    START_CHECKING_FUNCTION(malloc); // also, operator new or std::allocate would be nice
    obj.perform_realtime_stuff();
    STOP_CHECKING_FUNCTION(malloc);
}

ideally, the test would fail in a not-too-dirty way (eg not std::abort) if at some point malloc is called between the two checks.

Ideally, this would run on any system, but I can live with something that only does it on linux for now. Is this possible in some way ? Maybe through a LD_PRELOAD hack that would replace malloc, but I'd rather not have to do this for all the functions I'm interested in.

Psychotic answered 31/12, 2017 at 17:56 Comment(14)
Some malloc implementations keep track of how many times the function has been called, as an aid to debugging. Can you make use of that?Vincenty
I just can't see any direct relation between hard-real time requirements and dynamic memory allocation.Agnew
@Agnew The requirement is to not do any dynamic memory allocation.Vincenty
seleciii44: basically, on common "desktop" kernels, if you call malloc (or any other kind of system functions) there's a good chance your thread will get pre-empted by the OS ; even if it isn't, there's a chance that it will be locked if another thread calls malloc or free at the same time.Brinkema
1201ProgramAlarm: looked a bit and indeed it seems that something such as panthema.net/2013/malloc_count would work.Brinkema
Sorry, doesn't sound like "hard realtime" if you're testing in unittests whether you're calling malloc or not. You don't make "hard realtime" by accident.Amandaamandi
> "The hard real-time definition considers any missed deadline to be a system failure." That's really my case. We won't be able to ship as long as there may be a single missed deadline on the reference hardware. There won't be deaths, but there may be quite a bunch of money lost.Brinkema
Rather than doing unit tests, wouldn't it be better to (a) have a policy of not doing dynamic memory allocation (except during program startup) and (b) do an analysis of all source to ensure there is no usage of dynamic memory allocation as a means of enforcing the policy. It is easier to circumvent such "don't do that" things in unit tests, either accidentally or deliberately, since unit tests rely on all relevant paths of execution being followed (easier said than achieved). It is harder to circumvent an analysis of all source that detects the usage of functions or operatorsPalmitin
about (a), I need to have dynamic allocations BUT I don't want them to occur in a specific thread. eg thread A allocates memory in response to an user action, and sends a pointer to the newly allocated memory to thread B (the real-time thread) through a lock-free queue. About (b), the thing is : I am using "standard" C++ structures (mostly vector, flat_set / flat_map) and ensure that they have enough memory before I need to enter in real-time mode. What I want is to prevent someone else, to, say, do some vector::push_back at some point that would trigger a reallocation.Brinkema
And doing an analysis of all source is what I'm currently doing, by hand and with constant checks in heaptrack & valgrind, but what I would like is to automatize this process instead, since it's relatively time-consuming.Brinkema
If your application runs on a multithreaded system with pre-emption,(or even just interrupts) any pre-emption might blow your realtime schedue, not just malloc calls. So this is hardly a complete check.Lalita
Ira Baxter: what else could cause a preemption ? (assuming the thread uses realtime scheduling and has the highest priority)Brinkema
How about simply looking in the linker map file and check whether malloc is in there? You absolutely don't need to check this at run-time.Laundes
tofro: the problem is that I'm not banning malloc altogether, I am banning malloc in a specific call tree of my program.Brinkema
L
3

Unit tests call functions that they test. You want to know if a function F called by a unit test can eventually invoke malloc (or new or ...). Seems like what you really want to do is build a call graph for your entire system, and then ask for the critical functions F whether F can reach malloc etc. in the call graph. This is rather easy to compute once you have the call graph.

Getting the call graph is not so easy. Discovering that module A calls module B directly is "technically easy" if you have a real language front end that does name resolution. Finding out what A calls indirectly isn't easy; you need a (function pointer) points-to analysis and those are hard. And, of course, you have decide if you are going to dive into library (e.g., std::) routines or not.

Your call graph needs needs to be conservative (so you don't miss potential calls) and reasonably precise (so you don't drown in false positives) in the face of function pointers and method calls.

This Doxygen support claims to build call graphs: http://clang.llvm.org/doxygen/CallGraph_8cpp.html I don't know if it handles indirect/methods calls or how precise it is; I'm not very familiar with it and the documentation seems thin. Doxygen in the past did not have reputation for handling indirection well or being precise, but past versions weren't based on Clang. There is some further discussion of this applied on small scale at https://mcmap.net/q/195479/-how-to-generate-a-call-graph-for-c-code

Your question is tagged c/c++ but seems to be about C++. For C, our DMS Software Reengineering Toolkit with its generic flow analysis and call graph generation support, coupled with DMS's C Front End, has been used to analyze C systems of some 16 million lines/50,000 functions with indirect calls to produce conservatively correct call graphs.

We have not specifically tried to build C++ call graphs for large systems, but the same DMS generic flow analysis and call graph generation would be "technically straightforward" used with DMS's C++ Front End. When building a static analysis that operates correctly and at scale, nothing is trivial to do.

Lalita answered 1/1, 2018 at 8:48 Comment(0)
S
1

If you're using libraries which invoke malloc, then you might want to take a look at the Joint Strike Fighter C++ Coding Standards. It's a coding style aimed towards mission critical software. One suggestion would be to write your own allocator(s). Another suggestion is to use something like jemalloc which has statistics, but is much more unpredictable since it is geared towards performance.


What you want is a mocking library with spy capabilities. How this works for each framework is going to vary, but here is an example using Google:

static std::function<void*(size_t)> malloc_bridge;

struct malloc_mock {
    malloc_mock() { malloc_bridge = std::bind(&malloc_mock::mock_, this, _1); }
    MOCK_METHOD1(mock_, void*(size_t));
}

void* malloc_cheat(size_t size) {
    return malloc_bridge(size);
}

#define malloc malloc_cheat

struct fixture {
    void f() { malloc(...); }
};

struct CustomTest : ::testing::test {
    malloc_mock mock_;
};

TEST_F(CustomTest, ShouldMallocXBytes) {
    EXPECT_CALL(mock_, mock_(X))
      .WillOnce(::testing::Return(static_cast<void*>(0)));
    Fixture fix;
    fix.f();
}

#undef malloc

WARNING: Code hasn't been touched by compiler hands. But you get the idea.

Seema answered 31/12, 2017 at 21:21 Comment(2)
The problem is: I'm not directly calling malloc. But I'm using C++ structures such as std::vector, boost::flat_map / flat_set, etc. which at some point may call into libc malloc (sometimes even without malloc appearing anywhere in the source; eg new int[100] will call malloc behind the scene at least on linux but you can't catch it with a macro)Brinkema
You could use something like jemalloc, which I believe does have statistics.Seema
A
1

This is not a full answer but you can try to use Valgrind to count allocs and frees. The default Valgrind tool memcheck by default counts the number of allocs and frees and prints resulting report in HEAP SUMMARY, here is a sample output:

$ valgrind ./a.out
==2653== Memcheck, a memory error detector
==2653== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
==2653== Using Valgrind-3.13.0 and LibVEX; rerun with -h for copyright info
==2653== Command: ./a.out
==2653== 
==2653== 
==2653== HEAP SUMMARY:
==2653==     in use at exit: 0 bytes in 0 blocks
==2653==   total heap usage: 2 allocs, 2 frees, 72,716 bytes allocated
==2653== 
==2653== All heap blocks were freed -- no leaks are possible
==2653== 
==2653== For counts of detected and suppressed errors, rerun with: -v
==2653== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)

You can add one more test doing nothing to count baseline allocs number:

void my_unit_test_baseline() {
    my_object obj; // perform some initialization that will allocate
}

Now you can run real test and compare the number of allocations with baseline test. If they are not equal than some allocations took place in your tested code. You can log this fact or signal it some other whatever you want way.

Arjuna answered 5/1, 2018 at 19:1 Comment(0)
L
1

In case you are using the GNU C library, you can use the _malloc_hook () and alike functions to have a user-defined function called whenever one of the functions of the malloc family is used.

Such a hooked function could analyse the call trace (using backtrace()) in order to find whether malloc was allowed in this call chain or not and print messages about the culprit if not.

Laundes answered 13/1, 2018 at 20:49 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.