How to determine maximum stack usage?

Asked 23/12, 2008 at 15:50 Answered 7/8, 2013 at 19:12

Solved memory embedded stack code-analysis

What methods are available for determining the optimum stack size for embedded/memory constrained system? If it's too big then memory is wasted that could be used elsewhere. However, if it is too small then we get this website's namesake...

To try to jump start things: Jack Ganssle states in The Art of Designing Embedded Systems that, "With experience, one learns the standard, scientific way to compute the proper size for a stack: Pick a size at random and hope." Can anyone do better than that?

A more specific example was requested. So, how about a C program targeting an MSP430 MCU with 2 kB of RAM using the IAR Embedded Workbench toolchain without an operating system? This IDE can display the stack contents and usage while using a JTAG debugger.

Slavic answered 23/12, 2008 at 15:50 Comment(5)

depends on the chipset / OS / programming language you use. – Levis 23/12, 2008 at 15:57

Glad to see this question has some answers, unlike #178016 – Nomadize 23/12, 2008 at 17:50

I saw that question when asking, but thought the embedded tilt separated them... – Slavic 23/12, 2008 at 18:1

Jack Ganssle says more than just that. I think that was just his intro one-liner. From his book, 2nd edition, p. 250: "Since few programmers have a reasonable way to determine maximum stack requirements, always assume your estimates will be incorrect. For each stack in the system, make sure the initialization code fills the entire amount of memory allocated to the stack with the value 0x55. Later, when debugging, you can view the stack and detect stack overflows by seeing no blocks of 0x55 in that region..." – Dickenson 18/11, 2009 at 23:51

I know that's not all he says, but I tried to used in the same way he did: to provoke interest in the topic. ;) – Slavic 19/11, 2009 at 14:39

The most common way to determine the deepest stack usage is to initialize the stack memory with some known but unusual value, then periodically (or at the end of a big test run) see where that pattern stops.

This is exactly how the IAR IDE determines the amount of stack used.

Amphictyon answered 23/12, 2008 at 16:21 Comment(6)

True, it fills the stack with 0xCD. – Slavic 23/12, 2008 at 16:23

I've used this technique before as well. The trick is to paint the stack in pre-main (or in main from a point just after your current stack frame) to your limit and then count backwards from the stack limit to the where painted memory is spoiled (not 0xCD) – Lively 23/12, 2008 at 18:56

How do you actually do that, though? I ask an equivalent question about a regular PC at #1741388. – Biogeography 16/11, 2009 at 15:26

@Biogeography - For a Windows program this isn't quite as workable; stack memory is managed by the OS so you don't really have a good opportunity to paint the stack. This solution is more tailored for embedded systems where the stack for a task is often managed by the application. I'd have to think about how to do this safely on Windows. – Amphictyon 16/11, 2009 at 16:5

@Michael - Interesting method followed by IAR IDE and in embedded systems to determine the amount of stack used. Good that you pointed out that in windows the stack memory is managed by OS and hence there is no good chance to paint/color the stack ! +1. – Matchmaker 26/6, 2010 at 10:14

@Michael Doesn't Visual C++ "paint the stack" for you as well? – Slavic 17/6, 2011 at 1:23

You tagged your question with static-analysis, but this is a problem that is difficult to solve through static-analysis. The stack usage depends on the program's runtime profile, especially, if you're using recursion or alloca. Given that this is an embedded platform I guess it's also difficult to run something like ps or top and see how much stack your application is using.

An interesting approach is to use the address of the current stack frame in order to determine how much stack is used. You can do this by taking the address of a function's argument or local variable. Do that for the main function and for functions you think are using the most stack. The difference will tell you the amount of stack your application requires. Here is an example (assuming customary high-to-low stack growth).

char *stack_top, stack_bottom;

int
main(int argc, char *argv[])
{
    stack_top = (char *)&argc;
    // ...
    printf("Stack usage: %d\n", stack_top - stack_bottom);
}

void
deeply_nested_function(void)
{
    int a;
    stack_bottom = (char *)&a;
    // ...
}

If your compiler allows you to specify a custom function prologue (many do it to allow graph-based program profiling), you can even arrange for all functions to call such measuring code. Then your measurement function mecomes something like

void
stack_measurement_function(void)
{
    int a;
    stack_bottom = min(stack_bottom, (char *)&a);
    // ...
}

I used an approach similar to what I've described to generate these charts.

Disintegrate answered 23/12, 2008 at 16:13 Comment(2)

So, runtime testing is the answer in a nut shell? Pick a large stack and then reduce it after determining a ceiling? – Slavic 23/12, 2008 at 16:49

Since you say in your answer "this is a problem that is difficult to solve through static-analysis" I am mentioning absint.com/stackanalyzer – Triceps 19/1, 2011 at 12:58

With a good source code static analysis tool, you can produce a call graph for your application. Given that, and estimates of the amount of locals/temporaries produced by your compiler, you can straightforwardly compute a conservative estimate of the stack demand.

What I mean by "good" analysis tool is one that can read all the compilation units involved, can determine direct function calls, can determine indirect pointers, in the compilaiton unit, can compute a conservative points-to analysis across the entire system, can construct a call graph taking into account the points-to analysis. This eliminates a lot of tools, which is why one sees ad hoc methods such as "fill the stack at runtime and see what happens". You also need estimates of the stack demands the compiler places on the stack; you can approximate a lot of this by simply knowing how big the storage demands of all the types are, which is generally fairly easy to determine for embedded systems C programs. Finally, you need to believe that you applicaiton doesn't have recursive calls, or that the tool has an idea of the deepest recursion (probably by you telling it).

The DMS Software Reengineering Toolkit meets all of these requirements for C programs. See http://www.semanticdesigns.com/Products/DMS/DMSToolkit.html You still have to configure it to compute the stack demand by crawling the call graph and using the various size estimates.

If you want a fast answer, use the stack-fill trick. If you want an answer that you can recompute after each source code change you'll need the static analysis approach.

Ticklish answered 14/7, 2009 at 4:41 Comment(0)

I am working on this problem right now - i.e. analytical computation of stacksize. It is clearly going to be a highly recursive piece of code, because a function call could have an indexed array as one or more of its arguments and one or more of the array indices could involve function calls!!

However, a couple of realizations allow some relief from the complexity:

(1) When using a high-level language compiler, the stackpointer at the end of execution of each statement/line-of-code should be at the same place as it was at the start. (At least this would be a good rule to observe otherwise you are going to have problems!)

(2)The stackpointer after returning from each function or subroutine call should be the same as it was pre-call. Therefore the maximum stack size is the maximum, over all statements in the program, of the peak stacksize reached in each statement. (At least this would be a good rule to observe otherwise you are going to have problems!)

Of course a statement can include the recursive issues I mentioned above, but at least the problem of finding the maximum stacksize requirement of a whole program then boils down to finding the maximum stacksize requirement of each statement, and then picking the maximum of those.

This cannot be completed until all functions called have also been compiled. So I generate a file for each module compiled that records a number of stacksizes for each statement (basically the peak value prior to each function call and the value immmediately prior to each function call (excluding any as yet unknown additions to stacksize caused by the function call), and the function names involved. Then I retrospectively process these files using a recursive routine, once all functions have been compiled, to determine peak stacksize.

The fortunate thing is that, recursive routines apart, the maximum possible stacksize requirement does not then depend on program flow, although in a typical flow (which is data dependent) this maximum possible stacksize may never be reached.

Example: Suppose function 1 calls function 2 and the program flow of both depends on a data value X. Suppose there is a range of X which causes function 1 to execute its worst statement, which includes a call to function 2, which does not execute its worst case statement for that same range of X. Since we computed the maximum possible stacksize by using the worst cases for both function 1 and function 2 simultaneously, we may have overestimated the stacksize. At least we erred on the safe side.

I like to give interrupt routines their own stackspace on an OS stack if they need any, so they do not add to program stack requirements, apart from the return-from-interrupt address

Mcrae answered 7/8, 2013 at 19:12 Comment(0)

-13

Don't use recursion or recursive algorithms ever. (beware regex libraries)
Don't use arrays, always use malloc().
Don't use alloca(), some compilers even have bugs in this function.

Then examine by hand the portion of code, looking for where stack usage is likely the highest (remember I said no arrays)

Check the stack usage at the suspected high point in the code itself, log to debugger interface.
Place a cap based upon estimated stack usage with that cap in-place. e.g. limit server connections.

Sepal answered 27/12, 2011 at 22:58 Comment(3)

This question is tagged embedded. So your advice on never to "use arrays, always use malloc()" is especially out of place, but on any hardware I would say that was bad advice as a general rule. Why on earth would you always only allocate from the heap? – Humanist 19/6, 2012 at 1:0

It's hard to imagine why you haven't removed this answer. – Argot 21/3, 2016 at 18:42

Don't give awful, prescriptive advice with zero rationale. – Larvicide 7/5, 2018 at 9:58

Recommended topics

Hot tags