Returning a pointer to a static buffer

Asked 1/7, 2022 at 15:6 Answered 4/7, 2022 at 6:13

In C, on a small embedded system, is there any reason not to do the following?

const char * filter_something(const char * original, const int max_length)
{
    static char buffer[BUFFER_SIZE];
    // Checking inputs for safety omitted

    // Copy input to buffer here with appropriate filtering, etc.
    return buffer;
}

This is essentially a utility function the source is flash memory which may be corrupted, we do a kind of "safe copy" to make sure we have a null terminated string. I chose to use a static buffer and make it available read only to the caller.

A colleague is telling me that I am somehow not respecting the scope of the buffer by doing this, to me it makes perfect sense for the use case we have.

Is there a reason not to do this?

Many thanks to all who responded. You have generally confirmed my ideas on this, which I am grateful for. I was looking for major reasons not to do this, I don't think that there are any. To clarify a few points:

rentrancy/thread safety is not a concern. It is a small (bare metal) embedded system with a single run loop. This code will not be called from ISRs, ever.
in this system we are not short on memory, but we do want very predictable behavior. For this reason I prefer declaring an object like this statically, even though it might be a little "wasteful". We have already had issues with large objects declared carelessly on the stack, which caused intermittent crashes (now fixed, but it took a while to diagnose). So in general, I am preferring static allocation, simply to have very predictability, reliability, and less potential issues downstream.

So basically it's a case of taking a certain approach for a specific system design.

Smilax answered 1/7, 2022 at 15:6 Comment(13)

Be careful: If your embedded system uses an OS with multiple threads/processes using shared memory, then calls to filter_something from different threads/processes will all access the one single buffer. It might be better to pass the buffer as an argument to the function, and let the caller handle the memory and buffering. – Splenetic 1/7, 2022 at 15:21

Umm... RAM memory is far more easily corrupted than flash (although flash is more certain to eventually get corrupted because of data retention). The proper way of dealing with that is hardware ECC or CRC. – Auraaural 4/7, 2022 at 6:2

@Auraaural ordinarily yes. But in this case the FLASH is being written to by an error handler. In some cases we see that the write of the error diagnostics info (a combination of stack dump and null terminated strings) is not completely written (probably due to some kind of power related issue). The block does have a CRC. – Smilax 4/7, 2022 at 7:1

@Someprogrammerdude - it's a simple embedded system with a single thread, and this code is never called from interrupts. We don't care about re-entrancy here. But indeed, this might be a concern in other situations. – Smilax 4/7, 2022 at 7:2

@Smilax Sounds like you should trouble-shoot that issue then, rather than implementing some work-around. A hardware solution that might be sufficient would be to provide a bulk cap for the MCU supply large enough to keep it supplied during the flash erase+program cycle. Software solutions might entail keeping duplicate mirror segments of the flash data, then verify both at start-up, optionally with a repair of the broken one from the non-broken. – Auraaural 4/7, 2022 at 8:2

@lundin of course we are doing so. But it is in the nature of a system that reports errors on failure that such a condition might arise. That's why we have CRC's and other measures. So we want to preserve whatever info we can if the write was incomplete. There are also severe hardware constraints that I cannot go into that mean a large decoupling cap or similar is not a possibility. We have to work with what we have. – Smilax 4/7, 2022 at 8:9

@lundin I don't think mirroring would help. If you didn't have time to write (say) 250 bytes, trying to write 500 is not going to make things better. – Smilax 4/7, 2022 at 8:9

@Smilax Mirroring means you'll always have one intact segment at all times. You don't write to both of them at the same time, but one at a time, and verify that it's correct. This is common practice in mission-critical applications. – Auraaural 4/7, 2022 at 9:31

@lundin if the current issue seems to be that writing a single block is not completed, how would it help? I'm aware of the idea of duplicating data (or otherwise avoiding a single point of failure) for reliability, but I don't see the benefit here. – Smilax 4/7, 2022 at 9:43

@Smilax Because if you do, then you can safely resume operation despite flash corruption. If you don't, then your program is toast and non-recoverable. – Auraaural 4/7, 2022 at 9:49

@lundin we don't resume anything. It's an error handler, we attempt to write diagnostics to FLASH on entry to the handler. If it didn't write completely it's most likely because the power went off or a hard reset occurred. Trying to do it twice will not help. – Smilax 4/7, 2022 at 10:5

Does this answer your question? Is returning a pointer to a static local variable safe? – Legrand 22/7, 2022 at 21:48

that wasn't quite the question - I know that it is safe, at least in a single threaded application< and so long as the user understands the way it works. My question was more whether there are other reasons not to do this - the answer is that there are pros and cons, and yes that was well explained. – Smilax 25/7, 2022 at 8:2

Pro

The behavior is well defined; the static buffer exists for the duration of the program and may be used by the program after filter_something returns.

Cons

Returning a static buffer is prone to error because people writing calls to the routines may neglect or be unaware that a static buffer is returned. This can lead to attempts to use multiple instances of the buffer from multiple calls to the function (in the same thread or different threads). Clear documentation is essential.
The static buffer exists for the duration of the program, so it occupies space at times when it may not be needed.

Pellikka answered 1/7, 2022 at 16:0 Comment(0)

It really depends on how filter_something is used. Take the following as an example

#include <stdio.h>
#include <string.h>

const char* filter(const char* original, const int max_length)
{
    static char buffer[1024];

    memset(buffer, 0, sizeof(buffer));
    memcpy(buffer, original, max_length);
    return buffer;
}

int main()
{
    const char *strone, *strtwo;
    char deepone[16], deeptwo[16];

    /* Case 1 */
    printf("%s\n", filter("everybody", 10));

    /* Case 2 */
    printf("%s %s %s\n", filter("nobody", 7), filter("somebody", 9), filter("anybody", 8));

    /* Case 2 */
    if (strcmp(filter("same",5), filter("different", 10)) == 0)
        printf("Strings same\n");
    else
        printf("Strings different\n");

    /* Case 3 - Both of these end up with the same pointer */
    strone = filter("same",5);
    strtwo = filter("different", 10);
    if (strcmp(strone, strtwo) == 0)
        printf("Strings same\n");
    else
        printf("Strings different\n");

    /* Case 4 - You need a deep copy if you wish to compare */
    strcpy(deepone, filter("same", 5));
    strcpy(deeptwo, filter("different", 10));
    if (strcmp(deepone, deeptwo) == 0)
        printf("Strings same\n");
    else
        printf("Strings different\n");
}

The output when gcc is used is

everybody
nobody nobody nobody
Strings same
Strings same
Strings different.

When filter is used by itself, it behaves quite well.
When it is used multiple times in an expression, ~~the behaviour is undefined~~ there is no telling what it will do. All instances will use the contents the last time the filter was executed. This depends on the order in which the execution was performed.
If an instance is taken, the contents of the instance will not stay the same as when the instance was taken. This is also a common problem when C++ coders switch to C# or Java.
If a deep copy of the instance is taken, then the contents of the instance when the instance was taken will be preserved.

In C++, this technique is often used when returning objects with the same consequences.

Bardo answered 1/7, 2022 at 15:37 Comment(2)

When it is used multiple times in an expression, the behaviour is undefined. In this case the order of evaluation of the arguments is merely unspecified. As currently written, there is no undefined behavior. – Torgerson 1/7, 2022 at 16:42

@AndrewHenle - I have reworded the comment. – Bardo 2/7, 2022 at 16:27

It is true that the identifier buffer only has scope local to the block in which it is declared. However, because it is declared static, its lifetime is that of the full program.

So returning a pointer to a static variable is valid. In fact, many standard functions do this such as strtok and ctime.

The one thing you need to watch for is that such a function is not reentrant. For example, if you do something like this:

printf("filter 1: %s, filter 2: %s\n", 
       filter_something("abc", 3), filter_something("xyz", 3));

The two function calls can occur in any order, and both return the same pointer, so you'll get the same result printed twice (i.e. the result of whatever call happens to occur last) instead of two different results.

Also, if such a function is called from two different threads, you end up with a race condition with the threads reading/writing the same place.

Theisen answered 1/7, 2022 at 15:15 Comment(4)

'returning a pointer to a static variable is valid and perfectly safe' no. For example, it is well-known that strtok() is not thread-safe:( – Ronn 1/7, 2022 at 15:56

@MartinJames It's not just not thread-safe, but also not re-entrant. However, I wouldn't still call it unsafe, any more than I'd call, say, memcpy unsafe (in the context of C language), even though it does not support overlapping buffers or handle NULL pointers or detect buffer overflows. – Piling 1/7, 2022 at 16:13

@MartinJames Good point, that wording was a bit sloppy. Removed and clarified the potential issues. – Theisen 1/7, 2022 at 16:23

Yeah, I thought about the recursion issue - that is more likely to be an issue with parsers and the like than interruption and reentrancy. Overall, I don't care if functions have such issues as long as they are clearly documented. I've seen first-hand what happens when they are not, and intermittent near-untraceable failures occur under real loads on customer sites. It's not a happy place:) – Ronn 1/7, 2022 at 18:2

Just to add to the previous answers, I think the problem, in a more abstract sense, is to make the filtering result broader in scope than it ought to be. You introduce a 'state' which seems useless, at least if the caller's intention is only to get a filtered string. In this case, it should be the caller who should create the array, likely on the stack, and pass it as a parameter to the filtering method. It is the introduction of this state that makes possible all the problems referred to in the preceding responses.

Gabionade answered 2/7, 2022 at 14:13 Comment(1)

this was suggested also by my colleague. However it is a small embedded system. We are not particularly short on memory, but we do want very predictable memory usage. For this reason I prefer not to declare such buffers on the stack (although this one is not particularly huge and likely would be OK). We've had previous issues with sloppy code which declared large buffers on stack, which causes intermittent crashes which were hard to fix. So (in this system) I prefer static allocation, because behaviour is so predictable. – Smilax 4/7, 2022 at 7:7

From a program design, it's frowned upon to return pointers to private data, in case that data was made private for a reason. That being said, it's less bad design to return a pointer to a local static then it is to use spaghetti programming with "globals" (external linkage). Particularly when the pointer returned is const qualified.

One general issue with staticvariables, that may or may not be a problem regardless of embedded or hosted system is re-entrancy. If the code needs to be interrupt/thread safe, then you need to implement means to achieve that.

The obvious alternative to it all is caller allocation and you've got to ask yourself why that's not an option:

void filter_something (size_t size, char dest[size], const char original[size]);

(Or if you will, [restrict size] on both pointers for a mini-optimization.)

Auraaural answered 4/7, 2022 at 6:13 Comment(0)

Hot tags

Godot Unity Godot Help Programming Godot 4.X GUI GDScript 3D 2D Physics CSharp Godot 3.X VR XR Projects C++

Pro

Cons

Recommended topics

Hot tags