Hidden features of C

U

56

141

I know there is a standard behind all C compiler implementations, so there should be no hidden features. Despite that, I am sure all C developers have hidden/secret tricks they use all the time.

Uncaused answered 25/9, 2008 at 9:2 Comment(1)

It'd be great if you/someone were to edit the “question” to indicate the pick of the best hidden features, such as in the C# and Perl versions of this question. – Papiamento 26/5, 2010 at 13:19

I

62

Function pointers. You can use a table of function pointers to implement, e.g., fast indirect-threaded code interpreters (FORTH) or byte-code dispatchers, or to simulate OO-like virtual methods.

Then there are hidden gems in the standard library, such as qsort(),bsearch(), strpbrk(), strcspn() [the latter two being useful for implementing a strtok() replacement].

A misfeature of C is that signed arithmetic overflow is undefined behavior (UB). So whenever you see an expression such as x+y, both being signed ints, it might potentially overflow and cause UB.

Interferometer answered 25/9, 2008 at 9:2 Comment(9)

But if they had specified behaviour on overflow, it would have made it very slow on architectures where that was not the normal behaviour. Very low runtime overhead has always been a design goal of C, and that has meant that a lot of things like this are undefined. – Ibiza 17/10, 2008 at 8:38

I'm very well aware of why overflow is UB. It is still a misfeature, because the standard should have at least provided library routines that can test for arithmetic overflow (of all basic operations) w/o causing UB. – Interferometer 20/1, 2009 at 20:51

@zvrba, "library routines that can test for arithmetic overflow (of all basic operations)" if you had added this then you would have incurred significant performance hit for any integer arithmetic operations. ===== Case study Matlab specifically ADDS the feature of controlling integer overflow behavior to wrapping or saturate. And it also throws an exception whenever overflow occurs ==> Performance of Matlab integer operations: VERY SLOW. My own conclusion: I think Matlab is a compelling case study that shows why you don't want integer overflow checking. – Philter 11/6, 2009 at 13:35

@zvrba, In my opinion, C was designed ASSUMING that whenever you are doing integer arithmetic you the programmer are doing rigorous analysis to ENSURE that you have bounded-input-bounded-output (fancy way of saying "make sure your input and output stay within a range")!! If you are not doing that rigorous analysis then it's not the language's fault it is the programmer's fault. – Philter 11/6, 2009 at 13:38

I said that the standard should have provided library support for checking for arithmetic overflow. Now, how can a library routine incur a performance hit if you never use it? – Interferometer 12/6, 2009 at 18:52

A big negative is that GCC does not have a flag to catch signed integer overflows and throw a runtime exception. While there are x86 flags for detecting such cases, GCC does not utilize them. Having such a flag would allow non-performance-critical (especially legacy) applications the benefit of security with minimal to no code review and refactoring. – Gyre 22/6, 2009 at 0:23

None of this is at all 'hidden'. The standard library for example is well advertised; if people choose not to read the documentation they are fools. Function pointers are merely 'advanced' not hidden in any way whatsoever - most texts on the language deal with them. – Katonah 11/11, 2009 at 13:52

Hidden is relative, documentation is relatively absolute. – Nathanaelnathanial 2/9, 2010 at 1:8

what is the need of an strtok replacement? – Altercation 26/11, 2010 at 13:57

I

115

More of a trick of the GCC compiler, but you can give branch indication hints to the compiler (common in the Linux kernel)

#define likely(x)       __builtin_expect((x),1)
#define unlikely(x)     __builtin_expect((x),0)

see: http://kerneltrap.org/node/4705

What I like about this is that it also adds some expressiveness to some functions.

void foo(int arg)
{
     if (unlikely(arg == 0)) {
           do_this();
           return;
     }
     do_that();
     ...
}

Instrumentalism answered 25/9, 2008 at 9:2 Comment(1)

This trick is cool... :) Especially with the macros you define. :) – Androsphinx 22/10, 2008 at 15:23

W

77

int8_t
int16_t
int32_t
uint8_t
uint16_t
uint32_t

These are an optional item in the standard, but it must be a hidden feature, because people are constantly redefining them. One code base I've worked on (and still do, for now) has multiple redefinitions, all with different identifiers. Most of the time it's with preprocessor macros:

#define INT16 short
#define INT32  long

And so on. It makes me want to pull my hair out. Just use the freaking standard integer typedefs!

Webby answered 25/9, 2008 at 9:2 Comment(11)

I think they are C99 or so. I haven't found a portable way to ensure these would be around. – Selfassurance 25/9, 2008 at 19:25

They are an optional part of C99, but I know of no compiler vendors that don't implement this. – Webby 25/9, 2008 at 21:7

stdint.h isn't optional in C99, but following the C99 standard apparently is for some vendors (cough Microsoft). – Ithyphallic 22/10, 2008 at 17:54

Microsoft Visual C++ doesn't follow the Ada95 standard either. It's not a C99 compiler. It's a C++ 97 compiler. (It doesn't always follow that standard, but it's not fair to complain about it not being something it doesn't claim to be) – Kellum 1/3, 2009 at 11:33

@Pete, if you want to be anal: (1) This thread has nothig to do with any Microsoft product. (2) This thread never had anything to do with C++ at all. (3) There is no such thing as C++ 97. – Webby 1/3, 2009 at 21:20

Have a look at azillionmonkeys.com/qed/pstdint.h -- a close-to-portable stdint.h – Fennelly 16/4, 2009 at 14:16

@gnud: thanks for the tip, but my whole gripe is that it isn't necessary - most compilers implement the standard typedefs. The only compiler I've ever used that didn't was an old version of GCC adapted for embedded VxWorks development (old, like, GCC 2.7). – Webby 16/4, 2009 at 21:56

@Ben Collins: He's pointing out that it almost implements C++98, but falls short of several requirements. Furthermore, MSVC doesn't support C99, especially stdint.h which is a royal PITA. – Stryker 22/10, 2009 at 11:52

@Anacrolix: Yes. I understood what he was pointing out. You seem to miss my point though: it's apropos nothing. Whether or not a particular compiler "supports C99" really has nothing at all to do with whether or not you should use the standard integer typedefs. They are portable and easy to define even if your compiler sucks. If you need to specify a certain integer width, then the standard typedefs should always, always be used, regardless of where the definitions come from. – Webby 23/10, 2009 at 21:59

Thanks so much! It gets my back up when I use the Windows headers and you get typedef unsigned long ULONG. Like, seriously? Or, typedef float FLOAT. – Hypocaust 8/5, 2010 at 8:58

To my knowledge, Visual Studio 2010 now has stdint.h due to high demand for that specific feature of C99. It was a royal PITA that it wasn't included earlier than that. – Mothy 19/1, 2011 at 21:12

W

72

The comma operator isn't widely used. It can certainly be abused, but it can also be very useful. This use is the most common one:

for (int i=0; i<10; i++, doSomethingElse())
{
  /* whatever */
}

But you can use this operator anywhere. Observe:

int j = (printf("Assigning variable j\n"), getValueFromSomewhere());

Each statement is evaluated, but the value of the expression will be that of the last statement evaluated.

Webby answered 25/9, 2008 at 9:2 Comment(4)

In 20years of C I have NEVER seen that! – Partook 22/6, 2009 at 14:55

In C++ you can even overload it. – Marella 1/7, 2009 at 10:45

can != should, of course. The danger with overloading it is that the built in applies to everything already, including void, so will never fail to compile for lack of available overload. Ie, gives programmer much rope. – Urita 6/7, 2009 at 17:45

The int inside the loop will not work with C: it's a C++ improvment. Is the "," the same operation as for (i=0,j=10;i<j; j--, i++) ? – Altercation 26/11, 2010 at 14:0

L

63

initializing structure to zero

struct mystruct a = {0};

this will zero all stucture elements.

Llovera answered 25/9, 2008 at 9:2 Comment(14)

Does it zero out an entire array, as well? – Nagging 28/12, 2008 at 3:30

It doesn't zero the padding, if any, however. – Fraktur 1/3, 2009 at 13:49

@drhorrible: You can do int array[50] = {0}; for example. But only at declaration (I think.) – Naos 26/4, 2009 at 11:8

Doesn't this do something undefined if the structure contains non-integral types (e.g. floats and doubles)? – Sudan 11/6, 2009 at 11:20

@simonn, no it doesn't do undefined behavior if the structure contains non-integral types. memset with 0 on the memory of a float/double will still be zero when you interpret the float/double (float/double are designed like that on purpose). – Philter 11/6, 2009 at 13:59

@Trevor I thought that the effect this has on floats and such is "all bytes zero" which, in all sane cases, would give you a float equal to 0.0, but it's still implementation-defined. – Gyre 22/6, 2009 at 0:27

@Andrew: memset/calloc do "all bytes zero" (i.e. physical zeroes), which is indeed not defined for all types. { 0 } is guaranteed to intilaize everything with proper logical zero values. Pointers, for example, are guranteed to get their proper null values, even if the null-value on the given platform is 0xBAADFOOD. – Cocci 28/10, 2009 at 10:12

@AndreyT: Could you please elaborate on diff between logical and physical zeroes? – Sigrid 26/2, 2010 at 9:38

@nvl: You get physical zero when you just forcefully set all memory occupied by the object to all-bits-zero state. This is what memset does (with 0 as second argument). You get logical zero when you initialize/assign 0 ( or { 0 }) to the object in the source code. These two kinds of zeros do not necessarily produce the same result. As in the example with pointer. When you do memset on a pointer, you get a 0x0000 pointer. But when you assign 0 to a pointer, you get null pointer value, which at the physical level might be 0xBAADF00D or anything else. – Cocci 26/2, 2010 at 16:29

In other words, physical value is the explicit bit-pattern that is stored in memory. Logical value is how that bit-pattern is interpreted at the program level. – Cocci 26/2, 2010 at 16:31

@AndreyT I am clear about the physical zero thing. But cannot digest this logical zeroes completely. Any other data structure, other than pointer, which has different logical/physical representations? – Sigrid 26/2, 2010 at 18:57

@nvl: Well, in practice the difference is often only conceptual. But in theory, virtually any type can have it. For example, double. Usually it is implemented in accordance with IEEE-754 standard, in which the logical zero and physical zero are the same. But IEEE-754 is not required by the language. So it might happen that when you do double d = 0; (logical zero), physically some bits in memory occupied by d will not be zero. – Cocci 26/2, 2010 at 19:17

Same with bool values, for another example. If you do bool b = false; (or, equivalently, bool b = 0;) it does not necessarily mean that in physical memory b will be zeroed out (even though it is usually the case in practice). – Cocci 26/2, 2010 at 19:18

Got it. Thanks. I remember the floating point representation now. Makes sense. – Sigrid 26/2, 2010 at 21:10