What belongs in an educational tool to demonstrate the unwarranted assumptions people make in C/C++?
Asked Answered
I

23

125

I'd like to prepare a little educational tool for SO which should help beginners (and intermediate) programmers to recognize and challenge their unwarranted assumptions in C, C++ and their platforms.

Examples:

  • "integers wrap around"
  • "everyone has ASCII"
  • "I can store a function pointer in a void*"

I figured that a small test program could be run on various platforms, which runs the "plausible" assumptions which are, from our experience in SO, usually made by many inexperienced/semiexperienced mainstream developers and record the ways they break on diverse machines.

The goal of this is not to prove that it is "safe" to do something (which would be impossible to do, the tests prove only anything if they break), but instead to demonstrate to even the most uncomprehending individual how the most inconspicuous expression break on a different machine, if it has a undefined or implementation defined behavior..

To achieve this I would like to ask you:

  • How can this idea be improved?
  • Which tests would be good and how should they look like?
  • Would you run the tests on the platforms you can get your hands on and post the results, so that we end up with a database of platforms, how they differ and why this difference is allowed?

Here's the current version for the test toy:

#include <stdio.h>
#include <limits.h>
#include <stdlib.h>
#include <stddef.h>
int count=0;
int total=0;
void expect(const char *info, const char *expr)
{
    printf("..%s\n   but '%s' is false.\n",info,expr);
    fflush(stdout);
    count++;
}
#define EXPECT(INFO,EXPR) if (total++,!(EXPR)) expect(INFO,#EXPR)

/* stack check..How can I do this better? */
ptrdiff_t check_grow(int k, int *p)
{
    if (p==0) p=&k;
    if (k==0) return &k-p;
    else return check_grow(k-1,p);
}
#define BITS_PER_INT (sizeof(int)*CHAR_BIT)

int bits_per_int=BITS_PER_INT;
int int_max=INT_MAX;
int int_min=INT_MIN;

/* for 21 - left to right */
int ltr_result=0;
unsigned ltr_fun(int k)
{
    ltr_result=ltr_result*10+k;
    return 1;
}

int main()
{
    printf("We like to think that:\n");
    /* characters */
    EXPECT("00 we have ASCII",('A'==65));
    EXPECT("01 A-Z is in a block",('Z'-'A')+1==26);
    EXPECT("02 big letters come before small letters",('A'<'a'));
    EXPECT("03 a char is 8 bits",CHAR_BIT==8);
    EXPECT("04 a char is signed",CHAR_MIN==SCHAR_MIN);

    /* integers */
    EXPECT("05 int has the size of pointers",sizeof(int)==sizeof(void*));
    /* not true for Windows-64 */
    EXPECT("05a long has at least the size of pointers",sizeof(long)>=sizeof(void*));

    EXPECT("06 integers are 2-complement and wrap around",(int_max+1)==(int_min));
    EXPECT("07 integers are 2-complement and *always* wrap around",(INT_MAX+1)==(INT_MIN));
    EXPECT("08 overshifting is okay",(1<<bits_per_int)==0);
    EXPECT("09 overshifting is *always* okay",(1<<BITS_PER_INT)==0);
    {
        int t;
        EXPECT("09a minus shifts backwards",(t=-1,(15<<t)==7));
    }
    /* pointers */
    /* Suggested by jalf */
    EXPECT("10 void* can store function pointers",sizeof(void*)>=sizeof(void(*)()));
    /* execution */
    EXPECT("11 Detecting how the stack grows is easy",check_grow(5,0)!=0);
    EXPECT("12 the stack grows downwards",check_grow(5,0)<0);

    {
        int t;
        /* suggested by jk */
        EXPECT("13 The smallest bits always come first",(t=0x1234,0x34==*(char*)&t));
    }
    {
        /* Suggested by S.Lott */
        int a[2]={0,0};
        int i=0;
        EXPECT("14 i++ is strictly left to right",(i=0,a[i++]=i,a[0]==1));
    }
    {
        struct {
            char c;
            int i;
        } char_int;
        EXPECT("15 structs are packed",sizeof(char_int)==(sizeof(char)+sizeof(int)));
    }
    {
        EXPECT("16 malloc()=NULL means out of memory",(malloc(0)!=NULL));
    }

    /* suggested by David Thornley */
    EXPECT("17 size_t is unsigned int",sizeof(size_t)==sizeof(unsigned int));
    /* this is true for C99, but not for C90. */
    EXPECT("18 a%b has the same sign as a",((-10%3)==-1) && ((10%-3)==1));

    /* suggested by nos */
    EXPECT("19-1 char<short",sizeof(char)<sizeof(short));
    EXPECT("19-2 short<int",sizeof(short)<sizeof(int));
    EXPECT("19-3 int<long",sizeof(int)<sizeof(long));
    EXPECT("20 ptrdiff_t and size_t have the same size",(sizeof(ptrdiff_t)==sizeof(size_t)));
#if 0
    {
        /* suggested by R. */
        /* this crashed on TC 3.0++, compact. */
        char buf[10];
        EXPECT("21 You can use snprintf to append a string",
               (snprintf(buf,10,"OK"),snprintf(buf,10,"%s!!",buf),strcmp(buf,"OK!!")==0));
    }
#endif

    EXPECT("21 Evaluation is left to right",
           (ltr_fun(1)*ltr_fun(2)*ltr_fun(3)*ltr_fun(4),ltr_result==1234));

    {
    #ifdef __STDC_IEC_559__
    int STDC_IEC_559_is_defined=1;
    #else 
    /* This either means, there is no FP support
     *or* the compiler is not C99 enough to define  __STDC_IEC_559__
     *or* the FP support is not IEEE compliant. */
    int STDC_IEC_559_is_defined=0;
    #endif
    EXPECT("22 floating point is always IEEE",STDC_IEC_559_is_defined);
    }

    printf("From what I can say with my puny test cases, you are %d%% mainstream\n",100-(100*count)/total);
    return 0;
}

Oh, and I made this community wiki right from the start because I figured that people want to edit my blabber when they read this.

UPDATE Thanks for your input. I've added a few cases from your answers and will see if I can set up a github for this like Greg suggested.

UPDATE: I've created a github repo for this, the file is "gotcha.c":

Please answer here with patches or new ideas, so they can be discussed or clarified here. I will merge them into gotcha.c then.

Introrse answered 11/8, 2010 at 11:59 Comment(19)
You could take a look at gnu autotools. If I remember correctly, it was made to test for stuff like that during ./configureAnchylose
Would you be so kind to indicate me an example, an explanation, or an url dealing with the fact that a function pointer cannot be stored in a void* pointer ?Breaux
Consider the medium model in DOS. Functions can be stored in multiple segments, so a function pointer is 32 bits long. But your data is stored in a single segment only, therefore data pointers are only 16 bits long. Since void* is a data pointer, it's 16 bits wide, so you can't fit a function pointer in one. See c-jump.com/CIS77/ASM/Directives/D77_0030_models.htm.Toad
I've added TurboC++/DOS which shows that. Also I'll be running this on gcc/ATMEGA later which IMHO had the same thing (code and data memory are separate on ATMEGAs)Introrse
Perhaps you could throw this code up on github.com or something and then people could easily contribute patches.Grisby
+1: this is an awesome idea, and a possibly great future link to post on "yet another platform assumption" questions.Paperboard
@Stephane: Another example could be a pointer to a member function in C++.Tartarus
A lot of things here should help: stackoverflow.com/questions/367633/…Cajuput
This is awesome! I'm 79% mainstream on gcc 4.4.3, 32 bit.Edison
How about int x; for (x = 1; x > 0; x += x); . This yields an infinite loop when compiled with GCC using -O2 or higher. Using unsigned int instead of int makes it work, but I'm not sure that's trustworthy either.Splendent
POSIX requires that function pointers have the same representation as void * and can be converted (with a cast) without loss of information. One of the reasons for this is that dlsym() returns a void * but is intended for both data and function pointers. Therefore it may not be so bad to depend on this.Valenba
@Joey: With a signed type, an infinite loop is the mainstream behavior (2's complement, no overflow check): the program loops at MIN_INT. With an unsigned type, the behavior is defined: the loop stops after N iterations where N is the number of value bits in the type.Wraith
@Gilles: Note that if you say int x; for (x = 1; x > 0; x += x) printf("%d\n", x);, it will print 0s in an infinite loop, but only if the compiler optimizes here.Splendent
Your point 15 is imho backwards. Except if explicitly asked for or for 8 bit compilers the struct will never be packed. The default assumption is that it will not be packed. EDIT: as can be seen the only compiler that doesn't trigger it is on the Commodore PET.Reconstructive
@tristopia: Point 15 is here, because many beginners are often surprised to learn that data is not packed continuously but instead aligned to certain boundaries. They're puzzled when they change the member order and get different object sizes. Also, packing is the default mode with many contemporary micro controller or embedded devices. My AVR Atmega and TurboC/MSDOS output is packed too. MSDOS is still used in industrial applications.Introrse
I'm not sure how it could be illustrated reliably in a test program, but not all assumptions are violated by the CPU. Some may be broken due to compiler GCC's strict aliasing comes to mind as an example of code that beginners intuitively expect to work, and which breaks not because of hardware quirks, but because the compiler optimizes the codeSturm
Btw. the packed issue is important, because it follows from it, that subtracting two pointers that are not from the same array can not work anymore (it is undefined behavior anyway)Introrse
@tristopia, The compiler that I learned on also packed structures by default.Lathery
So far, the assumptions of ASCII (00, 01, 02), 8-bit char (03, 19-1), 2's-complement integers (06, 07), a downward-growing stack (11, 12), and truncating division (18) have held for all of the test results posted.Serai
G
93

The order of evaluation of subexpressions, including

  • the arguments of a function call and
  • operands of operators (e.g., +, -, =, * , /), with the exception of:
    • the binary logical operators (&& and ||),
    • the ternary conditional operator (?:), and
    • the comma operator (,)

is Unspecified

For example

  int Hello()
  {
       return printf("Hello"); /* printf() returns the number of 
                                  characters successfully printed by it
                               */
  }

  int World()
  {
       return printf("World !");
  }

  int main()
  {

      int a = Hello() + World(); //might print Hello World! or World! Hello
      /**             ^
                      | 
                Functions can be called in either order
      **/
      return 0;
  } 
Gambit answered 11/8, 2010 at 11:59 Comment(13)
I had always known that about function parameters, but I never thought of it in terms of operators ... ... and if I ever see you writing code like that in a production environment, I will slap you with a wet noodle.Caballero
Hmm... this isn't always true. Examnple: ||, &&. Logical operators are required to short circuit.Treacy
@Billy: But only for the primitive versions of the operators.Novel
@Dennis @Billy Which is one of the more confusing parts of reading C++ code, and the reason I never overload those operatorsNinetta
@Dennis: That is true. (Which is why it's an item in Effective/MoreEffective C++ to never overload those (Unless you're writing boost::spirit)Treacy
@Billy ONeal - item 30 in Sutter/Alexandrescu's Coding Standards book. But I'm not sure who it should be aimed at. If the problem is that users expect short-circuiting to work consistently, then the advice should be: don't use libraries that overload &&, etc. Overloading them yourself isn't hard; it's your caller who has to be careful. (Ultimately I think it's hollow advice anyway: internal DSLs have different semantics, get used to it.)Marchland
@Daniel: I'm not sure what you're trying to say. It sounds like you are suggesting its okay to overload the operators because its only the users of your class that might get it wrong, and if you aren't writing in straight C++ it doesn't matter. Neither of which make any sense at all.Novel
Neither of those make sense to me either; they sound almost opposite to what I said. If you want to assume consistent short-circuiting semantics, then not only must you never overload && but you must also review the external interface code of any 3rd party libraries you consume, to ensure they don't do it either. It's advice for consumers, as well as for implementers who want to serve users who have this concern. But...Marchland
... then an exception is made in the rule for some libraries; how are those exceptional libraries identified? Do they need to have smarter users, or do they take some other steps to ensure the changed semantics are not harmful?Marchland
@Prasoon The example that you quote has multiple side effects on stdout. So it has undefined behavior. It can print Hello World! or World! Hello. Or you know fry your computer.Roentgenograph
@user420536 : The behavior is just unspecified but not undefined. Yes the example can print either Hello World! or World! Hello but that's just unspecified because the order of evaluation of operands of + operator is unspecified (Compiler writers need not document the behaviour). It doesn't violate any sequence point rule as such.Gambit
@PrasoonSaurav; Please give an example for = operator.Brow
Quote from standard, just in case: Except where noted, evaluations of operands of individual operators and of subexpressions of individual expressions are unsequenced. [ Note: In an expression that is evaluated more than once during the execution of a program, unsequenced and indeterminately sequenced evaluations of its subexpressions need not be performed consistently in different evaluations. — end note ]Benedix
I
38

sdcc 29.7/ucSim/Z80

We like to think that:
..09a minus shifts backwards
   but '(t=-1,(15<<t)==7)' is false.
..19-2 short<int
   but 'sizeof(short)<sizeof(int)' is false.
..22 floating point is always IEEE
   but 'STDC_IEC_559_is_defined' is false.
..25 pointer arithmetic works outside arrays
   but '(diff=&var.int2-&var.int1, &var.int1+diff==&var.int2)' is false.
From what I can say with my puny test cases, you are Stop at 0x0013f3: (106) Invalid instruction 0x00dd

printf crashes. "O_O"


gcc 4.4@x86_64-suse-linux

We like to think that:
..05 int has the size of pointers
but 'sizeof(int)==sizeof(void*)' is false.
..08 overshifting is okay
but '(1<<bits_per_int)==0' is false.
..09a minus shifts backwards
but '(t=-1,(15<<t)==7)' is false.
..14 i++ is strictly left to right
but '(i=0,a[i++]=i,a[0]==1)' is false.
..15 structs are packed
but 'sizeof(char_int)==(sizeof(char)+sizeof(int))' is false.
..17 size_t is unsigned int
but 'sizeof(size_t)==sizeof(unsigned int)' is false.
..26 sizeof() does not evaluate its arguments
but '(i=10,sizeof(char[((i=20),10)]),i==10)' is false.
From what I can say with my puny test cases, you are 79% mainstream

gcc 4.4@x86_64-suse-linux(-O2)

We like to think that:
..05 int has the size of pointers
but 'sizeof(int)==sizeof(void*)' is false.
..08 overshifting is okay
but '(1<<bits_per_int)==0' is false.
..14 i++ is strictly left to right
but '(i=0,a[i++]=i,a[0]==1)' is false.
..15 structs are packed
but 'sizeof(char_int)==(sizeof(char)+sizeof(int))' is false.
..17 size_t is unsigned int
but 'sizeof(size_t)==sizeof(unsigned int)' is false.
..26 sizeof() does not evaluate its arguments
but '(i=10,sizeof(char[((i=20),10)]),i==10)' is false.
From what I can say with my puny test cases, you are 82% mainstream

clang 2.7@x86_64-suse-linux

We like to think that:
..05 int has the size of pointers
but 'sizeof(int)==sizeof(void*)' is false.
..08 overshifting is okay
but '(1<<bits_per_int)==0' is false.
..09a minus shifts backwards
but '(t=-1,(15<<t)==7)' is false.
..14 i++ is strictly left to right
but '(i=0,a[i++]=i,a[0]==1)' is false.
..15 structs are packed
but 'sizeof(char_int)==(sizeof(char)+sizeof(int))' is false.
..17 size_t is unsigned int
but 'sizeof(size_t)==sizeof(unsigned int)' is false.
..21a Function Arguments are evaluated right to left
but '(gobble_args(0,ltr_fun(1),ltr_fun(2),ltr_fun(3),ltr_fun(4)),ltr_result==4321)' is false.
ltr_result is 1234 in this case
..25a pointer arithmetic works outside arrays
but '(diff=&p1-&p2, &p2+diff==&p1)' is false.
..26 sizeof() does not evaluate its arguments
but '(i=10,sizeof(char[((i=20),10)]),i==10)' is false.
From what I can say with my puny test cases, you are 72% mainstream

open64 4.2.3@x86_64-suse-linux

We like to think that:
..05 int has the size of pointers
but 'sizeof(int)==sizeof(void*)' is false.
..08 overshifting is okay
but '(1<<bits_per_int)==0' is false.
..09a minus shifts backwards
but '(t=-1,(15<<t)==7)' is false.
..15 structs are packed
but 'sizeof(char_int)==(sizeof(char)+sizeof(int))' is false.
..17 size_t is unsigned int
but 'sizeof(size_t)==sizeof(unsigned int)' is false.
..21a Function Arguments are evaluated right to left
but '(gobble_args(0,ltr_fun(1),ltr_fun(2),ltr_fun(3),ltr_fun(4)),ltr_result==4321)' is false.
ltr_result is 1234 in this case
..25a pointer arithmetic works outside arrays
but '(diff=&p1-&p2, &p2+diff==&p1)' is false.
..26 sizeof() does not evaluate its arguments
but '(i=10,sizeof(char[((i=20),10)]),i==10)' is false.
From what I can say with my puny test cases, you are 75% mainstream

intel 11.1@x86_64-suse-linux

We like to think that:
..05 int has the size of pointers
but 'sizeof(int)==sizeof(void*)' is false.
..08 overshifting is okay
but '(1<<bits_per_int)==0' is false.
..09a minus shifts backwards
but '(t=-1,(15<<t)==7)' is false.
..14 i++ is strictly left to right
but '(i=0,a[i++]=i,a[0]==1)' is false.
..15 structs are packed
but 'sizeof(char_int)==(sizeof(char)+sizeof(int))' is false.
..17 size_t is unsigned int
but 'sizeof(size_t)==sizeof(unsigned int)' is false.
..21a Function Arguments are evaluated right to left
but '(gobble_args(0,ltr_fun(1),ltr_fun(2),ltr_fun(3),ltr_fun(4)),ltr_result==4321)' is false.
ltr_result is 1234 in this case
..26 sizeof() does not evaluate its arguments
but '(i=10,sizeof(char[((i=20),10)]),i==10)' is false.
From what I can say with my puny test cases, you are 75% mainstream

Turbo C++/DOS/Small Memory

We like to think that:
..09a minus shifts backwards
but '(t=-1,(15<<t)==7)' is false.
..16 malloc()=NULL means out of memory
but '(malloc(0)!=NULL)' is false.
..19-2 short<int
but 'sizeof(short)<sizeof(int)' is false.
..22 floating point is always IEEE
but 'STDC_IEC_559_is_defined' is false.
..25 pointer arithmetic works outside arrays
but '(diff=&var.int2-&var.int1, &var.int1+diff==&var.int2)' is false.
..25a pointer arithmetic works outside arrays
but '(diff=&p1-&p2, &p2+diff==&p1)' is false.
From what I can say with my puny test cases, you are 81% mainstream

Turbo C++/DOS/Medium Memory

We like to think that:
..09a minus shifts backwards
but '(t=-1,(15<<t)==7)' is false.
..10 void* can store function pointers
but 'sizeof(void*)>=sizeof(void(*)())' is false.
..16 malloc()=NULL means out of memory
but '(malloc(0)!=NULL)' is false.
..19-2 short<int
but 'sizeof(short)<sizeof(int)' is false.
..22 floating point is always IEEE
but 'STDC_IEC_559_is_defined' is false.
..25 pointer arithmetic works outside arrays
but '(diff=&var.int2-&var.int1, &var.int1+diff==&var.int2)' is false.
..25a pointer arithmetic works outside arrays
but '(diff=&p1-&p2, &p2+diff==&p1)' is false.
From what I can say with my puny test cases, you are 78% mainstream

Turbo C++/DOS/Compact Memory

We like to think that:
..05 int has the size of pointers
but 'sizeof(int)==sizeof(void*)' is false.
..09a minus shifts backwards
but '(t=-1,(15<<t)==7)' is false.
..16 malloc()=NULL means out of memory
but '(malloc(0)!=NULL)' is false.
..19-2 short<int
but 'sizeof(short)<sizeof(int)' is false.
..20 ptrdiff_t and size_t have the same size
but '(sizeof(ptrdiff_t)==sizeof(size_t))' is false.
..22 floating point is always IEEE
but 'STDC_IEC_559_is_defined' is false.
..25 pointer arithmetic works outside arrays
but '(diff=&var.int2-&var.int1, &var.int1+diff==&var.int2)' is false.
..25a pointer arithmetic works outside arrays
but '(diff=&p1-&p2, &p2+diff==&p1)' is false.
From what I can say with my puny test cases, you are 75% mainstream

cl65@Commodore PET (vice emulator)

alt text


I'll be updating these later:


Borland C++ Builder 6.0 on Windows XP

..04 a char is signed
   but 'CHAR_MIN==SCHAR_MIN' is false.
..08 overshifting is okay
   but '(1<<bits_per_int)==0' is false.
..09 overshifting is *always* okay
   but '(1<<BITS_PER_INT)==0' is false.
..09a minus shifts backwards
   but '(t=-1,(15<<t)==7)' is false.
..15 structs are packed
   but 'sizeof(char_int)==(sizeof(char)+sizeof(int))' is false.
..16 malloc()=NULL means out of memory
   but '(malloc(0)!=NULL)' is false.
..19-3 int<long
   but 'sizeof(int)<sizeof(long)' is false.
..22 floating point is always IEEE
   but 'STDC_IEC_559_is_defined' is false.
From what I can say with my puny test cases, you are 71% mainstream

Visual Studio Express 2010 C++ CLR, Windows 7 64bit

(must be compiled as C++ because the CLR compiler does not support pure C)

We like to think that:
..08 overshifting is okay
   but '(1<<bits_per_int)==0' is false.
..09a minus shifts backwards
   but '(t=-1,(15<<t)==7)' is false.
..14 i++ is structly left to right
   but '(i=0,a[i++]=i,a[0]==1)' is false.
..15 structs are packed
   but 'sizeof(char_int)==(sizeof(char)+sizeof(int))' is false.
..19-3 int<long
   but 'sizeof(int)<sizeof(long)' is false.
..22 floating point is always IEEE
   but 'STDC_IEC_559_is_defined' is false.
From what I can say with my puny test cases, you are 78% mainstream

MINGW64 (gcc-4.5.2 prerelase)

-- http://mingw-w64.sourceforge.net/

We like to think that:
..05 int has the size of pointers
   but 'sizeof(int)==sizeof(void*)' is false.
..05a long has at least the size of pointers
   but 'sizeof(long)>=sizeof(void*)' is false.
..08 overshifting is okay
   but '(1<<bits_per_int)==0' is false.
..09a minus shifts backwards
   but '(t=-1,(15<<t)==7)' is false.
..14 i++ is structly left to right
   but '(i=0,a[i++]=i,a[0]==1)' is false.
..15 structs are packed
   but 'sizeof(char_int)==(sizeof(char)+sizeof(int))' is false.
..17 size_t is unsigned int
   but 'sizeof(size_t)==sizeof(unsigned int)' is false.
..19-3 int<long
   but 'sizeof(int)<sizeof(long)' is false.
..22 floating point is always IEEE
   but 'STDC_IEC_559_is_defined' is false.
From what I can say with my puny test cases, you are 67% mainstream

64 bit Windows uses the LLP64 model: Both int and long are defined as 32-bit, which means that neither is long enough for a pointer.


avr-gcc 4.3.2 / ATmega168 (Arduino Diecimila)

The failed assumptions are:

..14 i++ is structly left to right
..16 malloc()=NULL means out of memory
..19-2 short<int
..21 Evaluation is left to right
..22 floating point is always IEEE

The Atmega168 has a 16 bit PC, but code and data are in separate address spaces. Larger Atmegas have a 22 bit PC!.


gcc 4.2.1 on MacOSX 10.6, compiled with -arch ppc

We like to think that:
..09a minus shifts backwards
   but '(t=-1,(15<<t)==7)' is false.
..13 The smallest bits come always first
   but '(t=0x1234,0x34==*(char*)&t)' is false.
..14 i++ is structly left to right
   but '(i=0,a[i++]=i,a[0]==1)' is false.
..15 structs are packed
   but 'sizeof(char_int)==(sizeof(char)+sizeof(int))' is false.
..19-3 int<long
   but 'sizeof(int)<sizeof(long)' is false.
..22 floating point is always IEEE
   but 'STDC_IEC_559_is_defined' is false.
From what I can say with my puny test cases, you are 78% mainstream

Introrse answered 11/8, 2010 at 11:59 Comment(3)
And you've identified another assumption: that you can fit 80 characters on a terminal line.Galatia
sizeof(void*)>=sizeof(void(*)()) would be more relevant than ==. All we care about is "can we store a function pointer in a void pointer", so the assumption you need to test is whether a void* is at least as big as a function pointer.Sturm
If your environment is POSIX-compliant, you should be okay with sizeof(void*)>=sizeof(void(*)()) - see opengroup.org/onlinepubs/009695399/functions/dlsym.htmlMarchland
P
26

A long time ago, I was teaching C from a textbook that had

printf("sizeof(int)=%d\n", sizeof(int));

as a sample question. It failed for a student, because sizeof yields values of type size_t, not int, int on this implementation was 16 bits and size_t was 32, and it was big-endian. (The platform was Lightspeed C on 680x0-based Macintoshes. I said it was a long time ago.)

Prouty answered 11/8, 2010 at 11:59 Comment(3)
+1 for pointing out one of the most common, and commonly-overlooked, errors of this sort.Spurling
This also happens on 64-bit systems, where size_t is 64 bit and ints are almost always shorter. Win64 is still weirder, because size_t is an unsigned long long there. Added as Test 17.Introrse
Unfortunately, Microsoft's C runtime doesn't support the z modifier for size_t sized integers, and long long isn't supported on some platforms as well. So there's no safe portable way to format or cast the printed size of an object.Spineless
F
16

You need to include the ++ and -- assumptions people make.

a[i++]= i;

For example, is syntactically legal, but produces varying results depending on too many things to reason out.

Any statement that has ++ (or --) and a variable which occurs more than once is a problem.

Flaky answered 11/8, 2010 at 11:59 Comment(1)
And it's just such a common question too!Sylph
T
8

Very interesting!

Other things I can think of it might be useful to check for:

  • do function pointers and data pointers exist in the same address space? (Breaks in Harvard architecture machines like DOS small mode. Don't know how you'd test for it, though.)

  • if you take a NULL data pointer and cast it to the appropriate integer type, does it have the numeric value 0? (Breaks on some really ancient machines --- see http://c-faq.com/null/machexamp.html.) Ditto with function pointer. Also, they may be different values.

  • does incrementing a pointer past the end of its corresponding storage object, and then back again, cause sensible results? (I don't know of any machines this actually breaks on, but I believe the C spec does not allow you to even think about pointers that don't point to either (a) the contents of an array or (b) the element immediately after the array or (c) NULL. See http://c-faq.com/aryptr/non0based.html.)

  • does comparing two pointers to different storage objects with < and > produce consistent results? (I can imagine this breaking on exotic segment-based machines; the spec forbids such comparisons, so the compiler would be entitled to compare the offset part of the pointer only, and not the segment part.)

Hmm. I'll try and think of some more.

Edit: Added some clarifying links to the excellent C FAQ.

Toad answered 11/8, 2010 at 11:59 Comment(3)
Incidentally, a while back I did an experimental project called Clue (cluecc.sourceforge.net) which allowed you to compile C into Lua, Javascript, Perl, LISP, etc. It ruthlessly exploited the undefined behaviour in the C standard to make pointers work. It may be interesting to try this test on it.Toad
IIRC C allows you to increment a pointer by 1 beyond the end of an object, but not any further. Decrementing it to a position before the beginning of an object is not allowed, however.Spurling
@R. Same in C++. And incrementing further might break if incrementing the pointer causes an overflow, on CPU's which don't just treat pointers as integers.Sturm
S
5
EXPECT("## pow() gives exact results for integer arguments", pow(2, 4) == 16);

Another one is about text mode in fopen. Most programmers assume that either text and binary are the same (Unix) or that text mode adds \r characters (Windows). But C has been ported to systems that use fixed-width records, on which fputc('\n', file) on a text file means to add spaces or something until the file size is a multiple of the record length.

And here are my results:

gcc (Ubuntu 4.4.3-4ubuntu5) 4.4.3 on x86-64

We like to think that:
..05 int has the size of pointers
   but 'sizeof(int)==sizeof(void*)' is false.
..08 overshifting is okay
   but '(1<<bits_per_int)==0' is false.
..09a minus shifts backwards
   but '(t=-1,(15<<t)==7)' is false.
..14 i++ is strictly left to right
   but '(i=0,a[i++]=i,a[0]==1)' is false.
..15 structs are packed
   but 'sizeof(char_int)==(sizeof(char)+sizeof(int))' is false.
..17 size_t is unsigned int
   but 'sizeof(size_t)==sizeof(unsigned int)' is false.
From what I can say with my puny test cases, you are 78% mainstream
Serai answered 11/8, 2010 at 11:59 Comment(1)
I've actually seen code that combined pow(2, n) with bit operations.Serai
V
5

Here's a fun one: What's wrong with this function?

float sum(unsigned int n, ...)
{
    float v = 0;
    va_list ap;
    va_start(ap, n);
    while (n--)
        v += va_arg(ap, float);
    va_end(ap);
    return v;
}

[Answer (rot13): Inevnqvp nethzragf borl gur byq X&E cebzbgvba ehyrf, juvpu zrnaf lbh pnaabg hfr 'sybng' (be 'pune' be 'fubeg') va in_net! Naq gur pbzcvyre vf erdhverq abg gb gerng guvf nf n pbzcvyr-gvzr reebe. (TPP qbrf rzvg n jneavat, gubhtu.)]

Vicar answered 11/8, 2010 at 11:59 Comment(7)
Oh, that's a good one. clang 2.7 eats this and produces complete nonsense without a warning.Introrse
va_arg expands if it's a macro and the while loop only executes the first statement, of perhaps many?Profiteer
Nope (if that happened it would be a bug in the implementation).Vicar
I'm curious — how do you infer that the standard specifies (words to the effect that) "the compiler is required not to treat this as a compile-time error"? (See §7.16.1.1 The va_arg macro for the C11 specification — I realize that C11 was not current when this answer was written, but C99 says substantially the same thing in §7.15.1.1 The va_arg macro).Humphreys
@JonathanLeffler That phrasing specifies runtime UB, i.e. the program as a whole only has UB if the function is actually called.Vicar
@JonathanLeffler (I cannot remember exactly why this is so. It came up on the GCC development list many years ago, and thinking back, it could have been more like "we're going to assume that all UB described in clause 7 is runtime-only unless it's utterly unambiguous that a compile error is OK" than "the standard definitely says we must complete translation of this program despite the UB".)Vicar
Thanks for responding — I, too, often find it hard to remember why I did or said something thirteen years ago. I plan to leave the "I'm curious" comment in place but there's no particular need for you to update the answer without anything better than fuzzy memory to guide you. If you do decide to make an update, lemme know, and I'll reconsider. (And just in case other readers are confused, the comment under discussion is in the ROT13-encoded text.)Humphreys
S
5

I think you should make an effort to distinguish between two very different classes of "incorrect" assumptions. A good half (right shift and sign extension, ASCII-compatible encoding, memory is linear, data and function pointers are compatible, etc.) are pretty reasonable assumptions for most C coders to make, and might even be included as part of the standard if C were being designed today and if we didn't have legacy IBM junk grandfathered-in. The other half (things related to memory aliasing, behavior of library functions when input and output memory overlap, 32-bit assumptions like that pointers fit in int or that you can use malloc without a prototype, that calling convention is identical for variadic and non-variadic functions, ...) either conflict with optimizations modern compilers want to perform or with migration to 64-bit machines or other new technology.

Spurling answered 11/8, 2010 at 11:59 Comment(3)
it's not just "IBM junk" (though I agree the IBM stuff is junk). Many embedded systems today have similar problems.Joni
To clarify, using malloc without a prototype means not including <stdlib.h>, which causes malloc to default to int malloc(int), a no-no if you want to support 64-bit.Splendent
Technically you're free not to include <stdlib.h> as long as you include another header that defines size_t and you then declare malloc with a correct prototype yourself.Spurling
C
4

Include a check for integer sizes. Most people assume that an int is bigger than a short is bigger than a char. However, these might all be false: sizeof(char) < sizeof(int); sizeof(short) < sizeof(int); sizeof(char) < sizeof(short)

This code might fail (crashes to unaligned access)

unsigned char buf[64];

int i = 234;
int *p = &buf[1];
*p = i;
i = *p;
Ciera answered 11/8, 2010 at 11:59 Comment(7)
would this code fail in C++? IIRC, it is illegal to cast pointers between unrelated types, EXCEPT for char*, which can be cast to any type (or is it the other way around?).Joni
You could just do int *p = (int*)&buf[1]; in c++, people expect that to work too.Ciera
@nos, yeah that can fail but the fail is crash so his program can't test for that one. :(Lathery
sizeof(char) < sizeof(int) is required. For example, fgetc() returns the value of the character as an unsigned char converted to int, or EOF which is a negative value. unsigned char may not have padding bits, so the only way this can be done is by making int larger than char. Also, (most versions of) the C spec require that any value from the range -32767..32767 can be stored in an int.Valenba
@illes still, there's DSPs with 32 bit chars and 32 bit ints.Ciera
@jilles: there is no requirement that char is only 8 bits though. And fgetc's return type doesn't by itself prove anything.Sturm
@nos: I believe those DSP implementations are freestanding implementations. A hosted implementation, unlike a freestanding one, is required to have fgetc, and thus jilles' point about the impossibility of implementing fgetc when sizeof(int)==1 is a valid point and it precludes the existence of any such hosted implementation.Spurling
M
4

Well the classic portability assumptions not meantioned yet are

  • assumptions about size of integral types
  • endianness
Martens answered 11/8, 2010 at 11:59 Comment(6)
"Endianness", including "There is an endianness": there are middle-endian machines, and the standard allows weird things like storing a short value fedcab9876543210 (that's 16 binary digits) as the two bytes 0248ace and fdb97531.Wraith
yes endianess for sure includes mixed/middle endian as well as big and little. if you go to custom hardware you could have any endianess you like on any bus.Martens
Middle endian is known as PDP endian. Gilles decribes something even weirder though that would cause headaches for implementing TCP/IP.Lathery
@Gilles: middle-endian... I am very glad I'm not developing on that one. (but now I'll get asked to do a middle-endian networking project, I'm sure)...Bandore
ARM FPE used middle-endian doubles, where they were stored as a <high quad> <low quad> pair but the ordering of the bits inside each quad were the wrong way round. (Thankfully, ARM VFP doesn't do this any more.)Toad
What about the pure binary representation in C99 §6.2.6.2? Bytes inside a word may have any ordering, but value bits inside a byte should follow a pure binary representation.Vague
R
4

Some of them can't easily be tested from inside C because the program is likely to crash on the implementations where the assumption doesn't hold.


"It's ok to do anything with a pointer-valued variable. It only needs to contain a valid pointer value if you dereference it."

void noop(void *p); /* A no-op function that the compiler doesn't know to optimize away */
int main () {
    char *p = malloc(1);
    free(p);
    noop(p); /* may crash in implementations that verify pointer accesses */
    noop(p - 42000); /* and if not the previous instruction, maybe this one */
}

Same with integral and floating point types (other than unsigned char), which are allowed to have trap representations.


"Integer calculations wrap around. So this program prints a large negative integer."

#include <stdio.h>
int main () {
    printf("%d\n", INT_MAX+1); /* may crash due to signed integer overflow */
    return 0;
}

(C89 only.) "It's ok to fall off the end of main."

#include <stdio.h>
int main () {
    puts("Hello.");
} /* The status code is 7 on many implementations. */
Rodrich answered 11/8, 2010 at 11:59 Comment(7)
As a concrete example: When compiled with gcc -ftrapv -O, the output is We like to think that: followed by AbortedSethrida
@caf: "This option generates traps for signed overflow on addition, subtraction, multiplication operations." Nice to know, thanks.Wraith
The last one is ok in C++ (98, 03 and 0x) as well, and implicitly returns 0.Sturm
Which is nasty because pre-ANSI C allowed this and C99 does as well.Lathery
@Joshua: AFAIK there is no difference between pre-ANSI C and C89 on return from main with no value: the program is correct but returns an undefined termination status (C89 §2.1.2.2). With many implementations (such as gcc, and older unix compilers) you get whatever was in a certain register at that point. The program typically works until it's used in a makefile or other environment that checks the termination status.Wraith
This could be a real issue with dos protected mode architecture. Trying to load an invalid pointer into a segment register:register pair would blow up if the segment register wasn't a valid segment. I had a long debug session with this once when my pointer was to a real-mode address. I handled it safely, the library didn't. (And the debugger was heisenbugged also--single-stepping through assembly code worked despite the invalid segment.Indole
Giles, you're right about random return. The guarantee in pre-ansi was only it wouldn't crash as that -ftrapv compilation did.Lathery
T
4
  • Discretization errors due to floating point representation. For example, if you use the standard formula to solve quadratic equations, or finite differences to approximate derivatives, or the standard formula to calculate variances, precision will be lost due to the calculation of differences between similiar numbers. The Gauß algorithm to solve linear systems is bad because rounding errors accumulate, thus one uses QR or LU decomposition, Cholesky decomposition, SVD, etc. Addition of floating point numbers is not associative. There are denormal, infinite and NaN values. a + bab.

  • Strings: Difference between characters, code points, and code units. How Unicode is implemented on the various operating systems; Unicode encodings. Opening a file with an arbitrary Unicode file name is not possible with C++ in a portable way.

  • Race conditions, even without threading: if you test whether a file exists, the result could become invalid at any time.

  • ERROR_SUCCESS = 0

Tyus answered 11/8, 2010 at 11:59 Comment(0)
O
3

A couple of things about built-in data types:

  • char and signed char are actually two distinct types (unlike int and signed int which refer to the same signed integer type).
  • signed integers are not required to use two's complement. Ones's complement and sign+magnitude are also valid representations of negative numbers. This makes bit operations involving negative numbers implementation-defined.
  • If you assign an out-of-range integer to a signed integer variable, the behaviour is implementation-defined.
  • In C90, -3/5 could return 0 or -1. Rounding towards zero in case one operand was negative is only guaranteed in C99 upwards and C++0x upwards.
  • There are no exact size guarantees for the built-in types. The standard only covers minimal requirements such as an int has at least 16 bits, a long has at least 32 bits, a long long has at least 64 bits. A float can at least represent 6 most significant decimal digits correctly. A double can at least represent 10 most significant decimal digits correctly.
  • IEEE 754 is not mandatory for representing floating point numbers.

Admittedly, on most machines we'll have two's complement and IEEE 754 floats.

Ocam answered 11/8, 2010 at 11:59 Comment(1)
I wonder what value there is in having out-of-range integer assignments be implementation-defined rather than Undefined Behavior? On some platforms, such a requirement would force the compiler to generate extra code for int mult(int a,int b) { return (long)a*b;} [e.g. if int is 32 bits, but registers and long are 64]. Without such a requirement, the "natural" behavior of the fastest implementation of long l=mult(1000000,1000000); would set l equal to 1000000000000, even though that's an "impossible" value for an int.Leandro
L
3

How about this one:

No data pointer can ever be the same as a valid function pointer.

This is TRUE for all flat models, MS-DOS TINY, LARGE, and HUGE models, false for MS-DOS SMALL model, and almost always false for MEDIUM and COMPACT models (depends on load address, you will need a really old DOS to make it true).

I can't write a test for this

And worse: pointers casted to ptrdiff_t may be compared. This not true for MS-DOS LARGE model (the only difference between LARGE and HUGE is HUGE adds compiler code to normalize pointers).

I can't write a test because the environment where this bombs hard won't allocate a buffer greater than 64K so the code that demonstrates it would crash on other platforms.

This particular test would pass on one now-defunct system (notice it depends on the internals of malloc):

  char *ptr1 = malloc(16);
  char *ptr2 = malloc(16);
  if ((ptrdiff_t)ptr2 - 0x20000 == (ptrdiff_t)ptr1)
      printf("We like to think that unrelated pointers are equality comparable when cast to the appropriate integer, but they're not.");
Lathery answered 11/8, 2010 at 11:59 Comment(0)
B
3

EDIT: Updated to the last version of the program

Solaris-SPARC

gcc 3.4.6 in 32 bit

We like to think that:
..08 overshifting is okay
   but '(1<<bits_per_int)==0' is false.
..09 overshifting is *always* okay
   but '(1<<BITS_PER_INT)==0' is false.
..09a minus shifts backwards
   but '(t=-1,(15<<t)==7)' is false.
..13 The smallest bits always come first
   but '(t=0x1234,0x34==*(char*)&t)' is false.
..14 i++ is strictly left to right
   but '(i=0,a[i++]=i,a[0]==1)' is false.
..15 structs are packed
   but 'sizeof(char_int)==(sizeof(char)+sizeof(int))' is false.
..19-3 int<long
   but 'sizeof(int)<sizeof(long)' is false.
..22 floating point is always IEEE
   but 'STDC_IEC_559_is_defined' is false.
From what I can say with my puny test cases, you are 72% mainstream

gcc 3.4.6 in 64 bit

We like to think that:
..05 int has the size of pointers
   but 'sizeof(int)==sizeof(void*)' is false.
..08 overshifting is okay
   but '(1<<bits_per_int)==0' is false.
..09 overshifting is *always* okay
   but '(1<<BITS_PER_INT)==0' is false.
..09a minus shifts backwards
   but '(t=-1,(15<<t)==7)' is false.
..13 The smallest bits always come first
   but '(t=0x1234,0x34==*(char*)&t)' is false.
..14 i++ is strictly left to right
   but '(i=0,a[i++]=i,a[0]==1)' is false.
..15 structs are packed
   but 'sizeof(char_int)==(sizeof(char)+sizeof(int))' is false.
..17 size_t is unsigned int
   but 'sizeof(size_t)==sizeof(unsigned int)' is false.
..22 floating point is always IEEE
   but 'STDC_IEC_559_is_defined' is false.
From what I can say with my puny test cases, you are 68% mainstream

and with SUNStudio 11 32 bit

We like to think that:
..08 overshifting is okay
   but '(1<<bits_per_int)==0' is false.
..09a minus shifts backwards
   but '(t=-1,(15<<t)==7)' is false.
..13 The smallest bits always come first
   but '(t=0x1234,0x34==*(char*)&t)' is false.
..14 i++ is strictly left to right
   but '(i=0,a[i++]=i,a[0]==1)' is false.
..15 structs are packed
   but 'sizeof(char_int)==(sizeof(char)+sizeof(int))' is false.
..19-3 int<long
   but 'sizeof(int)<sizeof(long)' is false.
From what I can say with my puny test cases, you are 79% mainstream

and with SUNStudio 11 64 bit

We like to think that:
..05 int has the size of pointers
   but 'sizeof(int)==sizeof(void*)' is false.
..08 overshifting is okay
   but '(1<<bits_per_int)==0' is false.
..09a minus shifts backwards
   but '(t=-1,(15<<t)==7)' is false.
..13 The smallest bits always come first
   but '(t=0x1234,0x34==*(char*)&t)' is false.
..14 i++ is strictly left to right
   but '(i=0,a[i++]=i,a[0]==1)' is false.
..15 structs are packed
   but 'sizeof(char_int)==(sizeof(char)+sizeof(int))' is false.
..17 size_t is unsigned int
   but 'sizeof(size_t)==sizeof(unsigned int)' is false.
From what I can say with my puny test cases, you are 75% mainstream
Balough answered 11/8, 2010 at 11:59 Comment(0)
W
2

You can use text-mode (fopen("filename", "r")) to read any sort of text file.

While this should in theory work just fine, if you also use ftell() in your code, and your text file has UNIX-style line-endings, in some versions of the Windows standard library, ftell() will often return invalid values. The solution is to use binary mode instead (fopen("filename", "rb")).

Woodhouse answered 11/8, 2010 at 11:59 Comment(0)
L
1

How about right-shifting by excessive amounts--is that allowed by the standard, or worth testing?

Does Standard C specify the behavior of the following program:

void print_string(char *st)
{
  char ch;
  while((ch = *st++) != 0)
    putch(ch);  /* Assume this is defined */
}
int main(void)
{
  print_string("Hello");
  return 0;
}

On at least one compiler I use, that code will fail unless the argument to print_string is a "char const *". Does the standard permit such a restriction?

Some systems allow one to produce pointers to unaligned 'int's and others don't. Might be worth testing.

Leandro answered 11/8, 2010 at 11:59 Comment(5)
C89 §3.3.7: “If the value of the right operand is negative or is greater than or equal to the width in bits of the promoted left operand, the behavior is undefined.” (applies to both << and >>). C99 has identical language in §6.5.7-3.Wraith
Apart from putch (why didn't you use the standard putchar?), I can't see any undefined behavior in your program. C89 §3.1.4 specifies that “a character string literal has […] type ‘array of char’” (note: no const), and that “if the program attempts to modify a string literal […], the behavior is undefined”. What compiler is that, and how does it translate this program?Wraith
In C++ character constants are not char[], they're const char[]. However... there used to be a specific hole in the type system to allow you to use a string constant in a context where a char* was expected and not get a type error. This led to situations where print_string("foo") would work but print_string("foo"+0) would not. This was deeply confusing, particular in environments where C files are compiled using a C++ compiler by default. The hole has been removed in new compilers but there are still plenty of old ones around. AFAIK C99 still defines string constants to be char[].Toad
On the HiTech compilers for the Microchip PIC series of controllers, a pointer without a storage qualifier can only point to RAM. A const-qualified pointer may point to either RAM or ROM. Non-const-qualified pointers are dereferenced directly in the code; const-qualified pointers are dereferenced via library routine. Depending upon the particular type of PIC, non-const-qualified pointers are 1 or 2 bytes; const-qualified ones are 2 or 3. Since ROM is much more plentiful than RAM, having constants in ROM is generally a good thing.Leandro
@David Given: Note my previous comment too. I prefer compilers which use qualifiers other than "const" to denote hardware storage class; the HiTech compiler has some rather annoying quirks with its storage class allocation (e.g. data items whose "component size" is a byte, or data items which are over 256 bytes, go in a "big" segment. Other data items go in the "bss" segment for the module they're defined; all the "bss" items in a module must fit within 256 bytes. Arrays that are slightly short of 256 bytes can be a real nuisance.Leandro
G
1

Via Codepad.org (C++: g++ 4.1.2 flags: -O -std=c++98 -pedantic-errors -Wfatal-errors -Werror -Wall -Wextra -Wno-missing-field-initializers -Wwrite-strings -Wno-deprecated -Wno-unused -Wno-non-virtual-dtor -Wno-variadic-macros -fmessage-length=0 -ftemplate-depth-128 -fno-merge-constants -fno-nonansi-builtins -fno-gnu-keywords -fno-elide-constructors -fstrict-aliasing -fstack-protector-all -Winvalid-pch) .

Note that Codepad did not have stddef.h. I removed test 9 due to codepad using warnings as errors. I also renamed the count variable since it was already defined for some reason.

We like to think that:
..08 overshifting is okay
   but '(1<<bits_per_int)==0' is false.
..14 i++ is structly left to right
   but '(i=0,a[i++]=i,a[0]==1)' is false.
..15 structs are packed
   but 'sizeof(char_int)==(sizeof(char)+sizeof(int))' is false.
..19-3 int<long
   but 'sizeof(int)<sizeof(long)' is false.
From what I can say with my puny test cases, you are 84% mainstream
Gant answered 11/8, 2010 at 11:59 Comment(0)
B
1

Visual Studio Express 2010 on 32-bit x86.

Z:\sandbox>cl testtoy.c
Microsoft (R) 32-bit C/C++ Optimizing Compiler Version 16.00.30319.01 for 80x86
Copyright (C) Microsoft Corporation.  All rights reserved.

testtoy.c
testtoy.c(54) : warning C4293: '<<' : shift count negative or too big, undefined
 behavior
Microsoft (R) Incremental Linker Version 10.00.30319.01
Copyright (C) Microsoft Corporation.  All rights reserved.

/out:testtoy.exe
testtoy.obj

Z:\sandbox>testtoy.exe
We like to think that:
..08 overshifting is okay
   but '(1<<bits_per_int)==0' is false.
..09a minus shifts backwards
   but '(t=-1,(15<<t)==7)' is false.
..14 i++ is structly left to right
   but '(i=0,a[i++]=i,a[0]==1)' is false.
..15 structs are packed
   but 'sizeof(char_int)==(sizeof(char)+sizeof(int))' is false.
..19-3 int<long
   but 'sizeof(int)<sizeof(long)' is false.
..22 floating point is always IEEE
   but 'STDC_IEC_559_is_defined' is false.
From what I can say with my puny test cases, you are 78% mainstream
Bandore answered 11/8, 2010 at 11:59 Comment(0)
R
1

Standard math functions on different systems don't give identical results.

Rooster answered 11/8, 2010 at 11:59 Comment(0)
M
1

An assumption that some may do in C++ is that a struct is limited to what it can do in C. The fact is that, in C++, a struct is like a class except that it has everything public by default.

C++ struct:

struct Foo
{
  int number1_;  //this is public by default


//this is valid in C++:    
private: 
  void Testing1();
  int number2_;

protected:
  void Testing2();
};
Millsap answered 11/8, 2010 at 11:59 Comment(0)
M
1

gcc 3.3.2 on AIX 5.3 (yeah, we need to update gcc)

We like to think that:
..04 a char is signed
   but 'CHAR_MIN==SCHAR_MIN' is false.
..09a minus shifts backwards
   but '(t=-1,(15<<t)==7)' is false.
..13 The smallest bits come always first
   but '(t=0x1234,0x34==*(char*)&t)' is false.
..14 i++ is structly left to right
   but '(i=0,a[i++]=i,a[0]==1)' is false.
..15 structs are packed
   but 'sizeof(char_int)==(sizeof(char)+sizeof(int))' is false.
..16 malloc()=NULL means out of memory
   but '(malloc(0)!=NULL)' is false.
..19-3 int<long
   but 'sizeof(int)<sizeof(long)' is false.
..22 floating point is always IEEE
   but 'STDC_IEC_559_is_defined' is false.
From what I can say with my puny test cases, you are 71% mainstream
Musicology answered 11/8, 2010 at 11:59 Comment(0)
S
0

FYI, For those who have to translate their C skills to Java, here are a few gotchas.

EXPECT("03 a char is 8 bits",CHAR_BIT==8);
EXPECT("04 a char is signed",CHAR_MIN==SCHAR_MIN);

In Java, char is 16-bit and signed. byte is 8-bit and signed.

/* not true for Windows-64 */
EXPECT("05a long has at least the size of pointers",sizeof(long)>=sizeof(void*));

long is always 64-bit, references can be 32-bit or 64-bit (if you have more than an app with more than 32 GB) 64-bit JVMs typically use 32-bit references.

EXPECT("08 overshifting is okay",(1<<bits_per_int)==0);
EXPECT("09 overshifting is *always* okay",(1<<BITS_PER_INT)==0);

The shift is masked so that i << 64 == i == i << -64, i << 63 == i << -1

EXPECT("13 The smallest bits always come first",(t=0x1234,0x34==*(char*)&t));

ByteOrder.nativeOrder() can be BIG_ENDIAN or LITTLE_ENDIAN

EXPECT("14 i++ is strictly left to right",(i=0,a[i++]=i,a[0]==1));

i = i++ never changes i

/* suggested by David Thornley */
EXPECT("17 size_t is unsigned int",sizeof(size_t)==sizeof(unsigned int));

The size of collections and arrays is always 32-bit regardless of whether the JVM is 32-bit or 64-bit.

EXPECT("19-1 char<short",sizeof(char)<sizeof(short));
EXPECT("19-2 short<int",sizeof(short)<sizeof(int));
EXPECT("19-3 int<long",sizeof(int)<sizeof(long));

char is 16-bit, short is 16-bit, int is 32-bit and long is 64-bit.

Shrimp answered 11/8, 2010 at 11:59 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.