Portability of C code for different memory addressing schemes

Asked 11/4, 2012 at 14:39 Answered 11/4, 2012 at 20:8

If I understand correctly, the DCPU-16 specification for 0x10c describes a 16-bit address space where each offset addresses a 16-bit word, instead of a byte as in most other memory architectures. This has some curious consequences, e.g. I imagine that sizeof(char) and sizeof(short) would both return 1.

Is it feasible to keep C code portable between such different memory addressing schemes? What would be the gotchas to keep in mind?

edit: perhaps I should have given a more specific example. Let's say you have some networking code that deals with byte streams. Do you throw away half of your memory by putting only one byte at each address so that the code can stay the same, or do you generalize everything with bitshifts to deal with N bytes per offset?

edit2: The answers seem to focus on the issue of data type sizes, which wasn't the point - I shouldn't even have mentioned it. The question is about how to cope with losing the ability to address any byte in memory with a pointer. Is it reasonable to expect code to be agnostic about this?

Oratory answered 11/4, 2012 at 14:39 Comment(2)

So long as you don't do things like assume CHAR_BIT is always 8 then there isn't a huge issue. – Hamby 11/4, 2012 at 14:42

So long as you follow the C standard and do not make assumptions about things that the standard says are variable or result in undefined or unspecified or implementation-defined behavior or results, you should be fine. – Tonsorial 11/4, 2012 at 15:29

It's totally feasible. Roughly speaking, C's basic integer data types have sizes that uphold:

sizeof (char) <= sizeof (short) <= sizeof (int) <= sizeof (long)

The above is not exactly what the spec says, but it's close.

As pointed out by awoodland in a comment, you'd also expect a C compiler for the DCPU-16 to have CHAR_BIT == 16.

Bonus for not assuming that the DCPU-16 would have sizeof (char) == 2, that's a common fallacy.

Menhir answered 11/4, 2012 at 14:43 Comment(1)

It should be mentioned that sizeof(char) is always 1. – Peaceful 11/4, 2012 at 15:13

When you say, 'losing the ability to address a byte', I assume you mean 'bit-octet', rather than 'char'. Portable code should only assume CHAR_BIT >= 8. In practice, architectures that don't have byte addressing often define CHAR_BIT == 8, and let the compiler generate instructions for accessing the byte.

I actually disagree with the answers suggesting: CHAR_BIT == 16 as a good choice. I'd prefer: CHAR_BIT == 8, with sizeof(short) == 2. The compiler can handle the shifting / masking, just as it does for many RISC architectures, for byte access in this case.

I imagine Notch will revise and clarify the DCPU-16 spec further; there are already requests for an interrupt mechanism, and further instructions. It's an aesthetic backdrop for a game, so I doubt there will be an official ABI spec any time soon. That said, someone will be working on it!

Edit:

Consider an array of char in C. The compiler packs 2 bytes in each native 16-bit word of DCPU memory. So if we access, say, the 10th element (index 9), fetch the word # [9 / 2] = 4, and extract the byte # [9 % 2] = 1.

Let 'X' be the start address of the array, and 'I' be the index:

SET J, I
SHR J, 1    ; J = I / 2
ADD J, X    ; J holds word address
SET A, [J]  ; A holds word
AND I, 0x1  ; I = I % 2 {0 or 1}
MUL I, 8    ; I = {0 or 8} ; could use: SHL I, 3
SHR A, I    ; right shift by I bits for hi or lo byte.

The register A holds the 'byte' - it's a 16 bit register, so the top half can be ignored. Alternatively, the top half can be zeroed:

AND A, 0xff ; mask lo byte.

This is not optimized, but it conveys the idea.

Nerine answered 11/4, 2012 at 19:58 Comment(4)

So it is possible for C compilers to emulate byte-addressable memory? Interesting! – Oratory 12/4, 2012 at 8:29

...but wouldn't that halve the available memory if you stick to a 16-bit address space? Or would the compiler add bits to the addresses - in this case, one extra bit to address the low or high byte? – Oratory 12/4, 2012 at 8:43

I think I already understood that. I meant that the largest possible pointer 0xFFFF in the emulated byte-addressable memory would be translated to the word address 0x7FFF (followed by a bitshift). That leaves half of the memory non-addressable. Unless you use 17-bit pointers. – Oratory 12/4, 2012 at 14:39

@WimCoenen - yes, you're right. And not being able to address all 8-bit bytes in memory makes DCPU-16 a poor candidate for a 'reasonable' C environment. A 16-bit 'char' makes simple byte processing code waste already limited memory. OTOH, if it's just a matter of porting C code, then hacking 0x10c sort of loses its appeal. – Nerine 12/4, 2012 at 18:14

The equality goes rather like this:

1 == sizeof(char) <= sizeof(short) <= sizeof(int) <= sizeof(long)

The short type can be 1, and as a matter of fact maybe you'll even want the int type to be 1 too actually (I didn't read the spec, but I'm supposing the normal data type is 16 bit). This stuff is defined by the compiler.

For practicity, the compiler may want to set long to something larger than int even if it requires the compiler doing some extra work (like implementing addition/multiplication etc in software).

This isn't a memory addressing issue, but rather a granularity question.

Cloddish answered 11/4, 2012 at 18:0 Comment(0)

yes it is entirely possible to port C code

in terms of data transfer it would be advisable to either pack the bits (or use a compression) or send in 16 bit bytes

because the CPU will almost entirely communicate only with (game) internal devices that will likely also be all 16 bit this should be no real problem

BTW I agree that CHAR_BIT should be 16 as (IIRC) each char must be addressable so making CHAR_BIT ==8 will REQUIRE sizeof(char*) ==2 which will make everything else overcomplicated

Gomez answered 11/4, 2012 at 20:8 Comment(0)

Recommended topics

Hot tags