I’m having trouble understanding network byte ordering and the order in which data is sent and received over UDP
Data is sent and received over UDP in exactly the byte order given to the socket. You only send or receive arrays of bytes and UDP doesn't reorder the bytes within a datagram at all.
So then the question is, what to do about endianness?
Well, in many cases, the answer is "nothing". For one, most computers you'll run into today are running x86 architecture and always use little-endian. And in many scenarios, you have control over both ends anyway and so can always stick with one or the other. If your API has a way to convert things to and from streams of bytes, you can just do that and send and receive those bytes directly with your UDP socket.
But yes, sometimes you need to be able to deal with transmitting data in an endianness that is different from that natively supported on the architecture on which your program is running.
In your particular example, you seem to have chosen to use big-endian as the byte order for your protocol (I'm inferring this from the little bit of code…it's not really possible to know for sure without a good Minimal, Complete, and Verifiable example). Which is fine; the original BSD socket API includes the concept of "network byte order", which is big-endian. The socket library includes functions to convert, e.g. 16-bit integers from "host order" to "network order" and back.
But it's important to understand that the endianness affects your data at the level of each individual primitive in your data. You can't reverse the entire array of bytes all at once, because if you do that will change not just the endianness of each individual primitive, but also the order of the primitives themselves. And sure enough, you see this in your Wireshark trace: the two fields in your data structure have had their order swapped.
To handle endianness properly, you have to go through each field of your data structure and swap the bytes individually. You'll leave bytes alone, short
values (16-bit) will have their pair of bytes swapped, int
values (32-bit) will have their four-byte sequence reversed, long
values (64-bit) will have eight bytes reversed, and so on.
To make matters more complicated, some data structures aren't affected by endianness at all (e.g. UTF8-encoded text), while others have more complex rules (e.g. a Windows GUID/UUID, which is a 128-bit value that is actually defined as a complex data structure, having multiple fields of varying size).
The important thing to remember is that the endianness is always applied at the level of each individual primitive data value, taking into account the actual number of bytes that primitive data value uses.
start_id
andmessage_id
), and it is at the byte level (so in a 4-byte int, the 1st with the 4th byte are exchanged, and the 2nd with the 3rd) – Jeaniejeanine