Why is it said that HTTP2 is a binary protocol?

Asked 22/10, 2019 at 6:19 Answered 7/10, 2023 at 13:50

http https http2 web-development-server http-1.1

I've just read an article about differences between http1 and http2. The main question that I have is when it says that http2 is a binary protocol but http1 is a textual protocol.

Maybe I'm wrong but I know that any data, text or whatever format it can be, has a binary representation form in memory, and even when transfer through TCP/IP network the data is split to a format according with the layer of the OSI model or the TCP/IP model representation which means that technically textual format doesn't exist in the context of data transfer through network.

I cannot really understand this difference between http2 and http1, can you help me please with a better explanation?

Abridgment answered 22/10, 2019 at 6:19 Comment(1)

Imagine a simple binary protocol (not the real HTTP/2) in which the first bit of the packet is reserved for specifying the request method, e.g. 0 for GET and 1 for POST. It's always the first one bit, by convention. On the other hand, in a textual protocol, you would need either 3 bytes or 4 bytes for 'G', 'E', 'T' or 'P', 'O', 'S', 'T' respectively. Essentially, an HTTP/2 frame is a special data structure that is compressed and serialized before it's sent over the network. It's not just plain text that is converted to bytes using ASCII table... – Phalansterian 8/4, 2024 at 6:12

Binary is probably a confusing term - everything is ultimately binary at some point in computers!

HTTP/2 has a highly structured format where HTTP messages are formatted into packets (called frames) and where each frame is assigned to a stream. HTTP/2 frames have a specific format, including a length which is declared at the beginning of each frame and various other fields in the frame header. In many ways it’s like a TCP packet. Reading an HTTP/2 frame can follow a defined process (the first 24 bits are the length of this packet, followed by 8 bits which define the frame type... etc.). After the frame header comes the payload (e.g. HTTP Headers, or the Body payload) and these will also be in a specific format that is known in advance. An HTTP/2 message can be sent in one or more frames.

By contrast HTTP/1.1 is an unstructured format made up of lines of text in ASCII encoding - so yes this is transmitted as binary ultimately, but it’s basically a stream of characters rather than being specifically broken into separate pieces/frames (other than lines). HTTP/1.1 messages (or at least the first HTTP Request/Response line and HTTP Headers) are parsed by reading in characters one at a time, until a new line character is reached. This is kind of messy as you don’t know in advance how long each line is so you must process it character by character. In HTTP/1.1 the HTTP Body’s length is handled slightly different as typically is known in advance as a content-length HTTP header will define this. An HTTP/1.1 message must be sent in its entirety as one continuous stream of data and the connection can not be used for anything else but transmitting that message until it is completed.

The advantage that HTTP/2 brings is that, by packaging messages into specific frames we can intermingle the messages: here’s a bit of request 1, here’s a bit of request 2, here’s some more of request 1... etc. In HTTP/1.1 this is not possible as the HTTP message is not wrapped into packets/frames tagged with an id as to which request this belongs to.

I’ve a diagram here and an animated version here that help conceptualise this better.

Photocomposition answered 22/10, 2019 at 22:42 Comment(5)

Then how about the json body payload. Is it also in binary? In addition, for http1.1 content-type:image/gif(octect-stream), is body also encoded as textual or binary? – Mucilaginous 12/4, 2022 at 12:39

Body’s are stored in a DATA frame which includes a header giving details like frame type, stream… etc. The payload within that can be text for text formats, but not for binary formats like GIF. However typically text resources are compressed with gzip or Brotli so will be binary data. – Photocomposition 12/4, 2022 at 13:52

So basically text resources will be text which has to be translated to byte char by char. iintegers are still not supported in the text payload. Good thing is that we can use gzip(huffman encoding and LZ77) to shorten binary length – Mucilaginous 13/4, 2022 at 1:19

What is special about intermingling of frames than say - all frames of request1 going first then immediately all frames of request2 going down the same TCP connection in HTTP/2? To me, intermingling of frames just seems like reordering of different parts of different requests. Why do frames have to intermingle and why can't all the frames of req1 get sent together all in a sequential manner followed by all frames of req2? Why does gaps have to be present(how do they occur) between frames of a single request? – Mas 30/5, 2023 at 21:32

Intermingling helps when request 1 isn't available (e.g. it's a request to a CDN and has to go all the way back to the origin). Still being able to send back the CSS and JS while that is blocked avoids waste. It's true for certain requests (e.g. CSS or JS) the full payload needs to be delivered so actual intermingling doesn't help much but for others (e.g. progressive JPEG or HTML), intermingling allows part of the resources to be sent first and the browser to start processing them. See: blog.cloudflare.com/… – Photocomposition 31/5, 2023 at 4:5

HTTP basically encodes all relevant instructions as ASCII code points, e.g.:

GET /foo HTTP/1.1

Yes, this is represented as bytes on the actual transport layer, but the commands are based on ASCII bytes, and are hence readable as text.

HTTP/2 uses actual binary commands, i.e. individual bits and bytes which have no representation other than the bits and bytes that they are, and hence have no readable representation. (Note that HTTP/2 essentially wraps HTTP/1 in such a binary protocol, there's still "GET /foo" to be found somewhere in there.)

Beer answered 22/10, 2019 at 6:30 Comment(2)

There's a method GET and a path /foo, yes, but not actually a GET /foo ... – Transude 22/10, 2019 at 6:34

But the headers like the next ones are encoded in HTTP2 in binary format (non ASCII) or just the command header with the verbs GET, POST, PUT, DELETE?: text/html,application/xhtml+xml,application/xml;q=0.9,/;q=0.8 Accept-Language: en-us,en;q=0.5 Accept-Encoding: gzip,deflate Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7 Keep-Alive: 300 Connection: keep-alive Cookie: PHPSESSID=r2t5uvjq435r4q7ib3vtdjq120 Pragma: no-cache Cache-Control: no-cache – Floriated 5/2, 2024 at 20:14

Conclusion

HTTP/2 is a binary protocol that uses a binary format for data transmission, in contrast to HTTP/1.x, which uses text formats.

Binary formats are more efficient because they don't require character set conversion and parsing like text formats.

Example

For instance, in HTTP/1, request header information is sent as text, which the receiver must parse into text format before using. However, in HTTP/2, request header information is read as binary format frames. For example:

00 00 0C                   ; Frame length: 12
01                         ; Frame type: HEADERS
04                         ; Flags: END_HEADERS
00 00 00 01                ; Stream Identifier: 1
82                         ; Compression flag
87 01 84 8D 4E 3D 6F C8    ; Binary data for request header information

In this example:

The first byte 00 00 0C represents the frame length of 12 bytes.
The second byte 01 represents the frame type as HEADERS.
The third byte 04 represents the flag as END_HEADERS, indicating that this is the last frame for the request header information.
The next four bytes 00 00 00 01 represent the stream identifier as 1, indicating that this is the first frame for the HTTP request.
The fifth byte 82 represents the compression flag, indicating that the request header information is compressed.
The final seven bytes 87 01 84 8D 4E 3D 6F C8 represent the binary data for the request header information.

Therefore, HTTP/2's binary protocol is more efficient and results in smaller data sizes.

Copestone answered 13/6, 2023 at 4:43 Comment(0)

Textual data means data can be read ofcourse every thing stored on computers are as bits, each character in the text is encoded by standard globally accepted encoding/decoding mechanism for example utf-8, ascii etc. This encoding/decoding mechanisms assigns each character(symbols) a unique code point which is represented by a byte or group of byte, however the same mechanism is used to decode the bytes to symbols(characters) by browser when we read it.

Binary data means data is represented in group of bytes but this bytes is defined by the protocol custom encoding/decoding mechanism based on certain rule or schema

for example json message format is textual but protobuf message format is binary, since json is textual any program can serialize their object into string, it does not have a defined schema or rules however probuf schema is need to be present in both client and server so that data can be encoded and decoded.

protobuf example -> schema -> message employee { int32 age=1 , int32 salary=2}, here schema says that employee first field which it 32 bit integer has age and second field which is also 32 bit is salary.

http2 is a binary protocol since it introduces a binary framing at layer 7 between the socket interface and above http apis. this binary framing has come with certain schema (how the fields are arranged) using which plain text is converted to binary data (called frames in http2 terminology). both sender and reciever must support http2 i.e they must binary framing schema so that data can be encoded and decoded much faster than a unstructed data. This is the reason http1.1 client can not make a call to http2 server.

In http1.1 data is plain text that needed to read one character at a time to make any sense till the new line \n also complete data is passed as one unit to lower layers where it is segmented.

Smallpox answered 7/10, 2023 at 13:50 Comment(0)

-3

I believe the primary reason HTTP/2 uses binary encoding is to pack the payload into the fixed sized frames. Plain text cannot fit exactly into the frame. So binary encoding the data and splitting into multiple frames would make lot more sense.

Endeavor answered 27/9, 2020 at 17:48 Comment(0)

Hot tags

Godot Unity Godot Help Programming Godot 4.X GUI GDScript 3D 2D Physics CSharp Godot 3.X VR XR Projects C++

Conclusion

Example

Recommended topics

Hot tags