Maximum size of email X-Headers
Asked Answered
G

2

11

We are looking at sticking some metadata into the X-Headers of email messages. These emails are for consumption by internal systems, and will be hosted on an Exchange server.

Is there a maximum size for the ammount of data that we can store in an X-Header?

Are there any limitations, such as special characters, that I should know about?

Gwendolin answered 27/4, 2010 at 13:32 Comment(0)
C
18

US ASCII characters only.

This is defined by RFC 822.

There is no limit on the length of a header body in the standard, though there is a line length limit, which imposes a limit on the length of the header name.

There are two limits that this specification places on the number of characters in a line. Each line of characters MUST be no more than 998 characters, and SHOULD be no more than 78 characters, excluding the CRLF.

You can, however, extend the header body beyond a single line with "folding". The receiver then "unfolds" the lines.

An unfolded header field has no length restriction and therefore may be indeterminately long.

The header name cannot be folded, so the header name cannot be longer than the line limit.

Note that even though the standards have no limitation on the total length of the header body, actual implementations may have imposed artificial limitations.

Carboloy answered 27/4, 2010 at 14:5 Comment(6)
Just a note: RFC 5322 is the latest RFC defining the standard email format. It supersedes RFC 822 and 2822.Baudelaire
RFC 5322 specifies line length SHOULD be less than 78 characters and MUST be 998 or less, including CRLF. (This results in a limit on the length of the header name, which cannot be folded, to 74 characters.) There is no limit to the number of folds in a header field body, although there are probably practical limitations.Esurient
RFC 2047 adds further constraints to header line length. in particular it says lines containing encoded words MUST be no longer than 76 octets, these constraints only apply to header lines where RFC2047 encoding is used.Elstan
@danorton, I updated my answer based on RFC 5322, and it explicitly states that headers bodies can be folded. While there are line length limits, there is no (total) length limit.Carboloy
@Esurient is incorrect: 74 is not the limit on names: Header-names MUST no bigger than 997, but they SHOULD be no bigger than 75 (before the colon)Elstan
A certain consumer fluidics company like to push this boundryElstan
I
0

The answer provided by Marcus Adams is correct in the numbers, but I wanted to provide a bit more curated history and (hopefully) clarity.

TL;DR

I believe the 1000 character limit originally comes from RFC #821, not RFC #822. And it only enters the specification of Internet Message Formats in RFC #2822.

Network Message Formats vs Simple Mail Transfer Protocol

Just to make sure we are on the same page with terminology and stacks, I wanted to call out the differences here.

Network Message Formats

RFC #822 has a long history, starting at RFC #733 (1977) -> RFC #822 (1982) -> RFC #2822 (2001) -> RFC #5322 (2008) -> RFC #6854 (2013). With updates and extensions all over, like RFC #4021, RFC #5335 and RFC #5336.

All of these refer to the "message format" for "network messages" (aka email), with titles referring to "ARPA NETWORK TEXT MESSAGES", "ARPA INTERNET TEXT MESSAGES", "Internet Message Format", "Mail and MIME Header Fields", and more.

Simple Mail Transfer Protocol

However, there is also Simple Mail Transfer Protocol (SMTP) itself, which describes how to move (relay, transfer, etc...) email around the internet. It is defined in RFC #821 (1982). I bring up SMTP because this where we first see the size limits:

SMTP Size Limits
4.5.3.  SIZES There are several objects that have required minimum maximum sizes.
* Note, the following are summaries, not a direct quote
user - 64 characters
domain - 64 characters
path - 256 characters
command line - 512 characters (including <CRLF>)
reply line - 512 characters (including <CRLF>)
text line - 1000 characters (including <CRLF>)
recipients buffer - 100 recipients

With an interesting note for the curious:

TO THE MAXIMUM EXTENT POSSIBLE, IMPLEMENTATION TECHNIQUES WHICH IMPOSE NO LIMITS ON THE LENGTH OF THESE OBJECTS SHOULD BE USED.

However, no limits in "Message Format" RFCs

First, RFC #822 does not specify any restraints on the length of the line, that I can see.

It does offer some guidelines for readability, but goes so far as to say,

3.4.8. FOLDING LONG HEADER FIELDS however, the limit is not imposed by this standard.

So I also looked at RFC #733 (which RFC #822 obsoletes) and this too recommends, but does not impose:

[Section III B.g] 
The former length is recommended as a limit, but it is not imposed by this standard

However, although the referenced answer/comments point to RFC #5322 as imposing limits, it was actually RFC #2822 that introduced it finally. Again (the answer isn't wrong, just providing a path to it here).

So what does SMTP have to do with message format size limits?

According to RFC #822, in

1.2. COMMUNICATION FRAMEWORK
Messages consist of lines of text

And further, in

3.4.8. FOLDING LONG HEADER FIELDS
Each header field may be represented on exactly one line  consisting 
of the name of the field and its body, and terminated by a CRLF; this
is what the parser sees.

We can start connecting line length limitations headers to SMTP.

Ignoring folding, which does not change what a line means (just how to display it), RFC #822 explains that messages are just lines of text, and a header must be on exactly one line. But we need to go to RFC #821 to see the maximum line length is 1000 characters.

I hope first that I am correct on this, and only then, it provides some clarity.

Insect answered 27/2, 2022 at 20:8 Comment(1)
This is a fantastic summary! Thanks for the clarity.Brick

© 2022 - 2024 — McMap. All rights reserved.