How to encode the filename parameter of Content-Disposition header in HTTP?
Asked Answered
I

22

630

Web applications that want to force a resource to be downloaded rather than directly rendered in a Web browser issue a Content-Disposition header in the HTTP response of the form:

Content-Disposition: attachment; filename=FILENAME

The filename parameter can be used to suggest a name for the file into which the resource is downloaded by the browser. RFC 2183 (Content-Disposition), however, states in section 2.3 (The Filename Parameter) that the file name can only use US-ASCII characters:

Current [RFC 2045] grammar restricts parameter values (and hence Content-Disposition filenames) to US-ASCII. We recognize the great desirability of allowing arbitrary character sets in filenames, but it is beyond the scope of this document to define the necessary mechanisms.

There is empirical evidence, nevertheless, that most popular Web browsers today seem to permit non-US-ASCII characters yet (for the lack of a standard) disagree on the encoding scheme and character set specification of the file name. Question is then, what are the various schemes and encodings employed by the popular browsers if the file name “naïvefile” (without quotes and where the third letter is U+00EF) needed to be encoded into the Content-Disposition header?

For the purpose of this question, popular browsers being:

  • Google Chrome
  • Safari
  • Internet Explorer or Edge
  • Firefox
  • Opera
Inefficiency answered 18/9, 2008 at 15:25 Comment(4)
Got it working for Mobile Safari (raw utf-8 as suggested by @Martin Ørding-Thomsen), but that does not work for GoodReader from the same device. Any ideas?Threepiece
Also see this similar questionPachton
Kornel's answer proved to be the path of least resistance, if you can set the last segment of the path; couple this with Content-Disposition: attachment.Hendley
The latest RFC specification for this is RFC 8187, which obsoletes RFC 5987.Lollop
D
103

There is discussion of this, including links to browser testing and backwards compatibility, in the proposed RFC 5987, "Character Set and Language Encoding for Hypertext Transfer Protocol (HTTP) Header Field Parameters."

RFC 2183 indicates that such headers should be encoded according to RFC 2184, which was obsoleted by RFC 2231, covered by the draft RFC above.

Diet answered 18/9, 2008 at 15:39 Comment(11)
With a quick test, that is implemented by Firefox and horribly broken in IE: it just doesn't recognize "filename*" as a filename and tries to desume the filename from mime-type and last part of the URL.Oakum
This was partly fixed in IE9.Fanfaronade
Also note that the internet draft (not "draft RFC") has been finished, and the final document is RFC 5987 (greenbytes.de/tech/webdav/rfc5987.html)Fanfaronade
Related to this, I discovered that Firefox (versions 4-9 inclusive) break if there is a comma (,) in the filename, e.g. Content-Disposition: filename="foo, bar.pdf". The result is that firefox downloads the file correctly but keeps the .part extension (e.g foo,bar.pdf-1.part). Then, of course the file won't open correctly because the application is not associated with .part. Other ASCII chars seem to work OK.Hypsometer
The RFC is well known being "not implemented correctly" by different browsers. IE, Chrome, Fx, especially Android stock browser handle very difference in Unicode/non-ascii charset.Motivate
@DennisCheung Do you have a reference for these inconsistencies in browser handling? (I want to provide evidence for an issue).Chanachance
@MatthewSchinckel e.g. kbyanc.blogspot.hk/2010/07/… and digiblog.de/2011/04/android-and-the-download-file-headersMotivate
For more about IE behavior, see blogs.msdn.com/b/ieinternals/archive/2010/06/07/…Miyamoto
@catchdave: You forgot the "attachment;" part.Somme
In Node.js with hapi for instance, you can reply(something).header('Content-Disposition', 'attachment; filename="' + encodeURI(fileName) + '"')Outer
Does this actually work in modern browsers, including Safari?Metalwork
L
433

I know this is an old post but it is still very relevant. I have found that modern browsers support rfc5987, which allows utf-8 encoding, percentage encoded (url-encoded). Then Naïve file.txt becomes:

Content-Disposition: attachment; filename*=UTF-8''Na%C3%AFve%20file.txt

Safari (5) does not support this. Instead you should use the Safari standard of writing the file name directly in your utf-8 encoded header:

Content-Disposition: attachment; filename=Naïve file.txt

IE8 and older don't support it either and you need to use the IE standard of utf-8 encoding, percentage encoded:

Content-Disposition: attachment; filename=Na%C3%AFve%20file.txt

In ASP.Net I use the following code:

string contentDisposition;
if (Request.Browser.Browser == "IE" && (Request.Browser.Version == "7.0" || Request.Browser.Version == "8.0"))
    contentDisposition = "attachment; filename=" + Uri.EscapeDataString(fileName);
else if (Request.Browser.Browser == "Safari")
    contentDisposition = "attachment; filename=" + fileName;
else
    contentDisposition = "attachment; filename*=UTF-8''" + Uri.EscapeDataString(fileName);
Response.AddHeader("Content-Disposition", contentDisposition);

I tested the above using IE7, IE8, IE9, Chrome 13, Opera 11, FF5, Safari 5.

Update November 2013:

Here is the code I currently use. I still have to support IE8, so I cannot get rid of the first part. It turns out that browsers on Android use the built in Android download manager and it cannot reliably parse file names in the standard way.

string contentDisposition;
if (Request.Browser.Browser == "IE" && (Request.Browser.Version == "7.0" || Request.Browser.Version == "8.0"))
    contentDisposition = "attachment; filename=" + Uri.EscapeDataString(fileName);
else if (Request.UserAgent != null && Request.UserAgent.ToLowerInvariant().Contains("android")) // android built-in download manager (all browsers on android)
    contentDisposition = "attachment; filename=\"" + MakeAndroidSafeFileName(fileName) + "\"";
else
    contentDisposition = "attachment; filename=\"" + fileName + "\"; filename*=UTF-8''" + Uri.EscapeDataString(fileName);
Response.AddHeader("Content-Disposition", contentDisposition);

The above now tested in IE7-11, Chrome 32, Opera 12, FF25, Safari 6, using this filename for download: 你好abcABCæøåÆØÅäöüïëêîâéíáóúýñ½§!#¤%&()=`@£$€{[]}+´¨^~'-_,;.txt

On IE7 it works for some characters but not all. But who cares about IE7 nowadays?

This is the function I use to generate safe file names for Android. Note that I don't know which characters are supported on Android but that I have tested that these work for sure:

private static readonly Dictionary<char, char> AndroidAllowedChars = "abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ._-+,@£$€!½§~'=()[]{}0123456789".ToDictionary(c => c);
private string MakeAndroidSafeFileName(string fileName)
{
    char[] newFileName = fileName.ToCharArray();
    for (int i = 0; i < newFileName.Length; i++)
    {
        if (!AndroidAllowedChars.ContainsKey(newFileName[i]))
            newFileName[i] = '_';
    }
    return new string(newFileName);
}

@TomZ: I tested in IE7 and IE8 and it turned out that I did not need to escape apostrophe ('). Do you have an example where it fails?

@Dave Van den Eynde: Combining the two file names on one line as according to RFC6266 works except for Android and IE7+8 and I have updated the code to reflect this. Thank you for the suggestion.

@Thilo: No idea about GoodReader or any other non-browser. You might have some luck using the Android approach.

@Alex Zhukovskiy: I don't know why but as discussed on Connect it doesn't seem to work terribly well.

Laveta answered 19/7, 2011 at 10:34 Comment(18)
I tested the above code using FF 8.0.1 on Windows 7. RFC5987 is chosen and the file name (Naïve file.txt) is shown correctly.Phenacite
Got it working for Mobile Safari (raw utf-8 as suggested above), but that does not work for GoodReader from the same device. Any ideas?Threepiece
IE7 and 8 also need apostrophes escaped: .Replace("'", Uri.HexEscape('\''))Gouda
Directly writing UTF-8 characters seems to work for current versions of Firefox, Chrome, and Opera. Didn't test Safari & IE.Haemolysin
My Chrome '26.0.1410.64 m' doesn't recognize the rfc5987 format. It eat the old ie percentage encoding.Stiletto
Why not combine them, as Content-Disposition: attachment; filename*=UTF-8''Na%C3%AFve%20file.txt; filename=Na%C3%AFve%20file.txt and skip the browser sniffing? Would that work?Overtime
Another addition: In IE9, the %20 in a filename* argument does not result in a space, but a literal %20 in the file name.Cornish
@Rutix In my case, removing the filename part altogether worked, because the path of the URL already included the file name. All the tested browsers then used that as the file name (which worked with spaces).Cornish
Tested on IE11 Mobile on "Windows Phone 8.1 Update" did not work :-(Cygnet
@DaveVandenEynde This is not warking in modern Chromium based browsers. Browser show a warning about security problems instead (I'm not sure which security problems can be caused by multiple filename specification, though).Wadley
@Wadley I've since learned to rely on Web.Api's ContentDispositionHeaderValue class to deal with this for me.Overtime
Some resent testing of this (using inline;filename=) suggests that using filenames containing spaces must have surrounding double-quote (") characters for Firefox 42 to use anything other than the filename before the first space. Using URL-encoded filenames does not work; the filename in the "Save As" dialog becomes my%20file.txt. Same with Safari 9: quotes must be used and %-encoding is a mess. Google Chrome 46 seems to ignore the header altogether, or possibly it doesn't like something specific about the formatting.Biestings
I should have mentioned that URL-encoding does not work with filename specifically. Use of filename* is another matter. Also note that you can't use + for space in the filename when using filename*: you must use %20.Biestings
In Node.js with hapi for instance, you can reply(something).header('Content-Disposition', 'attachment; filename="' + encodeURI(fileName) + '"')Outer
The kind folks at fastmail found another workaround: blog.fastmail.com/2011/06/24/download-non-english-filenames Content-Disposition: attachment; filename="foo-%c3%a4.html"; filename*=UTF-8''foo-%c3%a4.html Specifying the fileName twice (one time without the UTF-8 prefix and one time with) makes it work in IE8-11, Edge, Chrome, Firefox and Safari (seems like apple fixed safari, so it works there as well now)Guanase
@MartinØrding-Thomsen Do you know why standard System.Net.Mime.ContentDisposition generates invalid name which cannot be interpreted by any browser (even chrome cannot)?Snailfish
@DaveVandenEynde One must distinguish the Content-Disposition in the request headers from those in the a multipart/form-data body, In the later case using filename* is explicitly disallowed for Content-Disposition, see tools.ietf.org/html/rfc7578#section-4.2.Scab
RFC 8187 obsoleted RFC 5987.Lollop
S
187

There is a simple and very robust alternative: use a URL that contains the filename you want.

When the name after the last slash is the one you want, you don't need any extra headers!

This trick works:

/real_script.php/fake_filename.doc

And if your server supports URL rewriting (e.g. mod_rewrite in Apache) then you can fully hide the script part.

Characters in URLs should be in UTF-8, urlencoded byte-by-byte:

/mot%C3%B6rhead   # motörhead
Scotfree answered 19/10, 2008 at 18:26 Comment(18)
Anyone know how to do this in ASP.NET? Would it be possible to do something like GetAttachment.aspx?id=34/fake_filename.doc without a lot of trouble?Sirrah
Try GetAttachment.aspx/fake_filename.doc?id=34 (although it might be Apache-only quirk)Scotfree
You can handle this kind of path in IIS by using either a custom .Net HttpModule or maybe the UrlRewrite option in IIS7.Argentum
@SeanHanley - also check out URL rewrite for IIS and MVC frameworkNorthbound
I went down the rabbit trail and tried some of the other solutions; trying to sniff out the correct browser and version to set the headers correctly is too much of a nightmare. Chrome was incorrectly identifying as Safari which does not behave the same at all (breaks on commas if not encoded correctly). Save yourself the trouble, use this solution and alias the URL as needed.Tiv
I did this in ASP.NET 4.0 web forms using ASP.NET Routing. I registered the route: routes.MapPageRoute("Download", "download/{id}/{filename}", "~/download.aspx"); In download.aspx I only use the id: Page.RouteData.Values["id"] and do not write the extra header "Content-Disposition". Works nicely and is easier than HttpModule I suppose.Cheston
please do not encode . to %2e ,ie7 in winxp will fail to show correct file name.Orthopedic
The /:id/:filename method is really simple and works, thank you!Neuropath
how should i implement url approach with laravel Response::download()?if i dont define file name in this method it will select file name itself and doesn't consider urlAreola
A thousand times "Yes". You will seriously win time with this. More even - some Android browsers will flat out ignore the Content-Disposition and create very interesting filenames instead (they will be generated from your path). So the only solution for keeping one's sanity is just setting Content-Disposition: attachment and passing the desired filename as the last path component:Morra
this is a great solution (and made me feel kinda stupid) on a related note, remember if the filename comes from a user variable you still have to make sure it's ready for the filesystem. If you don't, and the file has something like /, you get really weird browser errors. With this answer as a reference I used s.replace(/[\000-\031\\\/:*?"<>\|]/g, '_')Segmentation
But in this case we need to know the file name in advance, right? That makes two requests, one for the file name, one for the file itself.Sisley
@GuneyOzsan On the HTTP level there's absolutely no difference, and it never causes any extra requests. You don't need to know the filename, you need to include the filename in the URL which you have to know anyway.Scotfree
@Scotfree in my current project I don't know file names in advance and request files by ID and try to get the file (as stream) and name (preferably in header) in a single request. On the other hand the C# version that Unity uses does not support that weird syntax in Content-Disposition. Eventually I resolved it by encoding it in php with filename="' . rawurlencode($file_name_with_extension) . '", and decoding it in C# using headerValue = ContentDispositionHeaderValue.Parse(contentDisposition), and fileName = Uri.UnescapeDataString(headerValue.FileName.Replace("\"", "")).Sisley
@Scotfree I was actually wondering how or why fake_filename.doc is interpreted as if it is the file name in header.Sisley
@GuneyOzsan the filename for saving is deduced by the web browser, and browsers have no understanding of what is happening server side, so they don't understand and don't care how the server interprets the URL. Browsers just take whatever is after the last slash in the URL path, sometimes additionally trying to correct filename extensions based on Content-Type.Scotfree
@Scotfree Ops, sorry, I was tired working for hours to resolve some bug and confused 'slash' with 'underscore' trying to understand the black magic behind why browser strips the part fake_. Thank you for your time.Sisley
Approach with file name in a path is not working when file name is long on mobile safari...Junejuneau
D
103

There is discussion of this, including links to browser testing and backwards compatibility, in the proposed RFC 5987, "Character Set and Language Encoding for Hypertext Transfer Protocol (HTTP) Header Field Parameters."

RFC 2183 indicates that such headers should be encoded according to RFC 2184, which was obsoleted by RFC 2231, covered by the draft RFC above.

Diet answered 18/9, 2008 at 15:39 Comment(11)
With a quick test, that is implemented by Firefox and horribly broken in IE: it just doesn't recognize "filename*" as a filename and tries to desume the filename from mime-type and last part of the URL.Oakum
This was partly fixed in IE9.Fanfaronade
Also note that the internet draft (not "draft RFC") has been finished, and the final document is RFC 5987 (greenbytes.de/tech/webdav/rfc5987.html)Fanfaronade
Related to this, I discovered that Firefox (versions 4-9 inclusive) break if there is a comma (,) in the filename, e.g. Content-Disposition: filename="foo, bar.pdf". The result is that firefox downloads the file correctly but keeps the .part extension (e.g foo,bar.pdf-1.part). Then, of course the file won't open correctly because the application is not associated with .part. Other ASCII chars seem to work OK.Hypsometer
The RFC is well known being "not implemented correctly" by different browsers. IE, Chrome, Fx, especially Android stock browser handle very difference in Unicode/non-ascii charset.Motivate
@DennisCheung Do you have a reference for these inconsistencies in browser handling? (I want to provide evidence for an issue).Chanachance
@MatthewSchinckel e.g. kbyanc.blogspot.hk/2010/07/… and digiblog.de/2011/04/android-and-the-download-file-headersMotivate
For more about IE behavior, see blogs.msdn.com/b/ieinternals/archive/2010/06/07/…Miyamoto
@catchdave: You forgot the "attachment;" part.Somme
In Node.js with hapi for instance, you can reply(something).header('Content-Disposition', 'attachment; filename="' + encodeURI(fileName) + '"')Outer
Does this actually work in modern browsers, including Safari?Metalwork
T
92

RFC 6266 describes the “Use of the Content-Disposition Header Field in the Hypertext Transfer Protocol (HTTP)”. Quoting from that:

6. Internationalization Considerations

The “filename*” parameter (Section 4.3), using the encoding defined in [RFC5987], allows the server to transmit characters outside the ISO-8859-1 character set, and also to optionally specify the language in use.

And in their examples section:

This example is the same as the one above, but adding the "filename" parameter for compatibility with user agents not implementing RFC 5987:

Content-Disposition: attachment;
                     filename="EURO rates";
                     filename*=utf-8''%e2%82%ac%20rates

Note: Those user agents that do not support the RFC 5987 encoding ignore “filename*” when it occurs after “filename”.

In Appendix D there is also a long list of suggestions to increase interoperability. It also points at a site which compares implementations. Current all-pass tests suitable for common file names include:

  • attwithisofnplain: plain ISO-8859-1 file name with double quotes and without encoding. This requires a file name which is all ISO-8859-1 and does not contain percent signs, at least not in front of hex digits.
  • attfnboth: two parameters in the order described above. Should work for most file names on most browsers, although IE8 will use the “filename” parameter.

That RFC 5987 in turn references RFC 2231, which describes the actual format. 2231 is primarily for mail, and 5987 tells us what parts may be used for HTTP headers as well. Don't confuse this with MIME headers used inside a multipart/form-data HTTP body, which is governed by RFC 2388 (section 4.4 in particular) and the HTML 5 draft.

Transient answered 5/1, 2014 at 12:48 Comment(1)
I had trouble in Safari . When downloading files with Russian names received erroneous and unreadable characters . The solution has helped . But we need to send a header in a single row ( !!! ) .Horn
I
15

The following document linked from the draft RFC mentioned by Jim in his answer further addresses the question and definitely worth a direct note here:

Test Cases for HTTP Content-Disposition header and RFC 2231/2047 Encoding

Inefficiency answered 18/9, 2008 at 16:8 Comment(1)
Note that one can supply both ways of encoding the filename parameter, and that they appear to work correctly with old browsers and new browsers (old being MSIE8 and Safari in this case). Check attfnboth in the report mentioned by @AtifAziz.Vu
S
14

Put the file name in double quotes. Solved the problem for me. Like this:

Content-Disposition: attachment; filename="My Report.doc"

http://kb.mozillazine.org/Filenames_with_spaces_are_truncated_upon_download

I've tested multiple options. Browsers do not support the specs and act differently, I believe double quotes is the best option.

Starkey answered 10/7, 2015 at 15:1 Comment(7)
This sadly doesn't solve all problems explained in the answers above.Neuropath
This will allow you to return a file name with spaces, &, %, # etc. So it solves that.Unflinching
What if the filename contains double quotes (yes this can happen), As specified in RFC 6266, the filename is a "quoted-string", and as specified in RFC 2616 double quotes within a quoted-string should be escaped with a backslash.Cotter
@ChristopheRoussy Is there a way to allow for a double quote in the file name? I tried a bunch of combinations of wrapping in single quotes, escaping the double quotes (\"), etc. but it never worked. Eventually I had to use a gsub to remove the double quotes. So if the filename is My 2" Report.doc it would end up being My 2 Report.doc. Not ideal but at least it worked. Thoughts?Ferous
@JoshuaPinter look into escaping or escape, sometimes you have to double the char. It must be defined in the standard. Close: #18634837Cotter
@ChristopheRoussy Thanks. I looked at this again and tried a bunch of methods, including double escaping and encoding using %22 (for double quotes). I even used the content_disposition gem (github.com/shrinerb/content_disposition) and it all just replaced the double quotes with an underscore character (_). Now, this may be happening automatically either on macOS or in Chrome to avoid issues with filenames. But my updated solution is to just replace double quotes with two single quotes in the filename. Works. Is (relatively) safe. And looks close enough to not lose meaning.Ferous
Did you test with non-ASCII characters? In Safari?Metalwork
T
12

I use the following code snippets for encoding (assuming fileName contains the filename and extension of the file, i.e.: test.txt):


PHP:

if ( strpos ( $_SERVER [ 'HTTP_USER_AGENT' ], "MSIE" ) > 0 )
{
     header ( 'Content-Disposition: attachment; filename="' . rawurlencode ( $fileName ) . '"' );
}
else
{
     header( 'Content-Disposition: attachment; filename*=UTF-8\'\'' . rawurlencode ( $fileName ) );
}

Java:

fileName = request.getHeader ( "user-agent" ).contains ( "MSIE" ) ? URLEncoder.encode ( fileName, "utf-8") : MimeUtility.encodeWord ( fileName );
response.setHeader ( "Content-disposition", "attachment; filename=\"" + fileName + "\"");
Thomey answered 19/4, 2013 at 11:29 Comment(1)
Right, it should be rawurlencode in PHP at least for the filename*= disposition header since value-chars used in ext-value of RFC 6266->RFC 5987 (see tools.ietf.org/html/rfc6266#section-4.1 & tools.ietf.org/html/rfc5987#section-3.2.1 ) doesn't allow space without percent escaping (filename=, on the other hand, seems that it could allow a space without escaping at all though only ASCII should be present here). It isn't necessary to encode with the full strictness of rawurlencode, so a few characters can be unescaped: gist.github.com/brettz9/8752120Vltava
C
10

in asp.net mvc2 i use something like this:

return File(
    tempFile
    , "application/octet-stream"
    , HttpUtility.UrlPathEncode(fileName)
    );

I guess if you don't use mvc(2) you could just encode the filename using

HttpUtility.UrlPathEncode(fileName)
Chaddy answered 15/7, 2010 at 15:8 Comment(3)
Url encoding for file name encoding is not valid, browsers ought to not url decode those.Nadaha
IE 11 definitely does not decode url encoding in this field.Scrope
But it needed to be UrlEncoded when the browser is Chrome or IE, others such as FF, Safari and Opera work fine with out encodingPhylloxera
F
10

In ASP.NET Web API, I url encode the filename:

public static class HttpRequestMessageExtensions
{
    public static HttpResponseMessage CreateFileResponse(this HttpRequestMessage request, byte[] data, string filename, string mediaType)
    {
        HttpResponseMessage response = new HttpResponseMessage(HttpStatusCode.OK);
        var stream = new MemoryStream(data);
        stream.Position = 0;

        response.Content = new StreamContent(stream);

        response.Content.Headers.ContentType = 
            new MediaTypeHeaderValue(mediaType);

        // URL-Encode filename
        // Fixes behavior in IE, that filenames with non US-ASCII characters
        // stay correct (not "_utf-8_.......=_=").
        var encodedFilename = HttpUtility.UrlEncode(filename, Encoding.UTF8);

        response.Content.Headers.ContentDisposition =
            new ContentDispositionHeaderValue("attachment") { FileName = encodedFilename };
        return response;
    }
}

IE 9 Not fixed
IE 9 Fixed

Flanagan answered 25/6, 2015 at 8:10 Comment(0)
S
9

In PHP this did it for me (assuming the filename is UTF8 encoded):

header('Content-Disposition: attachment;'
    . 'filename="' . addslashes(utf8_decode($filename)) . '";'
    . 'filename*=utf-8\'\'' . rawurlencode($filename));

Tested against IE8-11, Firefox and Chrome.
If the browser can interpret filename*=utf-8 it will use the UTF8 version of the filename, else it will use the decoded filename. If your filename contains characters that can't be represented in ISO-8859-1 you might want to consider using iconv instead.

Sonia answered 20/5, 2016 at 12:47 Comment(6)
Although this code may answer the question, providing additional context regarding why and/or how it answers the question would significantly improve its long-term value. Please edit your answer to add some explanation.Widescreen
Whoa, none of the above code-only answers got downvoted or critized like that. Also I found the why was answered well enough already: IE does not interpret filename*=utf-8 but needs ISO8859-1 version of the filename, which this script does offer. Only wanted to give the lazy a working simple code for PHP.Sonia
I think this got downvoted because the question isn't language specific but about the what RFCs to stick to when implementing the header encoding. Thanks however, for this answer, for PHP, this code made my woes go away.Pessimist
Thank you. This answer may not have strictly answered the question, but it was exactly what I was looking for and helped me resolve the issue in Python.Caroche
I am pretty sure this code can be used as an attack vector if the user can control the name of the file.Hendley
Does this work in Safari?Metalwork
U
7

From .NET 4.5 (and Core 1.0) you can use ContentDispositionHeaderValue to do the formatting for you.

var fileName = "Naïve file.txt";
var h = new System.Net.Http.Headers.ContentDispositionHeaderValue("attachment");
h.FileNameStar = fileName;
h.FileName = "fallback-ascii-name.txt";

Response.Headers.Add("Content-Disposition", h.ToString());

h.ToString() Will result in:

attachment; filename*=utf-8''Na%C3%AFve%20file.txt; filename=fallback-ascii-name.txt
Unprecedented answered 30/7, 2021 at 7:34 Comment(1)
I combined this with "ASCII Folding" from https://mcmap.net/q/48754/-how-do-i-remove-diacritics-accents-from-a-string-in-net to generate h.FileName Note: h.FileName must not contain quote character (from ContentDispositionHeaderValue source: "Only bounding quotes are allowed")Ternary
T
6

Just an update since I was trying all this stuff today in response to a customer issue

  • With the exception of Safari configured for Japanese, all browsers our customer tested worked best with filename=text.pdf - where text is a customer value serialized by ASP.Net/IIS in utf-8 without url encoding. For some reason, Safari configured for English would accept and properly save a file with utf-8 Japanese name but that same browser configured for Japanese would save the file with the utf-8 chars uninterpreted. All other browsers tested seemed to work best/fine (regardless of language configuration) with the filename utf-8 encoded without url encoding.
  • I could not find a single browser implementing Rfc5987/8187 at all. I tested with the latest Chrome, Firefox builds plus IE 11 and Edge. I tried setting the header with just filename*=utf-8''texturlencoded.pdf, setting it with both filename=text.pdf; filename*=utf-8''texturlencoded.pdf. Not one feature of Rfc5987/8187 appeared to be getting processed correctly in any of the above.
Tunic answered 13/3, 2019 at 19:18 Comment(1)
This is a good update. Can you elaborate on the specific tests you tried?Jeanajeanbaptiste
E
5

If you are using a nodejs backend you can use the following code I found here

var fileName = 'my file(2).txt';
var header = "Content-Disposition: attachment; filename*=UTF-8''" 
             + encodeRFC5987ValueChars(fileName);

function encodeRFC5987ValueChars (str) {
    return encodeURIComponent(str).
        // Note that although RFC3986 reserves "!", RFC5987 does not,
        // so we do not need to escape it
        replace(/['()]/g, escape). // i.e., %27 %28 %29
        replace(/\*/g, '%2A').
            // The following are not required for percent-encoding per RFC5987, 
            // so we can allow for a little better readability over the wire: |`^
            replace(/%(?:7C|60|5E)/g, unescape);
}
Evocative answered 25/9, 2015 at 12:45 Comment(2)
Better to use encodeURI(str). As example with dates in the file name: encodeURIComponent('"Kornél Kovács 1/1/2016') => "Kornél Kovács 1%2F1%2F2016" vs. encodeURI('"Kornél Kovács 1/1/2016') => "Kornél Kovács 1/1/2016"Outer
Does this work in Safari?Metalwork
A
4

I ended up with the following code in my "download.php" script (based on this blogpost and these test cases).

$il1_filename = utf8_decode($filename);
$to_underscore = "\"\\#*;:|<>/?";
$safe_filename = strtr($il1_filename, $to_underscore, str_repeat("_", strlen($to_underscore)));

header("Content-Disposition: attachment; filename=\"$safe_filename\""
.( $safe_filename === $filename ? "" : "; filename*=UTF-8''".rawurlencode($filename) ));

This uses the standard way of filename="..." as long as there are only iso-latin1 and "safe" characters used; if not, it adds the filename*=UTF-8'' url-encoded way. According to this specific test case, it should work from MSIE9 up, and on recent FF, Chrome, Safari; on lower MSIE version, it should offer filename containing the ISO8859-1 version of the filename, with underscores on characters not in this encoding.

Final note: the max. size for each header field is 8190 bytes on apache. UTF-8 can be up to four bytes per character; after rawurlencode, it is x3 = 12 bytes per one character. Pretty inefficient, but it should still be theoretically possible to have more than 600 "smiles" %F0%9F%98%81 in the filename.

Admix answered 5/4, 2015 at 15:45 Comment(1)
...but the max transferrable filename length also depends on the client. Just found out that at most [89 smiles😁].pdf filename gets through MSIE11. In Firefox37, it is at most [111x 😁].pdf. Chrome41 truncates the filename at 110th smile. Interestingly, the suffix is transferred ok.Admix
S
3

For those who need a JavaScript way of encoding the header, I found that this function works well:

function createContentDispositionHeader(filename:string) {
    const encoded = encodeURIComponent(filename);
    return `attachment; filename*=UTF-8''${encoded}; filename="${encoded}"`;
}

This is based on what Nextcloud seems to be doing when downloading a file. The filename appears first as UTF-8 encoded, and possibly for compatibility with some browsers, the filename also appears without the UTF-8 prefix.

Stegosaur answered 17/8, 2021 at 22:30 Comment(0)
P
2

PHP framework Symfony 4 has $filenameFallback in HeaderUtils::makeDisposition. You can look into this function for details - it is similar to the answers above.

Usage example:

$filenameFallback = preg_replace('#^.*\.#', md5($filename) . '.', $filename);
$disposition = $response->headers->makeDisposition(ResponseHeaderBag::DISPOSITION_ATTACHMENT, $filename, $filenameFallback);
$response->headers->set('Content-Disposition', $disposition);
Pertinent answered 22/7, 2019 at 13:58 Comment(0)
H
0

Classic ASP Solution

Most modern browsers support passing the Filename as UTF-8 now but as was the case with a File Upload solution I use that was based on FreeASPUpload.Net (site no longer exists, link points to archive.org) it wouldn't work as the parsing of the binary relied on reading single byte ASCII encoded strings, which worked fine when you passed UTF-8 encoded data until you get to characters ASCII doesn't support.

However I was able to find a solution to get the code to read and parse the binary as UTF-8.

Public Function BytesToString(bytes)    'UTF-8..
  Dim bslen
  Dim i, k , N 
  Dim b , count 
  Dim str

  bslen = LenB(bytes)
  str=""

  i = 0
  Do While i < bslen
    b = AscB(MidB(bytes,i+1,1))

    If (b And &HFC) = &HFC Then
      count = 6
      N = b And &H1
    ElseIf (b And &HF8) = &HF8 Then
      count = 5
      N = b And &H3
    ElseIf (b And &HF0) = &HF0 Then
      count = 4
      N = b And &H7
    ElseIf (b And &HE0) = &HE0 Then
      count = 3
      N = b And &HF
    ElseIf (b And &HC0) = &HC0 Then
      count = 2
      N = b And &H1F
    Else
      count = 1
      str = str & Chr(b)
    End If

    If i + count - 1 > bslen Then
      str = str&"?"
      Exit Do
    End If

    If count>1 then
      For k = 1 To count - 1
        b = AscB(MidB(bytes,i+k+1,1))
        N = N * &H40 + (b And &H3F)
      Next
      str = str & ChrW(N)
    End If
    i = i + count
  Loop

  BytesToString = str
End Function

Credit goes to Pure ASP File Upload by implementing the BytesToString() function from include_aspuploader.asp in my own code I was able to get UTF-8 filenames working.


Useful Links

Horrify answered 23/5, 2016 at 12:17 Comment(0)
H
0

In PHP simply use standard function, mb_encode_mimeheader().

Hagler answered 8/3, 2023 at 18:38 Comment(0)
O
0

This in PHP works for me for all browsers (Chrome, Safari, Firefox, IE11)...

header('Content-Disposition: attachment; filename="' . $fileName . '"; filename*=utf-8\'\'' . rawurlencode($fileName) . ';');

Ondrej answered 10/8, 2023 at 16:33 Comment(0)
H
-1

The method mimeHeaderEncode($string) from the library class Unicode does the job.

$file_name= Unicode::mimeHeaderEncode($file_name);

Example in drupal/php:

https://github.com/drupal/core-utility/blob/8.8.x/Unicode.php

/**
   * Encodes MIME/HTTP headers that contain incorrectly encoded characters.
   *
   * For example, Unicode::mimeHeaderEncode('tést.txt') returns
   * "=?UTF-8?B?dMOpc3QudHh0?=".
   *
   * See http://www.rfc-editor.org/rfc/rfc2047.txt for more information.
   *
   * Notes:
   * - Only encode strings that contain non-ASCII characters.
   * - We progressively cut-off a chunk with self::truncateBytes(). This ensures
   *   each chunk starts and ends on a character boundary.
   * - Using \n as the chunk separator may cause problems on some systems and
   *   may have to be changed to \r\n or \r.
   *
   * @param string $string
   *   The header to encode.
   * @param bool $shorten
   *   If TRUE, only return the first chunk of a multi-chunk encoded string.
   *
   * @return string
   *   The mime-encoded header.
   */
  public static function mimeHeaderEncode($string, $shorten = FALSE) {
    if (preg_match('/[^\x20-\x7E]/', $string)) {
      // floor((75 - strlen("=?UTF-8?B??=")) * 0.75);
      $chunk_size = 47;
      $len = strlen($string);
      $output = '';
      while ($len > 0) {
        $chunk = static::truncateBytes($string, $chunk_size);
        $output .= ' =?UTF-8?B?' . base64_encode($chunk) . "?=\n";
        if ($shorten) {
          break;
        }
        $c = strlen($chunk);
        $string = substr($string, $c);
        $len -= $c;
      }
      return trim($output);
    }
    return $string;
  }
Hustle answered 21/12, 2021 at 10:49 Comment(0)
C
-2

We had a similar problem in a web application, and ended up by reading the filename from the HTML <input type="file">, and setting that in the url-encoded form in a new HTML <input type="hidden">. Of course we had to remove the path like "C:\fakepath\" that is returned by some browsers.

Of course this does not directly answer OPs question, but may be a solution for others.

Countercurrent answered 27/1, 2015 at 11:54 Comment(1)
Completely different issue. The question is about downloading, your reply is about uploading.Mechanistic
W
-3

I normally URL-encode (with %xx) the filenames, and it seems to work in all browsers. You might want to do some tests anyway.

Wisniewski answered 18/9, 2008 at 15:28 Comment(1)
I did test in a few and it does not work that way in all the browsers, thus the question. :)Inefficiency

© 2022 - 2024 — McMap. All rights reserved.