Use character # in URL inside file name
Asked Answered
N

2

10

I need to put a link with this href="file://attachments/aaaa_#_aaaa.msg" Obviously in that way is not working because the hash character # is used for anchors.

So I try to change this to: href="file://attachments/aaaa_%23_aaaa.msg" but when I open the url in the IE, the browser is trying to open this: href="file://attachments/aaaa_%2523_aaaa.msg" IE is encoding the % character to %25

How can I put the file name in the URL to encode and read the hash character # in all the browsers to download the file?

I can't change the file name to remove this character, so I need a way to deal with this problem.

Numismatist answered 3/2, 2014 at 11:34 Comment(7)
Using aaaa_%23_aaaa.msg worked for me on IE8.Hindenburg
Just tested: %23 works in Firefox/26, Chrome/32, Opera/12.16 and Explorer/11 (all running on Windows 7). What target browser is it failing for?Conjoint
If I put this "aaaa_%23_aaaa.msg" direct into the adress bar is working, but when I put an anchor like <a href="file://attachments/aaaa_%23_aaa.msg">aaaa_#_aaa.msg</a> In IE11 is trying to open file://attachments/aaaa_%2523_aaa.msgNumismatist
I still cannot reproduce, though I admit I had to rewrite the URL prefix to make it absolute (file:///C:/attachments/); otherwise, it didn't work in any browser no matter the file name.Conjoint
@user2244596 your problem is that "%" is being url encoded so it is encoded to "%25" and 23 is then handled as usual chars. Are you using some encoding function? BTW My IE11 works even through your anchor tag example :)Quadragesimal
Not, I'm not using any encoded function, just put the the anchor tag in my asp file with the url retrieve from db. And only IE11 is making this encode for '%'. Finally I can't solved the problem directly, but I've could change the source process to change special characters in the files.Numismatist
have you tried putting single quotes instead of double quotes. I know some languages make a difference by interpreting doubles quotes but letting as is single quotes. Don't know if that tricks works on pure html though.Warchaw
L
2

You will avoid lots and lots and lots of pain if you are able to rename your files so they don't contain a "#" character. As long as they do, you will probably have current and future cross-browser issues, confusion on behalf of future developers working on your code (or confusion on your behalf in the future, when you've forgotten the ins and outs of the encoding), etc. Also, some Unix/Linux systems don't allow "#" in filenames. Not sure what OS you're using, but your filenames should be as portable as possible across OSs, even if you're "sure" right now that you'll never be running on one of those systems.

Leavening answered 9/2, 2014 at 1:46 Comment(3)
Do you have any reference on that? As far as I know, # is a plain US-ASCII character that doesn't have any special meaning in any Unix file system I'm aware of. And encoding it in a URL is straightforward (unlike other Unicode characters that have different encodings in Latin1 and UTF-8).Conjoint
@Álvaro G. Vicario, I don't have a reference; I have a memory of being on some UNIX flavor and not being able to use "#". And given it has a special meaning in URLs as well as being a comment identifier in some programming languages, I avoid it and it's worked well for me. E.g., I haven't run into any issues such as the OP has :)Leavening
I just made a file named "PO# tralala.pdf" in Fedora 23 with an ext4 filesystem. So yeah.Biagi
C
0

The problem is that in a URL, # introduces the fragment, and ? introduces the query. Path components on Unix / Linux / MacOS / iOS fortunately cannot contain "/" characters (they are separated by "/" but cannot contain a slash) because that would be another pain.

Almost all characters in the file path either become part of the URL unchanged, or they are percent-encoded and URLs know that for example %20 in the percent-encoded URL is really a space character in the path.

The two exceptions are # and ?. As soon as you have a path containing # or ?, which is perfectly legal, parsing the URL will think that "?" introduces a query, and "#" introduces a fragment, so ? or # and all following characters are not part of the path, but are turned into "query" and "fragment". And you cannot escape them in the original path, because %23 is again a perfectly legal path of three characters, percent, two and three, and gets percent-escaped to %2523, and when you or the OS tries to recreate the path, this is translated to %23.

(The "?" might cause trouble in shell scripting, but inside a path it is just as legal as any other character, at least in MacOS X and iOS. Even a nul byte is legal in a path except that C code trying to handle a path as a C string will again run into trouble).

Cleodell answered 21/2 at 11:29 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.