Passing base64 encoded strings in URL
Asked Answered
C

12

331

Is it safe to pass raw base64 encoded strings via GET parameters?

Cloth answered 3/9, 2009 at 17:12 Comment(0)
D
267

No, you would need to url-encode it, since base64 strings can contain the "+", "=" and "/" characters which could alter the meaning of your data - look like a sub-folder.

Valid base64 characters are below.

ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789+/=
Dreda answered 3/9, 2009 at 17:19 Comment(9)
URLencoding is a waste of space, especially as base64 itself leaves many characters unused.Softhearted
I am not sure I understand what you are saying - URL encoding wont alter any of the characters except the last three characters in the list above, and that is to prevent them from being interpreted incorrectly since they have other meanings in URLS. The same goes for base64, the original data could be binary or anything, but it is encoded in a form that can be transmitted easily using simple protocols.Dreda
Firstly, you should escape '+' too as it may be converted into space. Secondly, there are at least few characters which are safe for use in URLs and aren't used in ‘standard’ charset. Your method can even increase the size of transferred data three times in certain situations; while replacing those characters with some other will do the trick while preserving same length. And it's quite standard solution too.Softhearted
en.wikipedia.org/wiki/Base64#URL_applications — it says clearly that escaping ‘makes the string unnecessarily longer’ and mentions the alternate charset variant.Softhearted
Because of this answer, I diagnosed my problem as being exactly what it mentioned. Some of the base 64 characters (+,/,=) were being altered because of URL processing. When I URL encoded the base 64 string, the problem was resolved.Elfredaelfrida
@MichałGórny If you're using JSON as a GET parameter, Base 64 encoding will (depending on your data) likely reduce the size of the request string. (And before you say this is a silly idea, we're using JSON in query strings to facilitate deep linking into our app.) For our app, this approach achieved a reduction of about 30%. (To be fair, an even greater reduction could be achieved by avoiding Base64 entirely and instead writing our own JSON (de)serializers that use URL-encoding-friendly characters (e.g. ([' instead of {[").Icelander
I'm assuming you mean that reduction was from doing base64 before url-encoding, which makes perfect sense to me, because url-encoding is really inefficient.Evidentiary
You forgot spaces... If it's a base64 of a binary file, it can include spaces (which are sometimes ignored, but not always)...Prolongate
@Prolongate - the 64-character encoding set "of a binary file" you are referring to must be different than the standard "Base64", which does NOT include the space character. It uses only PRINTABLE characters (table is in the wiki link of comments above).Mantelletta
I
344

There are additional base64 specs. (See the table here for specifics ). But essentially you need 65 chars to encode: 26 lowercase + 26 uppercase + 10 digits = 62.

You need two more ['+', '/'] and a padding char '='. But none of them are url friendly, so just use different chars for them and you're set. The standard ones from the chart above are ['-', '_'], but you could use other chars as long as you decoded them the same, and didn't need to share with others.

I'd recommend just writing your own helpers. Like these from the comments on the php manual page for base64_encode:

function base64_url_encode($input) {
 return strtr(base64_encode($input), '+/=', '-_.');
}

function base64_url_decode($input) {
 return base64_decode(strtr($input, '-_.', '+/='));
}
Imidazole answered 29/4, 2011 at 17:35 Comment(9)
Great solution, except comma is not unreserved in URLs. I recommend using '~' (tilde) or '.' (dot) instead.Toilette
@kralyk: I reccomend just using urlencode as suggested by rodrigo-silveira's answer. Creating two new functions to save few chars in url length, it's like entering in your house passing through the window instead of just using the door.Pry
@MarcoDemaio, without knowing how it will be used, it's impossible to say that it's just a few characters. Every encoded character will have triple the length, and why wouldn't "+++..." be a valid base64 string? URLs have browser limits, and tripling a URL might make you hit those limits.Neill
Ironically, @Toilette suggests tilde, and yet tilde is not a URL-safe character! Lots of misinformation floating around. :)Topee
@RandalSchwartz tilde is URL-safe. From RFC3986: unreserved = ALPHA / DIGIT / "-" / "." / "_" / "~"Toilette
Ahh, that's a relatively recent change. I was using legacy information.Topee
Since , should be urlencoded to %2C, I suggest using ._- instead of -_, like the only variant in en.wikipedia.org/wiki/Base64#Variants_summary_table that keeps the trailing =Laos
Note, that this is not entirely save: It can happen, that the last char of your URL becomes a . which is then not considered a part of the URL by some mail clients. I still recommend the replacement proposed here, though, because some mail clients optimize //to / in URLs and also not accept trailing =-signs as part of an URL.Pinery
To clarify: Don't invent your own url-safe 64. Use base64url. That uses minus and underline (with equals for pad), as Joe Flynn's answer describes.Mantelletta
D
267

No, you would need to url-encode it, since base64 strings can contain the "+", "=" and "/" characters which could alter the meaning of your data - look like a sub-folder.

Valid base64 characters are below.

ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789+/=
Dreda answered 3/9, 2009 at 17:19 Comment(9)
URLencoding is a waste of space, especially as base64 itself leaves many characters unused.Softhearted
I am not sure I understand what you are saying - URL encoding wont alter any of the characters except the last three characters in the list above, and that is to prevent them from being interpreted incorrectly since they have other meanings in URLS. The same goes for base64, the original data could be binary or anything, but it is encoded in a form that can be transmitted easily using simple protocols.Dreda
Firstly, you should escape '+' too as it may be converted into space. Secondly, there are at least few characters which are safe for use in URLs and aren't used in ‘standard’ charset. Your method can even increase the size of transferred data three times in certain situations; while replacing those characters with some other will do the trick while preserving same length. And it's quite standard solution too.Softhearted
en.wikipedia.org/wiki/Base64#URL_applications — it says clearly that escaping ‘makes the string unnecessarily longer’ and mentions the alternate charset variant.Softhearted
Because of this answer, I diagnosed my problem as being exactly what it mentioned. Some of the base 64 characters (+,/,=) were being altered because of URL processing. When I URL encoded the base 64 string, the problem was resolved.Elfredaelfrida
@MichałGórny If you're using JSON as a GET parameter, Base 64 encoding will (depending on your data) likely reduce the size of the request string. (And before you say this is a silly idea, we're using JSON in query strings to facilitate deep linking into our app.) For our app, this approach achieved a reduction of about 30%. (To be fair, an even greater reduction could be achieved by avoiding Base64 entirely and instead writing our own JSON (de)serializers that use URL-encoding-friendly characters (e.g. ([' instead of {[").Icelander
I'm assuming you mean that reduction was from doing base64 before url-encoding, which makes perfect sense to me, because url-encoding is really inefficient.Evidentiary
You forgot spaces... If it's a base64 of a binary file, it can include spaces (which are sometimes ignored, but not always)...Prolongate
@Prolongate - the 64-character encoding set "of a binary file" you are referring to must be different than the standard "Base64", which does NOT include the space character. It uses only PRINTABLE characters (table is in the wiki link of comments above).Mantelletta
S
97

@joeshmo Or instead of writing a helper function, you could just urlencode the base64 encoded string. This would do the exact same thing as your helper function, but without the need of two extra functions.

$str = 'Some String';

$encoded = urlencode( base64_encode( $str ) );
$decoded = base64_decode( urldecode( $encoded ) );
Sundae answered 21/12, 2011 at 18:45 Comment(10)
The result is not exactly the same. urlencode uses 3 characters to encode non-valid characters and joeshmo's solution uses 1. It's not a big difference, but it's still a waste.Solidus
@JosefBorkovec Really? Then this would also mean the same number of bytes base64->url->encoded could be a variety of different resulting length, while the other solution gives a predictable lenght, right?Sunset
@Sunset Yes, urlencode is a shitty solution because it triples the size of certain base64 strings. You also can't reuse the buffer since the output is larger than the input.Belong
Expansion from 1 to 3 chars occurs on 3 out of 64 characters on average, so it is a 9% overhead (2*3/64)Laos
Be careful with / character if you pass it not as a GET parameter, but as a path in the URL. It will change your path if you don't replace / with something else on both sides.Harlie
base64 should not be decoded until after url parsing has already taken place, so mistaking a + or / as being part of the url after decoding the base64 should not be an issue. Parse the path of the url first, then if there are base64 segments in the path, decode those individually. Same goes for url parameters.Nisan
Should'nt the second line be urldecode(base64_decode($encoded )); It seems backwards to me.Salinometer
URL encoding the contents of the query params could be problematic with some rest clients, we faced that exact issue when encoding our Base64 query params before sending them, when using programmatic clients like RestAssured or Groovy http connection it would work well, but when using Postman or Curl the contents of the query param were different. The reason seems to be that some clients perform an extra encoding to the url, so the query params ended up going to the server with double encodingVizor
My choice. In Java: url = URLEncoder.encode(Base64.getEncoder().encodeToString(value), StandardCharsets.UTF_8) Base64.getDecoder().decode(URLDecoder.decode(code, StandardCharsets.UTF_8))Emperor
Unfortunately, the "decode" part of this answer is wrong. "urldecode" is done for you automatically when php populates $_GET, and doing "urldecode" again will be wrong for +. See Jeffory Beckers' answer for the details.Mantelletta
A
49

Introductory Note I'm inclined to post a few clarifications since some of the answers here were a little misleading (if not incorrect).

The answer is NO, you cannot simply pass a base64 encoded parameter within a URL query string since plus signs are converted to a SPACE inside the $_GET global array. In other words, if you sent test.php?myVar=stringwith+sign to

//test.php
print $_GET['myVar'];

the result would be:
stringwith sign

The easy way to solve this is to simply urlencode() your base64 string before adding it to the query string to escape the +, =, and / characters to %## codes. For instance, urlencode("stringwith+sign") returns stringwith%2Bsign

When you process the action, PHP takes care of decoding the query string automatically when it populates the $_GET global. For example, if I sent test.php?myVar=stringwith%2Bsign to

//test.php
print $_GET['myVar'];

the result would is:
stringwith+sign

You do not want to urldecode() the returned $_GET string as +'s will be converted to spaces.
In other words if I sent the same test.php?myVar=stringwith%2Bsign to

//test.php
$string = urldecode($_GET['myVar']);
print $string;

the result is an unexpected:
stringwith sign

It would be safe to rawurldecode() the input, however, it would be redundant and therefore unnecessary.

Armistice answered 25/9, 2012 at 22:19 Comment(3)
Nice answer. You can use PHP code without the starting and ending tags on this site if the question is tagged php (also most often it's clear from the context of the question). If you add two spaces at the end of a line you will see the <br>, so no need to type much HTML. I hope this helps, I edited your answer a little to even more improve it.Trondheim
Thank you for mentioning that PHP decodes the URL for you. That saves me from falling inside a rabbit hole.Gabe
Great Answer -> You do not want to urldecode() the returned $_GET string as +'s will be converted to spaces. It would be safe to rawurldecode() the input, however,Dielu
D
19

Yes and no.

The basic charset of base64 may in some cases collide with traditional conventions used in URLs. But many of base64 implementations allow you to change the charset to match URLs better or even come with one (like Python's urlsafe_b64encode()).

Another issue you may be facing is the limit of URL length or rather — lack of such limit. Because standards do not specify any maximum length, browsers, servers, libraries and other software working with HTTP protocol may define its' own limits.

Dinadinah answered 3/9, 2009 at 17:20 Comment(0)
K
13

Its a base64url encode you can try out, its just extension of joeshmo's code above.

function base64url_encode($data) {
return rtrim(strtr(base64_encode($data), '+/', '-_'), '=');
}

function base64url_decode($data) {
return base64_decode(str_pad(strtr($data, '-_', '+/'), strlen($data) % 4, '=', STR_PAD_RIGHT));
}
Knelt answered 21/7, 2015 at 8:31 Comment(2)
This works for data encoded with Java's Base64.getUrlEncoder().withoutPadding().encodeToString()Michikomichon
This version of base64url_decode() was breaking my JSON.Charlinecharlock
H
5

I don't think that this is safe because e.g. the "=" character is used in raw base 64 and is also used in differentiating the parameters from the values in an HTTP GET.

Habana answered 3/9, 2009 at 17:18 Comment(0)
L
3

If you have sodium extension installed and need to encode binary data, you can use sodium_bin2base64 function which allows you to select url safe variant.

for example encoding can be done like that:

$string = sodium_bin2base64($binData, SODIUM_BASE64_VARIANT_URLSAFE);

and decoding:

$result = sodium_base642bin($base64String, SODIUM_BASE64_VARIANT_URLSAFE);

For more info about usage, check out php docs:

https://www.php.net/manual/en/function.sodium-bin2base64.php https://www.php.net/manual/en/function.sodium-base642bin.php

Luminosity answered 8/12, 2021 at 14:50 Comment(0)
B
1

For url safe encode, like base64.urlsafe_b64encode(...) in Python the code below, works to me for 100%

function base64UrlSafeEncode(string $input)
{
   return str_replace(['+', '/'], ['-', '_'], base64_encode($input));
}
Bagdad answered 9/5, 2020 at 14:48 Comment(0)
A
1

I know I'm 14 years too late, but one way to absolutely positively guarantee that you are getting alpha-numeric text (no punctuation, symbols, etc) would be to use these functions:

function hex_encode($input) {
  return bin2hex($input);
}

function hex_decode($input) {
  return pack("H*", $input);
}

You can pass your base64_encode()'d string through the hex_encode and get back something like this:

2821583e34f623d682f26684e5c27

I know it inflates the size of the string, but you're guaranteed not to have to worry about +, /, =, etc.

Aerobatics answered 27/2 at 1:20 Comment(0)
A
0

In theory, yes, as long as you don't exceed the maximum url and/oor query string length for the client or server.

In practice, things can get a bit trickier. For example, it can trigger an HttpRequestValidationException on ASP.NET if the value happens to contain an "on" and you leave in the trailing "==".

Arondell answered 3/9, 2009 at 17:22 Comment(1)
you make no mention of +, /, or = characters which make urls invalid in certain cases.Zippel
M
0

For those using .NET, they can utilize the Encode and Decode methods of Base64UrlEncoder class which is found in package Microsoft.IdentityModel.Tokens v6.31.0.

Montano answered 2/7, 2023 at 6:58 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.