Why were Javascript `atob()` and `btoa()` named like that?
Asked Answered
B

5

433

In Javascript, window.atob() method decodes a base64 string and window.btoa() method encodes a string into base64.

Then why weren't they named like base64Decode() and base64Encode()? atob() and btoa() don't make sense because they're not semantic at all.

I want to know the reason.

Boneset answered 22/11, 2015 at 11:11 Comment(4)
I too was convinced that atob and btoa were named backwards, being A the original string and B the encoded string, it was an unfortunate conicidence Base64 shared the initial with the encoded string B. Piling up on the confusion is the fact that I started using Linux only during the last decade, Linux provided the base64 program so I never had to know that btoa did the same. I hardly question naming choices, but after many years I just had to know.Ishii
You could go function abes46neoced(a){return swab(swab(atob(a)))); but you would need to write your own swab function.Claudelle
It's because the 'b' in a atob stands for binary, not base64. ASCII is base64 encoded, and strings are binary.Jellyfish
@janac ASCII is NOT base64, if anything, one could maybe argue base64 is a subset of ASCII. But that's not really true either, it just gets represented by alphanumeric symbols we all recognize. ASCII was originally designed with 7 bits aka base128 and the newer utf-8 format and now utf-16 are the most common. Base256 and base65536. But no one calls them that. Base64 was "created" and used to ensure older incompatible devices could talk, since some used 7bits and some used 6bits for network communication. Base64 is 6 bits so it worked on both.Bibliopegy
A
254

I asked Brendan Eich (the creator of JavaScript) if he picked those names on Twitter and he responded:

Old Unix names, hard to find man pages rn but see https://www.unix.com/man-page/minix/1/btoa/ …. The names carried over from Unix into the Netscape codebase. I reflected them into JS in a big hurry in 1995 (after the ten days in May but soon).

In case the Minix link breaks, here's the man page content:

BTOA(1)                                           BTOA(1)

NAME
       btoa - binary to ascii conversion

SYNOPSIS
       btoa [-adhor] [infile] [outfile]

OPTIONS
       -a     Decode, rather than encode, the file

       -d     Extracts repair file from diagnosis file

       -h     Help menu is displayed giving the options

       -o     The obsolete algorithm is used for backward compatibility

       -r     Repair a damaged file

EXAMPLES
       btoa <a.out >a.btoa # Convert a.out to ASCII

       btoa -a <a.btoa >a.out
               # Reverse the above

DESCRIPTION
       Btoa  is  a  filter that converts a binary file to ascii for transmission over a telephone
       line.  If two file names are provided, the first in used for input and the second for out-
       put.   If  only one is provided, it is used as the input file.  The program is a function-
       ally similar alternative to uue/uud, but the encoding is completely different.  Since both
       of  these are widely used, both have been provided with MINIX.  The file is expanded about
       25 percent in the process.

SEE ALSO
       uue(1), uud(1).
Anastigmat answered 21/5, 2018 at 17:44 Comment(2)
Well, this is the actual answer to OP's question.Octopus
In my head, I always expanded the functions to asciiToBase64 and base64ToAscii, which always confuses me, since they actually do the exact opposite of that. That answer finally provides a good explanation that even makes some sense. Hope my brain will be able to pick that up :)Elderberry
R
219

The atob() and btoa() methods allow authors to transform content to and from the base64 encoding.

In these APIs, for mnemonic purposes, the "b" can be considered to stand for "binary", and the "a" for "ASCII". In practice, though, for primarily historical reasons, both the input and output of these functions are Unicode strings.

From : http://www.w3.org/TR/html/webappapis.html#atob

Radcliff answered 22/11, 2015 at 11:14 Comment(14)
This, and following the C tradition like atoi() etc.Orthognathous
But it's backward. atob() converts binary to ASCII, and btoa() converts ASCII to binary.Magical
ascii is base64, and atob is ascii to binary. they kind of left this out of both answers. so it isn't reversedCynth
So the String is Binary?! And I thought all the time, binary was something like 0 and 1. This is SO CONFUSING!Eb
@StefanRein I agree with your opinion. window.btoa read its argument as binary data and split it into 6 bits of chunks in order to encode it; it's true, so the naming makes sense in a point of view. However, also, window.btoa only takes a string as its argument! :(Brian
@K._ > "However, also, window.btoa only takes a string as its argument!" < That's true but the string here is only a representation of the data. Like if you try to open an image in a notepad it'll display as a string but it's still binary data. btoa's main advantage is that it doesn't care what format the string is in, it just treats it as binary. It's only incidental that in most cases that string happens to be a regular string.Colonel
This is a typical Javascript approach. Even mnemonics are broken in this language.Cannelloni
No it's a browser / web standard API, not JavaScript. Node.js has better APIs and first-class binary support via Buffer, e.g. buffer.toString('base64') and Buffer.from(base64string, 'base64')Paleontology
@StefanRein @Cynth Not only String is Binary, but also jpg,mp4,avi,gif……any binary file can encode into base64 ASCII array.Davison
I think we can all agree these are the worst named functions in the standard. To my brain, "b" must be short for base64 and "a" must be short for ASCII, yet it's the complete opposite of that. I just… why?Microphone
strange mnemonic choice :)Talanta
The "Binary" isn't supposed to mean normal text strings, but actual blob data - think images or document files. I guess it's use for text strings comes from the need to safely store web form data without the threat of code injection, or some similar reason.Calhoun
@jtheletter It's not backward, though I agree it's confusing. The "binary" input refers to a string representation of binary data, in which each character is interpreted as one byte (hence why codepoints above 0xff aren't allowed). Not all of this range is ASCII compliant (for example, btoa('ð\x9f\x92©') doesn't throw, even though none of these characters are ASCII).The "ASCII", meanwhile, refers to Base64, which is ASCII compliant. I agree it's a horrible naming system though.Transmarine
ASCII is not the same as base64, it is a superset.Natika
P
188

To sum up the already given answers:

  • atob stands for ASCII to binary
    • e.g.: atob("ZXhhbXBsZSELCg==") == "example!^K"
  • btoa stands for binary to ASCII
    • e.g.: btoa("\x01\x02\xfe\xff") == "AQL+/w=="

Why ASCII and binary:

  • ASCII (the a) is the result of base64 encoding. A safe text composed only of a subset of ascii characters(*) that can be correctly represented and transported (e.g. email's body),
  • binary (the b) is any stream of 0s and 1s (in javascript it must be represented with a string type).

(*) in base64 these are limited to: A-Z, a-z, 0-9, +, / and = (padding, only at the end) https://en.wikipedia.org/wiki/Base64

P.S. I must admit I myself was initially confused by the naming and thought the names were swapped. I thought that b stand for "base64 encoded string" and a for "any string" :D.

Palestra answered 16/5, 2017 at 15:12 Comment(2)
I think you basically just proved everyone's point: base64 is a subset of ASCII, therefore while you might argue that the output of btoa is still technically ASCII, there's no justification for the name atob which only accepts base64 as input.Microphone
It helps to think and remember 'a'(ascii) as base64 output and 'b'(binary) as stream of 0 and 1 which is string.Deferment
H
88

The names come from a unix function with similar functionality, but you can already read that in other answers here.


Here is my mnemonic to remember which one to use. This doesn't really answer the question itself, but might help people figure which one of the functions to use without keeping a tab open on this question on stack overflow all day long.

Beautiful to Awful btoa

Take something Beautiful (aka, beautiful content that would make sense to your application: json, xml, text, binary data) and transform it to something Awful, that cannot be understood as is (aka: encoded).

Awful to Beautiful atob

The exact opposite of btoa

Note

Some may say that binary is not beautiful, but hey, this is only a trick to help you.

Hyozo answered 8/12, 2020 at 21:22 Comment(1)
This mnemonic seems far more confusing than just remembering what the name is actually meant to stand for, which is "Binary to ASCII". It's pretty unintuitive that in your mnemonic, binary content that may literally not even contain printable characters is meant to be "beautiful" while the ASCII content is "awful".Soot
R
9

I can't locate a source at the moment, but it is common knowledge that in this case, the b stands for 'binary', and the a for 'ASCII'.

Therefore, the functions are actually named:

ASCII to Binary for atob(), and Binary to ASCII for btoa().

Alos, note that this is browser implementation and was left for legacy / backward compatibility purposes. In Node.js, you would use:

Buffer.from("Hello World").toString('base64')
Buffer.from("SGVsbG8gV29ybGQ=", 'base64').toString('ascii')
Rosauraroscius answered 22/11, 2015 at 11:17 Comment(1)
In Node you use Buffer.from("Hello World").toString('base64') & Buffer.from("SGVsbG8gV29ybGQ=", 'base64').toString('ascii')Vanhook

© 2022 - 2024 — McMap. All rights reserved.