How do I base64 encode (decode) in C?
Asked Answered
C

20

162

I have binary data in an unsigned char variable. I need to convert them to PEM base64 in c. I looked in openssl library but i could not find any function. Does any body have any idea?

Charry answered 4/12, 2008 at 23:8 Comment(5)
I have a github repository with tested base64 and unbase64 functions. The only header you need is base64.hImmortality
Unfortunately most of the answers here are completely off-topic. C++ is not C.Botts
@JoeCoder See comment on libb64 below.Incredible
@JonathanBen-Avraham Since libb64 is itself implemented in c++ I suspect the answer can be considered also off-topic.Protrusive
@Protrusive The OP mentions that he looked into openssl library functions, which indicates that he doesn't care what the library language is, libb64 is definitely relevant as a maintained, tested solution preferable to home brew solutions. The OP did not indicate any platform restrictions such as bare metal, FreeRTOS, MS WIndows.Incredible
G
130

Here's the one I'm using:

#include <stdint.h>
#include <stdlib.h>


static char encoding_table[] = {'A', 'B', 'C', 'D', 'E', 'F', 'G', 'H',
                                'I', 'J', 'K', 'L', 'M', 'N', 'O', 'P',
                                'Q', 'R', 'S', 'T', 'U', 'V', 'W', 'X',
                                'Y', 'Z', 'a', 'b', 'c', 'd', 'e', 'f',
                                'g', 'h', 'i', 'j', 'k', 'l', 'm', 'n',
                                'o', 'p', 'q', 'r', 's', 't', 'u', 'v',
                                'w', 'x', 'y', 'z', '0', '1', '2', '3',
                                '4', '5', '6', '7', '8', '9', '+', '/'};
static char *decoding_table = NULL;
static int mod_table[] = {0, 2, 1};


char *base64_encode(const unsigned char *data,
                    size_t input_length,
                    size_t *output_length) {

    *output_length = 4 * ((input_length + 2) / 3);

    char *encoded_data = malloc(*output_length);
    if (encoded_data == NULL) return NULL;

    for (int i = 0, j = 0; i < input_length;) {

        uint32_t octet_a = i < input_length ? (unsigned char)data[i++] : 0;
        uint32_t octet_b = i < input_length ? (unsigned char)data[i++] : 0;
        uint32_t octet_c = i < input_length ? (unsigned char)data[i++] : 0;

        uint32_t triple = (octet_a << 0x10) + (octet_b << 0x08) + octet_c;

        encoded_data[j++] = encoding_table[(triple >> 3 * 6) & 0x3F];
        encoded_data[j++] = encoding_table[(triple >> 2 * 6) & 0x3F];
        encoded_data[j++] = encoding_table[(triple >> 1 * 6) & 0x3F];
        encoded_data[j++] = encoding_table[(triple >> 0 * 6) & 0x3F];
    }

    for (int i = 0; i < mod_table[input_length % 3]; i++)
        encoded_data[*output_length - 1 - i] = '=';

    return encoded_data;
}


unsigned char *base64_decode(const char *data,
                             size_t input_length,
                             size_t *output_length) {

    if (decoding_table == NULL) build_decoding_table();

    if (input_length % 4 != 0) return NULL;

    *output_length = input_length / 4 * 3;
    if (data[input_length - 1] == '=') (*output_length)--;
    if (data[input_length - 2] == '=') (*output_length)--;

    unsigned char *decoded_data = malloc(*output_length);
    if (decoded_data == NULL) return NULL;

    for (int i = 0, j = 0; i < input_length;) {

        uint32_t sextet_a = data[i] == '=' ? 0 & i++ : decoding_table[data[i++]];
        uint32_t sextet_b = data[i] == '=' ? 0 & i++ : decoding_table[data[i++]];
        uint32_t sextet_c = data[i] == '=' ? 0 & i++ : decoding_table[data[i++]];
        uint32_t sextet_d = data[i] == '=' ? 0 & i++ : decoding_table[data[i++]];

        uint32_t triple = (sextet_a << 3 * 6)
        + (sextet_b << 2 * 6)
        + (sextet_c << 1 * 6)
        + (sextet_d << 0 * 6);

        if (j < *output_length) decoded_data[j++] = (triple >> 2 * 8) & 0xFF;
        if (j < *output_length) decoded_data[j++] = (triple >> 1 * 8) & 0xFF;
        if (j < *output_length) decoded_data[j++] = (triple >> 0 * 8) & 0xFF;
    }

    return decoded_data;
}


void build_decoding_table() {

    decoding_table = malloc(256);

    for (int i = 0; i < 64; i++)
        decoding_table[(unsigned char) encoding_table[i]] = i;
}


void base64_cleanup() {
    free(decoding_table);
}

Keep in mind that this doesn't do any error-checking while decoding - non base 64 encoded data will get processed.

Grantland answered 21/7, 2011 at 20:40 Comment(13)
It doesn't make any sense to use this if there is a library.Montreal
You can skip the libm and math.h "dependency" as well the need for floating point operations (which are slow on some hardware), by using *output_length = ((input_length - 1) / 3) * 4 + 4; in the beginning of base64_encode.Highgrade
I realize it is "no error checking", but especially notice that although the decoding table in the decoder is an array of 256, since char is signed on most architectures, you are really indexing from -128 to 127. Any character with the high bit set will cause you to read outside the allocated memory. Forcing the data lookup to be an unsigned char clears that up. You still get garbage out for garbage in, but you won't segfault.Reine
You have an array out-of-bounds problem in build_decoding_table. encoding_table[64] to encoding_table[255] do not exist.Immortality
why are you malloc'ing 256 and then are you fitting the malloc'ed area with only 64 byte? ThxKioto
Wanted to share an optimized version for encoding, you can apply similar logic to the decoding method.Bilow
This code has a sign bug: uint32_t octet_a = i < input_length ? data[i++] : 0; should be uint32_t octet_a = i < str.length() ? (unsigned char)str[i++] : 0; (others similar)Oeo
uint32_t octet_a = i < str.length() ? (unsigned char)data[i++] : 0;Oeo
@Kioto there the i is being used as an index for the indexHoover
Decoding also does not handle the situation where the padding "=" are missing. Together with all other errors a pretty bad implementation.Nonobedience
Could you please add a short usage example for encoding?Subjective
Thanks for this answer, I think it offers a really nice approach of which I implemented most (only talking about the encoding). I noticed that the output_length is not correct at all times, though. Using abcde as input, I get an output with the length of 9, which should not be possible. The ouput length should always be a multiple of 4.Warchaw
The for (int i = 0; i < mod_table[input_length % 3]; i++) encoded_data[*output_length - 1 - i] = '='; code part is causing me segfault when using printf with floats.Uncourtly
S
114

I know this question is quite old, but I was getting confused by the amount of solutions provided - each one of them claiming to be faster and better. I put together a project on github to compare the base64 encoders and decoders: https://github.com/gaspardpetit/base64/

At this point, I have not limited myself to C algorithms - if one implementation performs well in C++, it can easily be backported to C. Also tests were conducted using Visual Studio 2015. If somebody wants to update this answer with results from clang/gcc, be my guest.

FASTEST ENCODERS: The two fastest encoder implementations I found were Jouni Malinen's at http://web.mit.edu/freebsd/head/contrib/wpa/src/utils/base64.c and the Apache at https://opensource.apple.com/source/QuickTimeStreamingServer/QuickTimeStreamingServer-452/CommonUtilitiesLib/base64.c.

Here is the time (in microseconds) to encode 32K of data using the different algorithms I have tested up to now:

jounimalinen                25.1544
apache                      25.5309
NibbleAndAHalf              38.4165
internetsoftwareconsortium  48.2879
polfosol                    48.7955
wikibooks_org_c             51.9659
gnome                       74.8188
elegantdice                 118.899
libb64                      120.601
manuelmartinez              120.801
arduino                     126.262
daedalusalpha               126.473
CppCodec                    151.866
wikibooks_org_cpp           343.2
adp_gmbh                    381.523
LihO                        406.693
libcurl                     3246.39
user152949                  4828.21

(René Nyffenegger's solution, credited in another answer to this question, is listed here as adp_gmbh).

Here is the one from Jouni Malinen that I slightly modified to return a std::string:

/*
* Base64 encoding/decoding (RFC1341)
* Copyright (c) 2005-2011, Jouni Malinen <[email protected]>
*
* This software may be distributed under the terms of the BSD license.
* See README for more details.
*/

// 2016-12-12 - Gaspard Petit : Slightly modified to return a std::string 
// instead of a buffer allocated with malloc.

#include <string>

static const unsigned char base64_table[65] =
    "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789+/";

/**
* base64_encode - Base64 encode
* @src: Data to be encoded
* @len: Length of the data to be encoded
* @out_len: Pointer to output length variable, or %NULL if not used
* Returns: Allocated buffer of out_len bytes of encoded data,
* or empty string on failure
*/
std::string base64_encode(const unsigned char *src, size_t len)
{
    unsigned char *out, *pos;
    const unsigned char *end, *in;

    size_t olen;

    olen = 4*((len + 2) / 3); /* 3-byte blocks to 4-byte */

    if (olen < len)
        return std::string(); /* integer overflow */

    std::string outStr;
    outStr.resize(olen);
    out = (unsigned char*)&outStr[0];

    end = src + len;
    in = src;
    pos = out;
    while (end - in >= 3) {
        *pos++ = base64_table[in[0] >> 2];
        *pos++ = base64_table[((in[0] & 0x03) << 4) | (in[1] >> 4)];
        *pos++ = base64_table[((in[1] & 0x0f) << 2) | (in[2] >> 6)];
        *pos++ = base64_table[in[2] & 0x3f];
        in += 3;
    }

    if (end - in) {
        *pos++ = base64_table[in[0] >> 2];
        if (end - in == 1) {
            *pos++ = base64_table[(in[0] & 0x03) << 4];
            *pos++ = '=';
        }
        else {
            *pos++ = base64_table[((in[0] & 0x03) << 4) |
                (in[1] >> 4)];
            *pos++ = base64_table[(in[1] & 0x0f) << 2];
        }
        *pos++ = '=';
    }

    return outStr;
}

FASTEST DECODERS: Here are the decoding results and I must admit that I am a bit surprised:

polfosol                    45.2335
wikibooks_org_c             74.7347
apache                      77.1438
libb64                      100.332
gnome                       114.511
manuelmartinez              126.579
elegantdice                 138.514
daedalusalpha               151.561
jounimalinen                206.163
arduino                     335.95
wikibooks_org_cpp           350.437
CppCodec                    526.187
internetsoftwareconsortium  862.833
libcurl                     1280.27
LihO                        1852.4
adp_gmbh                    1934.43
user152949                  5332.87

Polfosol's snippet from base64 decode snippet in c++ is the fastest by a factor of almost 2x.

Here is the code for the sake of completeness:

static const int B64index[256] = { 0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,
0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,
0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0, 62, 63, 62, 62, 63, 52, 53, 54, 55,
56, 57, 58, 59, 60, 61,  0,  0,  0,  0,  0,  0,  0,  0,  1,  2,  3,  4,  5,  6,
7,  8,  9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25,  0,
0,  0,  0, 63,  0, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40,
41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51 };

std::string b64decode(const void* data, const size_t len)
{
    unsigned char* p = (unsigned char*)data;
    int pad = len > 0 && (len % 4 || p[len - 1] == '=');
    const size_t L = ((len + 3) / 4 - pad) * 4;
    std::string str(L / 4 * 3 + pad, '\0');

    for (size_t i = 0, j = 0; i < L; i += 4)
    {
        int n = B64index[p[i]] << 18 | B64index[p[i + 1]] << 12 | B64index[p[i + 2]] << 6 | B64index[p[i + 3]];
        str[j++] = n >> 16;
        str[j++] = n >> 8 & 0xFF;
        str[j++] = n & 0xFF;
    }
    if (pad)
    {
        int n = B64index[p[L]] << 18 | B64index[p[L + 1]] << 12;
        str[str.size() - 1] = n >> 16;

        if (len > L + 2 && p[L + 2] != '=')
        {
            n |= B64index[p[L + 2]] << 6;
            str.push_back(n >> 8 & 0xFF);
        }
    }
    return str;
}
Slickenside answered 12/12, 2016 at 5:35 Comment(9)
I really don't think std::string and the rest of functions you used are parts of ANSI C. The question asking for C code, and tagged C, gets most upvoted answer in C++.Whitleather
If one wants a solution that works well for both decoding and encoding without having to take code from two places I would choose the apache version for C and polfosol's solution for C++Floro
@Slickenside Can Polfosol’s decoding be used on Jouni’s encoding?Catchpole
@SamThomas yes, all methods tested provide the same output and are inter-changeable.Slickenside
@Slickenside would you like update a bit of your repository and tests? :) Here is rewritten source with jounimalinen for encoding and polfosol for decoding. It is rewritten to be automotive/avionic compliant following MISRA C:2012 coding guidance. github.com/IMProject/IMUtility/blob/main/Src/base64.c btw. we are not expecting to be the fastest but the fastest most secure/safe one :)Norbertonorbie
The str[j++] is not optimal as there is some overhead to access the string here, and also j++ is not good if you want to vectorize the loop. If you want extra speed, create a pointer to str.c_str(), and then work on the pointer.Blob
Another speed boost is to minimize the shifts, by having static const uint32_t b64_1[256], then b64_6[256], b64_12[256] and b64_18[256]. Overall with those 2 optimizations I'm getting a 1.785x faster code than Polfosol. For strings longer than 4 bytes, we can also use vectorization (SIMD) to gain a lot of speed. I can try later.Blob
If you are looking for SIMD base 64, I'd recommend github.com/powturbo/Turbo-Base64 - it is by far the fastest implementation I have seen.Slickenside
I used one from Apple, but a different one than the one mentioned this one instead: opensource.apple.com/source/ChatServer/ChatServer-37.1/…. It worked well, I described it more here: https://mcmap.net/q/151855/-convert-base64-decoded-string-to-unsigned-char-32Furthermore
P
35

But you can also do it in openssl (openssl enc command does it....), look at the BIO_f_base64() function

Primogeniture answered 4/12, 2008 at 23:28 Comment(1)
It seems like the OP is already using OpenSSL for some other reason, so this is probably the best way to go about it.Waterlogged
C
19

libb64 has both C and C++ APIs. It is lightweight and perhaps the fastest publicly available implementation. It's also a dedicated stand-alone base64 encoding library, which can be nice if you don't need all the other stuff that comes from using a larger library such as OpenSSL or glib.

Counterweigh answered 21/12, 2011 at 16:22 Comment(6)
Note on libb64: BUFFERSIZE is defined in a make file, so if you don't use make/cmake, you'll need to manually define it in the header files in order for it to compile. Works/briefly tested VS2012Esker
As Tom said: #define BUFFERSIZE 16777216 you can replace to 65536 if you need a smaller buffer.Imco
Beware! After an hour of debugging I figured out that libb64 assumes that char is signed on the target system... This is a problem since base64_decode_value could return a negative number which is then casted to char.Homy
Note the sourceforge implementation adds newlines which are not universally supported. A fork by BuLogics on github removes them, and I've generated a pull request based on your extremely useful finding, @Noir.Mandamus
While this link may answer the question, it is better to include the essential parts of the answer here and provide the link for reference. Link-only answers can become invalid if the linked page changes.Daumier
libb64 is supported in Buldroot since 2017.11Incredible
M
19

Here's my solution using OpenSSL.

/* A BASE-64 ENCODER AND DECODER USING OPENSSL */
#include <openssl/pem.h>
#include <string.h> //Only needed for strlen().

char *base64encode (const void *b64_encode_this, int encode_this_many_bytes){
    BIO *b64_bio, *mem_bio;      //Declares two OpenSSL BIOs: a base64 filter and a memory BIO.
    BUF_MEM *mem_bio_mem_ptr;    //Pointer to a "memory BIO" structure holding our base64 data.
    b64_bio = BIO_new(BIO_f_base64());                      //Initialize our base64 filter BIO.
    mem_bio = BIO_new(BIO_s_mem());                           //Initialize our memory sink BIO.
    BIO_push(b64_bio, mem_bio);            //Link the BIOs by creating a filter-sink BIO chain.
    BIO_set_flags(b64_bio, BIO_FLAGS_BASE64_NO_NL);  //No newlines every 64 characters or less.
    BIO_write(b64_bio, b64_encode_this, encode_this_many_bytes); //Records base64 encoded data.
    BIO_flush(b64_bio);   //Flush data.  Necessary for b64 encoding, because of pad characters.
    BIO_get_mem_ptr(mem_bio, &mem_bio_mem_ptr);  //Store address of mem_bio's memory structure.
    BIO_set_close(mem_bio, BIO_NOCLOSE);   //Permit access to mem_ptr after BIOs are destroyed.
    BIO_free_all(b64_bio);  //Destroys all BIOs in chain, starting with b64 (i.e. the 1st one).
    BUF_MEM_grow(mem_bio_mem_ptr, (*mem_bio_mem_ptr).length + 1);   //Makes space for end null.
    (*mem_bio_mem_ptr).data[(*mem_bio_mem_ptr).length] = '\0';  //Adds null-terminator to tail.
    return (*mem_bio_mem_ptr).data; //Returns base-64 encoded data. (See: "buf_mem_st" struct).
}

char *base64decode (const void *b64_decode_this, int decode_this_many_bytes){
    BIO *b64_bio, *mem_bio;      //Declares two OpenSSL BIOs: a base64 filter and a memory BIO.
    char *base64_decoded = calloc( (decode_this_many_bytes*3)/4+1, sizeof(char) ); //+1 = null.
    b64_bio = BIO_new(BIO_f_base64());                      //Initialize our base64 filter BIO.
    mem_bio = BIO_new(BIO_s_mem());                         //Initialize our memory source BIO.
    BIO_write(mem_bio, b64_decode_this, decode_this_many_bytes); //Base64 data saved in source.
    BIO_push(b64_bio, mem_bio);          //Link the BIOs by creating a filter-source BIO chain.
    BIO_set_flags(b64_bio, BIO_FLAGS_BASE64_NO_NL);          //Don't require trailing newlines.
    int decoded_byte_index = 0;   //Index where the next base64_decoded byte should be written.
    while ( 0 < BIO_read(b64_bio, base64_decoded+decoded_byte_index, 1) ){ //Read byte-by-byte.
        decoded_byte_index++; //Increment the index until read of BIO decoded data is complete.
    } //Once we're done reading decoded data, BIO_read returns -1 even though there's no error.
    BIO_free_all(b64_bio);  //Destroys all BIOs in chain, starting with b64 (i.e. the 1st one).
    return base64_decoded;        //Returns base-64 decoded data with trailing null terminator.
}

/*Here's one way to base64 encode/decode using the base64encode() and base64decode functions.*/
int main(void){
    char data_to_encode[] = "Base64 encode this string!";  //The string we will base-64 encode.

    int bytes_to_encode = strlen(data_to_encode); //Number of bytes in string to base64 encode.
    char *base64_encoded = base64encode(data_to_encode, bytes_to_encode);   //Base-64 encoding.

    int bytes_to_decode = strlen(base64_encoded); //Number of bytes in string to base64 decode.
    char *base64_decoded = base64decode(base64_encoded, bytes_to_decode);   //Base-64 decoding.

    printf("Original character string is: %s\n", data_to_encode);  //Prints our initial string.
    printf("Base-64 encoded string is: %s\n", base64_encoded);  //Prints base64 encoded string.
    printf("Base-64 decoded string is: %s\n", base64_decoded);  //Prints base64 decoded string.

    free(base64_encoded);                //Frees up the memory holding our base64 encoded data.
    free(base64_decoded);                //Frees up the memory holding our base64 decoded data.
}
Melanism answered 12/5, 2013 at 19:14 Comment(8)
On the "Adds a null-terminator" line I get an AddressSanitizer error that the write overflows the heap by 1 byte.Philander
Thanks, I have corrected the error, in addition to doing extensive testing with randomly-sized strings of random bytes to ensure that the code works as advertised. :)Melanism
NICE! I compiled it with cc -o base base.c -lssl -lcrypto. No errors. It produced this output: Original character string is: Base64 encode this string! Base-64 encoded string is: QmFzZTY0IGVuY29kZSB0aGlzIHN0cmluZyE= Base-64 decoded string is: Base64 encode this string!Monocot
@Melanism I have a file that is encoded as a string using python, but when i decode the string using your function and try to write the decoded result to a file(in C) i don't get the same file back. The encoded string is correct. ``` const unsigned char *jarFile = "<encoded file>"; int main() { print_version(); FILE *fp; char *out = base64decode(jarFile, strlen(jarFile)); fp = fopen("file.jar","wb") ; if (fp==NULL){ printf("File open failed"); return 1; } fwrite(out,sizeof(out),1,fp); fclose(fp); free(out); return 0; }```Catchpole
@SamThomas Using strlen works in my example because I made a string where only one null terminator exists (and it's at the end of the string). See: tutorialspoint.com/cprogramming/c_strings.htm Reading in jarFile with strlen will fail, because a null terminator likely exists in the middle of your binary file, messing up the bytes_to_decode value. See: #24596689 Find the size of your file a different way: #239103Melanism
This has a memory leak, at least in openssl1.1 version on ubuntu 20.04Chlores
@Chlores How do you know there's a memory leak and do you know where it is in the code?Melanism
@Melanism compile and run this program via valgrindChlores
T
17

glib has functions for base64 encoding: https://developer.gnome.org/glib/stable/glib-Base64-Encoding.html

Trahern answered 4/12, 2008 at 23:11 Comment(2)
While this link may answer the question, it is better to include the essential parts of the answer here and provide the link for reference. Link-only answers can become invalid if the linked page changes.Daumier
@UyghurLivesMatter ...which has now happened. :|Uria
D
15

GNU coreutils has it in lib/base64. It's a little bloated but deals with stuff like EBCDIC. You can also play around on your own, e.g.,

char base64_digit (n) unsigned n; {
  if (n < 10) return n - '0';
  else if (n < 10 + 26) return n - 'a';
  else if (n < 10 + 26 + 26) return n - 'A';
  else assert(0);
  return 0;
}

unsigned char base64_decode_digit(char c) {
  switch (c) {
    case '=' : return 62;
    case '.' : return 63;
    default  :
      if (isdigit(c)) return c - '0';
      else if (islower(c)) return c - 'a' + 10;
      else if (isupper(c)) return c - 'A' + 10 + 26;
      else assert(0);
  }
  return 0xff;
}

unsigned base64_decode(char *s) {
  char *p;
  unsigned n = 0;

  for (p = s; *p; p++)
    n = 64 * n + base64_decode_digit(*p);

  return n;
}

Know ye all persons by these presents that you should not confuse "playing around on your own" with "implementing a standard." Yeesh.

Daffy answered 5/12, 2008 at 2:37 Comment(3)
Also, '+' is 62 and '/' is 63 in PEM base64 as asked for by OP. Here is a list of base64 encoding variants. I do not see a base64 encoding variant with the ordering of characters you use. But the math behind the algorithm is correct.Pemmican
As already said : be careful this algorithm is not compatible with common base64Whiff
What about encoding?Deledda
M
15

I needed C++ implementation working on std::string. None of answers satisfied my needs, I needed simple two-function solution for encoding and decoding, but I was too lazy to write my own code, so I found this:

http://www.adp-gmbh.ch/cpp/common/base64.html

Credits for code go to René Nyffenegger.

Putting the code below in case the site goes down:

base64.cpp

/* 
   base64.cpp and base64.h

   Copyright (C) 2004-2008 René Nyffenegger

   This source code is provided 'as-is', without any express or implied
   warranty. In no event will the author be held liable for any damages
   arising from the use of this software.

   Permission is granted to anyone to use this software for any purpose,
   including commercial applications, and to alter it and redistribute it
   freely, subject to the following restrictions:

   1. The origin of this source code must not be misrepresented; you must not
      claim that you wrote the original source code. If you use this source code
      in a product, an acknowledgment in the product documentation would be
      appreciated but is not required.

   2. Altered source versions must be plainly marked as such, and must not be
      misrepresented as being the original source code.

   3. This notice may not be removed or altered from any source distribution.

   René Nyffenegger [email protected]

*/

#include "base64.h"
#include <iostream>

static const std::string base64_chars = 
             "ABCDEFGHIJKLMNOPQRSTUVWXYZ"
             "abcdefghijklmnopqrstuvwxyz"
             "0123456789+/";


static inline bool is_base64(unsigned char c) {
  return (isalnum(c) || (c == '+') || (c == '/'));
}

std::string base64_encode(unsigned char const* bytes_to_encode, unsigned int in_len) {
  std::string ret;
  int i = 0;
  int j = 0;
  unsigned char char_array_3[3];
  unsigned char char_array_4[4];

  while (in_len--) {
    char_array_3[i++] = *(bytes_to_encode++);
    if (i == 3) {
      char_array_4[0] = (char_array_3[0] & 0xfc) >> 2;
      char_array_4[1] = ((char_array_3[0] & 0x03) << 4) + ((char_array_3[1] & 0xf0) >> 4);
      char_array_4[2] = ((char_array_3[1] & 0x0f) << 2) + ((char_array_3[2] & 0xc0) >> 6);
      char_array_4[3] = char_array_3[2] & 0x3f;

      for(i = 0; (i <4) ; i++)
        ret += base64_chars[char_array_4[i]];
      i = 0;
    }
  }

  if (i)
  {
    for(j = i; j < 3; j++)
      char_array_3[j] = '\0';

    char_array_4[0] = (char_array_3[0] & 0xfc) >> 2;
    char_array_4[1] = ((char_array_3[0] & 0x03) << 4) + ((char_array_3[1] & 0xf0) >> 4);
    char_array_4[2] = ((char_array_3[1] & 0x0f) << 2) + ((char_array_3[2] & 0xc0) >> 6);
    char_array_4[3] = char_array_3[2] & 0x3f;

    for (j = 0; (j < i + 1); j++)
      ret += base64_chars[char_array_4[j]];

    while((i++ < 3))
      ret += '=';

  }

  return ret;

}

std::string base64_decode(std::string const& encoded_string) {
  int in_len = encoded_string.size();
  int i = 0;
  int j = 0;
  int in_ = 0;
  unsigned char char_array_4[4], char_array_3[3];
  std::string ret;

  while (in_len-- && ( encoded_string[in_] != '=') && is_base64(encoded_string[in_])) {
    char_array_4[i++] = encoded_string[in_]; in_++;
    if (i ==4) {
      for (i = 0; i <4; i++)
        char_array_4[i] = base64_chars.find(char_array_4[i]);

      char_array_3[0] = (char_array_4[0] << 2) + ((char_array_4[1] & 0x30) >> 4);
      char_array_3[1] = ((char_array_4[1] & 0xf) << 4) + ((char_array_4[2] & 0x3c) >> 2);
      char_array_3[2] = ((char_array_4[2] & 0x3) << 6) + char_array_4[3];

      for (i = 0; (i < 3); i++)
        ret += char_array_3[i];
      i = 0;
    }
  }

  if (i) {
    for (j = i; j <4; j++)
      char_array_4[j] = 0;

    for (j = 0; j <4; j++)
      char_array_4[j] = base64_chars.find(char_array_4[j]);

    char_array_3[0] = (char_array_4[0] << 2) + ((char_array_4[1] & 0x30) >> 4);
    char_array_3[1] = ((char_array_4[1] & 0xf) << 4) + ((char_array_4[2] & 0x3c) >> 2);
    char_array_3[2] = ((char_array_4[2] & 0x3) << 6) + char_array_4[3];

    for (j = 0; (j < i - 1); j++) ret += char_array_3[j];
  }

  return ret;
}

base64.h

#include <string>

std::string base64_encode(unsigned char const* , unsigned int len);
std::string base64_decode(std::string const& s);

Usage

const std::string s = "test";
std::string encoded = base64_encode(reinterpret_cast<const unsigned char*>(s.c_str()), s.length());
  std::string decoded = base64_decode(encoded);
Machicolate answered 8/4, 2013 at 13:14 Comment(0)
P
8

Here's the decoder I've been using for years...

    static const char  table[] = "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789+/";
    static const int   BASE64_INPUT_SIZE = 57;

    BOOL isbase64(char c)
    {
       return c && strchr(table, c) != NULL;
    }

    inline char value(char c)
    {
       const char *p = strchr(table, c);
       if(p) {
          return p-table;
       } else {
          return 0;
       }
    }

    int UnBase64(unsigned char *dest, const unsigned char *src, int srclen)
    {
       *dest = 0;
       if(*src == 0) 
       {
          return 0;
       }
       unsigned char *p = dest;
       do
       {

          char a = value(src[0]);
          char b = value(src[1]);
          char c = value(src[2]);
          char d = value(src[3]);
          *p++ = (a << 2) | (b >> 4);
          *p++ = (b << 4) | (c >> 2);
          *p++ = (c << 6) | d;
          if(!isbase64(src[1])) 
          {
             p -= 2;
             break;
          } 
          else if(!isbase64(src[2])) 
          {
             p -= 2;
             break;
          } 
          else if(!isbase64(src[3])) 
          {
             p--;
             break;
          }
          src += 4;
          while(*src && (*src == 13 || *src == 10)) src++;
       }
       while(srclen-= 4);
       *p = 0;
       return p-dest;
    }
Primeval answered 8/1, 2009 at 23:31 Comment(6)
what is the *dest = 0; at the start for?Edveh
It's just a very simple operation that makes sure the dest buffer is set to NULL in case the caller did not do that before the call, and if perhaps the decode failed, the returned buffer would be zero length. I didn't say I debugged, traced, and profiled this routine, it's just one I've been using for years. :) When I look at it now, it really doesn’t need to be there, so, why don't we call it an "exercise for the reader?" hehe.. Maybe I'll just edit it out. Thanks for pointing it out!Primeval
your UnBase64 function may compromise the memory after the dest buffer, if that buffer is the exact size required to decode the base 64 encoded string. Take for instance the simple case where you try to decode the following base 64 encoded string "BQ==", into a single BYTE i.e. unsigned char Result = 0; UnBase64(&Result, "BQ==", 4); It will corrupt the stack!Blaspheme
Yeah, caused nasty bug in our app. Do not recommend.Lordosis
Hi Larry, thanks for sharing you code. It's very usefull!Divertissement
By "I've been using for years..." you mean the code has been running for years without returning, because of the many slow strchr calls, right? ;) ... I mean not using a reverse lookup table at least saves some memoryJeffiejeffrey
S
8

The EVP_EncodeBlock and EVP_DecodeBlock functions make it very easy:

#include <stdio.h>
#include <stdlib.h>
#include <openssl/evp.h>

char *base64(const unsigned char *input, int length) {
  const int pl = 4*((length+2)/3);
  char *output = calloc(pl+1, 1); //+1 for the terminating null that EVP_EncodeBlock adds on
  const int ol = EVP_EncodeBlock(output, input, length);
  if (ol != pl) { fprintf(stderr, "Whoops, encode predicted %d but we got %d\n", pl, ol); }
  return output;
}

unsigned char *decode64(const char *input, int length) {
  const int pl = 3*length/4;
  unsigned char *output = calloc(pl+1, 1);
  const int ol = EVP_DecodeBlock(output, input, length);
  if (pl != ol) { fprintf(stderr, "Whoops, decode predicted %d but we got %d\n", pl, ol); }
  return output;
}
Spillman answered 7/3, 2020 at 18:49 Comment(0)
S
6

Small improvement to the code from ryyst (who got the most votes) is to not use dynamically allocated decoding table but rather static const precomputed table. This eliminates the use of pointer and initialization of the table, and also avoids memory leakage if one forgets to clean up the decoding table with base64_cleanup() (by the way, in base64_cleanup(), after calling free(decoding_table), one should have decoding_table=NULL, otherwise accidentally calling base64_decode after base64_cleanup() will crash or cause undetermined behavior). Another solution could be to use std::unique_ptr...but I'm satisfied with just having const char[256] on the stack and avoid using pointers alltogether - the code looks cleaner and shorter this way.

The decoding table is computed as follows:

const char encoding_table[] = { 
    'A', 'B', 'C', 'D', 'E', 'F', 'G', 'H',
    'I', 'J', 'K', 'L', 'M', 'N', 'O', 'P',
    'Q', 'R', 'S', 'T', 'U', 'V', 'W', 'X',
    'Y', 'Z', 'a', 'b', 'c', 'd', 'e', 'f',
    'g', 'h', 'i', 'j', 'k', 'l', 'm', 'n',
    'o', 'p', 'q', 'r', 's', 't', 'u', 'v',
    'w', 'x', 'y', 'z', '0', '1', '2', '3',
    '4', '5', '6', '7', '8', '9', '+', '/' };

unsigned char decoding_table[256];

for (int i = 0; i < 256; i++)
    decoding_table[i] = '\0';

for (int i = 0; i < 64; i++)
    decoding_table[(unsigned char)encoding_table[i]] = i;

for (int i = 0; i < 256; i++)
    cout << "0x" << (int(decoding_table[i]) < 16 ? "0" : "") << hex << int(decoding_table[i]) << (i != 255 ? "," : "") << ((i+1) % 16 == 0 ? '\n' : '\0');

cin.ignore();

and the modified code I am using is:

        static const char encoding_table[] = { 
            'A', 'B', 'C', 'D', 'E', 'F', 'G', 'H',
            'I', 'J', 'K', 'L', 'M', 'N', 'O', 'P',
            'Q', 'R', 'S', 'T', 'U', 'V', 'W', 'X',
            'Y', 'Z', 'a', 'b', 'c', 'd', 'e', 'f',
            'g', 'h', 'i', 'j', 'k', 'l', 'm', 'n',
            'o', 'p', 'q', 'r', 's', 't', 'u', 'v',
            'w', 'x', 'y', 'z', '0', '1', '2', '3',
            '4', '5', '6', '7', '8', '9', '+', '/' };

        static const unsigned char decoding_table[256] = {
            0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
            0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
            0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x3e, 0x00, 0x00, 0x00, 0x3f,
            0x34, 0x35, 0x36, 0x37, 0x38, 0x39, 0x3a, 0x3b, 0x3c, 0x3d, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
            0x00, 0x00, 0x01, 0x02, 0x03, 0x04, 0x05, 0x06, 0x07, 0x08, 0x09, 0x0a, 0x0b, 0x0c, 0x0d, 0x0e,
            0x0f, 0x10, 0x11, 0x12, 0x13, 0x14, 0x15, 0x16, 0x17, 0x18, 0x19, 0x00, 0x00, 0x00, 0x00, 0x00,
            0x00, 0x1a, 0x1b, 0x1c, 0x1d, 0x1e, 0x1f, 0x20, 0x21, 0x22, 0x23, 0x24, 0x25, 0x26, 0x27, 0x28,
            0x29, 0x2a, 0x2b, 0x2c, 0x2d, 0x2e, 0x2f, 0x30, 0x31, 0x32, 0x33, 0x00, 0x00, 0x00, 0x00, 0x00,
            0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
            0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
            0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
            0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
            0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
            0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
            0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
            0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00 };

        char* base64_encode(const unsigned char *data, size_t input_length, size_t &output_length) {

            const int mod_table[] = { 0, 2, 1 };

            output_length = 4 * ((input_length + 2) / 3);

            char *encoded_data = (char*)malloc(output_length);

            if (encoded_data == nullptr)
                return nullptr;

            for (int i = 0, j = 0; i < input_length;) {

                uint32_t octet_a = i < input_length ? (unsigned char)data[i++] : 0;
                uint32_t octet_b = i < input_length ? (unsigned char)data[i++] : 0;
                uint32_t octet_c = i < input_length ? (unsigned char)data[i++] : 0;

                uint32_t triple = (octet_a << 0x10) + (octet_b << 0x08) + octet_c;

                encoded_data[j++] = encoding_table[(triple >> 3 * 6) & 0x3F];
                encoded_data[j++] = encoding_table[(triple >> 2 * 6) & 0x3F];
                encoded_data[j++] = encoding_table[(triple >> 1 * 6) & 0x3F];
                encoded_data[j++] = encoding_table[(triple >> 0 * 6) & 0x3F];

            }

            for (int i = 0; i < mod_table[input_length % 3]; i++)
                encoded_data[output_length - 1 - i] = '=';

            return encoded_data;

        };

        unsigned char* base64_decode(const char *data, size_t input_length, size_t &output_length) {        

            if (input_length % 4 != 0)
                return nullptr;

            output_length = input_length / 4 * 3;

            if (data[input_length - 1] == '=') (output_length)--;
            if (data[input_length - 2] == '=') (output_length)--;

            unsigned char* decoded_data = (unsigned char*)malloc(output_length);

            if (decoded_data == nullptr)
                return nullptr;

            for (int i = 0, j = 0; i < input_length;) {

                uint32_t sextet_a = data[i] == '=' ? 0 & i++ : decoding_table[data[i++]];
                uint32_t sextet_b = data[i] == '=' ? 0 & i++ : decoding_table[data[i++]];
                uint32_t sextet_c = data[i] == '=' ? 0 & i++ : decoding_table[data[i++]];
                uint32_t sextet_d = data[i] == '=' ? 0 & i++ : decoding_table[data[i++]];

                uint32_t triple = (sextet_a << 3 * 6)
                    + (sextet_b << 2 * 6)
                    + (sextet_c << 1 * 6)
                    + (sextet_d << 0 * 6);

                if (j < output_length) decoded_data[j++] = (triple >> 2 * 8) & 0xFF;
                if (j < output_length) decoded_data[j++] = (triple >> 1 * 8) & 0xFF;
                if (j < output_length) decoded_data[j++] = (triple >> 0 * 8) & 0xFF;

            }

            return decoded_data;

        };
Stuck answered 16/2, 2018 at 0:44 Comment(7)
Like the idea of the included decoding table. Just curious, do you see any benefit in adding in any of the bug fixes from this version of Ryyst's code?Cembalo
I don't know. I am more concerned that he uses if (decoding_table == NULL) build_decoding_table(); in the main function but forgets to set the pointer to null in void base64_cleanup() { free(decoding_table); }. Didn't know there is an improvement to make it "URL-safe" but you can modify my code accordingly.Stuck
Actually I'm now a bit concerned, since apparently I took a bad code as an example (despite most votes on stackoverflow).Stuck
Yeah, IDK, maybe there are other factors to consider in @GaspardP's solution, for example. Compilers change over time- the basic algorithm looks solid enough.Cembalo
In base64_decode and base64_encode, string termination char was missing before returning data : decoded_data[output_length] = 0;and encoded_data[output_length] = 0;Uncourtly
Since original question is for C, need to change size_t &output to size_t *output_length and prepend * to all occurrences of output_length in the body of each function.Immoralist
joshbodily, you are correct.Stuck
C
4

I wrote one for use with C++, it's very fast, works with streams, free, and open source:

https://tmplusplus.svn.sourceforge.net/svnroot/tmplusplus/trunk/src/

Feel free to use it if it fits your purpose.

Edit: Added code inline by request.

The performance boost is acieved by using a lookup table for encoding and decoding. _UINT8 is an unsigned char on most OS's.

/** Static Base64 character encoding lookup table */
const char CBase64::encodeCharacterTable[65] = "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789+/";

/** Static Base64 character decoding lookup table */
const char CBase64::decodeCharacterTable[256] = {
    -1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1
    ,-1,62,-1,-1,-1,63,52,53,54,55,56,57,58,59,60,61,-1,-1,-1,-1,-1,-1,-1,0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21
    ,22,23,24,25,-1,-1,-1,-1,-1,-1,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,-1,-1,-1,-1,-1,
    -1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,
    -1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1
    ,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,
    -1,-1,-1}; 

/*!
\brief Encodes binary data to base 64 character data
\param in The data to encode
\param out The encoded data as characters
*/
void CBase64::Encode(std::istream &in, std::ostringstream &out)
{
    char buff1[3];
    char buff2[4];
    _UINT8 i=0, j;
    while(in.readsome(&buff1[i++], 1))
        if (i==3)
        {
            out << encodeCharacterTable[(buff1[0] & 0xfc) >> 2];
            out << encodeCharacterTable[((buff1[0] & 0x03) << 4) + ((buff1[1] & 0xf0) >> 4)];
            out << encodeCharacterTable[((buff1[1] & 0x0f) << 2) + ((buff1[2] & 0xc0) >> 6)];
            out << encodeCharacterTable[buff1[2] & 0x3f];
            i=0;
        }

    if (--i)
    {
        for(j=i;j<3;j++) buff1[j] = '\0';

        buff2[0] = (buff1[0] & 0xfc) >> 2;
        buff2[1] = ((buff1[0] & 0x03) << 4) + ((buff1[1] & 0xf0) >> 4);
        buff2[2] = ((buff1[1] & 0x0f) << 2) + ((buff1[2] & 0xc0) >> 6);
        buff2[3] = buff1[2] & 0x3f;

        for (j=0;j<(i+1);j++) out << encodeCharacterTable[buff2[j]];

        while(i++<3) out << '=';
    }

}

/*!
\brief Decodes base 64 character data to binary data
\param in The character data to decode
\param out The decoded data
*/
void CBase64::Decode(std::istringstream &in, std::ostream &out)
{
    char buff1[4];
    char buff2[4];
    _UINT8 i=0, j;

    while(in.readsome(&buff2[i], 1) && buff2[i] != '=')
    {
        if (++i==4)
        {
            for (i=0;i!=4;i++)
                buff2[i] = decodeCharacterTable[buff2[i]];

            out << (char)((buff2[0] << 2) + ((buff2[1] & 0x30) >> 4));
            out << (char)(((buff2[1] & 0xf) << 4) + ((buff2[2] & 0x3c) >> 2));
            out << (char)(((buff2[2] & 0x3) << 6) + buff2[3]);

            i=0;
        }
    }

    if (i) 
    {
        for (j=i;j<4;j++) buff2[j] = '\0';
        for (j=0;j<4;j++) buff2[j] = decodeCharacterTable[buff2[j]];

        buff1[0] = (buff2[0] << 2) + ((buff2[1] & 0x30) >> 4);
        buff1[1] = ((buff2[1] & 0xf) << 4) + ((buff2[2] & 0x3c) >> 2);
        buff1[2] = ((buff2[2] & 0x3) << 6) + buff2[3];

        for (j=0;j<(i-1); j++) out << (char)buff1[j];
    }
}
Cleary answered 21/1, 2011 at 20:46 Comment(3)
The linked blog no longer seems to exist at that URL.Hackneyed
@Hackneyed It's still available here tmplusplus.svn.sourceforge.net/svnroot/tmplusplus/trunk/srcCleary
@cpburnz I've added inline example now and a comment of why its fast, thanks.Cleary
C
4

In case people need a c++ solution, I put this OpenSSL solution together (for both encode and decode). You'll need to link with the "crypto" library (which is OpenSSL). This has been checked for leaks with valgrind (although you could add some additional error checking code to make it a bit better - I know at least the write function should check for return value).

#include <openssl/bio.h>
#include <openssl/evp.h>
#include <stdlib.h>

string base64_encode( const string &str ){

    BIO *base64_filter = BIO_new( BIO_f_base64() );
    BIO_set_flags( base64_filter, BIO_FLAGS_BASE64_NO_NL );

    BIO *bio = BIO_new( BIO_s_mem() );
    BIO_set_flags( bio, BIO_FLAGS_BASE64_NO_NL );

    bio = BIO_push( base64_filter, bio );

    BIO_write( bio, str.c_str(), str.length() );

    BIO_flush( bio );

    char *new_data;

    long bytes_written = BIO_get_mem_data( bio, &new_data );

    string result( new_data, bytes_written );
    BIO_free_all( bio );

    return result;

}



string base64_decode( const string &str ){

    BIO *bio, *base64_filter, *bio_out;
    char inbuf[512];
    int inlen;
    base64_filter = BIO_new( BIO_f_base64() );
    BIO_set_flags( base64_filter, BIO_FLAGS_BASE64_NO_NL );

    bio = BIO_new_mem_buf( (void*)str.c_str(), str.length() );

    bio = BIO_push( base64_filter, bio );

    bio_out = BIO_new( BIO_s_mem() );

    while( (inlen = BIO_read(bio, inbuf, 512)) > 0 ){
        BIO_write( bio_out, inbuf, inlen );
    }

    BIO_flush( bio_out );

    char *new_data;
    long bytes_written = BIO_get_mem_data( bio_out, &new_data );

    string result( new_data, bytes_written );

    BIO_free_all( bio );
    BIO_free_all( bio_out );

    return result;

}
Cockleshell answered 16/1, 2014 at 9:48 Comment(2)
BIO_free_all needs to specify the head - not the tail -of your bio chain (i.e. the base64_filter). Your current implementation has a memory leak.Melanism
@Melanism Which line has the leak? Bio_free_all frees the entire chain.Cockleshell
A
2

I fix @ryyst answer's bug and this is a url safe version:

    #include <stdio.h>
#include <stdint.h>
#include <stdlib.h>
#include <string.h>
 
static char encoding_table[] = {'A', 'B', 'C', 'D', 'E', 'F', 'G', 'H',
                                'I', 'J', 'K', 'L', 'M', 'N', 'O', 'P',
                                'Q', 'R', 'S', 'T', 'U', 'V', 'W', 'X',
                                'Y', 'Z', 'a', 'b', 'c', 'd', 'e', 'f',
                                'g', 'h', 'i', 'j', 'k', 'l', 'm', 'n',
                                'o', 'p', 'q', 'r', 's', 't', 'u', 'v',
                                'w', 'x', 'y', 'z', '0', '1', '2', '3',
                                '4', '5', '6', '7', '8', '9', '-', '_'};
static char *decoding_table = NULL;
static int mod_table[] = {0, 2, 1};
 
void build_decoding_table() {
 
    decoding_table = malloc(256);
 
    for (int i = 0; i < 64; i++)
        decoding_table[(unsigned char) encoding_table[i]] = i;
}
 
 
void base64_cleanup() {
    free(decoding_table);
} 
 
char *base64_encode(const char *data,
                    size_t input_length,
                    size_t *output_length) {
 
    *output_length = 4 * ((input_length + 2) / 3);
 
    char *encoded_data = malloc(*output_length);
    if (encoded_data == NULL) return NULL;
 
    for (int i = 0, j = 0; i < input_length;) {
 
        uint32_t octet_a = i < input_length ? (unsigned char)data[i++] : 0;
        uint32_t octet_b = i < input_length ? (unsigned char)data[i++] : 0;
        uint32_t octet_c = i < input_length ? (unsigned char)data[i++] : 0;
 
        uint32_t triple = (octet_a << 0x10) + (octet_b << 0x08) + octet_c;
 
        encoded_data[j++] = encoding_table[(triple >> 3 * 6) & 0x3F];
        encoded_data[j++] = encoding_table[(triple >> 2 * 6) & 0x3F];
        encoded_data[j++] = encoding_table[(triple >> 1 * 6) & 0x3F];
        encoded_data[j++] = encoding_table[(triple >> 0 * 6) & 0x3F];
    }
 
    //int i=0;
    for (int i = 0; i < mod_table[input_length % 3]; i++)
        encoded_data[*output_length - 1 - i] = '=';
    
    *output_length  = *output_length -2 + mod_table[input_length % 3];
    encoded_data[*output_length] =0;

    return encoded_data;
}
 
 
unsigned char *base64_decode(const char *data,
                             size_t input_length,
                             size_t *output_length) {
 
    if (decoding_table == NULL) build_decoding_table();
 
    if (input_length % 4 != 0) return NULL;
 
    *output_length = input_length / 4 * 3;
    if (data[input_length - 1] == '=') (*output_length)--;
    if (data[input_length - 2] == '=') (*output_length)--;
 
    unsigned char *decoded_data = malloc(*output_length);
    if (decoded_data == NULL) return NULL;
 
    for (int i = 0, j = 0; i < input_length;) {
 
        uint32_t sextet_a = data[i] == '=' ? 0 & i++ : decoding_table[data[i++]];
        uint32_t sextet_b = data[i] == '=' ? 0 & i++ : decoding_table[data[i++]];
        uint32_t sextet_c = data[i] == '=' ? 0 & i++ : decoding_table[data[i++]];
        uint32_t sextet_d = data[i] == '=' ? 0 & i++ : decoding_table[data[i++]];
 
        uint32_t triple = (sextet_a << 3 * 6)
        + (sextet_b << 2 * 6)
        + (sextet_c << 1 * 6)
        + (sextet_d << 0 * 6);
 
        if (j < *output_length) decoded_data[j++] = (triple >> 2 * 8) & 0xFF;
        if (j < *output_length) decoded_data[j++] = (triple >> 1 * 8) & 0xFF;
        if (j < *output_length) decoded_data[j++] = (triple >> 0 * 8) & 0xFF;
    }
 
    return decoded_data;
}
 
int main(){
    
    const char * data = "Hello World! 您好!世界!";
    size_t input_size = strlen(data);
    printf("Input size: %ld \n",input_size);
    char * encoded_data = base64_encode(data, input_size, &input_size);
    printf("After size: %ld \n",input_size);
    printf("Encoded Data is: %s \n",encoded_data);
    
    size_t decode_size = strlen(encoded_data);
    printf("Output size: %ld \n",decode_size);
    unsigned char * decoded_data = base64_decode(encoded_data, decode_size, &decode_size);
    printf("After size: %ld \n",decode_size);
    printf("Decoded Data is: %s \n",decoded_data);
    return 0;
}
Ansell answered 16/11, 2020 at 10:40 Comment(0)
P
2

If you want to find a workable C solution, I believe you need this.
https://github.com/littlstar/b64.c

#include <stdio.h>
#include <string.h>
#include <stdlib.h>
#include "b64.h"

int
main (void) {
  unsigned char *str = "brian the monkey and bradley the kinkajou are friends";
  char *enc = b64_encode(str, strlen(str));

  printf("%s\n", enc); // YnJpYW4gdGhlIG1vbmtleSBhbmQgYnJhZGxleSB0aGUga2lua2Fqb3UgYXJlIGZyaWVuZHM=

  char *dec = b64_decode(enc, strlen(enc));

  printf("%s\n", dec); // brian the monkey and bradley the kinkajou are friends
  free(enc);
  free(dec);
  return 0;
}
Profitsharing answered 7/12, 2020 at 1:58 Comment(0)
D
1

This is a decoder that is specifically written to avoid the need for a buffer, by writing directly to a putchar function. This is based on wikibook's implementation https://en.wikibooks.org/wiki/Algorithm_Implementation/Miscellaneous/Base64#C

This is not as easy to use as other options above. However, it can be of use in embedded systems, where you want to dump a large file without allocating another large buffer to store the resultant base64 datauri string. (It's a pity that datauri does not let you specify the filename).

void datauriBase64EncodeBufferless(int (*putchar_fcptr)(int), const char* type_strptr, const void* data_buf, const size_t dataLength)
{
  const char base64chars[] = "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789+/";
  const uint8_t *data = (const uint8_t *)data_buf;
  size_t x = 0;
  uint32_t n = 0;
  int padCount = dataLength % 3;
  uint8_t n0, n1, n2, n3;

  size_t outcount = 0;
  size_t line = 0;

  putchar_fcptr((int)'d');
  putchar_fcptr((int)'a');
  putchar_fcptr((int)'t');
  putchar_fcptr((int)'a');
  putchar_fcptr((int)':');
  outcount += 5;

  while (*type_strptr != '\0')
  {
    putchar_fcptr((int)*type_strptr);
    type_strptr++;
    outcount++;
  }

  putchar_fcptr((int)';');
  putchar_fcptr((int)'b');
  putchar_fcptr((int)'a');
  putchar_fcptr((int)'s');
  putchar_fcptr((int)'e');
  putchar_fcptr((int)'6');
  putchar_fcptr((int)'4');
  putchar_fcptr((int)',');
  outcount += 8;

  /* increment over the length of the string, three characters at a time */
  for (x = 0; x < dataLength; x += 3)
  {
    /* these three 8-bit (ASCII) characters become one 24-bit number */
    n = ((uint32_t)data[x]) << 16; //parenthesis needed, compiler depending on flags can do the shifting before conversion to uint32_t, resulting to 0

    if((x+1) < dataLength)
       n += ((uint32_t)data[x+1]) << 8;//parenthesis needed, compiler depending on flags can do the shifting before conversion to uint32_t, resulting to 0

    if((x+2) < dataLength)
       n += data[x+2];

    /* this 24-bit number gets separated into four 6-bit numbers */
    n0 = (uint8_t)(n >> 18) & 63;
    n1 = (uint8_t)(n >> 12) & 63;
    n2 = (uint8_t)(n >> 6) & 63;
    n3 = (uint8_t)n & 63;

    /*
     * if we have one byte available, then its encoding is spread
     * out over two characters
     */

    putchar_fcptr((int)base64chars[n0]);
    putchar_fcptr((int)base64chars[n1]);
    outcount += 2;

    /*
     * if we have only two bytes available, then their encoding is
     * spread out over three chars
     */
    if((x+1) < dataLength)
    {
      putchar_fcptr((int)base64chars[n2]);
      outcount += 1;
    }

    /*
     * if we have all three bytes available, then their encoding is spread
     * out over four characters
     */
    if((x+2) < dataLength)
    {
      putchar_fcptr((int)base64chars[n3]);
      outcount += 1;
    }

    /* Breaking up the line so it's easier to copy and paste */
    int curr_line = (outcount/80);
    if( curr_line != line )
    {
      line = curr_line;
      putchar_fcptr((int)'\r');
      putchar_fcptr((int)'\n');
    }
  }

  /*
  * create and add padding that is required if we did not have a multiple of 3
  * number of characters available
  */
  if (padCount > 0)
  {
    for (; padCount < 3; padCount++)
    {
      putchar_fcptr((int)'=');
    }
  }

  putchar_fcptr((int)'\r');
  putchar_fcptr((int)'\n');
}

Here is the test

#include <stdio.h>
#include <stdint.h>
#include <string.h>

int main(void)
{
  char str[] = "test";
  datauriBase64EncodeBufferless(putchar, "text/plain;charset=utf-8", str, strlen(str));
  return 0;
}

Expected Output: data:text/plain;charset=utf-8;base64,dGVzdA==

Deafen answered 1/9, 2019 at 6:48 Comment(0)
C
0

This solution is based on schulwitz answer (encoding/decoding using OpenSSL), but it is for C++ (well, original question was about C, but there are already another C++ answers here) and it uses error checking (so it's safer to use):

#include <openssl/bio.h>

std::string base64_encode(const std::string &input)
{
    BIO *p_bio_b64 = nullptr;
    BIO *p_bio_mem = nullptr;

    try
    {
        // make chain: p_bio_b64 <--> p_bio_mem
        p_bio_b64 = BIO_new(BIO_f_base64());
        if (!p_bio_b64) { throw std::runtime_error("BIO_new failed"); }
        BIO_set_flags(p_bio_b64, BIO_FLAGS_BASE64_NO_NL); //No newlines every 64 characters or less

        p_bio_mem = BIO_new(BIO_s_mem());
        if (!p_bio_mem) { throw std::runtime_error("BIO_new failed"); }
        BIO_push(p_bio_b64, p_bio_mem);

        // write input to chain
        // write sequence: input -->> p_bio_b64 -->> p_bio_mem
        if (BIO_write(p_bio_b64, input.c_str(), input.size()) <= 0)
            { throw std::runtime_error("BIO_write failed"); }

        if (BIO_flush(p_bio_b64) <= 0)
            { throw std::runtime_error("BIO_flush failed"); }

        // get result
        char *p_encoded_data = nullptr;
        auto  encoded_len    = BIO_get_mem_data(p_bio_mem, &p_encoded_data);
        if (!p_encoded_data) { throw std::runtime_error("BIO_get_mem_data failed"); }

        std::string result(p_encoded_data, encoded_len);

        // clean
        BIO_free_all(p_bio_b64);

        return result;
    }
    catch (...)
    {
        if (p_bio_b64) { BIO_free_all(p_bio_b64); }
        throw;
    }
}

std::string base64_decode(const std::string &input)
{
    BIO *p_bio_mem = nullptr;
    BIO *p_bio_b64 = nullptr;

    try
    {
        // make chain: p_bio_b64 <--> p_bio_mem
        p_bio_b64 = BIO_new(BIO_f_base64());
        if (!p_bio_b64) { throw std::runtime_error("BIO_new failed"); }
        BIO_set_flags(p_bio_b64, BIO_FLAGS_BASE64_NO_NL); //Don't require trailing newlines

        p_bio_mem = BIO_new_mem_buf((void*)input.c_str(), input.length());
        if (!p_bio_mem) { throw std::runtime_error("BIO_new failed"); }
        BIO_push(p_bio_b64, p_bio_mem);

        // read result from chain
        // read sequence (reverse to write): buf <<-- p_bio_b64 <<-- p_bio_mem
        std::vector<char> buf((input.size()*3/4)+1);
        std::string result;
        for (;;)
        {
            auto nread = BIO_read(p_bio_b64, buf.data(), buf.size());
            if (nread  < 0) { throw std::runtime_error("BIO_read failed"); }
            if (nread == 0) { break; } // eof

            result.append(buf.data(), nread);
        }

        // clean
        BIO_free_all(p_bio_b64);

        return result;
    }
    catch (...)
    {
        if (p_bio_b64) { BIO_free_all(p_bio_b64); }
        throw;
    }
}

Note that base64_decode returns empty string, if input is incorrect base64 sequence (openssl works in such way).

Clava answered 10/12, 2015 at 11:51 Comment(1)
hm... using openssl library for decoding/encoding base64 takes more lines of code than direct implementation (best answer in this question)...Clava
S
0

Base64 encoding and decoding in C using openSSL is actually quite easy if you use the EVP code.

Here is the documentation, in openssl 3.1, for the required encode and decode functions : https://www.openssl.org/docs/man3.1/man3/EVP_EncodeUpdate.html

The EVP interface may look complex; but once you understand it you can do encoding, encryption, etc, very easily without the use of examples. Don't worry though, I'll provide you with an example that works for me.

The following was compiled using g++ and std::string. Use the -lcrypto flag when you compile.

Feel free to clean up the code to meet whatever standards that you may require.

#include <openssl/evp.h>
#include <string>
#include <stdio.h>

size_t resultLen = 0;
bool failed = false;

string encode(string in, size_t inSize){
    // Error handling stuff
    string ret = "";
    failed = false;

    //Required variables, and proper size estimates
    unsigned char *inbuf = new unsigned char[inSize];
    size_t outBufCalculated = ((inSize/48) * 66) + 66;
    unsigned char *outbuf = new unsigned char[outBufCalculated];
    int outCount = 0;

    // OpenSSL uses unsigned char buffers
    for(int i=0; i<inSize; i++)
        inbuf[i] = in[i];

    EVP_ENCODE_CTX *encodeCtx = EVP_ENCODE_CTX_new();
    EVP_EncodeInit(encodeCtx);
    if(encodeCtx == NULL){
        failed = true;
        delete[] inbuf;
        delete[] outbuf;
        return "";
    }

    // Encode the "complete" blocks
    if(EVP_EncodeUpdate(encodeCtx, 
                        outbuf, &outCount, 
                        (const unsigned char *)inbuf, inSize) != 1)
    {
        failed = true;
        delete[] inbuf;
        delete[] outbuf;
        EVP_ENCODE_CTX_free(encodeCtx);
        return "";
    }

    // You need to track the number of encoded bytes.
    resultLen = outCount;

    // Handle the encoding of incomplete blocks
    EVP_EncodeFinal(encodeCtx, outbuf+resultLen, &outCount);

    // Don't forget to track the number of remaining bytes encoded :)
    resultLen += outCount;

    // Get the results into that sweet, sweet string
    for(int i=0; i<resultLen; i++)
        ret += outbuf[i];

    // Barny clean-up song
    delete[] inbuf;
    delete[] outbuf;
    EVP_ENCODE_CTX_free(encodeCtx);

    return ret;
}

string decode(string in, size_t inSize){
    // Error handling stuff
    string ret = "";
    failed = false;

    unsigned char *inbuf = new unsigned char[inSize];
    size_t outBufCalculated = ((inSize/48) * 66) + 66;
    unsigned char *outbuf = new unsigned char[outBufCalculated];
    int outCount = 0;

    for(int i=0; i<inSize; i++)
        inbuf[i] = in[i];

    // There's only one CTX function for both encode and decode.
    EVP_ENCODE_CTX *encodeCtx = EVP_ENCODE_CTX_new();
    EVP_DecodeInit(encodeCtx);
    if(encodeCtx == NULL){
        failed = true;
        delete[] inbuf;
        delete[] outbuf;
        return "";
    }

    // Decode returns -1 on error, encode returns 1 on success.
    if(EVP_DecodeUpdate(encodeCtx, 
                        outbuf, &outCount, 
                        (const unsigned char *)inbuf, inSize) == -1)
    {
        failed = true;
        delete[] inbuf;
        delete[] outbuf;
        EVP_ENCODE_CTX_free(encodeCtx);
        return "";
    }
    resultLen = outCount;

    EVP_DecodeFinal(encodeCtx, outbuf+resultLen, &outCount);

    resultLen += outCount;

    for(int i=0; i<resultLen; i++)
        ret += outbuf[i];

    delete[] inbuf;
    delete[] outbuf;
    EVP_ENCODE_CTX_free(encodeCtx);

    return ret;
}

int main(void){
    string Message = "Z";
    for(int i=0; i<100; i++){
        Message = "a" + Message;
    }

    Message = encode(Message, Message.length());

    // OpenSSL null terminates base64.
    printf("Encoded Message : \n%s\n", Message.c_str());

    Message = decode(Message, Message.length());
    printf("Decoded Message : \n%s\n", Message.c_str());
    return 0;
}

Compilation and output, hope this helps! :

g++ foo.cc -lcrypto
./a.out
Encoded Message : 
YWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFh
YWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFh
YWFhYVo=

Decoded Message : 
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaZ
Skiing answered 11/6, 2023 at 2:21 Comment(0)
D
-1

Based on GaspardP's answer, here is simplified version of Jouni Malinen's encoder in C, I made for project I am contributing to :

/* Character list for url-safe base64 encoding */
//char cl[]="ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789-_";

/* Character list for url-unsafe base64 encoding */
char cl[]="ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789+/";

/**
 * @brief Encodes s_in into base64 and writes it to s_out.
 *    @param s_in  Pointer to input buffer.
 *    @param s_out Pointer to output buffer.
 * @return Pointer to end of output buffer
 * @usage b64e("ABC",buf);
 */
char *b64e(char *s_in, char *s_out){
   int i=0;
   if (s_in[i]==0) return s_out;
   if (s_in[i+1]==0 || s_in[i+2]==0) {
      *s_out++= b64_cl[ s_in[i] >> 2 ];
      if (s_in[i+1]==0) {
         *s_out++ = b64_cl[ ( ( s_in[i]   & 0b000011 ) << 4 ) ];
      } else
      if (s_in[i+2]==0) {
         *s_out++ = b64_cl[ ( ( s_in[i]   & 0b000011 ) << 4 ) + ( ( s_in[i+1] >> 4 ) & 0b001111 ) ];
         *s_out++ = b64_cl[ ( ( s_in[i+1] & 0b001111 ) << 2 ) ];
      }
      return s_out;
   }
   *s_out++ = b64_cl[    s_in[i] >> 2 ];
   *s_out++ = b64_cl[ ( (s_in[i]   & 0b000011 ) << 4 ) + ( (s_in[i+1] >> 4) & 0b001111 ) ];
   *s_out++ = b64_cl[ ( (s_in[i+1] & 0b001111 ) << 2 ) + ( (s_in[i+2] >> 6) & 0b000011 ) ];
   *s_out++ = b64_cl[ (  s_in[i+2] & 0b111111 ) ];
   return b64e( s_in+3, s_out );
}
Doubleganger answered 26/1, 2022 at 1:30 Comment(0)
B
-2

Here is an optimized version of encoder for the accepted answer, that also supports line-breaking for MIME and other protocols (simlar optimization can be applied to the decoder):

 char *base64_encode(const unsigned char *data,
                    size_t input_length,
                    size_t *output_length,
                    bool addLineBreaks)

    *output_length = 4 * ((input_length + 2) / 3);
    if (addLineBreaks) *output_length += *output_length / 38; //  CRLF after each 76 chars

    char *encoded_data = malloc(*output_length);
    if (encoded_data == NULL) return NULL;

    UInt32 octet_a;
    UInt32 octet_b;
    UInt32 octet_c;
    UInt32 triple;
    int lineCount = 0;
    int sizeMod = size - (size % 3); // check if there is a partial triplet
    // adding all octet triplets, before partial last triplet
    for (; offset < sizeMod; ) 
    {
        octet_a = data[offset++];
        octet_b = data[offset++];
        octet_c = data[offset++];

        triple = (octet_a << 0x10) + (octet_b << 0x08) + octet_c;

        encoded_data[mBufferPos++] = encoding_table[(triple >> 3 * 6) & 0x3F];
        encoded_data[mBufferPos++] = encoding_table[(triple >> 2 * 6) & 0x3F];
        encoded_data[mBufferPos++] = encoding_table[(triple >> 1 * 6) & 0x3F];
        encoded_data[mBufferPos++] = encoding_table[(triple >> 0 * 6) & 0x3F];
        if (addLineBreaks)
        {
            if (++lineCount == 19)
            {
                encoded_data[mBufferPos++] = 13;
                encoded_data[mBufferPos++] = 10;
                lineCount = 0;
            }
        }
    }

    // last bytes
    if (sizeMod < size)
    {
        octet_a = data[offset++]; // first octect always added
        octet_b = offset < size ? data[offset++] : (UInt32)0; // conditional 2nd octet
        octet_c = (UInt32)0; // last character is definitely padded

        triple = (octet_a << 0x10) + (octet_b << 0x08) + octet_c;

        encoded_data[mBufferPos++] = encoding_table[(triple >> 3 * 6) & 0x3F];
        encoded_data[mBufferPos++] = encoding_table[(triple >> 2 * 6) & 0x3F];
        encoded_data[mBufferPos++] = encoding_table[(triple >> 1 * 6) & 0x3F];
        encoded_data[mBufferPos++] = encoding_table[(triple >> 0 * 6) & 0x3F];

        // add padding '='
        sizeMod = size % 3; 
        // last character is definitely padded
        encoded_data[mBufferPos - 1] = (byte)'=';
        if (sizeMod == 1) encoded_data[mBufferPos - 2] = (byte)'=';
    }
 }
Bilow answered 29/7, 2013 at 11:42 Comment(1)
your snipped cannot even compile: no size variable and I hope that it is not a global variable in your code.Strow

© 2022 - 2024 — McMap. All rights reserved.