Encode a FileStream to base64 with c#
Asked Answered
D

8

63

I know how to encode / decode a simple string to / from base64.

But how would I do that if the data is already been written to a FileStream object. Let's say I have only access to the FileStream object not to the previously stored original data in it. How would I encode a FileStream to base64 before I flush the FileStream to a file.

Ofc I could just open my file and encode / decode it after I have written the FileStream to the file, but I would like to do this all in one single step without doing two file operations one after another. The file could be larger and it would also take double time to load, encode and save it again after it was just saved a short time before.

Maybe someone of you knows a better solution? Can I convert the FileStream to a string, encode the string and then convert the string back to a FileStream for example or what would I do and how would such a code look like?

Datcha answered 2/10, 2013 at 9:41 Comment(5)
I'm not sure I totally understand your question, but it's possible to use built-in classes to provide a stream which will transform binary data to or from base 64 data. You could then interpose such a stream between your writes and a file output stream (such as is commonly done with compressing streams and encrypting streams). An example is here: netpl.blogspot.co.uk/2011/05/builtin-base64-streaming.htmlDerision
Possible duplicate of How to convert an Stream into a byte[] in C#?Telpherage
Possible duplicate of Is there a Base64Stream for .NET? where?Desirae
Isn`t this the answer?Pavlish
Do not forget: stream.Seek(0, SeekOrigin.Begin); at the beginning of method... ;-)Reifel
P
10

You can also encode bytes to Base64. How to get this from a stream see here: How to convert an Stream into a byte[] in C#?

Or I think it should be also possible to use the .ToString() method and encode this.

Paraphrast answered 2/10, 2013 at 9:51 Comment(1)
Since the input is a stream, a much more useful answer is the one below that transforms the stream to another (B64 encoded) stream.Ibbison
K
63

An easy one as an extension method

public static class Extensions
{
    public static Stream ConvertToBase64(this Stream stream)
    {
        byte[] bytes;
        using (var memoryStream = new MemoryStream())
        {
            stream.CopyTo(memoryStream);
            bytes = memoryStream.ToArray();
        }

        string base64 = Convert.ToBase64String(bytes);
        return new MemoryStream(Encoding.UTF8.GetBytes(base64));
    }
}
Ketcham answered 14/9, 2017 at 16:16 Comment(6)
This results in the stream being buffered entirely into memory (with multiple copies too, as you don't set an initial capacity). This is not a practical solution for files larger than a few megabytes - and will certainly break for files larger than 2GB (as MemoryStream uses a single Byte[] internally). People also report MemoryStream breaking for sizes over 256MB: #15595561Shouse
@Shouse I think that if you are trying to base64 encode a stream that is large, maybe there is a better alternative than base64 encoding? I'm doing it to supply a file within a JSON web request, if there is a large file (MBs+), then it doesn't make sense for me to do that.Ketcham
This solution wastes a lot of memory (few times) and cpu time. It can be done much optimal!Wonderstricken
@VasilPopov Please post your optimal solution.Janessajanet
@PhillipCopley I did it few posts belowWonderstricken
The value of using a stream is not to have to load all the data in memory. No interest in using a memorystreamBiodynamics
S
56

When dealing with large streams, like a file sized over 4GB - you don't want to load the file into memory (as a Byte[]) because not only is it very slow, but also may cause a crash as even in 64-bit processes a Byte[] cannot exceed 2GB (or 4GB with gcAllowVeryLargeObjects).

Fortunately there's a neat helper in .NET called ToBase64Transform which processes a stream in chunks. For some reason Microsoft put it in System.Security.Cryptography and it implements ICryptoTransform (for use with CryptoStream), but disregard that ("a rose by any other name...") just because you aren't performing any cryptographic tasks.

You use it with CryptoStream like so:

using System.Security.Cryptography;
using System.IO;

//

using( FileStream   inputFile    = new FileStream( @"C:\VeryLargeFile.bin", FileMode.Open, FileAccess.Read, FileShare.None, bufferSize: 1024 * 1024, useAsync: true ) ) // When using `useAsync: true` you get better performance with buffers much larger than the default 4096 bytes.
using( CryptoStream base64Stream = new CryptoStream( inputFile, new ToBase64Transform(), CryptoStreamMode.Read ) )
using( FileStream   outputFile   = new FileStream( @"C:\VeryLargeBase64File.txt", FileMode.CreateNew, FileAccess.Write, FileShare.None, bufferSize: 1024 * 1024, useAsync: true ) )
{
    await base64Stream.CopyToAsync( outputFile ).ConfigureAwait(false);
}
Shouse answered 6/9, 2019 at 10:43 Comment(5)
Note that the leaveOpen property is invalid in netstandard2.0, but is accepted in 472Appendicle
This is almost perfect, except the resultant stream doesn't support Seek, which I need. I swear every day I work with C# System lib I find something I need to re-implement properly.Crib
@YarekT If you're using streams then you should never need to seek (the only exception being FileStream). If you find yourself needing to seek in a non-disk stream then your system is likely designed wrong.Shouse
@Shouse Interesting. I'm using FluentFtp's UploadAsync method which takes a stream. I assumed it wouldn't seek, but it does. I'm guessing they designed it for regular file streams.Crib
@YarekT According to this issue, it's because FluentFTP needs to know the length of a stream beforehand (which is reasonable), however it does that by requiring streams be seekable (which is wrong - but a consequence of the design of System.IO.Stream which doesn't let it expose a Length unless CanSeek == true, grrr): github.com/robinrodricks/FluentFTP/issues/668Shouse
W
16

A simple Stream extension method would do the job:

public static class StreamExtensions
{
    public static string ConvertToBase64(this Stream stream)
    {
        if (stream is MemoryStream memoryStream)
        {
            return Convert.ToBase64String(memoryStream.ToArray());
        }

        var bytes = new Byte[(int)stream.Length];

        stream.Seek(0, SeekOrigin.Begin);
        stream.Read(bytes, 0, (int)stream.Length);

        return Convert.ToBase64String(bytes);
    }
}

The methods for Read (and also Write) and optimized for the respective class (whether is file stream, memory stream, etc.) and will do the work for you. For simple task like this, there is no need of readers, and etc.

The only drawback is that the stream is copied into byte array, but that is how the conversion to base64 via Convert.ToBase64String works unfortunately.

Wonderstricken answered 7/2, 2020 at 8:51 Comment(5)
This is not a generic solution as many many stream types will not support Length or Seek()Meiosis
Rhys Bevilaqua, Usually you would need to either seek to the beginning of the stream to read it all, or to "know" that you are in the beginning (which is against SOLID principles). ONly streaming does not implement these 2 methods - almost all the rest (memory, file, etc) have it. You can always have second implementation where you read buffer by buffer until the "end" of the stream, but this is not so efficient and straightforward.Wonderstricken
I'm sick of buffering everything to byte[] in .NET. It's tremendously wasteful. It's high time for stream APIs end-to-end.Spanos
Updated a bit to support the ToArray() method of the MemoryStream class.Wonderstricken
If this is called a lot, ArrayPool<T> is helpful instead of creating lots of byte[] arrays: using System.Buffers; using System.Linq; var bytes = ArrayPool<byte>.Shared.Rent((int)stream.Length); ... var str = Convert.ToBase64String(bytes.Take((int)stream.Length).ToArray()); ArrayPool<byte>.Shared.Return(bytes); return str;Spier
T
13

You may try something like that:

    public Stream ConvertToBase64(Stream stream)
    {
        Byte[] inArray = new Byte[(int)stream.Length];
        Char[] outArray = new Char[(int)(stream.Length * 1.34)];
        stream.Read(inArray, 0, (int)stream.Length);
        Convert.ToBase64CharArray(inArray, 0, inArray.Length, outArray, 0);
        return new MemoryStream(Encoding.UTF8.GetBytes(outArray));
    }
Torres answered 2/10, 2013 at 11:38 Comment(5)
Where does 1.34 come from?Ratite
A byte holds 8 bits. A base64 doesn't use bytes but chars. Not any chars, but specific chars that can be converted to 6 bits. So the in-array is smaller than the our-array by a factor 6/8. 8 divided by 6 is 1,33333 so if you take 1.34 the out array will always be just big enough.Vitascope
You need to get the new size from Convert.ToBase64CharArray and then do Array.Resize<Char>(ref base64Chars, newSize);. Otherwise, you have extra bytes in the final output.Internship
The 1.34 is wrong! The extra 0.0333 gives you some space which is too small for small lengths and unnecessary big for big arrays. Instead of flooring (cast to int) you should do a ceiling (int)Math.Ceiling(stream.Length * 8.0 / 6.0) so you get the exact length.Wirehaired
Note that if stream happens to be a MemoryStream, you can just use stream.ToArray() and avoid the 8/6 calculations.Cai
P
10

You can also encode bytes to Base64. How to get this from a stream see here: How to convert an Stream into a byte[] in C#?

Or I think it should be also possible to use the .ToString() method and encode this.

Paraphrast answered 2/10, 2013 at 9:51 Comment(1)
Since the input is a stream, a much more useful answer is the one below that transforms the stream to another (B64 encoded) stream.Ibbison
P
7

The answers recommending the use of ToBase64Transform are valid, but there is a big catch. Not sure if this should be an answer, but had I known this it would have saved me a lot of time.

The problem I ran into with ToBase64Transform is that it is hard-coded to read 3 bytes at a time. If each write to the input stream specified in CryptoStream constructor is something like a websocket or anything that has non trivial overhead or latency, this can be a huge problem.

Bottom line - if you are doing something like this:

using var cryptoStream = new CryptoStream(httpRequestBodyStream, new ToBase64Transform(), CryptoStreamMode.Write);

It may be worthwhile to fork the class ToBase64Transform to modify the hard-coded 3/4 byte values to something substantially larger so that it incurs fewer writes. In my case, with the default 3/4 value, transmission rate was about 100 KB/s. Changing to 768/1024 (same ratio) worked and transmission rate was about 50-100 MB/s because of way fewer writes.

    public class BiggerBlockSizeToBase64Transform : ICryptoTransform
    {
        // converting to Base64 takes 3 bytes input and generates 4 bytes output
        public int InputBlockSize => 768;
        public int OutputBlockSize => 1024;
        public bool CanTransformMultipleBlocks => false;
        public virtual bool CanReuseTransform => true;

        public int TransformBlock(byte[] inputBuffer, int inputOffset, int inputCount, byte[] outputBuffer, int outputOffset)
        {
            ValidateTransformBlock(inputBuffer, inputOffset, inputCount);

            // For now, only convert 3 bytes to 4
            byte[] tempBytes = ConvertToBase64(inputBuffer, inputOffset, 768);

            Buffer.BlockCopy(tempBytes, 0, outputBuffer, outputOffset, tempBytes.Length);
            return tempBytes.Length;
        }

        public byte[] TransformFinalBlock(byte[] inputBuffer, int inputOffset, int inputCount)
        {
            ValidateTransformBlock(inputBuffer, inputOffset, inputCount);

            // Convert.ToBase64CharArray already does padding, so all we have to check is that
            // the inputCount wasn't 0
            if (inputCount == 0)
            {
                return Array.Empty<byte>();
            }

            // Again, for now only a block at a time
            return ConvertToBase64(inputBuffer, inputOffset, inputCount);
        }

        private byte[] ConvertToBase64(byte[] inputBuffer, int inputOffset, int inputCount)
        {
            char[] temp = new char[1024];
            Convert.ToBase64CharArray(inputBuffer, inputOffset, inputCount, temp, 0);
            byte[] tempBytes = Encoding.ASCII.GetBytes(temp);
            if (tempBytes.Length != 1024)
                throw new Exception();

            return tempBytes;
        }

        private static void ValidateTransformBlock(byte[] inputBuffer, int inputOffset, int inputCount)
        {
            if (inputBuffer == null) throw new ArgumentNullException(nameof(inputBuffer));
        }

        // Must implement IDisposable, but in this case there's nothing to do.

        public void Dispose()
        {
            Clear();
        }

        public void Clear()
        {
            Dispose(true);
            GC.SuppressFinalize(this);
        }

        protected virtual void Dispose(bool disposing) { }

        ~BiggerBlockSizeToBase64Transform()
        {
            // A finalizer is not necessary here, however since we shipped a finalizer that called
            // Dispose(false) in desktop v2.0, we need to keep it in case any existing code had subclassed
            // this transform and expects to have a base class finalizer call its dispose method.
            Dispose(false);
        }
    }
Pensionary answered 5/5, 2022 at 0:55 Comment(0)
S
5

Since the file will be larger, you don't have very much choice in how to do this. You cannot process the file in place since that will destroy the information you need to use. You have two options that I can see:

  1. Read in the entire file, base64 encode, re-write the encoded data.
  2. Read the file in smaller pieces, encoding as you go along. Encode to a temporary file in the same directory. When you are finished, delete the original file, and rename the temporary file.

Of course, the whole point of streams is to avoid this sort of scenario. Instead of creating the content and stuffing it into a file stream, stuff it into a memory stream. Then encode that and only then save to disk.

Stereopticon answered 2/10, 2013 at 9:46 Comment(0)
N
0

Necromancing.
Apparently, nobody wants to write it themselfs.
It's not that hard to do this yourselfs.
See bellow my code that writes files in the same format as certutil uses to base64-encode files.
(I use the decode part to transfer executables over clipboard as text - there is a size limit in certutil)

You can just set the parameter insertLineBreaks to false for ConvertToBase64Array in ToCertUtil.
Then remove the lines

sw.Write("-----BEGIN CERTIFICATE-----");
sw.Write(System.Environment.NewLine);

and

  sw.Write(System.Environment.NewLine);
  sw.Write("-----END CERTIFICATE-----");
  sw.Write(System.Environment.NewLine);

in the ToCertUtil procedure and you've successfully copied-together a base64-streamwriter.

namespace TestAuthentication
{


    public class Base64Transformer 
    {
        private const int base64LineBreakPosition = 64;

        internal static readonly char[] base64Table = {'A','B','C','D','E','F','G','H','I','J','K','L','M','N','O',
                                                       'P','Q','R','S','T','U','V','W','X','Y','Z','a','b','c','d',
                                                       'e','f','g','h','i','j','k','l','m','n','o','p','q','r','s',
                                                       't','u','v','w','x','y','z','0','1','2','3','4','5','6','7',
                                                       '8','9','+','/','=' };



        // How to convert MyFile.zip into a base64 text-file with cmd.exe: 
        // certutil -encode MyFile.zip lol.txt
        // How to convert the base64 text in lol.txt back to MyFile.zip
        // certutil -decode lol.txt MyCrap.zip

        // Twist: The -f means "force overwrite".
        // certutil -f -encode raw.txt encoded.txt

        public static void Test()
        {
            string inputFile = @"D:\username\Desktop\Transfer\MyFile.zip";
            string outputFile = @"D:\username\Desktop\Transfer\MyFile.txt";
            ToCertUtil(inputFile, outputFile);
        } // End Sub Test 


        public static void ToCertUtil(string inputFile, string outputFile)
        {
            // const int BUFFER_SIZE = 4032; // 4032%3=0 && 4032%64=0
            // const int BUFFER_SIZE = 4080; // 4080%3=0 && 4080%8=0
            const int BUFFER_SIZE = 4095; // 4095%3=0 && 4095~=4096 (pageSize) 
            byte[] buffer = new byte[BUFFER_SIZE];
            int lastCharCount = 0;

            // using (System.IO.FileStream outputStream = System.IO.File.OpenWrite(outputFile,))
            using (System.IO.FileStream outputStream = new System.IO.FileStream(outputFile, System.IO.FileMode.Create, System.IO.FileAccess.Write, System.IO.FileShare.Read))
            {

                using (System.IO.StreamWriter sw = new System.IO.StreamWriter(outputStream, System.Text.Encoding.ASCII))
                {
                    sw.Write("-----BEGIN CERTIFICATE-----");
                    sw.Write(System.Environment.NewLine);

                    using (System.IO.FileStream inputStream = System.IO.File.OpenRead(inputFile))
                    {
                        using (System.IO.BinaryReader br = new System.IO.BinaryReader(inputStream))
                        {
                            br.BaseStream.Seek(0, System.IO.SeekOrigin.Begin);
                            long totalLength = inputStream.Length;

                            long totalRead = 0;
                            int bytesRead;
                            while ((bytesRead = br.Read(buffer, 0, BUFFER_SIZE)) > 0)
                            {
                                totalRead += bytesRead;

                                bool isFinal = (bytesRead < BUFFER_SIZE || totalRead == totalLength);
                                lastCharCount = ConvertToBase64Array(sw, buffer, 0, bytesRead, true, isFinal, lastCharCount);
                            } // Whend 

                            br.Close();
                        } // End Using br 

                    } // End Using inputStream 

                    sw.Write(System.Environment.NewLine);
                    sw.Write("-----END CERTIFICATE-----");
                    sw.Write(System.Environment.NewLine);
                    sw.Flush();
                    outputStream.Flush();
                } // End Using sw 

            } // End Using outputStream 

        } // End Sub ToCertUtil 


        // public static int ConvertToBase64Array(System.IO.StreamWriter sw, byte[] inData, int offset, int length, int totalLength, bool insertLineBreaks)
        // https://www.codeproject.com/Articles/5483/Base-Encoder-Decoder-in-C
        // https://github.com/mono/mono/blob/master/mcs/class/referencesource/mscorlib/system/convert.cs
        public static int ConvertToBase64Array(System.IO.StreamWriter sw, byte[] inData, int offset, int length, bool insertLineBreaks, bool isFinal, int lastCharCount)
        {
            int lengthmod3 = length % 3;
            int calcLength = offset + (length - lengthmod3);
            int j = 0;
            int charCount = lastCharCount; // if one block is done, we need to reset the charCount to the previous position 
            int i;

            // Convert three bytes at a time to base64 notation.
            // This will consume 4 chars.
            // get a pointer to the base64Table to avoid unnecessary range checking
            {
                for (i = offset; i < calcLength; i += 3)
                {
                    if (insertLineBreaks)
                    {
                        if (charCount == base64LineBreakPosition)
                        {
                            sw.Write("\r\n");
#if DEBUGFLUSH
                            sw.Flush();
#endif

                            j += 2;
                            charCount = 0;
                        } // End if (charCount == base64LineBreakPosition) 

                        charCount += 4; // We only need the charCount if we do line-breaks  
                    } // End if (insertLineBreaks) 

                    sw.Write(base64Table[(inData[i] & 0xfc) >> 2]);
                    sw.Write(base64Table[((inData[i] & 0x03) << 4) | ((inData[i + 1] & 0xf0) >> 4)]);
                    sw.Write(base64Table[((inData[i + 1] & 0x0f) << 2) | ((inData[i + 2] & 0xc0) >> 6)]);
                    sw.Write(base64Table[(inData[i + 2] & 0x3f)]);

                    j += 4;
                } // Next i 

                //Where we left off before
                i = calcLength;

                if (insertLineBreaks && (lengthmod3 != 0 || !isFinal) && (charCount == base64LineBreakPosition))
                {
                    sw.Write("\r\n");
#if DEBUGFLUSH
                    sw.Flush();
#endif
                    j += 2;
                } // End if (insertLineBreaks && (lengthmod3 != 0 || !isFinal) && (charCount == base64LineBreakPosition)) 

                switch (lengthmod3)
                {
                    // One character padding needed
                    case 2: 
                        // char char1 = base64Table[(inData[i] & 0xfc) >> 2];
                        // char char2 = base64Table[((inData[i] & 0x03) << 4) | ((inData[i + 1] & 0xf0) >> 4)];
                        // char char3 = base64Table[(inData[i + 1] & 0x0f) << 2];
                        // char char4 = base64Table[64]; //Pad

                        // string e = new string(new char[] { char1, char2, char3, char4 });
                        // System.Console.WriteLine(e);


                        sw.Write(base64Table[(inData[i] & 0xfc) >> 2]);
                        sw.Write(base64Table[((inData[i] & 0x03) << 4) | ((inData[i + 1] & 0xf0) >> 4)]);
                        sw.Write(base64Table[(inData[i + 1] & 0x0f) << 2]);
                        sw.Write(base64Table[64]); //Pad
                        j += 4;
                        break;

                    // Two character padding needed
                    case 1: 
                        // char char11 = base64Table[(inData[i] & 0xfc) >> 2];
                        // char char12 = base64Table[(inData[i] & 0x03) << 4];
                        // char char13 = base64Table[64]; //Pad
                        // char char14 = base64Table[64]; //Pad

                        // string ee = new string(new char[] { char11, char12, char13, char14 });
                        // System.Console.WriteLine(ee);

                        sw.Write(base64Table[(inData[i] & 0xfc) >> 2]);
                        sw.Write(base64Table[(inData[i] & 0x03) << 4]);
                        sw.Write(base64Table[64]); //Pad
                        sw.Write(base64Table[64]); //Pad
                        j += 4;
                        break;
                } // End Switch 

            } // End empty scope block 

            // return j;
            return charCount;
        } // End Function ConvertToBase64Array 


    } // End Class Base64Transformer 


} // End Namespace 
Nichy answered 16/5, 2024 at 11:42 Comment(0)

© 2022 - 2025 — McMap. All rights reserved.