a single character in Base64String holds how many bytes? 1, 2 or more
Asked Answered
M

1

6

In my Android application the requirement is to fetch the images from the server and cache them into the heap memory.

On receiving the request, the server first encodes the byte[] into Base64String and returns that string. And, at the time of rendering it into ImageView the Android application decodes the Base64String back to the byte[], creates a Bitmap, and puts it on ImageView.

As everything is in cache, there are chances the Application to go out of memory at some point, and crash critically.

To prevent the out of memory situation, I have defined a safety quantum (E.g. 5 MB) in my application. If at any point the available memory goes below to this safety quantum, user would then need to mark some of the images as the candidates to be deleted. Along with, the application would show the estimated memory going to be released once the selected items are cleared.

The Bitmap has been recycled once user moves away from the image, so the Bitmap effectively isn't holding any memory as long as we are away.

In a particular test, I download 55 images, and my heap grows from 16 MB to 42 MB. That means, 55 images occupy 26 MB. After I clear all of them, the heap shrinks back to 16 MB.

But, when I take a cumulative sum of lengths of all Base64String it comes to 11983840. And if I consider one character as 1 byte the 11983840 bytes makes 11.4 MB

The problem is, cumulative sum of the lengths of Base64String is the only measure available to me, that helps to let user know how much memory can be released by his selection.

I have also read the following question, which mentions, for each 3 Bytes of original data the Base64String will have 4 Characters.

Base64 length calculation?

The question is, a single character in Base64String holds how many bytes? 1, 2 or more

If 1 character is 1 byte, and in my test the heap grows and shrinks by 26 MB. Then why the cumulative sum of the lengths of Base64String is only 11.4 MB?

Updated

enter image description here

This means 1 Byte per character.

enter image description here

The default CharacterSet here is UTF-8

Mcnew answered 7/8, 2015 at 10:20 Comment(2)
refer this link #13379315Fatma
I have already referred the link and hence have kept it as part of my question as well. The problem in the link and this are quite different.Mcnew
P
4

Lets start from the beginning...

Base64 is an encoding format based upon a set of 64 characters - each character is worth 6 bits of data (2^6 = 64). So when converting, each 8-bit byte of input needs (8 / 6) = 1.333333... characters.

But since computers only store whole bytes, it's easier to say that for every 3 bytes (3*8=24 bits) of input, 4 base-64 characters are produced (4*6=24 bits).

Now, you say your base-64 string total length is 11 million characters. In Java, strings are stored in 16-bit Unicode format: 1 character = 2 bytes.

So 11 million characters will use over 20 MB just in the raw character data. That, combined with objects, variables and other state needed to store the information means that 26 MB is a reasonable value for the amount of data you are using.

So a general rule you might want to use to estimate memory usage is (((input_data_in_bytes * 4/3) * 2) + a few MB)

Phosphorescence answered 7/8, 2015 at 11:53 Comment(5)
Thanks for the reply. Its really a nice explanation. But, the question still remains open, because in my latest finding I seen the default character as UTF-8. Please check the "Updated" section in the question.Mcnew
@Wayofhope I think you're confused. The defaultCharset() value represents the charset used when your Java program interacts with your operating system - it's to do with encoding data when reading and writing from streams and other I/O. It has nothing to do with how the characters are stored within the Java String type, which is always 16 bit unicode. As documented in docs.oracle.com/javase/7/docs/api/java/lang/String.html A String represents a string in the UTF-16 format...Phosphorescence
Hi adelphus, thanks for the explanation. I might be wrong, but just want to dig into the bottom of it. For the String "Hello", why does "Hello".getBytes().length returns me 5?Mcnew
@Wayofhope You're calling getBytes() which, according to the Java docs, Encodes this String into a sequence of bytes using the platform's default charset,. You've already demonstrated that your platform's charset is UTF-8, so getBytes() will convert from the Java String internal format (UTF-16) and return "Hello" as 5 UTF-8 bytes.Phosphorescence
Now this solves my confusion. It takes a lot of patience and competence to explain in depth, that too with an online exchange medium. Kudos for that. Will seek more & more knowledge from you in future :)Mcnew

© 2022 - 2024 — McMap. All rights reserved.