Where is Blob binary data stored?
Asked Answered
L

4

9

Given

var data = new Array(1000000);
for (var i = 0; i < data.length; i++) {
  data[i] = 1;
}
var blob = new Blob([data]);

where is binary data representation of array stored?

Laoag answered 7/7, 2016 at 7:3 Comment(16)
Do you have something in mind beyond "in memory"?Collogue
@duskwuff Yes. Is Blob binary data stored at the object itself, or within browser internals? How to access raw binary data directly? Related #38196355. Attempting to determine where the actual raw binary data is stored in browser; IndexDB at browser configuration folder? Other? Can you provide details of "in memory"?Laoag
@duskwuff Not presently versed in language written in, C++?. Does source at chromium indicate Blob is stored in memory cs.chromium.org/chromium/src/third_party/WebKit/Source/core/… , cs.chromium.org/chromium/src/third_party/WebKit/Source/core/… ? In particular reference to #include <memory> at cs.chromium.org/chromium/src/third_party/WebKit/Source/platform/… ?Laoag
@duskwuff Is this what you are referring to static void populateBlobData(BlobData*, const HeapVector<ArrayBufferOrArrayBufferViewOrBlobOrUSVString>& parts, bool normalizeLineEndingsToNative); cs.chromium.org/chromium/src/third_party/WebKit/Source/core/… , cs.chromium.org/chromium/src/third_party/WebKit/Source/platform/… ?Laoag
@guest271314: If you want to know how you could access the data stored in the blob through javascript, then you should ask that. And the answer is No, btw.Netti
@Netti "If you want to know how you could access the data stored in the blob through javascript, then you should ask that." Did ask that "where is binary data representation of array stored?" ? If the answer is "no", can you post an Answer including technical details of why?Laoag
@guest271314: It could be in some structure in RAM, or on a harddisk, I don't know and it doesn't matter - it's no structure accessible through JavaScript directly.Netti
@Netti Really? How can you at one comment state "And the answer is No, btw" , then at next comment "It could be in some structure in RAM, or on a harddisk, I don't know and it doesn't matter" ? Of course it matters; everything "matters". If you do not know for certain, how can you be certain that there are not approaches which could be used to access the data? Raw binary data stored at Blob can be echoed from php; there apparently is some form of data attached to object which is accessible? Or, at least attached intrinsically at some low-level?Laoag
Simply because the DOM interface of Blob does not contain the data? Also from MDN: "Blobs represent data that isn't necessarily in a JavaScript-native format.". You need a FileReader to get the bytes from a Blob, that's how it works.Netti
@Laoag "Or, at least attached intrinsically at some low-level" - yes, that's exactly what I mean - a low level not accessible through JavaScript.Netti
@Netti Tried to gather what static void populateBlobData(BlobData*, const HeapVector<ArrayBufferOrArrayBufferViewOrBlobOrUSVString>& parts, bool normalizeLineEndingsToNative); see link at previous comment; does at chromium source, though not well-versed at all in C++? Also tried to look for initial posts discussing FileReader, though was not able to locate older posts or mailing lists exchanges concerning how FileReader actually accesses the Blob data? Another way to make the inquiry could be how to create a shim or replicate functionality of FileReader from scratch?Laoag
@Netti Did find shims of Blob, and it should be possible to create alternative versions of Blob and FileReader, though curious where the actual raw data is actually stored? Why is it possible to POST only Blob to php, and receive binary representation of Blob data in response? Can file_get_contents and php://input be replicated in javascript?Laoag
@guest271314: Those are serverside functions, they won't help you. No, it's not possible to intercept the data that the browsers reads from the file on the disk and sends to the server.Netti
@Netti What about the POST data at XMLHttpRequest? Given limitation of not using .responseType and an environment of safari 5.1.4? Would it be beyond the scope of this Question to inquire to determine the memory slot where Blob is stored? Or, there could be a reference to Blob data in memory at browser profile or configuration folder? When Blob is POSTed, how does javascript send the raw data? Why cannot this data by accessed? fwiw, as a note to this inquiry, found that could create a File object from a Blob with FormData.append() without using new File() constructor.Laoag
@guest271314: Same there. Yes, of course if you debug your browser process you will be able to find the data (and possibly even in some temp folder that the browser is using), but that's still not accessible from JavaScript.Netti
@Netti A workaround for reading Blob data without using FileReader stackoverflow.com/a/38295759; though note, uses technologies Response and ReadableStream.getReader(), which do not appear to have been available when safari 5.1.4 was released; which, in part, motivated this QuestionLaoag
H
16

All variables that are not explicitly represented in any other storage are stored in memory (RAM) and lives there till end of your program or while you unset it (clear it from memory).

TLDR; In RAM

Height answered 7/7, 2016 at 7:18 Comment(3)
Can you provide details of "TLDR; In RAM" specific to Blob implementation in browser?Laoag
@Laoag It's too broad and too specific for StackOverflow for such specific question (I have taken 1year Computer Architecture course in university to learn how information is saved in memory and later accessed). Try some off-site resources. Like this link: computer.howstuffworks.com/ram.htmHeight
@Laoag there is no such thing as "implementation in browser" there are multiple JS engines times multiple versions in use throughout the different browsers, and each one may have a (slightly) different approach to this. The only thing that is defined in the standards is the API of the Blob-class; how it should behave in JS. The implementation details are completely up the the browwser devs.Subinfeudate
A
21

Blobs represent a bunch of data that could live anywhere. The File API specification intentionally does not offer any synchronous way of reading a Blob's contents.

Here are some concrete possibilities.

  1. When you create a Blob via the constructor and pass it in-memory data, like an Uint8Array, the Blob's contents lives in memory, at least for a while.
  2. When you get a Blob from <input type="file">, the Blob's contents lives on disk, in the file selected by the user. The spec mentions snapshotting, but no implementation does it, because it'd add a lot of lag to user operations.
  3. When you get a Blob from another client-side storage API like IndexedDB or the Cache Storage API, the Blob's contents lives in the API's backing store on disk.
  4. Some APIs may return a Blob whose data streams from the network. The XMLHttpRequest spec makes this impossible, and I think the fetch spec also requires retrieving the entire response before creating the Blob. However, there could be a future spec that streams an HTTP response.
  5. Blobs created via the Blob constructor via an array of pieces may have their contents scattered across all the places mentioned above.

In Chrome, we use a multi-process architecture where the browser process has a central registry of all live Blobs, and serves as the source of truth for blob contents. When a Blob is created in a renderer (by JavaScript), its contents is moved to the browser process via IPC, shared memory, or temporary files, depending on the size of the Blob. The browser process may also evict in-memory Blob contents to temporary files. The 500mb limit mentioned in a previous answer was lifted around 2016. More implementation details are in the README for Chrome's Blobs subsystem.

Almaalmaata answered 2/6, 2019 at 21:48 Comment(1)
thanks for the 500mb limit note and great overall breakdown!Toner
H
16

All variables that are not explicitly represented in any other storage are stored in memory (RAM) and lives there till end of your program or while you unset it (clear it from memory).

TLDR; In RAM

Height answered 7/7, 2016 at 7:18 Comment(3)
Can you provide details of "TLDR; In RAM" specific to Blob implementation in browser?Laoag
@Laoag It's too broad and too specific for StackOverflow for such specific question (I have taken 1year Computer Architecture course in university to learn how information is saved in memory and later accessed). Try some off-site resources. Like this link: computer.howstuffworks.com/ram.htmHeight
@Laoag there is no such thing as "implementation in browser" there are multiple JS engines times multiple versions in use throughout the different browsers, and each one may have a (slightly) different approach to this. The only thing that is defined in the standards is the API of the Blob-class; how it should behave in JS. The implementation details are completely up the the browwser devs.Subinfeudate
R
13

This will not answer your question fully.

So what happens when a new Blob() is declared?

From official fileAPI documentation,

The Blob() constructor can be invoked with zero or more parameters. When the Blob() constructor is invoked, user agents must run the following Blob constructor steps:
[1] If invoked with zero parameters, return a new Blob object with its readability state set to OPENED, consisting of 0 bytes, with size set to 0, and with type set to the empty string.
[2] Otherwise, the constructor is invoked with a blobParts sequence. Let a be that sequence.
[3] Let bytes be an empty sequence of bytes.
[4] Let length be `a`s length. For 0 ≤ i < length, repeat the following steps:
    1. Let element be the ith element of a.
    2. If element is a DOMString, run the following substeps:
        Let s be the result of converting element to a sequence of Unicode characters [Unicode] using the algorithm for doing so in WebIDL.
        Encode s as UTF-8 and append the resulting bytes to bytes.
    Note:
        The algorithm from WebIDL [WebIDL] replaces unmatched surrogates in an invalid UTF-16 string with U+FFFD replacement characters. Scenarios exist when the Blob constructor may result in some data loss due to lost or scrambled character sequences.  

    3. If element is an ArrayBufferView [TypedArrays], convert it to a sequence of byteLength bytes from the underlying ArrayBuffer, starting at the byteOffset of the ArrayBufferView [TypedArrays], and append those bytes to bytes.
    4. If element is an ArrayBuffer [TypedArrays], convert it to a sequence of byteLength bytes, and append those bytes to bytes.
    5. If element is a Blob, append the bytes it represents to bytes. The type of the Blob array element is ignored.  
[5] If the type member of the optional options argument is provided and is not the empty string, run the following sub-steps:
    1. Let t be the type dictionary member. If t contains any characters outside the range U+0020 to U+007E, then set t to the empty string and return from these substeps.
    2. Convert every character in t to lowercase using the "converting a string to ASCII lowercase" algorithm.
[6] Return a Blob object with its readability state set to OPENED, referring to bytes as its associated byte sequence, with its size set to the length of bytes, and its type set to the value of t from the substeps above. 

A Blob is stored in the memory much like any other ArrayBuffer. It's stored in the ram, just like the other objects declared in the window.

Looking at the chrome://blob-internals, we can see how its physically stored in the ram. Here is an example blob.

c7828dad-dd4f-44e6-b374-9239dbe35e35
    Refcount: 1
    Status: BlobStatus::DONE: Blob built with no errors.
    Content Type: application/javascript
    Type: file
    Path: /Users/Chetan/Library/Application Support/Google/Chrome/Default/blob_storage/c7828dad-dd4f-44e6-b374-9239dbe35e35/0
    Modification Time: Monday, June 5, 2017 at 4:29:53 PM
    Offset: 4,917,846
    Length: 224,733

On printing the actual contents of the blob, we get a normal js file.

$ cat c7828dad-dd4f-44e6-b374-9239dbe35e35/0

...
html {
   font-family: sans-serif;
   /* 1 */
   -ms-text-size-adjust: 100%;
   /* 2 */
   -webkit-text-size-adjust: 100%;
   /* 2 */ }

/**
 * Remove default margin.
 */
body {
    margin: 0; }
...
Rollick answered 24/5, 2017 at 5:15 Comment(2)
how could you print that content with cat?Lucretialucretius
You say it is "physically stored in the ram," yet it has a file system path. So when does it get copied from RAM to disk?Vennieveno
A
6

Blob is stored in memory. In browser blob storage. If you create a blob object, you can check it at Firefox memory profiler(about:memory). An example of firefox output, here we can see, selected files. There is a difference between Blob and File. Blob stores at the memory, File stores at filesystem.

651.04 MB (100.0%) -- explicit
├──430.49 MB (66.12%) -- dom
│  ├──428.99 MB (65.89%) -- memory-file-data
│  │  ├──428.93 MB (65.88%) -- large
│  │  │  ├────4.00 MB (00.61%) ── file(length=2111596, sha1=b95ccd8d05cb3e7a4038ec5db1a96d206639b740)
│  │  │  ├────4.00 MB (00.61%) ── file(length=2126739, sha1=15edd5bb2a17675ae3f314538b2ec16f647e75d7)

There is a bug in Google Chrome. Chrome has blob limit. When you create total blob amount more than 500mb. The browser will stop creating blobs, because of blob storage is reached a 500mb limit. The only way to avoid this is to write a blob to IndexDb and remove from IndexDb. When a blob is written to indexDb, blob object automatically will be saved to a file system (blob will be converted to file). Blobs will be cleaned from memory with Garbage Collector after you will stop using them, or make blob = null. But GC will remove blob after some time, not instantaneously.

Aldoaldol answered 24/5, 2017 at 5:21 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.