How to use Javascript to read local text file and read line by line?
Asked Answered
M

4

101

I have a web page made by html+javascript which is demo, I want to know how to read a local csv file and read line by line so that I can extract data from the csv file.

Messere answered 28/4, 2014 at 2:22 Comment(10)
Check this out html5rocks.com/en/tutorials/file/dndfilesAsafoetida
do you have any browser compatibility requirements? specifically ddo you support ie9 or less?Wilen
developer.mozilla.org/en-US/docs/Web/API/FileReaderAblation
@HunterLarco thank you, the problem is that I don't know how to get each line from the result. I mean reader.readAsText() returns all the data instead of I can read line by lineMessere
@LukeMcGregor No requirements, just support the current versions will be OK.Messere
@Messere No problem. You could try splitting the result by '\n', being the new line character.Asafoetida
@Derek朕會功夫 Still, how to get contents line by line, because it seems that readAsText will return all the dataMessere
@HunterLarco, OK, just like C string styleMessere
@Messere - Here's a related post that might answer your question: #9917746Ablation
A better tested and more production-quality solution is at #24648063. @Derek: no, that answer you mentioned doesn't help.Dorado
B
138

Without jQuery:

const $output = document.getElementById('output')
document.getElementById('file').onchange = function() {
  var file = this.files[0];

  var reader = new FileReader();
  reader.onload = function(progressEvent) {
    // Entire file
    const text = this.result;
    $output.innerText = text

    // By lines
    var lines = text.split('\n');
    for (var line = 0; line < lines.length; line++) {
      console.log(lines[line]);
    }
  };
  reader.readAsText(file);
};
<input type="file" name="file" id="file">
<div id='output'>
  ...
</div>

Remember to put your javascript code after the file field is rendered.

Bethesda answered 28/4, 2014 at 3:40 Comment(10)
I have 200000 lines (not kidding, it's a log file). I don't think your solution covers that, nice try though.Lanate
Also, this solution doesn't handle if that return(line feed) is within a quoted field. As for Tomas, if you have a more advanced browser, you could use a generator to read line by line without doing a "split".Laddy
where is the path for external file in which we take lines?Iorio
@TomášZato mine is 100m lines. I haven't tested that answer yet though.. How did you approach that? A link to an example would be super-appreciated! huanPastas, +1 for the answer!Syrup
@Syrup I don't remember exactly but I used some stream that reads data piece by piece and then emmited event for every time I encountered \n. But with 100m lines, you're gonna run into table with displaying them in HTML.Lanate
I've been looking everywhere for a way to do this and this is exactly what I needed. God bless your soul.Phail
Can we read a file onload of the script without input tag solely in javascript if we provide the file path in it ?Fiorenze
@TomášZato-ReinstateMonica In fact, I just ran this script on a 60.000.000+ line file, and it was perfectly smooth :) Longest part was the upload.Ortiz
@Ortiz I actually once wrote a generator function that does not split the original string and serves lines one after another, but I cannot find it.Lanate
How would this look if I already know the filename in my app file system? I don't want to trigger the file picker.Sandberg
A
46

Using ES6 the javascript becomes a little cleaner

handleFiles(input) {

    const file = input.target.files[0];
    const reader = new FileReader();

    reader.onload = (event) => {
        const file = event.target.result;
        const allLines = file.split(/\r\n|\n/);
        // Reading line by line
        allLines.forEach((line) => {
            console.log(line);
        });
    };

    reader.onerror = (event) => {
        alert(event.target.error.name);
    };

    reader.readAsText(file);
}
Alum answered 18/2, 2017 at 15:16 Comment(6)
voted up for split lines with regex which is the right way of doing it.Desdamona
Simpler regexp: \r?\nElayneelazaro
Excellent example, and I love that Windows and Unix-style line endings are handled. Thank you.Liam
const allLines = file.split(/\r\n|\n/); - This is not really "read line by line". This is gulping the whole multi-gig file and choking on it.Clansman
@Clansman indeed, so how to solve that?Magog
@Magog Please see this answer: https://mcmap.net/q/212459/-how-to-read-a-text-file-line-by-line-in-javascriptClansman
Z
3

Here's a function from the MDN docs that shows you how to use a ReadableStream to read a File line-by-line. This example uses fetch, but if you already have a File, you can call stream() and getReader() instead.

async function* makeTextFileLineIterator(fileURL) {
  const utf8Decoder = new TextDecoder("utf-8");
  let response = await fetch(fileURL);
  let reader = response.body.getReader();
  let { value: chunk, done: readerDone } = await reader.read();
  chunk = chunk ? utf8Decoder.decode(chunk, { stream: true }) : "";

  let re = /\r\n|\n|\r/gm;
  let startIndex = 0;

  for (;;) {
    let result = re.exec(chunk);
    if (!result) {
      if (readerDone) {
        break;
      }
      let remainder = chunk.substr(startIndex);
      ({ value: chunk, done: readerDone } = await reader.read());
      chunk =
        remainder + (chunk ? utf8Decoder.decode(chunk, { stream: true }) : "");
      startIndex = re.lastIndex = 0;
      continue;
    }
    yield chunk.substring(startIndex, result.index);
    startIndex = re.lastIndex;
  }
  if (startIndex < chunk.length) {
    // last line didn't end in a newline char
    yield chunk.substr(startIndex);
  }
}

for await (let line of makeTextFileLineIterator(urlOfFile)) {
  processLine(line);
}
Ziwot answered 2/11, 2023 at 16:32 Comment(0)
R
0

You can reference the following code to read the first lines of a file. But note some caveats and observations:

  • Why search for the position of the line break? You might want to directly read 512KB(or any other chunk size) as text. But note that unless you read the entire file all at once, you risk breaking a Unicode character at the 512KB boundary. The last several bytes in the chunk might be an incomplete Unicode. When you are slicing the Blob(File) object, you are slicing a byte array instead of a character array. However, if we locate the position of the line break and read up to that location, we know everything that come before it are whole Unicode characters.

  • Does this guarantee not reading a whole file? I do not know, but at least from the consumption point of view, I am not touching anything after the chunks I read. If the underlying mechanism on the browser side wants to mobilize the entire file, this is none of my concern and will be the best I can do.

Example code:

/*
    This function is used to scan the first few lines of a file to determine the position of the nth line break. 
    This is useful for large files where we want to avoid reading the entire file into memory.
    Read is done in chunks of 512KB.
*/
async function scanLinePosition(file: File, lines: number): Promise<number> {
    return await new Promise((resolve, reject) => {
        const reader = new FileReader();
        let rowsRead = 0;
        let chunkSize = 512 * 1024; // 512KB
        let totalRead = 0;

        reader.onload = () => {
            const bytes = new Uint8Array(reader.result as ArrayBuffer);
            for (let i = 0; i < bytes.length; i++) {
                if (bytes[i] === 10) {
                    rowsRead++;
                }
                if (rowsRead >= lines) {
                    break;
                }
            }

            totalRead += bytes.length;

            if (rowsRead >= lines) {
                resolve(totalRead);
                return;
            }

            if (bytes.length === chunkSize && rowsRead < lines) {
                reader.readAsArrayBuffer(file.slice(totalRead, totalRead + chunkSize));
            } else {
                resolve(totalRead);
            }
        };

        reader.onerror = (error) => {
            reject(error);
        };

        reader.readAsArrayBuffer(file.slice(0, chunkSize));
    });
}

async function readFileContent(file: File, lines: number) {
    const readLimit = await scanLinePosition(file, lines);

    return await new Promise((resolve, reject) => {
        const reader = new FileReader();
        reader.onload = () => {
            const rows = (reader.result as string).split("\n");
            if (rows.length >= lines) {
                resolve(rows.slice(0, lines));
            } else {
                reject(new Error("File is too short"));
            }
        };

        reader.onerror = (error) => {
            reject(error);
        };

        reader.readAsText(file.slice(0, readLimit));
    });
}
Rhythmist answered 19/3 at 17:42 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.