How to read from stdin line by line in Node
Asked Answered
M

12

274

I'm looking to process a text file with node using a command line call like:

node app.js < input.txt

Each line of the file needs to be processed individually, but once processed the input line can be forgotten.

Using the on-data listener of the stdin, I get the input steam chunked by a byte size so I set this up.

process.stdin.resume();
process.stdin.setEncoding('utf8');

var lingeringLine = "";

process.stdin.on('data', function(chunk) {
    lines = chunk.split("\n");

    lines[0] = lingeringLine + lines[0];
    lingeringLine = lines.pop();

    lines.forEach(processLine);
});

process.stdin.on('end', function() {
    processLine(lingeringLine);
});

But this seems so sloppy. Having to massage around the first and last items of the lines array. Is there not a more elegant way to do this?

Memnon answered 20/11, 2013 at 3:39 Comment(0)
A
316

You can use the readline module to read from stdin line by line:

const readline = require('readline');

const rl = readline.createInterface({
  input: process.stdin,
  output: process.stdout,
  terminal: false
});

rl.on('line', (line) => {
    console.log(line);
});

rl.once('close', () => {
     // end of input
 });
Alter answered 20/11, 2013 at 4:2 Comment(9)
That seems to work well for entering input by hand in the console, however, when I pass a file into the command the file is sent to stdout. A bug? readline is considered unstable at this point.Memnon
I think you can just change process.stdout to a different writable stream — it could be as simple as output: new require('stream').Writable()Unspoiled
Unfortunately, I need the stdout. I left it out of my question, but I'm trying to get the app to be usable as node app.js < input.txt > output.txt.Memnon
Apparently this is 'by design' github.com/joyent/node/issues/4243#issuecomment-10133900. So I ended up doing as you said and provided the output option a dummy writable stream, then wrote directly to the stdout stream. I don't like it, but it works.Memnon
Looks like if you pass the argument terminal: false to createInterface, it fixes this problem.Meaningless
I have used output: null with no ill effects.Ido
It's worth noting process.stdin is broken in electron for windows and the electron team does not believe it is fixable, so this code and every other example will not work.Wnw
reading from stdin is common practice for utilities and cli programs... it should definitely not require an external module/dependencyHying
readline is a builtin module to node. Also it makes sense because javascript was designed for browsers which don't have consoles (in the traditional sense).Acth
M
142
// Work on POSIX and Windows
var fs = require("fs");
var stdinBuffer = fs.readFileSync(0); // STDIN_FILENO = 0
console.log(stdinBuffer.toString());
Marten answered 3/8, 2017 at 14:9 Comment(15)
Could you include some details? There is already a highly rated accepted answerDecalogue
This doesn't work for me (node v9.2.0, Windows). Error: EISDIR: illegal operation on a directory, fstat at tryStatSync (fs.js:534:13)`Hypesthesia
Worked for me on node v6.11.2, OSX.Maritime
@AlexChaffee: There appears to be a bug on Windows (still present as of v9.10.1) if there's no stdin input or if stdin is closed - see this GitHub issue. Apart from this, however, the solution does work on Windows.Clog
This is what I was looking for, but it doesn't actually answer the question, so I guess it's kind of a wash.Khaddar
for whom got the fs not defined error: w3schools.com/nodejs/nodejs_filesystem.aspUranography
works very well and is the shortest by far, could make it shorter by doing fs.readFileSync(0).toString()Crompton
Great answer! Really helps for js shell scripts and quickly/easily/synchronously getting values piped into the script as input.Fortissimo
Note that the "magic number" 0 can be replaced with the clearer process.stdin.fd (which is just hard-coded to 0 but makes it more obvious what you're doing)Khaddar
my note: echo hoge | node -e 'console.log(require("fs").readFileSync(0).toString())'Dashpot
Nice, and does not require external dependencies and setting up npm package. Hence, can be used in inline mode from anywhere.Rhombus
This did not work for work on Windows 11 when stdin is long. Truncated the input.Gaga
@Khaddar Please note, however, that using process.stdin.fd can cause issues, as described here: #40362869 . Something like const STDIN_FILE_DESCRIPTOR = 0; fs.readFileSync(STDIN_FILE_DESCRIPTOR);, will work in all cases, while keeping your intention explicit.Slope
This doesn't seem to read line by line, rather it waits for stdin to close before processing everything at once.Bibliogony
I've had cases that fails with "EAGAIN: resource temporarily unavailable" errors as noted here, especially when trying to run the code in middle of piping calls. Turns out it may not be safe to read stdin in a non-blocked way.Lecturer
G
70

readline is specifically designed to work with terminal (that is process.stdin.isTTY === true). There are a lot of modules which provide split functionality for generic streams, like split. It makes things super-easy:

process.stdin.pipe(require('split')()).on('data', processLine)

function processLine (line) {
  console.log(line + '!')
}
Goody answered 20/11, 2013 at 9:14 Comment(2)
no it's not. If you don't want to read line-by-line you don't need it at allGoody
Tip: if you want to run some code after processing all the lines, add .on('end', doMoreStuff) after the first .on(). Remember that if you just write the code normally after the statement with .on(), that code will run before any input is read, because JavaScript isn’t synchronous.Brendis
B
23
#!/usr/bin/env node

const EventEmitter = require('events');

function stdinLineByLine() {
  const stdin = new EventEmitter();
  let buff = '';

  process.stdin
    .on('data', data => {
      buff += data;
      lines = buff.split(/\r\n|\n/);
      buff = lines.pop();
      lines.forEach(line => stdin.emit('line', line));
    })
    .on('end', () => {
      if (buff.length > 0) stdin.emit('line', buff);
    });

  return stdin;
}

const stdin = stdinLineByLine();
stdin.on('line', console.log);
Bul answered 21/4, 2018 at 16:24 Comment(0)
T
9

Node.js has changed a lot since the accepted answer was posted, so here is a modern example using readline to split the stream into lines, for await to read from the stream, and ES modules:

import { createInterface } from "node:readline"

for await (const line of createInterface({ input: process.stdin })) {
  // Do something with `line` here.
  console.log(line)
}
True answered 22/7, 2023 at 9:19 Comment(0)
R
3

New answer to old question.

Since Node 10 (April 2018) ReadableStreams such as process.stdin support for-await-of loops thanks to the addition of a Symbol.asyncIterator method (ReadableStream documentation, Symbol.asyncIterator documentation).

Using this we can create an adaptor that goes from iterating through chunks of data, to iterating through lines. The logic for doing this was adapted from this answer.

function streamByLines(stream) {
  stream.setEncoding('utf8');
  return {
    async *[Symbol.asyncIterator]() {
      let buffer = '';

      for await (const chunk of stream) {
        buffer += chunk;
        const lines = buffer.split(/\r?\n/);
        buffer = lines.pop();
        for (const line of lines) {
          yield line;
        }
      }
      if (buffer.length > 0) yield buffer;
    },
  };
}

You can use it like this (in a context where await is allowed)

for await (const line of streamByLines(process.stdin)) {
  console.log('Current line:', line)
}
Raincoat answered 4/3, 2023 at 3:58 Comment(0)
B
2

Just translating what https://mcmap.net/q/108268/-how-to-read-from-stdin-line-by-line-in-node said to plain JavaScript, require and adding a async function callback to allow us to test it:

readlines.js

const readline = require('readline');

(async function () {
for await (const line of readline.createInterface({ input: process.stdin })) {
  console.log(line)
}
})()

We can then test this with:

(echo asdf; sleep 1; echo qwer; sleep 1; echo zxcv) | node  readlines.js

and it outputs:

asdf
qwer
zxcv

where each line is printed immediately after it is read from stdin, spaced 1 second apart. This confirms to us that lines are being read one by one and immediately after they are made available.

Tested on Node.js v16.14.2, Ubuntu 23.04.

Bibliogony answered 2/11, 2023 at 3:43 Comment(0)
V
-1

In my case the program (elinks) returned lines that looked empty, but in fact had special terminal characters, color control codes and backspace, so grep options presented in other answers did not work for me. So I wrote this small script in Node.js. I called the file tight, but that's just a random name.

#!/usr/bin/env node

function visible(a) {
    var R  =  ''
    for (var i = 0; i < a.length; i++) {
        if (a[i] == '\b') {  R -= 1; continue; }  
        if (a[i] == '\u001b') {
            while (a[i] != 'm' && i < a.length) i++
            if (a[i] == undefined) break
        }
        else R += a[i]
    }
    return  R
}

function empty(a) {
    a = visible(a)
    for (var i = 0; i < a.length; i++) {
        if (a[i] != ' ') return false
    }
    return  true
}

var readline = require('readline')
var rl = readline.createInterface({ input: process.stdin, output: process.stdout, terminal: false })

rl.on('line', function(line) {
    if (!empty(line)) console.log(line) 
})
Vesuvius answered 13/6, 2016 at 6:54 Comment(0)
C
-1

I have revisited the code years later, here is the updated code

It should be good for large files piped into stdin, Supporting utf8 and pause && resume It can read a stream line by line and process the input without choking.

StreamLinesReader Class Description:

The StreamLinesReader class is designed to process streams of text data, specifically handling each line of input efficiently and correctly. Key features of this class include:

  • The class sets the stream's encoding to UTF-8. To ensure that multibyte utf-8 characters are not split in the middle.

  • Stream uses Pause-Resume Mechanism: The class pauses the input stream upon receiving data. This mechanism is essential for controlling the flow of data, especially in scenarios where the line processing function is slower than the rate at which data is received. By pausing the stream, it prevents buffer overflow and ensures that each line is processed sequentially without losing any data.

  • Accumulation of Incomplete Lines: In cases where the data chunks received do not always end with a newline character, the class accumulates these partial lines. It holds them in a buffer until a complete line (ending with a newline) is received. This approach ensures that lines are processed only when they are complete, preserving the integrity of the data.

  • Handling of Split Lines: When a chunk of data is received, the class splits it into lines and processes each line individually. If the last part of the chunk does not end with a newline, this part is buffered and not returned right away. After processing all complete lines, the class pushes the last (potentially incomplete) line into the buffer. After all lines were processed the stream is then resumed to receive more data, allowing the buffered line to be completed in the next chunk.

  • Stream End Processing: When the end of the stream is reached, the class checks if there is any remaining data in the buffer (an incomplete line) and then it processes it. Additionally, the class provides a mechanism to notify when the stream processing is complete, through the wait function which returns a promise that resolves upon the completion of the stream processing.

class code:

class StreamLinesReader {
    constructor(stream, onLineFunction, onEnd = undefined) {
        stream.pause();

        this.stream = stream;
        this.onLine = onLineFunction;
        this.onEnd = onEnd;
        this.buffer = [];
        this.line = 0;

        this.stream.setEncoding('utf8');

        this.stream.on('data', (chunk) => {
            stream.pause();
            this.processChunk(chunk).then(() => this.stream.resume());
        });

        this.stream.on('end', async () => {
            if (this.buffer.length) {
                const str = this.buffer.join('');
                await this.onLine(str, this.line++);
            }
            if (this.onEnd) await this.onEnd();
            if (this.resolveWait) this.resolveWait();
        });

        this.stream.resume();
    }

    async processChunk(chunk) {
        const newlines = /\r\n|\n/;
        const lines = chunk.split(newlines);

        if (lines.length === 1) {
            this.buffer.push(lines[0]);
            return;
        }

        // Join buffer and first line
        this.buffer.push(lines[0]);
        const str = this.buffer.join('');
        this.buffer.length = 0;
        await this.onLine(str, this.line++);

        // Process lines in the chunk
        for (let i = 1; i < lines.length - 1; i++) {
            await this.onLine(lines[i], this.line++);
        }

        // Buffer the last line (might be the beginning of the next line)
        this.buffer.push(lines[lines.length - 1]);
    }

    // optional:
    waitEnd() {
        // Return a new promise and save the resolve function
        return new Promise((resolve) => {
            this.resolveWait = resolve;
        });
    }
}

example usage:

session.on('pty', (accept, reject, info) => {
    accept();
    session.on('shell', (accept, reject) => {
        const stream = accept();

        const onLineFunction = async (line, lineNumber) => {
            console.log(lineNumber, "line ", line);
            if (line === 'exit') {
                stream.end();
                // Assuming conn is a connection variable defined elsewhere
                conn.end();
            }
        };

        const onEndFunction = async () => {
            console.log("Stream has ended");
        };

        new StreamLinesReader(stream, onLineFunction, onEndFunction);
        
        const OUTPUT = 'shell output!\n';
        stream.write(OUTPUT);
    });
});

my old synchronous code was:

read stream line by line,should be good for large files piped into stdin, my version:

var n=0;
function on_line(line,cb)
{
    ////one each line
    console.log(n++,"line ",line);
    return cb();
    ////end of one each line
}

var fs = require('fs');
var readStream = fs.createReadStream('all_titles.txt');
//var readStream = process.stdin;
readStream.pause();
readStream.setEncoding('utf8');

var buffer=[];
readStream.on('data', (chunk) => {
    const newlines=/[\r\n]+/;
    var lines=chunk.split(newlines)
    if(lines.length==1)
    {
        buffer.push(lines[0]);
        return;
    }   
    
    buffer.push(lines[0]);
    var str=buffer.join('');
    buffer.length=0;
    readStream.pause();

    on_line(str,()=>{
        var i=1,l=lines.length-1;
        i--;
        function while_next()
        {
            i++;
            if(i<l)
            {
                return on_line(lines[i],while_next);
            }
            else
            {
                buffer.push(lines.pop());
                lines.length=0;
                return readStream.resume();
            }
        }
        while_next();
    });
  }).on('end', ()=>{
      if(buffer.length)
          var str=buffer.join('');
          buffer.length=0;
        on_line(str,()=>{
            ////after end
            console.error('done')
            ////end after end
        });
  });
readStream.resume();
Cheffetz answered 14/8, 2016 at 16:42 Comment(2)
what is happening in this answer?Featherston
@Featherston Added an explanation and an es6 codeCheffetz
B
-1

If you wish to await the input entry you can do:

function getUserInputOnEnter() {
    return new Promise(resolve => {
        rl.on('line', line => resolve(line))
    });
}

and then use it as :

let cnt = await getUserInputOnEnter();
Bolinger answered 10/4, 2024 at 5:29 Comment(1)
This will only resolve the first line.Memnon
C
-2

if you want to ask the user number of lines first:

    //array to save line by line 
    let xInputs = [];

    const getInput = async (resolve)=>{
            const readline = require('readline').createInterface({
                input: process.stdin,
                output: process.stdout,
            });
            readline.on('line',(line)=>{
            readline.close();
            xInputs.push(line);
            resolve(line);
            })
    }

    const getMultiInput = (numberOfInputLines,callback)=>{
        let i = 0;
        let p = Promise.resolve(); 
        for (; i < numberOfInputLines; i++) {
            p = p.then(_ => new Promise(resolve => getInput(resolve)));
        }
        p.then(()=>{
            callback();
        });
    }

    //get number of lines 
    const readline = require('readline').createInterface({
        input: process.stdin,
        output: process.stdout,
        terminal: false
    });
    readline.on('line',(line)=>{
        getMultiInput(line,()=>{
           //get here the inputs from xinputs array 
        });
        readline.close();
    })
Carillo answered 3/10, 2020 at 6:5 Comment(0)
A
-9
process.stdin.pipe(process.stdout);
Anglim answered 9/11, 2020 at 6:30 Comment(2)
Please add some explanation tooDugout
Hi Ayush. Thanks for your answer. Usually answers with an explanation are more welcomed there. Would you like to add an explanation to your answer? You may improve formatting of your answer as well.Tandi

© 2022 - 2025 — McMap. All rights reserved.