Custom implementation of InputStream

S

6

8

To send data to a file on my FTP server, I need to create a custom InputStream implementation that reads database data row by row, converts it to CSV and publishes it via its read() methods: from the database, I get a List<Application> object with the data. For each Application object, I want to create a line in the CSV file.

My idea is to load all the data in the constructor and then override the read method. Do I need to override all InputStream's methods? I tried googling for some examples but didn't succeed - could you eventually give me a link to one?

Styracaceous answered 26/1, 2011 at 13:40 Comment(1)

It might be easier to write bytes to PipedOutputStream which would be read from a corresponding PipedOutputStream: https://mcmap.net/q/80862/-how-do-i-convert-an-outputstream-to-an-inputstream – Schnorkle 14/9, 2021 at 15:23

P

4

For possibly large data you can use com.google.common.io.FileBackedOutputStream from guava.

Javadoc: An OutputStream that starts buffering to a byte array, but switches to file buffering once the data reaches a configurable size.

Using out.getSupplier().getInput() you get your InputStream.

Precaution answered 26/1, 2011 at 14:30 Comment(2)

so could I first use it as an output stream, submitting all my data, and then get an input stream providing all that data to my FTP client? – Styracaceous 27/1, 2011 at 12:3

Yes and no, it should be easier than doing it on the fly. If there's any error when reading, nothing gets sent. You can display the whole content while debugging if you want to. It's up to you. – Precaution 27/1, 2011 at 23:15

B

13

You only nead to implement the read() method without parameters. All other methods are implemented as calls to that method. For performance reasons (and even ease of implementation) it might be easier to implement the three-argument read() method instead and re-implement the no-args read() method in terms of that method.

Belda answered 26/1, 2011 at 13:44 Comment(6)

According to the documentation, you need to implement the available() method also. – Lumpfish 28/1, 2012 at 22:6

Mr. Ed: it clearly says "should". And since you should not rely on available() to know how many bytes to ready anyway, I'd say that the stream will work just fine with the default implementation. – Belda 29/1, 2012 at 15:47

The default implementation of available() returns zero. If some function relied on that to know if they could read and get something or not, that function won't work unless available() works. – Lumpfish 31/1, 2012 at 7:29

@Mr Ed: yes, that's true. But such a function would be broken: available() is only defined to be an estimate. And relying on an estimate to be accurate is a mistake, in my opinion. – Belda 31/1, 2012 at 11:46

The latest documentation says: "...The available method for class InputStream always returns 0. This method should be overridden by subclasses. Returns: an estimate of the number of bytes that can be read (or skipped over) from this input stream without blocking or 0 when it reaches the end of the input stream." Anyway, the only reason I mentioned it in the first place was because my program didn't work because I didn't define that method. – Lumpfish 31/1, 2012 at 13:9

my actual experience here is if you only implement read() bufferedinputstream will return short reads if you try to read past 256 characters, when I implemented available() it started working correctly. – Smithson 17/8, 2020 at 1:26

F

9

Some very important points which I met when implementing my InputStream.

Override available(). As the Javadoc says:

The available method for class InputStream always returns 0. This method should be overridden by subclasses.

not overriding this method will causes that any tempt to test whether this stream is readable return false. For example, if you feed your inputStream to a inputStreamReader, this reader will always return false when you invoke reader.ready().
return -1 in the read(). The doc didn't emphasize it:

If no byte is available because the end of the stream has been reached, the value -1 is returned. This method blocks until input data is available, the end of the stream is detected, or an exception is thrown.

if you choose to block read() when no data is available, you have to remember to return -1 at some situations. Not doing this may causes that another read(byte b[], int off, int len) blocks for the following code in the source:
```
for (; i < len ; i++) {// default len is a relative large number (8192 - readPosition)
    c = read();
    if (c == -1) {
        break;
    }
    b[off + i] = (byte)c;
}
```
And this causes some(if not all) high level read block, like a reader's readLine(), read() etc.

Furnishing answered 29/12, 2015 at 8:57 Comment(0)

P

4

For possibly large data you can use com.google.common.io.FileBackedOutputStream from guava.

Javadoc: An OutputStream that starts buffering to a byte array, but switches to file buffering once the data reaches a configurable size.

Using out.getSupplier().getInput() you get your InputStream.

Precaution answered 26/1, 2011 at 14:30 Comment(2)

so could I first use it as an output stream, submitting all my data, and then get an input stream providing all that data to my FTP client? – Styracaceous 27/1, 2011 at 12:3

Yes and no, it should be easier than doing it on the fly. If there's any error when reading, nothing gets sent. You can display the whole content while debugging if you want to. It's up to you. – Precaution 27/1, 2011 at 23:15

K

1

There's absolutely no need to create a custom InputStream. Use ByteArrayInputStream, something like this:

public static InputStream createStream(){
    final String csv = createCsvFromDataBaseValues();
    return new ByteArrayInputStream(csv.getBytes());
}

Especially given this quote:

My idea is to load all the data in the constructor and then override the read method.

If you do it like this, you gain absolutely nothing by implementing a custom InputStream. It's pretty much equivalent to the approach I outlined above.

Katherinkatherina answered 26/1, 2011 at 13:47 Comment(3)

it's probably in tens of thousands of Application objects; each produces a line in the CSV file of roughly 100 characters. would it be a good idea to produce such a long string in memory or would it be better to make a temporary file and transfer it when it's done? – Styracaceous 26/1, 2011 at 13:52

@John either way, I'd create a common interface that I'd pass to the Application objects, and I'd experiment with both StringBuilder- and File backed versions of this interface. – Katherinkatherina 26/1, 2011 at 13:56

My understanding says, it would fail if the data is larger than 2GB. Correct me, if I am wrong. – Laudable 10/8, 2022 at 13:7

N

1

Why do you need a custon inputstream? why not just write the csv data as you generate it to the outputstream being written to the ftp server?

Nerva answered 26/1, 2011 at 16:9 Comment(2)

It's a nice approach if you can partition the reads and writes into separate threads. – Rebirth 2/8, 2018 at 0:4

@nobar, that's rarely useful. about the only time that's a useful thing to do is if the producer and consumer can both arbitrarily block on some other resources. – Nerva 13/8, 2018 at 14:45

A

0

If the data is not too large, you could:

Read it all
Convert to CSV (text)
Get the text bytes (via String.getBytes(encoding))
But the byte array in a ByteArrayInputStream

Altorelievo answered 26/1, 2011 at 13:47 Comment(0)

Recommended topics

Hot tags