DataInputStream vs InputStreamReader, trying to conceptually understand the two
Asked Answered
F

6

7

As I tentatively understand it at the moment:

DataInputStream is an InputStream subclass, hence it reads and writes bytes. If you are reading bytes and you know they are all going to be ints or some other primitive data type, then you can read those bytes directly into the primitive using DataInputStream.

  • Question: Would you would need to know the type (int, string, etc) of the content being read before it is read? And would the whole file need to consist of that one primitive type?

The question I am having is: Why not use an InputStreamReader wrapped around the InputStream's byte data? With this approach you are still reading the bytes, then converting them to integers that represent characters. Which integers represent which characters depends on the character set specified, e.g., "UTF-8".

  • Question: In what case would an InputStreamReader fail to work where a DataInputStream would work?

My guess answer: If speed is really important, and you can do it, then converting the InputStream's byte data directly to the primitive via DataInputStream would be the way to go? This avoids the Reader having to "cast" the byte data to an int first; and it wouldn't rely on providing a character set to interpret which character is being represented by the returned integer. I suppose this is what people mean by DataInputStream allows for a machine-indepent read of the underlying data.

  • Simplification: DataInputStream can convert bytes directly to primitives.

Question that spurred the whole thing: I was reading the following tutorial code:

    FileInputStream fis = openFileInput("myFileText");

    BufferedReader reader = new BufferedReader( new InputStreamReader( new DataInputStream(fis)));

    EditText editText = (EditText)findViewById(R.id.edit_text);

    String line;

    while(  (line = reader.readline()) != null){

        editText.append(line);
        editText.append("\n");
    }

...I do not understand why the instructor chose to use new DataInputStream(fis) because it doesn't look like any of the ability to directly convert from bytes to primitives is being leveraged?

  • Am I missing something?

Thanks for your insights.

Flammable answered 26/8, 2014 at 17:14 Comment(1)
I don't think you are missing anything. InputStreamReader is going to call DataInputStream.read to read bytes from the file.Partheniaparthenocarpy
A
13

InputStreamReader and DataInputStream are completely different.

DataInputStream is an InputStream subclass, hence it reads and writes bytes.

This is incorrect, an InputStream only reads bytes and the DataInputStream extends it so you can read Java primitives as well. Neither of them is able to write any data.

Question: would you would need to know the type (int, string, etc) of the content being read before it is read? And would the whole file need to consist of that one primitive type?

A DataInputStream should only be used to read data that was previously written by a DataOutputStream. If that's not the case, your DataInputStream is not likely to "understand" the data you are reading and will return random data. Therefore, you should know exactly what type of data was written by the corresponding DataOutputStream in which order.

For example, if you want to save your application's state (let's say it consists of a few numbers):

public void exit() {
    //...
    DataOutputStream dout = new DataOutputStream(new FileOutputStream(STATE_FILE));
    dout.write(somefloat);
    dout.write(someInt);
    dout.write(someDouble);
}

public void startup() {
    DataInputStream dout = new DataInputStream(new FileInputStream(STATE_FILE));
    //exactly the same order, otherwise it's going to return weird data
    dout.read(somefloat);
    dout.read(someInt);
    dout.read(someDouble);
}

That's basically the whole story of DataInputStream and DataOutputStream: write your primitive variables to a stream and read them.

Now, the InputStreamReader is something entirely different. An InputStreamReader "translates" encoded text to Java characters. You can basically use any text stream (knowing its encoding) and read Java Characters from that source using an InputStreamReader.

With this approach you are still reading the bytes, then converting them to integers that represent characters. Which integers represent which characters depends on the character set specified, e.g., "UTF-8".

A character encoding is more than a simple mapping between code points and characters. Further than that, it specifies how a code point is represented in memory. For example, UTF-8 and UTF-16 share the same character mapping, but an InputStreamReader would fail dramatically if you tried to read a UTF-8 stream as UTF-16. The string aabb, which represented by four bytes un UTF-8 ('a', 'a', 'b', 'b') would be converted to two characters. The values of the two a's and b's would be regarded as one character. I'm too lazy to look up which characters those would be, but they would be very weird.

An InputStreamReader handles all that stuff and is therefore able to read text from any source (unlike DataInputStream) if you know the encoding.

Question: In what case would an InputStreamReader fail to work where a DataInputStream would work?

This should be quite clear by now. Since both classes have completely different purposes, you shouldn't ask this question. An InputStreamReader does not convert bytes to integers like a DataInputStream and is not designed for that purpose.

In the tutorial code, I am quite sure that you could omit the DataInputStream:

BufferedReader reader = new BufferedReader( new InputStreamReader(fis));

However, DataInputStream provides the same methods as InputStream, which is why it's not wrong to wrap the FileInputStream inside it (although it's unnecessary).

Alphanumeric answered 26/8, 2014 at 17:54 Comment(1)
Thank you for your time and energy. Good stuff, helped a lot.Flammable
M
2

Your professors example is very misleading.

From the API, it says this about DataInputStream: "Read primitive Java data types from an underlying input stream in a machine-independent way". Back in Java v 1.0, this was much more important, as communication protocols like HTML, XML, and JSON didn't exist or were in their infancy. So the big problem was with big-endian / little-endian issue with ints, longs, floats, etc. Communicating from a big-endian computer to a little-endian computer in Java using raw sockets WOULD need something like this.

But this has nothing to do with reading text files. I'd advise using FileReader to connect to the BufferedReader, rather than what your professor has done.

Maisiemaison answered 26/8, 2014 at 17:36 Comment(0)
P
0

DataInputStream is for reading simple data from a binary file, e.g. ints, Strings, objects, etc. (As opposed to something like ASN.1, audio, or images, although they could be implemented on top of a DataInputStream, but that's off topic.) InputStreamReader is for converting from a byte stream to a character stream.

As far as the example code goes, this:

FileInputStream fis = ...;
BufferedReader reader = new BufferedReader(
    new InputStreamReader( 
        new DataInputStream(fis)
    )
);

and this:

FileInputStream fis = ...;
BufferedReader reader = new BufferedReader(
    new InputStreamReader(fis)
);

do exactly the same thing, except that the first one is slightly less efficient.

Pedraza answered 26/8, 2014 at 17:38 Comment(0)
S
0

DataInputStream has additional methods to read binary data, for example to read four bytes as a 32-bit integer. It's still a InputStream.

A InputStreamReader takes a InputStream which is a stream of bytes and translates them into a character stream (Reader) with an encoding. It can be used for text files.

Combining both does not have any effect because DataInputStream does not change the behaviour of an InputStream.

Supen answered 26/8, 2014 at 17:55 Comment(0)
F
0

After having read these wonderful responses I wanted to take that knowledge and cement it. Here is a mini program demonstrating how DataInputStream & DataOutputStream work. One of the biggest takeaways for me was: with these streams you have to know the type and order of the data that is being read. So here's a complete piece of code that uses a DataOutputStream to save object instance fields to a file, then those fields are read from the file, the objects are remade, then printed.

package StackoverflowQuestion25511536;

import java.io.BufferedInputStream;
import java.io.BufferedOutputStream;
import java.io.DataInputStream;
import java.io.DataOutputStream;
import java.io.EOFException;
import java.io.File;
import java.io.FileInputStream;
import java.io.FileOutputStream;
import java.io.IOException;
import java.util.ArrayList;
import java.util.List;

/**
* Earthlings writes a collection of Earthling Peeps to
* a file using DataInputStream, then reads the collection. 
* 
* The intent of this class is to demonstrate that
* DataInputStream & DataOutputStream require knowledge
* of the data (types and order) that is written to a file so 
* that it can be meaningfully interpreted when read.
* 
* Using object serialization and ObjectInputStream
* and ObjectOutputStream should be considered.
* 
* Detection of end of file (EOF)
* is determined using technique taught in "Introduction to Java 
* Programming", 9th ed, by Y. Danial Liang.  
* 
* @author Ross Studtman
*/
public class Earthlings {   

List<Peeps> peepCollection = new ArrayList<Peeps>();

public static void main(String[] args){
    new Earthlings().run(); 
}

public void run(){
    makePeeps();
    writeAndRead();
}

public void makePeeps(){    
    peepCollection.add(new Peeps("Ross", 45, 6.9));
    peepCollection.add(new Peeps("Lebowski", 42, 7.8));
    peepCollection.add(new Peeps("Whedon", 50, 8.8));       
}

// When end of file reached an EOFException is thrown.
public void writeAndRead(){

    DataOutputStream output = null;
    DataInputStream input = null;

    try{
        // Create DataOutputStream
        output = new DataOutputStream(new BufferedOutputStream(new FileOutputStream("peeps.oddPeople")));

        // Iterate over collection
        for(Peeps peep : peepCollection){

            // Assign instance fields to output types.
            output.writeUTF(peep.getName());
            output.writeInt(peep.getAge());
            output.writeDouble(peep.getOddness());
        }

        // flush buffer to ensure everything is written.
        output.flush();

        // Close output
        output.close();

        // Create DataInputStream
        File theSavedFile = new File("peeps.oddPeople");            
        input = new DataInputStream(new BufferedInputStream(new FileInputStream(theSavedFile)));

        // How many bytes are in this file? Used in for-loop as upper iteration limit.
        long bytes = theSavedFile.length();

        // Reconstitute objects & print             
        for(long counter = 0 ; counter < bytes; counter++ ){  // EOFException thrown before 'counter' ever equals 'bytes'.

            String name = input.readUTF();
            int age = input.readInt();
            double oddity = input.readDouble();

            // Create and print new Peep object.
            System.out.println(new Peeps(name, age, oddity));
        }


    }catch(EOFException e){
        System.out.println("All data read from file.");         
    }catch(IOException e){
        e.printStackTrace();
    }finally{
        if(input != null){
            try {
                input.close();
            } catch (IOException e) {
                // TODO Auto-generated catch block
                e.printStackTrace();
            }
        }           
    }
}   

/**
 * Simple class to demonstrate with.
 */
class Peeps{        
    // Simple Peep info
    private String name;
    private int age;
    private double oddness;

    // Constructor
    public Peeps(String name, int age, double oddness) {
        super();
        this.name = name;
        this.age = age;
        this.oddness = oddness;
    }

    // Getters
    public String getName() { return name;}
    public int getAge() { return age;}
    public double getOddness() { return oddness;}

    @Override
    public String toString() {
        return "Peeps [name=" + name + ", age=" + age + ", oddness=" + oddness + "]";
    }           
}

}

Flammable answered 27/8, 2014 at 17:5 Comment(0)
J
-1

InputStreamReader is used if you want to read character based streams. Such as from standard input or a property file. DataInputStream is if you want to read raw streams, such as from a socket, in a machine independent way. You should use DataInputStream over InputStreamReader if you want speed and storage size over readability. Storing the data in binary form is usually much quicker and takes less space than to store it in a human readable format. InputStreamReader should be used if you are parsing a human readable format, such as XML or HTTP.

Josuejosy answered 26/8, 2014 at 17:37 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.