Why doesn't JTextComponent.setText(String) normalize line endings?
Asked Answered
K

2

9

It has recently come to my attention that Java text components use line feed characters (LF, \n, 0x0A) to represent and interpret line breaks internally. This came as quite a surprise to me and puts my assumption, that using System.getProperty('line.separator') everywhere is a good practice, under a question mark.

It would appear that whenever you are dealing with a text component you should be very careful when using the mentioned property, since if you use JTextComponent.setText(String) you might end up with a component that contains invisible newlines (CRs for example). This might not seem that important, unless the content of the text component can be saved to a file. If you save and open the text to a file using the methods that are provided by all text components, your hidden newlines suddenly materialize in the component upon the file being re-opened. The reason for that seems to be that JTextComponent.read(...) method does the normalization.

So why doesn't JTextComponent.setText(String) normalize line endings? Or any other method that allows text to be modified within a text component for that matter? Is using System.getProperty('line.separator') a good practice when dealing with text components? Is it a good practice at all?

Some code to put this question into perspective:

import java.awt.GridBagConstraints;
import java.awt.GridBagLayout;
import java.awt.Insets;
import java.awt.event.ActionEvent;
import java.io.File;
import java.io.FileInputStream;
import java.io.FileNotFoundException;
import java.io.FileOutputStream;
import java.io.IOException;
import java.io.InputStreamReader;
import java.io.OutputStreamWriter;
import java.io.Reader;
import java.io.UnsupportedEncodingException;
import java.io.Writer;
import javax.swing.AbstractAction;
import javax.swing.JButton;
import javax.swing.JFrame;
import javax.swing.JOptionPane;
import javax.swing.JScrollPane;
import javax.swing.JTextArea;
import javax.swing.SwingUtilities;

public class TextAreaTest extends JFrame {

    private JTextArea jtaInput;
    private JScrollPane jscpInput;
    private JButton jbSaveAndReopen;

    public TextAreaTest() {
        super();
        setDefaultCloseOperation(EXIT_ON_CLOSE);
        setTitle("Text Area Test");
        GridBagLayout layout = new GridBagLayout();
        setLayout(layout);        

        jtaInput = new JTextArea();
        jtaInput.setText("Some text followed by a windows newline\r\n"
                + "and some more text.");
        jscpInput = new JScrollPane(jtaInput);
        GridBagConstraints constraints = new GridBagConstraints();
        constraints.gridx = 0; constraints.gridy = 0;
        constraints.gridwidth = 2;
        constraints.weightx = 1.0; constraints.weighty = 1.0;
        constraints.fill = GridBagConstraints.BOTH;
        add(jscpInput, constraints);

        jbSaveAndReopen = new JButton(new SaveAndReopenAction());
        constraints = new GridBagConstraints();
        constraints.gridx = 1; constraints.gridy = 1;
        constraints.anchor = GridBagConstraints.EAST;
        constraints.insets = new Insets(5, 0, 2, 2);
        add(jbSaveAndReopen, constraints);

        pack();
    }

    public static void main(String[] args) {
        SwingUtilities.invokeLater(new Runnable() {

            public void run() {
                TextAreaTest tat = new TextAreaTest();
                tat.setVisible(true);
            }
        });
    }

    private class SaveAndReopenAction extends AbstractAction {

        private File file = new File("text-area-test.txt");

        public SaveAndReopenAction() {
            super("Save and Re-open");
        }

        private void saveToFile() 
                throws UnsupportedEncodingException, FileNotFoundException,
                IOException {

            Writer writer = null;
            try {
                writer = new OutputStreamWriter(
                        new FileOutputStream(file), "UTF-8");
                TextAreaTest.this.jtaInput.write(writer);
            } finally {
                if (writer != null) {
                    try {
                        writer.close();
                    } catch (IOException ex) {
                    }
                }
            }
        }

        private void openFile() 
                throws UnsupportedEncodingException, IOException {
            Reader reader = null;
            try {
                reader = new InputStreamReader(
                        new FileInputStream(file), "UTF-8");
                TextAreaTest.this.jtaInput.read(reader, file);
            } finally {
                if (reader != null) {
                    try {
                        reader.close();
                    } catch (IOException ex) {
                    }
                }
            }
        }

        public void actionPerformed(ActionEvent e) {
            Throwable exc = null;
            try {
                saveToFile();
                openFile();
            } catch (UnsupportedEncodingException ex) {
                exc = ex;
            } catch (FileNotFoundException ex) {
                exc = ex;
            } catch (IOException ex) {
                exc = ex;
            }
            if (exc != null) {
                JOptionPane.showConfirmDialog(
                        TextAreaTest.this, exc.getMessage(), "An error occured",
                        JOptionPane.DEFAULT_OPTION, JOptionPane.ERROR_MESSAGE);
            }
        }        
    }
}

An example of what this program saves on my windows machine after adding a new line of text (why the single CR? o_O):

enter image description here

Edit01

I ran/debugged this from within Netbeans IDE, which uses JDK1.7u15 64bit (C:\Program Files\Java\jdk1.7.0_15) on Windows 7.

Kramer answered 26/7, 2013 at 13:26 Comment(7)
Excellent question, with code and a telling & small image. :) Sorry, don't know the answer. :(Mechanist
This is what I got on Windows 8: i.sstatic.net/ePJxJ.pngCommutate
Hmm.. Perhaps this is JDK/JRE version specific bug?Kramer
I run Java on JDK 7u25 64-bit.Commutate
@Eng.Fouad, Ah.. What about if you add a new line of text (Enter + "Some new text") and then save?Kramer
@Kramer i.sstatic.net/QMHU0.pngCommutate
There was another answer posted by camickr, but it got deleted. It contained a link to some interesting reading on this matter (his blog?). Unfortunately I cannot find this webpage anymore. If anyone remembers it, please post a link to it here in the comments.Kramer
S
2

First of all, the real answer is that this is how the designers thought the design should work. You'd really need to ask them to get the real reason(s).

Having said that:

So why doesn't JTextComponent.setText(String) normalize line endings?

I think that the most likely reasons are:

  • It would be unexpected behaviour. Most programmers would expect1 a 'get' on a text field to return the same string value that was 'set' ... or that the user entered.

  • If text fields did normalize, then the programmer would have great difficulty preserving the original text's line endings in vases where this was desirable.

  • The designers might have wanted to change their minds at some point (c.f. the reported behaviour of the read and write methods) bur were unable to for reasons of compatibility.

Anyway, if you need normalization, there's nothing stopping your code from doing this on the value retrieved by the setter.

Or any other method that allows text to be modified within a text component for that matter?

It is reported (see comments) that read and/or write do normalization.

Is using System.getProperty('line.separator') a good practice when dealing with text components? Is it a good practice at all?

It depends on the context. If you know you are reading and writing files to be processed on "this" platform, its probably a good idea. If the file is intended to be read on a different platform (with a different line separators) then normalizing to match the current machine's convention is maybe a bad idea.


1 - The fact that other methods like read and write that may behave differently doesn't affect this. They are not "getters" and "setters". I'm talking about how people expect "getters" and "setters" to behave ... not anything else. Besides, people shouldn't expect everything to behave the same way, unless it is specified that they do. But obviously, the part of the problem here is that the spec ... the javadocs ... is silent on these issues.

The other possibility is that the normalization behaviour that @predi reports is actually happening in the Reader / Writer objects ...

Slily answered 26/7, 2013 at 14:13 Comment(5)
But it doesn't seem consistent with what JTextComponent.read(...) does. Why does that method appear to normalize newlines? Should not that also be considered "unexpected behavior"?Kramer
Hey, these are only my theories. Like I said, if you want the real explanation, ask the designers!Slily
Theories should be challenged. They serve no purpose otherwise... :PKramer
Thankyou for giving purpose to my theories :-)Slily
I believe this is the most appropriate answer. Thanks for taking the time to write this up and later even responding to my nagging...Kramer
M
1

Using the system line separator is questionable. I would only use it to write text files in platform specific format.

When reading, I always simply throw away any '\r', (CR) effectively converting down Windows/Mac/Unix to Unix-style linefeeds. Internally I would never use anything other than plain '\n' (LF) to indicate linefeeds - its a waste of memory and makes processing text only more painful.

Melentha answered 26/7, 2013 at 14:27 Comment(2)
I disagree. Java text components have their set of methods for reading/writing text where newlines are obviously handled in a specific way. Bypassing that and rolling on your own can't be a good idea.Kramer
Counterargument: You won't make use of Swing components for server based/headless text processing - implementing the conversion into a limited scope API like Swing presents a violation of responsibility, it clearly doesn't belong there - a GUI component shouldn't concern itself with platform specific file encoding. If I need to roll a different implementation for some cases anyway, I prefer to use the implementation that works everywhere the same. Not everything in the JDK is well designed - flagship examples are Date, Vector, StringBuffer... just because it's there doesn't mean it's good.Melentha

© 2022 - 2024 — McMap. All rights reserved.