Illegal Character when trying to compile java code
Asked Answered
H

10

34

I have a program that allows a user to type java code into a rich text box and then compile it using the java compiler. Whenever I try to compile the code that I have written I get an error that says that I have an illegal character at the beginning of my code that is not there. This is the error the compiler is giving me:

C:\Users\Travis Michael>"\Program Files\Java\jdk1.6.0_17\bin\javac" Test.java
Test.java:1: illegal character: \187
public class Test
 ^
Test.java:1: illegal character: \191
public class Test
  ^
2 errors
Harilda answered 2/1, 2010 at 21:33 Comment(8)
Can you post the character codes for those characters? Maybe it's a Unicode BOM that got mangled?Schutt
How would you get the character codes?Harilda
I'm sorry but what does BOM mean?Harilda
byte order mark en.wikipedia.org/wiki/Byte_order_mark "The UTF-8 representation of the BOM is the byte sequence EF BB BF"Quagmire
Dear Microsoft, when will you stop making your tools default to dropping bogus pseudo-BOMs into the start of UTF-8 files? It's getting beyond a joke now.Oread
A UTF8 encoded text file requires a BOM. There's no way to know for the reader to know it is UTF8 otherwise.Marrufo
@Hans Passant: No it doesn't. See the specification, on page 30.Onomastics
Yup, that's the opt-out clause that companies use to make sure the files their software generate can only reliably be read by their own software. Unicode was created by software vendors.Marrufo
M
24

The BOM is generated by, say, File.WriteAllText() or StreamWriter when you don't specify an Encoding. The default is to use the UTF8 encoding and generate a BOM. You can tell the java compiler about this with its -encoding command line option.

The path of least resistance is to avoid generating the BOM. Do so by specifying System.Text.Encoding.Default, that will write the file with the characters in the default code page of your operating system and doesn't write a BOM. Use the File.WriteAllText(String, String, Encoding) overload or the StreamWriter(String, Boolean, Encoding) constructor.

Just make sure that the file you create doesn't get compiled by a machine in another corner of the world. It will produce mojibake.

Marrufo answered 2/1, 2010 at 22:11 Comment(2)
Thank you so much. I finally worked! Maybe one day Microsoft will get rid of BOM and all the other bugs that they have!Harilda
Be careful throwing that bug bomb. That a relatively new chunk of software like the Java compiler cannot auto detect UTF8 is pretty stunning. This appears to be a problem in Vietnam too: vietunicode.sourceforge.net/howto/java/encoding.htmlMarrufo
S
20

That's a byte order mark, as everyone says.

javac does not understand the BOM, not even when you try something like

javac -encoding UTF8 Test.java

You need to strip the BOM or convert your source file to another encoding. Notepad++ can convert a single files encoding, I'm not aware of a batch utility on the Windows platform for this.

The java compiler will assume the file is in your platform default encoding, so if you use this, you don't have to specify the encoding.

Stacy answered 2/1, 2010 at 22:30 Comment(1)
For me, javac -encoding UTF16 Test.java works.Algicide
B
7
  1. If using an IDE, specify the java file encoding (via the properties panel)
  2. If NOT using an IDE, use an advanced text-editor (I can recommend Notepad++) and set the encoding to "UTF without BOM", or "ANSI", if that suits you.
Blamed answered 2/1, 2010 at 21:43 Comment(0)
G
5

In this case do the following Steps 1-7

In Android Studio

1. Menu -> Edit -> Select All
2. Menu -> Edit -> Cut
  1. Open new Notepad.exe

In Notepad

4. Menu -> Edit -> Paste
5. Menu -> Edit -> Select All
6. Menu -> Edit -> Copy 

Back In Android Studio

7. Menu -> Edit -> Paste
Greaten answered 21/1, 2018 at 17:16 Comment(0)
J
3

http://en.wikipedia.org/wiki/Byte_order_mark

The byte order mark (BOM) is a Unicode character used to signal the endianness (byte order) of a text file or stream. Its code point is U+FEFF. BOM use is optional, and, if used, should appear at the start of the text stream. Beyond its specific use as a byte-order indicator, the BOM character may also indicate which of the several Unicode representations the text is encoded in.

The BOM is a funky-looking character that you sometimes find at the start of unicode streams, giving a clue what the encoding is. It's usually handles invisibly by the string-handling stuff in Java, so you must have confused it somehow, but without seeing your code, it's hard to see where.

You might be able to fix it trivially by manually stripping the BOM from the string before feeding it to javac. It probably qualifies as whitespace, so try calling trim() on the input String, and feeding the output of that to javac.

Jitter answered 2/1, 2010 at 21:42 Comment(8)
i tried to trim it and it did not work. btw i am using VB.NETHarilda
Regarding "giving a clue what the encoding is" I just want to point out: Although the BOM can give a hint as to the encoding it is not intended to be used for this purpose. As the name suggests it tells you only the byte order. In fact in UTF-16 and UTF-32 (little endian) there is an ambiguity that means that the BOM cannot be used to tell them apart reliably. The BOM is not a replacement for correctly handling character encoding issues.Schutt
Would saving the file with a different encoding?Harilda
@Mark: Good point, well made - I oversimplified in haste. @muckdog: Sorry, can't help you there, vb.net isn't my thing.Jitter
How might i be able to get rid of the BOM?Harilda
muckdog12: it's just a character like any other character. You can remove it using any of the string operations that you would normally use to remove characters.Schutt
I can't seem to find the encoding.defaultHarilda
Thanks, it helped me with the same issue I was having.Durham
M
2

That's a problem related to BOM (Byte Order Mark) character. Byte Order Mark BOM is an Unicode character used for defining a text file byte order and comes in the start of the file. Eclipse doesn't allow this character at the start of your file, so you must delete it. for this purpose, use a rich text editor like Notepad++ and save the file with encoding "UTF-8 without BOM". That should remove the problem.

I have copy pasted the some content from a website to a Notepad++ editor,
it shows the "LS" with black background. Have deleted the "LS" content and 
have copy the same content from notepad++ to java file, it works fine.
Misanthrope answered 15/3, 2016 at 14:10 Comment(0)
K
2

I solved this by right clicking in my textEdit program file and selecting [substitutions] and un-checking smart quotes.

Keelin answered 11/11, 2016 at 18:53 Comment(0)
D
1

instead of getting Notepad++, You can simply Open the file with Wordpad and then Save As - Plain Text document

Dowager answered 6/9, 2016 at 15:29 Comment(0)
W
1

I had the same problem with a file i generated using the command echo "" > Main.java in Windows Powershell. I searched the problem and it seemed to have something to do with encoding. I checked the encoding of the file using file -i Main.java and the result was text/plain; charset=utf-16le.

Later i deleted the file and recreated it using git bash using touch Main.java and with this the file compiled successfully. I checked the file encoding using file -i command and this time the result was Main.java: text/x-c; charset=us-ascii.

Next i searched the internet and found that to create an empty file using Powershell we can use the Cmdlet New-Item. I create the file using New-Item Main.java and this time the encoding was Main.java: text/x-c; charset=us-ascii and the file was compiled successfully.

Walther answered 10/4, 2021 at 6:1 Comment(0)
B
0

Even I was facing this issue as am using notepad++ to code. It is very convenient to type the code in notepad++. However after compiling I get an error " error: illegal character: '\u00bb'". Solution : Start writing the code in older version of notepad(which will be there by default in your PC) and save it. Later the modifications can be done using notepad++. It works!!!

Burnoose answered 3/7, 2016 at 5:15 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.