For layouting we have our famous "Lorem ipsum" text to test how it looks like.
What I am looking for is a set of files containing Text encoded with several different encodings that I can use in my JUnit tests to test some methods that are dealing with character encoding when reading text files.
Example:Having a ISO 8859-1
encoded test-file and a Windows-1252
encoded test-file. The Windows-1252 have to trigger the differences in region 8016 – 9F16. In other words it must contain at least one character of this region to distinguish it from ISO 8859-1.
Maybe the best set of test-files is that where the test-file for each encoding contains all its characters once. But maybe I am not aware of sth - we all like this encoding stuff, right? :-)
Is there such a set of test-files for character-encoding issues out there?