I am looking for a sample text unicode file (UTF-8) that can be used for testing different problems related with text encoding and decoding including:
- low ascii character usage, like first 32 codes
- characters outside BMP
- NFC related issues
- XML encoding/decoding issues
Mainly I want to copy the text into clipboard, paste it in an HTML text-area of the application, and be able to retrieve it from a page after.
This would enable to identify different Unicode related problems that could occur at decoding, encoding or even database level.