Details
-
Sub-task
-
Resolution: Fixed
-
Minor
-
4.4
-
None
-
None
-
5.0 Milestone1
Description
One of the more subtle error sources is how non-ASCII characters are stored in files in the file system and interpreted by programs. A typical example is storing the character "ø" in a file on a Linux machine where the character is represented by two bytes in the default UTF-8 encoding, and loading the same file on Windows machine where the two bytes are converted to two characters "ø" instead of the original single "'ø".
The only robust solution I have found so far is to explicitly enforce in the pom.xml that all files are encoded in ASCII and then appropriately encode all non-ASCII character found in sources and testdata.