Unicode characters in MATLAB source files
Asked Answered
D

5

26

I'd like to use Unicode characters in comments in a MATLAB source file. This seems to work when I write the text; however, if I close the file and reload it, "unusual" characters have been turned into question marks. I guess MATLAB is saving the file as ASCII.

Is there any way to tell MATLAB to use UTF-8 instead?

Doubleedged answered 13/2, 2011 at 13:59 Comment(0)
T
7

How the MATLAB Process Uses Locale Settings shows how to set the encoding for different platforms. Use

feature('DefaultCharacterSet')

You can read more about this undocumented function here. See also this Matlab Central thread for other options.

Tautog answered 13/2, 2011 at 16:36 Comment(4)
Thanks. The page you linked says that, on Mac, Matlab decides on an encoding based on the system's language settings, and ignores the LANG environment variable. I tried creating a startup.m file containing the command slCharacterEncoding('UTF-8'), but that didn't seem to help, apart from making Matlab hang in the "initializing" phase at startup. At any rate, even with slCharacterEncoding set to 'UTF-8', script files still seem to be encoded using ISO-8859-1. Any other ideas?Doubleedged
@LaC: Unfortunately I don't have any idea about how to set the encoding on startup. There seems to be room for improvement.Tautog
I'm accepting this answer, even though the problem remains unsolved, because it doesn't look like there's any solution.Doubleedged
slCharacterEncoding('UTF-8') is a Simulink (hence the prefix sl) function. That's why it hangs the system.Urease
E
20

According to http://www.mathworks.de/matlabcentral/newsreader/view_thread/238995

feature('DefaultCharacterSet', 'UTF8')

will change the encoding to UTF-8. You can put the line above in your startup.m file.

Ecesis answered 4/10, 2011 at 9:9 Comment(1)
Note that slCharacterEncoding, suggested in the comment above is a function which requiers Simulink.Ecesis
T
7

How the MATLAB Process Uses Locale Settings shows how to set the encoding for different platforms. Use

feature('DefaultCharacterSet')

You can read more about this undocumented function here. See also this Matlab Central thread for other options.

Tautog answered 13/2, 2011 at 16:36 Comment(4)
Thanks. The page you linked says that, on Mac, Matlab decides on an encoding based on the system's language settings, and ignores the LANG environment variable. I tried creating a startup.m file containing the command slCharacterEncoding('UTF-8'), but that didn't seem to help, apart from making Matlab hang in the "initializing" phase at startup. At any rate, even with slCharacterEncoding set to 'UTF-8', script files still seem to be encoded using ISO-8859-1. Any other ideas?Doubleedged
@LaC: Unfortunately I don't have any idea about how to set the encoding on startup. There seems to be room for improvement.Tautog
I'm accepting this answer, even though the problem remains unsolved, because it doesn't look like there's any solution.Doubleedged
slCharacterEncoding('UTF-8') is a Simulink (hence the prefix sl) function. That's why it hangs the system.Urease
G
2

Mac OSX only!

As I found solution which worked in my case I want to share it.

Mathworks advises here to use slCharacterEncoding(encoding) in order to change the encoding as desired, but for the OSX this does not solve the issue exactly as the feature('DefaultCharacterSet') in accepted answer does not do it. What helped me to get the UTF-8 encoding set for opening and saving .m files was the following link on MATLAB answers: https://www.mathworks.com/matlabcentral/answers/12422-macosx-encoding-problem

Matlab seems to ignore any value set in slCharacterEncoding(encoding) or feature('DefaultCharacterSet') but uses region set in System Preferences -> Language & Region. After checking which region is selected in our case then it is possible to define the actual encoding in the hidden configuration file in

 $matlabroot/bin/lcdata.xml

This directory can be opened by getting to the Applications and after right click on Matlab by selecting Show Package Contents as on screenshot (here in German)

Package Contents

For example for German default ISO-8859-1 it is possible to adjust it by changing the respective line in the file lcdata.xml:

 <locale name="de_DE" encoding="ISO-8859-1" xpg_name="de_DE.ISO8859-1">

to:

 <locale name="de_DE" encoding="UTF-8" xpg_name="de_DE.UTF-8">

If the region which is selected is not present in the lcdata.xml file this will not work.

Hope this helps!

Geum answered 31/5, 2017 at 14:1 Comment(0)
O
2

The solution provided here worked for me on Windows with R2018a.

In case link doesn't work: the idea is to use file matlabroot/bin/lcdata.xml to configure an alias for encoding name (some explanation can be found in this very file in the comments):

<codeset>
  <encoding name="UTF-8">
   <encoding_alias name="windows-1252" />
  </encoding>
</codeset>

You would use your own value instead of windows-1252, currently used encoding can be obtained by running feature('locale').

Although, if you use Unicode characters in help comments, the help browser does not recognize them, as well as console window output.

Override answered 12/4, 2018 at 15:57 Comment(0)
D
0

For Mac OS users, Jendker's solution really helps!!! Thanks a lot first.

Recap here.

  1. Check the default language in Matlab by typing in the command window getenv('LANG'). Mine returned en_US.ISO8859-1.

  2. In the Application directory find Matlab, show its package contents. Go to bin, open lcdata.xml as an administrator, locate the corresponding xpg_name, in my case en_US.ISO8859-1. Change encoding in the same line to UTF-8. Save it.

  3. Restart Matlab, and it's all done!

Deborahdeborath answered 6/4, 2018 at 18:5 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.