How to display 192 character symbol ( └ ) in perl ?
What you want is to be able to print unicode, and the answer is in perldoc perluniintro
.
You can use \x{nnnn}
where n is the hex identifier, or you can do \N{...}
with the name:
perl -E 'say "\x{2514}"; use charnames; say "\N{BOX DRAWINGS LIGHT UP AND RIGHT}"'
To use exactly these codes your terminal must support Code Page 437, which contains frames. Alternatively you can use derived CP850 with less boxing characters.
Such boxing characters also exist as Unicode Block Elements. The char which you want in perl is noted as \N{U+2514}
. More details in perlunicode
"\x{2514}"
does it, too. This syntax is explained in perlop. –
Rife That looks like the Code page 437 encoding. Perl is probably just outputting bytes that you give it. And your terminal is probably expecting UTF8.
So you need to decode it to Unicode, then re-encode it in UTF-8.
EDIT: Correct encoding.
As usual, Jon Skeet nails it: the 192
code is in the "extended ASCII" range. I suggest you follow @Douglas Leeder's advice, but I'm not sure which encoding www.LookupTables.com is giving you; ISO-8859-1
thinks 192 maps to "À", and Mac OS Roman
thinks its "¿".
Is there a solution that works on ALL characters?
The user says they wanted to use an latin-1 extended charset character — so let's try an example from this block! So, if they wanted the Æ
character, they would run...
print "\x{00C6}";
Output: �
TDLR Character Encoding Modes in Perl
So, wait, what just happened there? You'll notice that other ways of invoking UTF-8, such as char(...)
, \N{U+...}
, and even unpack(...)
also have the same issue. That's right -- the problem isn't with any of these functions, but an underlying character abstraction layer. In this case, you'll want to indicate this layer early in your code..
use open qw( :std :encoding(UTF-8) );
print "\x{00C6}";
Output: Æ
Now I can spell 'Ælf' correctly!
Why did that happen?
There is a note within the PerlDoc regarding the chr()
function....
Note that characters from 128 to 255 (inclusive) are by default internally not encoded as UTF-8 for backward compatibility reasons.
For this reason, this special block needs to have that special use open
to indicate std encoding.
© 2022 - 2024 — McMap. All rights reserved.