How can I display Extended ASCII Codes characters in Perl?
Asked Answered
P

5

7

alt text

How to display 192 character symbol ( └ ) in perl ?

Pisci answered 20/9, 2010 at 15:40 Comment(4)
192 isn't actually ASCII. The ASCII set ends at 128 (or 127, depending on what exactly you include).Dallis
what is perfect title for this question ?Pisci
If you want to print the character with a value of 192 then you need to tell us which character encoding you're using. It isn't ASCII, as ASCII only defines 128 characters? Are you using one of the extended character sets? Perhaps cp1252 or ISO-8859?Overskirt
There is no such thing as "Extended ASCII", what you have there is called Code Page 437: en.wikipedia.org/wiki/Code_page_437Quilmes
N
11

What you want is to be able to print unicode, and the answer is in perldoc perluniintro.

You can use \x{nnnn} where n is the hex identifier, or you can do \N{...} with the name:

perl -E 'say "\x{2514}"; use charnames; say "\N{BOX DRAWINGS LIGHT UP AND RIGHT}"'
Nippy answered 20/9, 2010 at 15:47 Comment(1)
I have edited the code example to be relevant to the question. If you do not agree, you can easily undo this.Rife
C
8

To use exactly these codes your terminal must support Code Page 437, which contains frames. Alternatively you can use derived CP850 with less boxing characters. Such boxing characters also exist as Unicode Block Elements. The char which you want in perl is noted as \N{U+2514}. More details in perlunicode

Clericalism answered 20/9, 2010 at 15:52 Comment(1)
"\x{2514}" does it, too. This syntax is explained in perlop.Rife
L
5

That looks like the Code page 437 encoding. Perl is probably just outputting bytes that you give it. And your terminal is probably expecting UTF8.

So you need to decode it to Unicode, then re-encode it in UTF-8.

EDIT: Correct encoding.

Leitmotiv answered 20/9, 2010 at 15:48 Comment(2)
Or, change your terminal settings. :)Improvised
No, it's IBM437. See IANA, RFC 1345, en.Wikipedia.Rife
B
3

As usual, Jon Skeet nails it: the 192 code is in the "extended ASCII" range. I suggest you follow @Douglas Leeder's advice, but I'm not sure which encoding www.LookupTables.com is giving you; ISO-8859-1 thinks 192 maps to "À", and Mac OS Roman thinks its "¿".

Barna answered 20/9, 2010 at 16:2 Comment(1)
"Extended ASCII" is a family of encodings. The one in the question is IBM437. See IANA, RFC 1345, en.Wikipedia.Rife
W
0

Is there a solution that works on ALL characters?

The user says they wanted to use an latin-1 extended charset character — so let's try an example from this block! So, if they wanted the Æ character, they would run...

print "\x{00C6}";

Output:

Full Testable, Online Demo

TDLR Character Encoding Modes in Perl

So, wait, what just happened there? You'll notice that other ways of invoking UTF-8, such as char(...), \N{U+...}, and even unpack(...) also have the same issue. That's right -- the problem isn't with any of these functions, but an underlying character abstraction layer. In this case, you'll want to indicate this layer early in your code..

use open qw( :std :encoding(UTF-8) );
print "\x{00C6}";

Output: Æ

Now I can spell 'Ælf' correctly!

Full Testable, Online Demo

Why did that happen?

There is a note within the PerlDoc regarding the chr() function....

Note that characters from 128 to 255 (inclusive) are by default internally not encoded as UTF-8 for backward compatibility reasons.

For this reason, this special block needs to have that special use open to indicate std encoding.

Weigh answered 28/11, 2021 at 21:53 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.