Extended Ascii in Linux
Asked Answered
R

5

10

How would I print these characters in Linux?

│ (ascii 179)

├ (ascii 195)

└ (ascii 192)

─ (ascii 196)

I cannot find any octal values that would work with echo -e "\0xxx", any ideas?

Rotator answered 7/4, 2011 at 17:1 Comment(0)
A
9

After much poring over man printf and info printf, I think I've gotten this to work.

The basic issue seems to be that bash has a built-in printf that doesn't work. And, despite what the man/info pages, say, \U doesn't work. \u still does, though.

env printf '\u2502'

gets me a vertical box character.

Avunculate answered 7/4, 2011 at 17:21 Comment(4)
env printf '\u2502' gets me "â", maybe there are some settings I would need to change?.. I was hoping to find a solution that would work in any environmentRotator
The full list of box drawing characters for Unicode to be used with this, are listed in en.wikipedia.org/wiki/Box-drawing_characterDisembarrass
Rather than using "env printf" you can just run "/bin/printf" so that Bash runs the external printf program, not the builtin one (which does not supports this unicode syntax). BTW, this will only works with printf from GNU coreutils, it may not work in other Unix versions.Disembarrass
Well, env printf is not yet the whole of the story. You can also prepend some locale before the env to take influence on results, e. g. LC_ALL=en_GB.UTF-8 env printf '\u2502'. I think that's the problem that Benjamin has faced. To find the source of these issues, it's always very recommendable to type locale in console to see what the current settings are. (Particular ones will also take influence on the language of some X11 apps.)Burner
T
6

You can use the exact same codes you provided or of the extended ASCII character set (e.g. 195 for ├) if you've got the right encoder to display the characters.

On Linux, we lack the non-standard extended ASCII character set support - which is why it's not displayed. However, I found another character set that's available for Linux and is almost similar to the extended ASCII character set. It's IBM855.

All you have to do is changed the character encoding of your command line application to IBM855. All popular box drawing characters have the same code of the extended ASCII character set - which is the most important.

You may compare the sets by this image and this image.

PS: If you're using gnome-terminal, you can add IBM855 charset by clicking the "Terminal" menu from the menu bar -> "set character encoding" -> "Add or Remove". Look for IBM855, and add it. Now just choose the encoding from "terminal"->"set character encoding"->"Cyrillic (IBM855)".

They boxes were enough for my homework. Hope this helps. :)

Topsyturvy answered 25/10, 2011 at 17:4 Comment(1)
How to do this in bash on Mac OS X?Younger
C
3

Because some people may still want to know this...

See the lines that uses iconv to translate.

To print all ascii/extended ascii codes CP437 in Linux/bash script:

# heading index with div line
printf "\n      "; # indent

for x in {0..15}; do printf "%-3x" $x; done;
printf "\n%46s\n" | sed 's/ /-/g;s/^/      /';

# two lines with dots to represent control chars
c=$(echo "fa" | xxd -p -r | iconv -f 'CP437//' -t 'UTF-8')
printf "%32s" | sed 's/../'"$c"'  /g;s/^/  0   /;s/$/\n\n/'
printf "%32s" | sed 's/../'"$c"'  /g;s/^/  1   /'

# convert dec to codepage 437 in a table
for x in {32..255};
do

  # newline every 16 translated code values
  (( x % 16 == 0 )) && printf "\n\n"

  # left index numbers
  let "n = x % 15"
  (( (x % 16) == 0 )) && printf "%-4x" $n | sed 's/0/f/;s/^/  /'

  # conversion of x integer value to symbol
  printf "%02x" $x | xxd -p -r | iconv -f 'CP437//' -t 'UTF-8' | sed 's/.*/&  /'

  # div line
  (( x == 127 )) && printf "%46s" | sed 's/ /-/g;s/^/      /;i\ '

done
printf "%46s" | sed 's/ /-/g;s/^/\n      /;s/$/\n      /'; # div line
for x in {0..15}; do printf "%-3x" $x; done;
echo

Compline answered 22/2, 2020 at 18:25 Comment(1)
Thank-you @shmatt! It works perfectly. exampleConfiture
B
2

Either switch the font to one that is in PC-8/CP437 encoding, or use the Unicode values for those characters instead, encoded into the current charset.

Balthazar answered 7/4, 2011 at 17:7 Comment(0)
L
0

ASCII (invented in 1960) is actually a /seven/ bit encoding standard, and as such only the characters in the range {0..127} are defined.

Characters {128..255} are (commonly) referred to as the "alt page" and can be mapped any way you wish. There are MANY "standard" mappings, of which "CodePage 437" is/was a popular choice (thanks to DOS support, and "ANSI artists" of the 80s and 90s).

Under Linux, you can use luit to intercept stdin and stdout and convert data on the fly.

Create a suitable demo file

(for i in `seq 128 255` ; do printf "\x$(printf "%02x" $i)" ; done; echo) >demo.txt

Hexdump the file (sanity check)

hexdump -C demo.txt

Display the file normally

cat demo.txt

Display the file as cp437 (as per OP):

luit -encoding cp437 cat demo.txt

Display the file as cp850 (as an additional example):

luit -encoding cp850 cat demo.txt

You can also just run luit -encoding cp437 to open a sub-shell with cp437 encoding (use ^D to exit luit), at which point your [OP] echo statements should work as desired.

Limnology answered 19/6 at 16:45 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.