How can I get Perl to respect the locale encoding for STDIN/STDOUT/STDERR, without affecting file IO?
Asked Answered
R

2

4

What is the best way to ensure Perl uses the locale encoding (as in LANG=en_US.UTF-8) for STDIN/STDOUT/STDERR, without affecting file IO?

If I use

use open ':locale';
say "mañana";
open (my $f, '>', 'test.txt'); say $f "mañana";

then the locale encoding is used for STDIN/STDOUT/STDERR, but also in test.txt, which is not very well-behaved: you don't want the encoding of a file to depend on the way you logged in.

Rubicon answered 18/1, 2013 at 13:6 Comment(2)
What encoding do you expect in the output file?Comparable
export PERL_UNICODE=SALHuldahuldah
J
4

To add the encoding layers to STDIN, STDOUT and STDERR, you need to use

use open ':std', ':locale';

instead of

use open ':locale';

But that doesn't just add an encoding layer to STDIN, STDOUT and STDERR; it causes the same layer to be added to file handles opened in scope by default. So we need to override that default with

open(my $fh, '>:encoding(UTF-8)', $qfn)

or

use open ':encoding(UTF-8)';
open(my $fh, '>', $qfn)

All together:

use open ':std', ':locale';
use open ':encoding(UTF-8)';
open(my $fh_txt, '>',     $qfn);   # Text
open(my $fh_bin, '>:raw', $qfn);   # Binary

or

use open ':std', ':locale';
open(my $fh_txt, '>:encoding(UTF-8)', $qfn);   # Text
open(my $fh_bin, '>:raw',             $qfn);   # Binary

Result:

my $s = chr(0xE9);

say         $s;      # U+E9 encoded as per locale
say $fh_txt $s;      # U+E9 encoded using UTF-8
say $fh_bin $s;      # Byte E9

(You can use binmode($fh); instead of :raw for binary files, if you prefer.)

Juno answered 18/1, 2013 at 19:12 Comment(0)
D
1
{ use open IO => ':locale'; }   # the IO => is optional

... does what the OP asked for.

The effect on subsequent opens is scoped, so putting the use open ... in it's own little block prevents it from having any affect on file handles opened later. :locale implies :std so it immediately modifies STDIN/OUT/ERR.

Dispersion answered 2/4, 2023 at 5:43 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.