In Pharo/Smalltalk: How to read a file with a specific encoding?
Asked Answered
R

2

6

I am currently reading a file like this:

dir := FileSystem disk workingDirectory.
stream := (dir / 'test.txt' ) readStream.
line := stream nextLine.

This works when the file is utf-8 encoded but I could not find out what to do when the file has another encoding.

Rsfsr answered 26/3, 2018 at 13:34 Comment(1)
Hint: Look for implementors (and senders) of #encoding:.Eke
P
5

For Pharo 7 there's this guide for file streams, which proposes:

('test.txt' asFileReference)
    readStreamEncoded: 'cp-1250' do: [ :stream |
        stream upToEnd ].
Progestin answered 24/1, 2019 at 12:32 Comment(0)
R
5

The classes ZnCharacterReadStream and ZnCharacterWriteStream provide functionality to work with encoded character streams other then UTF-8 (which is the default). First, the file stream needs to be converted into a binary stream. After this, it can be wrapped by a ZnCharacter*Stream. Here is a full example for writing and reading a file:

dir := FileSystem disk workingDirectory.

(dir / 'test.txt') writeStreamDo: [ :out |
  encoded := ZnCharacterWriteStream on: (out binary) encoding: 'cp1252'.
  encoded nextPutAll: 'Über?'.
].

content := '?'.
(dir / 'test.txt') readStreamDo: [ :in |
  decoded := ZnCharacterReadStream on: (in binary) encoding: 'cp1252'.
  content := decoded nextLine.
].
content. " -> should evaluate to 'Über?'"

For more details, the book Enterprise Pharo a Web Perspective has a chapter about character encoding.

Rsfsr answered 26/3, 2018 at 14:30 Comment(0)
P
5

For Pharo 7 there's this guide for file streams, which proposes:

('test.txt' asFileReference)
    readStreamEncoded: 'cp-1250' do: [ :stream |
        stream upToEnd ].
Progestin answered 24/1, 2019 at 12:32 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.