Haskell IO russian symbols
Asked Answered
E

3

6

I an trying to process a file which contains russian symbols. When reading and after writing some text to the file I get something like:

\160\192\231\229\240\225\224\233\228\230\224\237

How can I get normal symbols?

Experimentalism answered 15/5, 2010 at 12:41 Comment(1)
i try parse web page www.trade.su/search?ext=1Experimentalism
E
2

I have got success.

{-# LANGUAGE ImplicitParams #-}

import Network.HTTP
import Text.HTML.TagSoup
import Data.Encoding
import Data.Encoding.CP1251
import Data.Encoding.UTF8

openURL x =  do 
        x <- simpleHTTP (getRequest x)
        fmap (decodeString CP1251) (getResponseBody x)

main :: IO ()
main = do
    tags <- fmap parseTags $ openURL "http://www.trade.su/search?ext=1"
    let TagText r  = partitions (~== "<input type=checkbox>") tags !! 1 !! 4
    appendFile "out" r
Experimentalism answered 16/5, 2010 at 15:54 Comment(0)
S
8

If you are getting strings with backslashes and numbers in, then it sounds like you might be calling "print" when you want to call "putStr".

Scholasticate answered 15/5, 2010 at 16:23 Comment(0)
B
2

If you deal with Unicode, you might try utf8-string package

import System.IO hiding (hPutStr, hPutStrLn, hGetLine, hGetContents, putStrLn)
import System.IO.UTF8
import Codec.Binary.UTF8.String (utf8Encode)
main = System.IO.UTF8.putStrLn "Вася Пупкин"

However it didn't work well in my windows CLI garbling the output because of codepage. I expect it to work fine on other Unix-like systems if your locale is set correctly. However writing to file should be successfull on all systems.

UPDATE:

An example on encoding package usage.

Builtup answered 15/5, 2010 at 13:19 Comment(12)
He's not dealing with unicode. According to firefox the page he linked is encoded in Windows-1251.Wretch
Then encoding package may be useful, it has System.Encoding.CP1251.Builtup
I have some problems to install this package on windows. Can not find library i try like this: cd c:\Users\test_8\Desktop\encoding-0.6.3 runhaskell Setup.hs configure --extra-include-dirs="c:\Users\test_8\Desktop\encoding-0.6.3" --extra-lib-dirs="c:\Users\test_8\Desktop\encoding-0.6.3" but get this: Setup.hs: Missing dependency on a foreign library: * Missing header file: system_encoding.hExperimentalism
required localinfo.h. I can not find it.Experimentalism
@Anton: Please, paste your sources somewhere, for example here if they aren't so huge.Builtup
import Text.HTML.TagSoup import Text.HTML.Download main :: IO () main = do tags <- fmap parseTags $ openURL "trade.su/search?ext=1" let r = partitions (~== "<input type=checkbox>") tags !! 1 appendFile "out" (show r)Experimentalism
This is does not work: {-# LANGUAGE ImplicitParams #-} import Text.HTML.TagSoup import Text.HTML.Download import Prelude hiding (appendFile) import System.IO.Encoding import Data.Encoding.CP1251 main :: IO () main = do tags <- fmap parseTags $ openURL "trade.su/search?ext=1" let r = partitions (~== "<input type=checkbox>") tags !! 1 let ?enc = CP1251 appendFile "out" (show r)Experimentalism
I've just given up installing encoding package on Windows, haven't GHC for Unix on hand. This is interesting how you managed to install it.Builtup
I Could not install on Windows too.Experimentalism
@Anton: Mate, I can no longer help you, since I don't have GHC for *nix, sorry. I wish google will help you. And it would be nice if you answered yourself here. ;)Builtup
Thanks. I can not understand reason of my problem. Hence i do not know what i need find in Google.Experimentalism
@Anton: Also make sure that appendFile you use is imported from System.IO.Encoding rather System.IO. Try to use System.IO.Encoding.appendFile call.Builtup
E
2

I have got success.

{-# LANGUAGE ImplicitParams #-}

import Network.HTTP
import Text.HTML.TagSoup
import Data.Encoding
import Data.Encoding.CP1251
import Data.Encoding.UTF8

openURL x =  do 
        x <- simpleHTTP (getRequest x)
        fmap (decodeString CP1251) (getResponseBody x)

main :: IO ()
main = do
    tags <- fmap parseTags $ openURL "http://www.trade.su/search?ext=1"
    let TagText r  = partitions (~== "<input type=checkbox>") tags !! 1 !! 4
    appendFile "out" r
Experimentalism answered 16/5, 2010 at 15:54 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.