How to convert ansi text to utf8 in Go? I am trying to convert ansi string to utf8 string.
Go only has UTF-8 strings. You can convert something to a UTF8 string using the conversion described here from a byte[]:
Here is newer method.
package main
import (
"bytes"
"fmt"
"io/ioutil"
"golang.org/x/text/encoding/traditionalchinese"
"golang.org/x/text/transform"
)
func Decode(s []byte) ([]byte, error) {
I := bytes.NewReader(s)
O := transform.NewReader(I, traditionalchinese.Big5.NewDecoder())
d, e := ioutil.ReadAll(O)
if e != nil {
return nil, e
}
return d, nil
}
func main() {
s := []byte{0xB0, 0xAA}
b, err := Decode(s)
fmt.Println(string(b))
fmt.Println(err)
}
I were use iconv-go to do such convert, you must know what's your ANSI code page, in my case, it is 'big5'.
package main
import (
"fmt"
//iconv "github.com/djimenez/iconv-go"
iconv "github.com/andelf/iconv-go"
"log"
)
func main() {
ibuf := []byte{170,76,80,67}
var obuf [256]byte
// Method 1: use Convert directly
nR, nW, err := iconv.Convert(ibuf, obuf[:], "big5", "utf-8")
if err != nil {
log.Fatalln(err)
}
log.Println(nR, ibuf)
log.Println(obuf[:nW])
fmt.Println(string(obuf[:nW]))
// Method 2: build a converter at first
cv, err := iconv.NewConverter("big5", "utf-8")
if err != nil {
log.Fatalln(err)
}
nR, nW, err = cv.Convert(ibuf, obuf[:])
if err != nil {
log.Fatalln(err)
}
log.Println(string(obuf[:nW]))
}
I've written a function that was useful for me, maybe someone else can use this. It converts from Windows-1252
to UTF-8
. I've converted some code points that Windows-1252
treats as chars but Unicode considers to be control characters (http://en.wikipedia.org/wiki/Windows-1252)
func fromWindows1252(str string) string {
var arr = []byte(str)
var buf bytes.Buffer
var r rune
for _, b := range(arr) {
switch b {
case 0x80:
r = 0x20AC
case 0x82:
r = 0x201A
case 0x83:
r = 0x0192
case 0x84:
r = 0x201E
case 0x85:
r = 0x2026
case 0x86:
r = 0x2020
case 0x87:
r = 0x2021
case 0x88:
r = 0x02C6
case 0x89:
r = 0x2030
case 0x8A:
r = 0x0160
case 0x8B:
r = 0x2039
case 0x8C:
r = 0x0152
case 0x8E:
r = 0x017D
case 0x91:
r = 0x2018
case 0x92:
r = 0x2019
case 0x93:
r = 0x201C
case 0x94:
r = 0x201D
case 0x95:
r = 0x2022
case 0x96:
r = 0x2013
case 0x97:
r = 0x2014
case 0x98:
r = 0x02DC
case 0x99:
r = 0x2122
case 0x9A:
r = 0x0161
case 0x9B:
r = 0x203A
case 0x9C:
r = 0x0153
case 0x9E:
r = 0x017E
case 0x9F:
r = 0x0178
default:
r = rune(b)
}
buf.WriteRune(r)
}
return string(buf.Bytes())
}
There is no way to do it without writing the conversion yourself or using a third-party package. You could try using this: http://code.google.com/p/go-charset
golang.org/x/text/encoding/charmap
package has functions exactly for this problem
import "golang.org/x/text/encoding/charmap"
func DecodeWindows1250(enc []byte) string {
dec := charmap.Windows1250.NewDecoder()
out, _ := dec.Bytes(enc)
return string(out)
}
func EncodeWindows1250(inp string) []byte {
enc := charmap.Windows1250.NewEncoder()
out, _ := enc.String(inp)
return out
}
Edit: undefined: ba
is replace enc
out, _ := charmap.Windows1250.NewDecoder().String(input)
–
Sadness © 2022 - 2024 — McMap. All rights reserved.