DBF - encoding cp1250
Asked Answered
M

3

6

I have dbf database encoded in cp1250 and I am reading this database using folowing code:

import csv
from dbfpy import dbf
import os
import sys

filename = sys.argv[1]
if filename.endswith('.dbf'):
    print "Converting %s to csv" % filename
    csv_fn = filename[:-4]+ ".csv"
    with open(csv_fn,'wb') as csvfile:
        in_db = dbf.Dbf(filename)
        out_csv = csv.writer(csvfile)
        names = []
        for field in in_db.header.fields:
            names.append(field.name)
        #out_csv.writerow(names)
        for rec in in_db:
            out_csv.writerow(rec.fieldData)
        in_db.close()
        print "Done..."
else:
  print "Filename does not end with .dbf"

Problem is, that final csv file is wrong. Encoding of the file is ANSI and some characters are corrupted. I would like to ask you, if you can help me how to read dbf file correctly.

EDIT 1

I tried different code from https://pypi.python.org/pypi/simpledbf/0.2.4, there is some error.

Source 2:

from simpledbf import Dbf5
import os
import sys

dbf = Dbf5('test.dbf', codec='cp1250');
dbf.to_csv('junk.csv');

Output:

python program2.py
Traceback (most recent call last):
  File "program2.py", line 5, in <module>
    dbf = Dbf5('test.dbf', codec='cp1250');
  File "D:\ProgramFiles\Anaconda\lib\site-packages\simpledbf\simpledbf.py",      line 557, in __init__
    assert terminator == b'\r'

AssertionError

I really don't know how to solve this problem.

Maturity answered 7/7, 2015 at 13:48 Comment(0)
U
3

Try using my dbf library:

import dbf
with dbf.Table('test.dbf') as table:
    dbf.export(table, 'junk.csv')
Ubangi answered 7/7, 2015 at 15:56 Comment(0)
B
3

I wrote simpledbf. The line that is causing you problems was from some testing I was doing when developing the module. First of all, you might want to update your installation, as 0.2.6 is the most recent. Then you can try removing that particular line (#557) from the file "D:\ProgramFiles\Anaconda\lib\site-packages\simpledbf\simpledbf.py". If that doesn't work, you can ping me at the GitHub repo for simpledbf, or you could try Ethan's suggestion for the dbf module.

Babylonia answered 8/7, 2015 at 1:17 Comment(1)
It works too. Thanks for answer. :) There is just error on empty field."ValueError: Column type "" not yet supported."Maturity
G
0

You can decode and encode as necessary. dbfpy assumes strings are utf8 encoded, so you can decode as it isn't that encoding and then encode again with the right encoding.

import csv
from dbfpy import dbf
import os
import sys

filename = sys.argv[1]
if filename.endswith('.dbf'):
    print "Converting %s to csv" % filename
    csv_fn = filename[:-4]+ ".csv"
    with open(csv_fn,'wb') as csvfile:
        in_db = dbf.Dbf(filename)
        out_csv = csv.writer(csvfile)
        names = []
        for field in in_db.header.fields:
            names.append(field.name)
        #out_csv.writerow(names)
        for rec in in_db:
            row = [i.decode('utf8').encode('cp1250') if isinstance(i, str) else i for i in rec.fieldData]
            out_csv.writerow(rec.fieldData)
        in_db.close()
        print "Done..."
else:
  print "Filename does not end with .dbf"
Gummy answered 25/1, 2017 at 16:33 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.