Reading DBF files with pyodbc
Asked Answered
E

3

7

In a project, I need to extract data from a Visual FoxPro database, which is stored in dbf files, y have a data directory with 539 files I need to take into account, each file represents a database table, so I've been doing some testing and my code goes like this:

import pyodbc

connection = pyodbc.connect("Driver={Microsoft Visual FoxPro Driver};SourceType=DBF;SourceDB=P:\\Data;Exclusive=No;Collate=Machine;NULL=No;DELETED=Yes")
tables = connection.cursor().tables()
for _ in tables:
    print _

this prints only 15 tables, with no obvious pattern, always the same 15 tables, I thought this was because the rest of the tables were empty but I checked and it some of the tables (dbf files) on the list are empty too, then, I thought it was a permission issue, but all the files have the same permission structure, so, I don't know what's happening here.

Any light??

EDIT: It is not truccating the output, the tables it list are not the 15 first or anything like that

Encratis answered 21/12, 2012 at 17:28 Comment(2)
Thank you for your amazing question and answer! For anyone curious, the code in the original question does work with older versions of FoxPro on a 32-bit system.Assyria
See here: github.com/olemb/dbfreadZeebrugge
E
10

I DID IT!!!!

There where several problems with what I was doing so, here I come with what I did to solve it (after implementing it the first time with Ethan Furman's solution)

The first thing was a driver problem, it turns out that the Windows' DBF drivers are 32 bits programs and runs on a 64 bits operating system, so, I had installed Python-amd64 and that was the first problem, so I installed a 32bit Python.

The second issue was a library/file issue, according to this, dbf files in VFP > 7 are diferent, so my pyodbc library won't read them correctly, so I tried some OLE-DB libraries with no success and I decided to to it from scratch.

Googling for a while took me to this post which finally gave me a light on this

Basically, what I did was the following:

import win32com.client

conn = win32com.client.Dispatch('ADODB.Connection')
db = 'C:\\Profit\\profit_a\\ARMM'
dsn = 'Provider=VFPOLEDB.1;Data Source=%s' % db
conn.Open(dsn)

cmd = win32com.client.Dispatch('ADODB.Command')
cmd.ActiveConnection = conn
cmd.CommandText = "Select * from factura, reng_fac where factura.fact_num = reng_fac.fact_num AND factura.fact_num = 6099;"

rs, total = cmd.Execute() # This returns a tuple: (<RecordSet>, number_of_records)

while total:
    for x in xrange(rs.Fields.Count):
        print '%s --> %s' % (rs.Fields.item(x).Name, rs.Fields.item(x).Value)
    rs.MoveNext()        #<- Extra indent
    total = total - 1

And it gave me 20 records which I checked with DBFCommander and were OK

First, you need to install pywin32 extensions (32bits) and the Visual FoxPro OLE-DB Provider (only available for 32bits), in my case for VFP 9.0

Also, it's good to read de ADO Documentation at the w3c website

This worked for me. Thank you very much to those who replied

Encratis answered 26/12, 2012 at 20:57 Comment(3)
Congratulations! Looks like a good example on using win32 as well.Tenantry
Thank you very much!, yes, and this way I can take advantage of the VFP SQL capacity so I don't have to implement all the relational algebra, I was even thinking of implementing a context free grammar with your dbf package and PLY to solve this in a "reusable way"Encratis
This is great! I tried the other dbf libraries but it seems they were not able to get the correct output possibly because of VFP version being different. I think this is the preferred solution if working with VFP dbfs. The only drawback is having to install everything for win32.Obstreperous
T
3

I would use my own dbf package and the code would go something like this:

import dbf
from glob import glob

for dbf_file in glob(r'p:\data\*.dbf'):
    with dbf.Table(dbf_file) as table:
        for record in table:
            do_something_with(record)

A table is list-like, and iteration through it returns records. A record is list-, dict-, and obj-like, and iteration returns the values; besides iteration through the record, individual fields can be accessed either by offset (record[0] for the first field), by field-name using dict-like access (record['some_field']), or by field-name using obj.attr-like access (record.some_field).

If you just wanted to dump the contents of each dbf file into a csv file you could do:

for dbf_file in glob(r'p:\data\*.dbf'):
    with dbf.Table(dbf_file) as table:
        dbf.export(table, dbf_file)
Tenantry answered 21/12, 2012 at 23:33 Comment(1)
Very interesting and useful, thank you very much, I like the csv exporting feature, I was coding it using the other dbf tool. Thank you very much! this saved some time for me.Encratis
T
2

I know this doesn't directly answer your question, but might still help. I've had lots of issues using ODBC with VFP databases and I've found it's often much easier treating the VFP tables as free tables when possible.

Using Yusdi Santoso's dbf.py and glob, here's some code to open each table in a directory and run through each record.

import glob
import os
import dbf

os.chdir("P:\\data")
for file in glob.glob("*.dbf"):
    table = dbf.readDbf(file)
    for row in table:
         #do stuff
Tala answered 21/12, 2012 at 17:51 Comment(3)
This kinda could work, but how do you work with the data like this, how do you select some columns or find relations through foreign keys?, Thank you very much for your answerEncratis
The data comes out in lists of dicts if I remember correctly. for row in table: data = row['FIELDNAME'] For sorting, you can use operator to do things like table.sort(key=operator.itemgetter('FIELDNAME')). As for foreign keys, it's all manual in this case. You're working on a table level instead of at the database level with this method.Tala
Thank you very much, I tried it and it works. If I can't find a way to do it at the database level, I'll do it this way.Encratis

© 2022 - 2024 — McMap. All rights reserved.