Convert JSON to SQLite table
Asked Answered
D

2

50

I want to convert a JSON file I created to a SQLite database.

My json file is like this (containing traffic data from some crossroads in my city):

{
"2011-12-17 16:00": {
    "local": "Av. Protásio Alves; esquina Ramiro Barcelos",
    "coord": "-30.036916,-51.208093",
    "sentido": "bairro-centro",
    "veiculos": "automotores",
    "modalidade": "semaforo 50-15",
    "regime": "típico",
    "pistas": "2+c"
    },
"2011-12-19 08:38": {
    "local": "R. Fernandes Vieira; esquina Protásio Alves",
    "coord": "-30.035535,-51.211079",
    "sentido": "único",
    "veiculos": "automotores",
    "modalidade": "semáforo 30-70",
    "regime": "típico",
    "pistas": "3"
    }
}

And I have created nice database with a one-to-many relation with these lines of Python code:

import sqlite3

db = sqlite3.connect("fluxos.sqlite")
c = db.cursor()

c.execute('''create table medicoes
         (timestamp text primary key,
          local text,
          coord text,
          sentido text,
          veiculos text,
          modalidade text,
          pistas text)''')

c.execute('''create table valores
         (id integer primary key,
          quantidade integer,
          tempo integer,
          foreign key (id) references medicoes(timestamp))''')

How can I programmatically read the keys from each "block" in the JSON file (in this case, "local", "coord", "sentido", "veiculos", "modalidade", "regime", "pistas" e "medicoes"), create the database with the columns in that same order, and then insert the rows with the proper values?

Dorella answered 10/1, 2012 at 22:59 Comment(0)
S
55

You have this python code:

c.execute("insert into medicoes values(?,?,?,?,?,?,?)" % keys)

which I think should be

c.execute("insert into medicoes values (?,?,?,?,?,?,?)", keys)

since the % operator expects the string to its left to contain formatting codes.

Now all you need to make this work is for keys to be a tuple (or list) containing the values for the new row of the medicoes table in the correct order. Consider the following python code:

import json

traffic = json.load(open('xxx.json'))

columns = ['local', 'coord', 'sentido', 'veiculos', 'modalidade', 'pistas']
for timestamp, data in traffic.iteritems():
    keys = (timestamp,) + tuple(data[c] for c in columns)
    print str(keys)

When I run this with your sample data, I get:

(u'2011-12-19 08:38', u'R. Fernandes Vieira; esquina Prot\xe1sio Alves', u'-30.035535,-51.211079', u'\xfanico', u'automotores', u'sem\xe1foro 30-70', u'3')
(u'2011-12-17 16:00', u'Av. Prot\xe1sio Alves; esquina Ramiro Barcelos', u'-30.036916,-51.208093', u'bairro-centro', u'automotores', u'semaforo 50-15', u'2+c')

which would seem to be the tuples you require.

You could add the necessary sqlite code with something like this:

import json
import sqlite3

traffic = json.load(open('xxx.json'))
db = sqlite3.connect("fluxos.sqlite")

query = "insert into medicoes values (?,?,?,?,?,?,?)"
columns = ['local', 'coord', 'sentido', 'veiculos', 'modalidade', 'pistas']
for timestamp, data in traffic.iteritems():
    keys = (timestamp,) + tuple(data[c] for c in columns)
    c = db.cursor()
    c.execute(query, keys)
    c.close()

Edit: if you don't want to hard-code the list of columns, you could do something like this:

import json

traffic = json.load(open('xxx.json'))

someitem = traffic.itervalues().next()
columns = list(someitem.keys())
print columns

When I run this it prints:

[u'medicoes', u'veiculos', u'coord', u'modalidade', u'sentido', u'local', u'pistas', u'regime']

You could use it with something like this:

import json
import sqlite3

db = sqlite3.connect('fluxos.sqlite')
traffic = json.load(open('xxx.json'))

someitem = traffic.itervalues().next()
columns = list(someitem.keys())
columns.remove('medicoes')
columns.remove('regime')

query = "insert into medicoes (timestamp,{0}) values (?{1})"
query = query.format(",".join(columns), ",?" * len(columns))
print query

for timestamp, data in traffic.iteritems():
    keys = (timestamp,) + tuple(data[c] for c in columns)
    c = db.cursor()
    c.execute(query)
    c.close()

The query this code prints when I try it with your sample data is something like this:

insert into medicoes (timestamp,veiculos,coord,modalidade,sentido,local,pistas) values (?,?,?,?,?,?,?)
Swop answered 10/1, 2012 at 23:25 Comment(1)
I think your answer contains most of what I want to do. My only remaining problem is that columns is hard-coded, and I would like thes strings to be read from the file. Optimally, they should be in the same order. What do you think?Dorella
U
0

With the popular data manipulation library pandas, converting json to a sqlite table is very easy since a lot of the processing is done by pandas. Basically, the json file can be converted into a pandas DataFrame using pandas.read_json, then we can simply filter the required columns and dump into a SQLite table using to_sql.

import sqlite3
import pandas as pd

con = sqlite3.connect('data/fluxos.sqlite')
(
    pd.read_json('data/my_json.json', orient='index')
    .filter(['local', 'coord', 'sentido', 'veiculos', 'modalidade', 'pistas'])
    .to_sql('medicoes', con, index_label='timestamp', dtype={'timestamp': 'TEXT PRIMARY KEY'}, if_exists='append')
)
con.close()

However, if you don't want to stick to the built-in libraries, then instead of explicitly opening/closing a connection and inserting each row one-by-one, you can process the data beforehand and use executemany to let sqlite3 handle the loop:

import json
import sqlite3


with open('data/my_json.json') as f:
    my_json = json.load(f)

columns = ['local', 'coord', 'sentido', 'veiculos', 'modalidade', 'pistas']
processed_data = [(timestamp, *(data[c] for c in columns)) for timestamp, data in my_json.items()]

con = sqlite3.connect('data/fluxos.sqlite')
cur = con.cursor()

cur.execute('''
CREATE TABLE medicoes (
    timestamp TEXT PRIMARY KEY,
    local TEXT,
    coord TEXT,
    sentido TEXT,
    veiculos TEXT,
    modalidade TEXT,
    pistas TEXT)
''')

cur.executemany("INSERT INTO medicoes VALUES (?, ?, ?, ?, ?, ?, ?)", processed_data)
con.commit()
cur.close()
con.close()
Unbeknown answered 20/11, 2023 at 20:36 Comment(0)

© 2022 - 2025 — McMap. All rights reserved.