SQLite import csv file with comma in text fields
Asked Answered
N

3

8

I want to import csv file into SQLite db using

sqlite> .separator ,
sqlite> .mode csv data
sqlite> .import test.csv data

where data is the table name with three columns, just like the file.

  • The file has some string value that are encapsulated using double quotes.
  • Some of the string values have commas in them

Example data

Alias Spy_Name Agency
007 "Bond\, James" MI 5
Q "Mister Q" MI 5

The values "Bond\, James" should be treated as a single column, but SQLite produces an error:

Error: test.csv line 2: expected 3 columns of data but found 4

How can I make SQLite import these values correctly?

Niccolite answered 19/3, 2014 at 15:22 Comment(1)
Does the string only contain a comma "," or does it contain a comma and a backslash like this ",\"?Inarch
B
2

I know this is a bit old, but this was the first relevant google search result, so I wanted to share my solution.

Use a different separator, and remove the quotes around values.

sed -i -e 's/","/|/g' -e 's/"$//g' -e 's/^"//g' file.csv

sqlite> .separator "|"
sqlite> .import file.csv tablename
Bordereau answered 25/9, 2014 at 16:52 Comment(0)
N
2

SQLite's .import will accept a CSV line like this

fee, fi,"fo, fum"

provided that there are no space between the preceding comma and the string that is enclosed in quotes.

Since the following has a space between fi, and "fo

fee, fi, "fo, fum"

it will produce an error like:

expected 3 columns but found 4 - extras ignored

If anyone is wondering why this is the case, this was the response of Richard Hipp, author of SQLite, in two mails dated 21st May 2019 to the sqlite-users mailing list, in the thread 'CSV import does not handle fields with a comma surrounded by double'. (It should have been "double quotes", but I forgot the last word.) He wrote:

This is not valid CSV. There is an extra space character after the comma and before the double-quote.

And then

I'm going by RFC 4180. https://tools.ietf.org/html/rfc4180. On page 2 it says: "Spaces are considered part of a field and should not be ignored."

(In case anyone is wondering why I posted an Internet Archive copy of a third-party/unofficial archive, the IA copy is just from an abundance of caution. The unofficial archive is because, as far as I can tell, an official mailing list archive does not exist. The mailing list itself was discontinued some time ago.)

So the logic is that the string is to be surrounded by whitespace, it should surround the leading space too.

Transcript session follows.

###################
## incorrect.csv ##
###################   
fee, fi, "fo, fum"
#################    
## correct.csv ##
#################
fee, fi,"fo, fum" 
############################################## 
              ## test.sh ##
##############################################
echo "Importing incorrect.csv into test.db" 
sqlite3 test.db '.mode csv' 'DROP TABLE IF EXISTS incorrect;' 'CREATE TABLE IF NOT EXISTS incorrect(col1 TEXT PRIMARY KEY, col2 TEXT NOT NULL, col3 TEXT NOT NULL);' '.import incorrect.csv incorrect' '.exit'
echo
echo "Importing correct.csv into test.db"
sqlite3 test.db '.mode csv' 'DROP TABLE IF EXISTS correct;' 'CREATE TABLE IF NOT EXISTS correct(col1 TEXT PRIMARY KEY, col2 TEXT NOT NULL, col3 TEXT NOT NULL);' '.import correct.csv correct' '.exit'
echo
echo "Result of 'select * from incorrect'"
sqlite3 test.db 'select * from incorrect' '.exit'
echo
echo "Result of 'select * from correct'"
sqlite3 test.db 'select * from correct' '.exit'  
$ sh test.sh
    
Importing incorrect.csv into test.db
incorrect.csv:1: expected 3 columns but found 4 - extras ignored
    
Importing correct.csv into test.db
    
Result of 'select * from incorrect'
fee| fi| "fo
    
Result of 'select * from correct'
fee| fi|fo, fum
Norge answered 15/1, 2022 at 19:42 Comment(1)
The command csvtool cat incorrect.csv formats the file. The link is here. Then .import test.csv data doesn't produce the error.Cromer
F
0

I've experienced this issue myself and found it much much easier to modify my script so that it dumps sql queries as opposed to csv delimited values.

There are problems importing csv data into sqlite3 not only with commas, but also with new line characters.

I would suggest the following:

  • Modify your script to produce sql dumps
  • Convert the csv dump to sql queries and feed it to sqlite3
Farny answered 24/5, 2016 at 23:49 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.