Escaping Strings For Ruby SQLite Insert
Asked Answered
T

1

6

I'm creating a Ruby script to import a tab-delimited text file of about 150k lines into SQLite. Here it is so far:

require 'sqlite3'

file = File.new("/Users/michael/catalog.txt")
string = []

# Escape single quotes, remove newline, split on tabs, 
# wrap each item in quotes, and join with commas
def prepare_for_insert(s)
  s.gsub(/'/,"\\\\'").chomp.split(/\t/).map {|str| "'#{str}'"}.join(", ")
end

file.each_line do |line|
  string << prepare_for_insert(line)
end

database = SQLite3::Database.new("/Users/michael/catalog.db")

# Insert each string into the database
string.each do |str|
  database.execute( "INSERT INTO CATALOG VALUES (#{str})")
end

The script errors out on the first line containing a single quote in spite of the gsub to escape single quotes in my prepare_for_insert method:

/Users/michael/.rvm/gems/ruby-1.9.3-p0/gems/sqlite3-1.3.5/lib/sqlite3/database.rb:91:
in `initialize': near "s": syntax error (SQLite3::SQLException)

It's erroring out on line 15. If I inspect that line with puts string[14], I can see where it's showing the error near "s". It looks like this: 'Touch the Top of the World: A Blind Man\'s Journey to Climb Farther Than the Eye Can See'

Looks like the single quote is escaped, so why am I still getting the error?

Theoretical answered 8/3, 2012 at 7:33 Comment(0)
A
13

Don't do it like that at all, string interpolation and SQL tend to be a bad combination. Use a prepared statement instead and let the driver deal with quoting and escaping:

# Ditch the gsub in prepare_for_insert and...
db  = SQLite3::Database.new('/Users/michael/catalog.db')
ins = db.prepare('insert into catalog (column_name) values (?)')
string.each { |s| ins.execute(s) }

You should replace column_name with the real column name of course; you don't have to specify the column names in an INSERT but you should always do it anyway. If you need to insert more columns then add more placeholders and arguments to ins.execute.

Using prepare and execute should be faster, safer, easier, and it won't make you feel like you're writing PHP in 1999.

Also, you should use the standard CSV parser to parse your tab-separated files, XSV formats aren't much fun to deal with (they're downright evil in fact) and you have better things to do with your time than deal with their nonsense and edge cases and what not.

Annual answered 8/3, 2012 at 7:42 Comment(2)
Perfect. It works now. There are 34 columns. So, goofy as it looks, (?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?) is the correct way to express the expected number of columns? I'll incorporate the CSV parser as well.Theoretical
@michaelmichael: Yes, and 34 arguments to ins.execute but you can put/keep the arguments in an array and ins.execute(*array) them to keep the goofiness in check. You can also build the placeholder using (['?'] * 34).join(',') (this is okay as you know exactly what strings you're working with, trying to use unknown strings in SQL is just asking for trouble though).Annual

© 2022 - 2024 — McMap. All rights reserved.