Transform SQL insert script into CSV format
Asked Answered
F

4

10

I'm looking for an awk command, or similar tool, to transform standard well formatted SQL insert script into csv file.
By standard I mean there is no database vendor specific stuff anywhere.
By well formatted I mean the case where each line of the sql script has a full column set to insert, even if there are NULLs. Also the order of fields to insert is the same.
Sample input SQL script:

INSERT INTO tbl VALUES (1, 'asd', 923123123, 'zx');
INSERT INTO tbl VALUES (1, NULL, 923123123, 'zxz');
INSERT INTO tbl VALUES (3, 'asd3', 923123123, NULL);

Optionally:

INSERT INTO tbl (colA, colB, colC, colD) VALUES (1, 'asd', 923123123, 'zx');

Expected output should be a csv file:

1,'asd',923123123,'zx'
1,,923123123,'zxz'
3,'asd3',923123123,

Looking for a performance efficient solution.

Friedrick answered 15/8, 2015 at 15:46 Comment(2)
csv is practically evil. if text data never contains a single-quote (aka tick = ') then i guess its ok. Other test cases to add include which would simulate common name and address data: INSERT INTO tbl VALUES (1, NULL, 923123123, 'foo '' bar'); INSERT INTO tbl VALUES (1, NULL, 923123123, 'foo , bar'); INSERT INTO tbl VALUES (1, NULL, 923123123, 'foo ,'', bar');Awakening
csv is most widely used data exchange format, if it is an evil, it is still something we have to comprehend. If you have decent tool that extracts data to csv, then you will be good.Friedrick
J
9
$ awk -F' *[(),]+ *' -v OFS=, '{for (i=2;i<NF;i++) printf "%s%s", ($i=="NULL"?"":$i), (i<(NF-1)?OFS:ORS)}' file
1,'asd',923123123,'zx'
1,,923123123,'zxz'
3,'asd3',923123123,

I'd recommend you test all potential solutions with this input:

$ cat file
INSERT INTO tbl VALUES (1, NULL, 923123123, 'foo NULL bar');

$ awk -F' *[(),]+ *' -v OFS=, '{for (i=2;i<NF;i++) printf "%s%s", ($i=="NULL"?"":$i), (i<(NF-1)?OFS:ORS)}' file
1,,923123123,'foo NULL bar'

to make sure the string NULL and blank chars are not deleted when they appear as part of a literal string.

Jansenism answered 15/8, 2015 at 17:26 Comment(1)
FPAT may be of interest as well: #45421035 Just noticed you both asked and answered that which is awesome.Brockie
S
6

Try this with GNU grep and sed:

grep -oP '\(\K[^)]*(?=\);)' file | sed 's/NULL//g;s/ //g'

Output from all four lines:

1,'asd',923123123,'zx'
1,,923123123,'zxz'
3,'asd3',923123123,
1,'asd',923123123,'zx'

or only with GNU sed:

sed 's/.*(\([^)]*\));/\1/;s/NULL//g;s/ //g' file

Output from all four lines:

1,'asd',923123123,'zx'
1,,923123123,'zxz'
3,'asd3',923123123,
1,'asd',923123123,'zx'
Shashaban answered 15/8, 2015 at 15:52 Comment(1)
Neither of those check for the word NULL being a field of it's own rather than part of a field so it will delete NULL within literal strings, and also it deletes all spaces including those within literal strings.Jansenism
C
2
awk -F'[()]' -v OFS=, '{gsub(/NULL|;/,"")}{gsub(/, /,",")}{print $(NF -1)}' file
1,'asd',923123123,'zx'
1,,923123123,'zxz'
3,'asd3',923123123,
Centrist answered 16/8, 2015 at 22:28 Comment(0)
B
0

This Python helper: https://github.com/jamesmishra/mysqldump-to-csv may be of interest. It may be a bit hacky, and it is a bit encoding broken, but it is a start, and handle extremely long INSERT lines as produced by mysqldump without blowing up.

https://mcmap.net/q/209527/-dump-all-tables-in-csv-format-using-39-mysqldump-39 mentions it, sample usage:

wget https://raw.githubusercontent.com/jamesmishra/mysqldump-to-csv/master/mysqldump_to_csv.py
mysqldump database table | python mysqldump_to_csv.py > table.csv
Brockie answered 10/10, 2023 at 16:38 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.