You can run complex SQL queries from data in a CSV file, if you have Java installed in your OS (that's pretty common) and by combining Ant (scripting) and H2 (in-memory database).
For example, if you have the file my_file.csv
as:
"name", "sex", "age", "height", "weight"
"Alex", "M", 41, 74, 170
"Bert", "M", 42, 66, 166
"Dave", "M", 39, 72, 167
"Elly", "F", 30, 70, 124
"Luke", "M", 34, 72, 163
"Omar", "M", 38, 69, 145
"Page", "F", 31, 67, 135
"Ruth", "F", 28, 65, 131
Then you can create the script1.xml
Ant script:
<project default="sql">
<target name="sql">
<sql driver="org.h2.Driver" url="jdbc:h2:mem:" userid="sa" password=""
classpath="./h2-2.1.214.jar" onerror="abort" print="true"
showheaders="true" showtrailers="true" showWarnings="true" >
create table my_data as select * from csvread('my_file.csv');
select *
from my_data a
join (
select * from my_data where sex = 'M'
) b on b.height < a.height
where a.sex = 'F'
</sql>
</target>
</project>
Download and unzip Apache Ant 1.10.14 (or other version) under the folder apache-ant-1.10.14/
. Then download the H2 database jar file in the folder. Your dir should look like:
apache-ant-1.10.14/
h2-2.1.214.jar
my_file.csv
script1.xml
Then, just run it:
apache-ant-1.10.14/bin/ant -f script1.xml
The first SQL statement loads the CSV data in memory. The second one is your "complex" query:
[sql] Executing commands
[sql] NAME,SEX,AGE,HEIGHT,WEIGHT,NAME,SEX,AGE,HEIGHT,WEIGHT
[sql] Elly,F,30,70,124,Bert,M,42,66,166
[sql] Elly,F,30,70,124,Omar,M,38,69,145
[sql] Page,F,31,67,135,Bert,M,42,66,166
[sql]
[sql] 2 of 2 SQL statements executed successfully
BUILD SUCCESSFUL
Total time: 0 seconds
That's it!
Notes:
For a variable query string, you could externalize it in a file and then use <transaction src="command.sql" />
instead of the embedded SQL I included. You can use multiple tags as needed.
For a variable CSV file name (or any parameter value) you can add -Dcsvfile=my_file.csv
in the command line and then use it in the script as ${csvfile}
. You can define many parameters, as needed.
H2 by definition is not a system-level installed database, but and embeded one. Nevertheless, you can place the JAR file anywhere you want, and then use <classpath>
tags to refer to the system classpath, instead of the simpler classpath
attrubute in this example. See Ant -> Ant Tasks -> List of Tasks -> Sql and
<classpath>
for details.
As a database, H2 is quite powerful. I would qualify its SQL support to medium-high, so it implements most of the common SQL syntax and features you would find in expensive databases.
Note that the SQL query is embedded in the script using XML format. That's why the <
symbol is escaped as <
. If you don't like this, you can place the query in an external file (no escaping necessary) using the <transaction>
tag.
You can also output the result as CSV or other format for further processing or to be consumed by other tools.
create table my_data as select * from csvread('my_file.csv');
. Then, you can run any complex query in the tablemy_data
. – Kishke