How to fetch the filename from CSV and insert it into one of columns of a table
Asked Answered
A

2

7

Suppose mt table name is Table_1

Structure of the table is something like:

EMP_id, Emp_Name, File_Name_Input_Date (columns)

I am loading Table_1 from a CSV file. The name of this CSV file is daily_01122014.csv

I am using vertical copy command to load this table:

ex copy Table_1 from '/location/daily_01122014.csv'delimiter ',';

It will load the table, but column name File_Name_Input_Date will be left blank, and I have to hard code it.

What I want is to write a shell script which will dynamically read file name and load it into column File_Name, and also extract date from file name.

i.e: If file name is daily_01122014.csv, then while loading Table_1 column File_Name should be daily_01122014.csv and Input_Date should be 2014-12-01. Since I have 50-60 files, I can not do it manually.

Assiduous answered 24/4, 2015 at 12:0 Comment(1)
solutions for your request were given !Cavitation
C
2

What you can do is adding this string to the file before you load the data to Vertica See here an example:

[dbadmin@DCXLG001 tmp]$ cat file001.csv
1,Adrian
[dbadmin@DCXLG001 tmp]$ ls -1 file*.csv|xargs -I% sed -i 's/$/,%/' %
[dbadmin@DCXLG001 tmp]$ cat file001.csv
1,Adrian,file001.csv

- where i have a file called file*.csv and i add the name of the file to the end of each line in the actual file. You can do this to any more then one file. I would use the operation system resources to do this transformation then trying to create a filler using Vetica's copy command

Cavitation answered 24/4, 2015 at 14:13 Comment(0)
T
2

You can just use CURRENT_LOAD_SOURCE() inside your COPY:

COPY t (c1, c2, c3 AS CURRENT_LOAD_SOURCE()) 
FROM '/home/load_file_1' ON exampledb_node02, '/home/load_file_2' ON exampledb_node03 
DELIMITER ',';

Alternatively, you can use a shell script:

#!/bin/bash
for INPUT_FILE in /path/to/file*.dat; do
vsql -c "copy test_file_name_insert
(
input_file_name as '$INPUT_FILE',
id,
name
)
from '/home/test_input_file.txt'
delimiter '|' ;"
done
Tradition answered 24/4, 2015 at 14:35 Comment(4)
Sounds like he also wants to parse out the date. Maybe something like TO_DATE(SPLIT_PART(SPLIT_PART(CURRENT_LOAD_SOURCE(),'_',2),'.',1),'DDMMYYYY')) would work. Not sure, he references one column in part of his question and then two later.Selfrighteous
@Selfrighteous I think you're out of line sir.Tradition
It's not easy being green, is it, @Tradition ?Selfrighteous
@Selfrighteous not if you're shopping on woot all dayTradition

© 2022 - 2024 — McMap. All rights reserved.