I have to partition the table in hive
with a column which is also part of the table.
For eg:
Table: employee
Columns: employeeId, employeeName, employeeSalary
I have to partition the table using employeeSalary. So I write the following query:
CREATE TABLE employee (employeeId INT, employeeName STRING, employeeSalary INT) PARTITIONED BY (ds INT);
I just used the name "ds" here as it did'nt allow me to put the same name employeeSalary
.
Is this right what I am doing? Also while inserting values into the table, I have to use a comma separated file. Now the file consists of row like: 2019,John,2000
as one row. If I have to partition using salary my first partition would be all people for salary 2000. So the query would be
LOAD DATA LOCAL INPATH './examples/files/kv2.txt' OVERWRITE INTO TABLE employee PARTITION (ds=2000);
Again after 100 entries with salary as 2000, I have next 500 entries with salary as 4000. So I would again fire the query:
LOAD DATA LOCAL INPATH './examples/files/kv2.txt' OVERWRITE INTO TABLE employee PARTITION (ds=4000);
PLEASE LET ME KNOW IF I AM RIGHT...