Data not getting loaded into Partitioned Table in Hive
Asked Answered
B

7

12

I am trying to create partition for my Table inorder to update a value.

This is my sample data

1,Anne,Admin,50000,A
2,Gokul,Admin,50000,B
3,Janet,Sales,60000,A

I want to update Janet's Department to B.

So for doing that I created a table with Department as partition.

create external table trail (EmployeeID Int,FirstName String,Designation String,Salary Int) PARTITIONED BY (Department String) row format delimited fields terminated by "," location '/user/sreeveni/HIVE';

But while doing the above command. No data are inserted into trail table.

hive>select * from trail;                               
OK
Time taken: 0.193 seconds

hive>desc trail;                                        
OK
employeeid              int                     None                
firstname               string                  None                
designation             string                  None                
salary                  int                     None                
department              string                  None                

# Partition Information      
# col_name              data_type               comment             

department              string                  None   

Am I doing anything wrong?

UPDATE

As suggested I tried to insert data into my table

load data inpath '/user/aibladmin/HIVE' overwrite into table trail Partition(Department);

But it is showing

FAILED: SemanticException [Error 10096]: Dynamic partition strict mode requires at least one static partition column. To turn this off set hive.exec.dynamic.partition.mode=nonstrict

After setting set hive.exec.dynamic.partition.mode=nonstrict also didnt work fine.

Anything else to do.

Billiot answered 18/9, 2014 at 6:46 Comment(0)
S
25

Try both below properties

SET hive.exec.dynamic.partition = true;
SET hive.exec.dynamic.partition.mode = nonstrict;

And while writing insert statement for a partitioned table make sure that you specify the partition columns at the last in select clause. 

Spermaceti answered 15/4, 2015 at 9:18 Comment(1)
im trying to insert into select to a partition table. The data is only 13k records. After using the above setting the query runs. but i get memory run our issue. Wonder, for a fairly small data it should not run out of memory.Wollis
H
4

You cannot directly insert data(Hdfs File) into a Partitioned hive table. First you need to create a normal table, then you will insert that table data into partitioned table.

set hive.exec.dynamic.partition.mode=strict means when ever you are populating hive table it must have at least one static partition column.

set hive.exec.dynamic.partition.mode=nonstrict In this mode you don't need any static partition column.

Hapsburg answered 9/3, 2017 at 7:17 Comment(1)
How do you create a static partition column?Christmann
S
3

Try the following:

Start by creating the table:

create external table test23 (EmployeeID Int,FirstName String,Designation String,Salary Int) PARTITIONED BY (Department String) row format delimited fields terminated by "," location '/user/rocky/HIVE';

Create a directory in hdfs with partition name :

$ hadoop fs -mkdir /user/rocky/HIVE/department=50000

Create a local file abc.txt by filtering records having department equal to 50000:

$ cat abc.txt 
1,Anne,Admin,50000,A
2,Gokul,Admin,50000,B

Put it into HDFS:

$ hadoop fs -put /home/yarn/abc.txt /user/rocky/HIVE/department=50000

Now alter the table:

ALTER TABLE test23 ADD PARTITION(department=50000);

And check the result:

select * from test23 ;
Serpens answered 18/9, 2014 at 10:57 Comment(0)
C
2

just set those 2 properties BEFORE you getOrCreate() the spark session:

SparkSession
    .builder
    .config(new SparkConf())
    .appName(appName)
    .enableHiveSupport()
    .config("hive.exec.dynamic.partition","true")
    .config("hive.exec.dynamic.partition.mode", "nonstrict")
    .getOrCreate()
Contented answered 3/11, 2021 at 15:55 Comment(0)
C
1

Can you try running MSCK REPAIR TABLE table_name;

https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL#LanguageManualDDL-RecoverPartitions(MSCKREPAIRTABLE)

Calysta answered 3/1, 2020 at 8:33 Comment(0)
K
0

I ran into the same problem and yes these two properties are needed. However, I used JDBC driver with Scala to set these properties before executing Hive statements. The problem, however, was that I was executing a bunch of properties (SET statements) in one execution statement like this

     conn = DriverManager.getConnection(conf.get[String]("hive.jdbc.url"))
     conn.createStatement().execute(
"SET spark.executor.memory = 2G;
SET hive.exec.dynamic.partition.mode = nonstrict; 
SET hive.other.statements =blabla  ;") 

For some reason, the driver was not able to interpret all these as separate statements, so I needed to execute each one of them separately.

  conn = DriverManager.getConnection(conf.get[String]("hive.jdbc.url"))
    conn.createStatement().execute("SET spark.executor.memory = 2G;")
    conn.createStatement().execute("SET hive.exec.dynamic.partition.mode=nonstrict;") 
   conn.createStatement().execute("SET hive.other.statements =blabla  ;") 
Khartoum answered 15/11, 2019 at 13:38 Comment(0)
D
0

Check if your source is provided with the partitioning column data for dynamic partitioning or any spell checks in column names of partitioning.

Delmerdelmor answered 19/7 at 4:50 Comment(1)
As it’s currently written, your answer is unclear. Please edit to add additional details that will help others understand how this addresses the question asked. You can find more information on how to write good answers in the help center.Forewing

© 2022 - 2024 — McMap. All rights reserved.