hive 0.13 msck repair table only lists partitions not in metastore
Asked Answered
I

7

5

I'm trying to use Hive(0.13) msck repair table command to recover partitions and it only lists the partitions not added to metastore instead of adding them to metastore as well.

here's the ouput of the command

partitions not in metastore externalexample:CreatedAt=26 04%3A50%3A56 UTC 2014/profileLocation="Chicago"

here's how I'm creating the external table

CREATE EXTERNAL TABLE IF NOT EXISTS ExternalExample(
        tweetId BIGINT, username STRING,
        txt STRING, CreatedAt STRING,
        profileLocation STRING,
        favc BIGINT,retweet STRING,retcount BIGINT,followerscount BIGINT)
    COMMENT 'This is the Twitter streaming data'
    ROW FORMAT DELIMITED
    FIELDS TERMINATED BY '\t'
    STORED AS TEXTFILE
    location '/user/hue/exttable/';

Am I missing something?

Isidore answered 5/11, 2014 at 9:30 Comment(0)
L
10

I had a similar issue with the MSCK REPAIR TABLE listing the partitions that were not in the metastore but not actually adding them (and no error message).

I tried manually adding the partition with the ALTER TABLE ADD PARTITION command, and this gave me an error message, leading me to the root cause which was that the HDFS folder containing the 'missing' partition had been set up with incorrect permissions.

Once the permissions issue was resolved, then the MSCK REPAIR TABLE command worked correctly.

If you encounter this issue, it may be worthwhile to try adding it manually with the ALTER TABLE ADD PARTITION command. It may produce a useful error message that would help you determine the root cause of the problem.

Longboat answered 12/5, 2016 at 18:3 Comment(0)
O
6

Please make sure that the name of the partitions defined in your table definition match the name of the partition on hdfs.

For example, in your table creation example, I see that you haven't defined any paritions at all.

I think you want to do something like this (note the use of PARTITIONED BY):

create external table ExternalExample(tweetId BIGINT, username STRING, txt STRING,favc BIGINT,retweet STRING,retcount BIGINT,followerscount BIGINT) PARTITIONED BY (CreatedAt STRING, profileLocation STRING) COMMENT 'This is the Twitter streaming data' ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t' STORED AS TEXTFILE location '/user/hue/exttable/';

Then on hdfs you should have the following folder structure:

/user/hue/exttable/CreatedAt=<someString>/profileLocation=<someString>/your-data-file
Outdated answered 18/11, 2014 at 20:41 Comment(1)
CreatedAt and profileLocation in the SQL are case-insensitive but they have to be all in lowercase in the file paths.Blanchblancha
H
5

The partition names for MSCK REPAIR TABLE ExternalTable should be in lowercase then only it will add it to hive metastore, I faced the similar issue in hive 1.2.1 where there was no support for ALTER TABLE ExternalTable RECOVER PARTITION, but after spending some time debugging found the issue that the partition names should be in lowercase i.e /some_external_path/mypartion=01 is valid and /some_external_path/myParition=01 is invalid;

Make your profileLocation to profilelocation or profile_location and test it should work.

My question is here Not able to recover partitions through alter table in Hive 1.2

Hambletonian answered 26/9, 2016 at 13:26 Comment(0)
T
0

Hive stores a list of partitions for each table in its metastore. If, however, new partitions are directly added to HDFS (manually by hadoop fs -put command), the metastore will not be aware of these partitions. you need to add partition

ALTER TABLE ExternalExample ADD PARTITION

for every partition or in short you can run

MSCK REPAIR TABLE ExternalExample; It will add any partitions that exist on HDFS but not in metastore to the metastore.

Ref https://issues.apache.org/jira/browse/HIVE-874

Teletypesetter answered 5/11, 2014 at 16:22 Comment(1)
Thanks, I understand that. but the problem I'm facing when I run MSCK REPAIR TABLE ExternalExample; is that it is not adding partitions to the metastore instead it just gives out partitions not in metastore. Is it clear now?Isidore
T
0

1) You need to specify partitions 2) Partition names must have all lower case letters . See this - https://singhanuvrat.com/hive-partition-column-name-camelcase-bad-idea-b89796d4e741#.16d7uqfot

Thay answered 30/8, 2016 at 17:57 Comment(0)
U
-1

you might not be running as the hive user:

sudo -u hive** hive -e "set hive.msck.path.validation=ignore;msck repair table T1"

set hive.msck.path.validation=ignore; ( this is for tables with large number of partitions.)

Usm answered 3/7, 2018 at 13:52 Comment(0)
A
-2

You are just missing the PARTITIONED BY (CreatedAt STRING, profileLocation STRING).

Aircrewman answered 14/10, 2015 at 8:1 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.