How do I access HBase table in Hive & vice-versa?
Asked Answered
A

2

11

As a developer, I've created HBase table for our project by importing data from existing MySQL table using sqoop job. The problem is our data analyst team are familiar with MySQL syntax, implies they can query HIVE table easily. For them, I need to expose HBase table in HIVE. I don't want to duplicate data by populating data again in HIVE. Also, duplicating data might have consistency issues in future.

Can I expose HBase table in HIVE without duplicating data? If yes, how do I do it? Also, if I insert/update/delete data in my HBase table will updated data appear in HIVE without any issues?

Sometimes, our data analytic team create table and populate data in HIVE. Can I expose them to HBase? If yes, how?

Accuse answered 8/5, 2015 at 15:7 Comment(0)
V
16

HBase-Hive Integration:

Creating an external table in hive for HBase table allows you to query HBase data o be queried in Hive without the need for duplicating data. You can just update or delete data from HBase table and you can view the modified table in Hive too.

Example:

Consider you have an hbase table with columns id, name and email.

Sample external table command for hive:

CREATE EXTERNAL TABLE hivehbasetable(key INT, id INT,  username STRING, password STRING, email STRING) STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler' WITH SERDEPROPERTIES ("hbase.columns.mapping" = ":key,id:id,name:username,name:password,email:email") TBLPROPERTIES("hbase.table.name" = "hbasetable");

For more information on Hive-Hbase integration look here

Vibrate answered 8/5, 2015 at 16:28 Comment(4)
What about reverse? I have HIVE table and exposing it to HBase?Accuse
@Accuse As long as you write to a HIVE table stored by the HBaseStorageHandler, the table and the data it cointains will be stored in HBase. Your team can create as many HBase-stored tables (external or not) as they like in HIVE and INSERT ... SELECT... into them, the data will be immediately available in HBase once the query finishes. Try it.Canossa
Hi, this works fine and I'm able to create the hive table over hbase. Now I wish to insert records into the table on the fly but it is taking a lot of time around 45 sec. On the other hand, I tried to insert records into similar hive table and it is taking relatively less time 30 sec. I thought hbase would improve the performance though it is the other way. Is there any way I can insert data to hive in like 2-3 secs?Agateware
Thanks to your answer I got this working on MapR. I presume there is not support for timestamps? So we can only every get the most recent version in Hive?Publication
N
2

Using Apache Phoenix

One quick solution would be to use apache phoenix layer over HBase tables. Apache Phoenix is an interface that enables OLTP SQL queries to be used over Hbase NoSql DB. This doesn't have any additional overhead, rather it produces a view of data present in HBase using SQL queries.

Refer these links for further details:

Nakano answered 3/2, 2020 at 22:50 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.