If I use the hbase shell
and issue:
put 'test', 'rowkey1','cf:foo', 'bar'
scan 'test'
I will see the result as a string, not in bytes.
If I use happybase
and issue:
import happybase
connection = happybase.Connection('<hostname>')
table = connection.table('test')
table.put('rowkey2', {'cf:foo': 'bar'})
for row in table.scan():
print row
I will see the result as a string, not in bytes.
I have data in hive that I ran an aggregation on and stored on HDFS via:
INSERT OVERWRITE DIRECTORY 'aggregation_test'
SELECT device_id, device_name, sum(device_cost)
FROM devices
GROUP BY device_id, device_name
ORDER BY device_id, device_name
However, if I issue the following in Pig:
A = LOAD 'aggregation_test' USING PigStorage(',') as (device_id:chararray, device_name:chararray, device_sum:int);
STORE A INTO 'hbase://aggregation_test'
USING org.apache.pig.backend.hadoop.hbase.HBaseStorage(
'cf:device_name, cf:device_sum');
Scans in hbase shell
and in happybase
result in bytes, not in string.
I can't even search on a row key that is a string.
How can I use Pig and HBaseStorage to store data from HDFS into HBase as strings not bytes?