How to use Impala to read Hive view containing complex types?
Asked Answered
I

0

6

I have some data that is processed and model based on case classes, and the classes can also have other case classes in them, so the final table has complex data, struct, array. Using the case class I save the data in hive using dataframe.saveAsTextFile(path).

This data sometimes changes or needs to have a different model, so for each iteration I use a suffix in the table name (some_data_v01, some_data_v03, etc.).

I also have queries that are run on a schedule on these tables, using Impala, so in order to not modify the query each time I save a a new table, I wanted to use a view that is always updated whenever I change the model.

The problem with that is I can't use Impala to create the view, because of the complex nature of the data in the tables (nested complex types). Apart from being a lot of work to expand the complex types, I want these types to be preserved (lots of level of nesting, duplication of data when joining arrays).

One solution was to create the view using Hive, like this

create view some_data as select * from some_data_v01;

But if I do this, when I want to use the table from Impala,

select * from some_data;

or even something simple, like

select some_value_not_nested, struct_type.some_int, struct_type.some_other_int from some_data;

the error is the following:

AnalysisException: Expr 'some_data_v01.struct_type' in select list returns a complex type 'STRUCT< some_int:INT, some_other_int:INT, nested_struct:STRUCT< nested_int:INT, nested_other_int:INT>, last_int:INT>'. Only scalar types are allowed in the select list.

Is there any way to access this view, or create it in some other way for it to work?

Insufferable answered 26/3, 2019 at 17:33 Comment(3)
I would recommend add the create table statement and Ideally some test sample data. The issue is essentially because the sintax is different between hive and impalaDobby
Have you checked the file formats supported by Impala for complex types? cloudera.com/documentation/enterprise/latest/topics/…Brosy
Hello. Yes, the files are parquet, so no problem there. Also, the original table can be read from both Hive and Impala, only the hive created view cannot be read from impala.Insufferable

© 2022 - 2024 — McMap. All rights reserved.