How to convert a column in H2OFrame to a python list?
Asked Answered
T

3

8

I've read the PythonBooklet.pdf by H2O.ai and the python API documentation, but still can't find a clean way to do this. I know I can do either of the following:

  • Convert H2OFrame to Spark DataFrame and do a flatMap + collect or collect + list comprehension.
  • Use H2O's get_frame_data, which gives me a string of header and data separated by \n; then convert it a list (a numeric list in my case).

Is there a better way to do this? Thank you.

Toh answered 3/4, 2017 at 16:5 Comment(0)
M
9

You can try something like this: bring an H2OFrame into python as a pandas dataframe by calling .as_data_frame(), then call .tolist() on the column of interest.

A self contained example w/ iris

import h2o
h2o.init()
df = h2o.import_file("iris_wheader.csv")
pd = df.as_data_frame()
pd['sepal_len'].tolist()
Megargee answered 3/4, 2017 at 16:25 Comment(1)
Thanks! It's certainly a better solution than the other two.Toh
J
2

You can (1) convert the H2o frame to pandas dataframe and (2) convert pandas dataframe to list as follows:

pd=h2o.as_list(h2oFrame) 
l=pd["column"].tolist()
Jessen answered 1/2, 2019 at 21:57 Comment(1)
thanks it helped me, the second line did the thing for meSwithin
O
0

H2O as_list method returns a list of lists along with the column name, hence you need to flatten the list after extracting the column as shown below

column_as_list_of_lists = h2o.as_list(h2oFrame[:,'<col_name>'],use_pandas=False)  
flat_list = [item for sublist in column_as_list_of_lists[1:len(column_as_list_of_lists)-1] for item in sublist]
Overscore answered 28/11, 2023 at 12:44 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.