apache-arrow Questions

2

I use toPandas() on a DataFrame which is not very large, but I get the following exception: 18/10/31 19:13:19 ERROR Executor: Exception in task 127.2 in stage 13.0 (TID 2264) org.apache.spark.api....
Checkered asked 31/10, 2018 at 11:51

2

Solved

I have a somewhat large (~20 GB) partitioned dataset in parquet format. I would like to read specific partitions from the dataset using pyarrow. I thought I could accomplish this with pyarrow.parqu...
Unseasonable asked 28/12, 2017 at 5:29

1

Solved

I'm trying to return a specific structure from a pandas_udf. It worked on one cluster but fails on another. I try to run a udf on groups, which requires the return type to be a data frame. from py...
Phoenix asked 26/3, 2018 at 11:10

0

I'm currently writing some code to convert an arbitrary data structure to Apache Arrow vectors and got stuck on something relatively simple, namely, how to write a byte[] to a ListVector. When wri...
Macassar asked 30/10, 2017 at 8:3

1

Solved

I'm currently playing with Apache Arrow's java API (though I use it from Scala for the code samples) to get some familiarity with this tool. As an exercise, I chose to load a CSV file into arrow v...
Saundrasaunter asked 23/10, 2017 at 9:53

© 2022 - 2024 — McMap. All rights reserved.