Python apache beam dataflow worker-startup error: Failed to install packages: failed to install SDK: exit status 2
Asked Answered
H

2

6

Before seeing:

RuntimeError: IOError: [Errno 2] No such file or directory:
'/beam-temp-andrew_mini_vocab-..../......andrew_mini_vocab' [while running .....]

in my apache beam python dataflow job I see this error logged:

A setup error was detected in __. Please refer to the worker-startup
log for detailed information. `

I've found the worker startup logs and the Payload error is:

Failed to install packages: failed to install SDK: exit status 2

The error is not specific enough for me to debug. Any insight into what SDK isn't getting loaded? My imports for the job are extremely basic:

from __future__ import absolute_import
from __future__ import division
import argparse
import logging
import re
import apache_beam as beam
from apache_beam.io import WriteToText
from apache_beam.options.pipeline_options import PipelineOptions
from apache_beam.options.pipeline_options import SetupOptions
from apache_beam.pvalue import AsDict
Hombre answered 12/12, 2017 at 18:0 Comment(3)
how did you install beam? just via pip?Cooley
I encountered exit status 2 for subprocess, it was resolved by updating pip to version greater than 7, reason being no-binary option wasnt supported before that version, which is being used in python sdk of apache beamProlongate
@Andrew Cassidy - did Anuj's answer help?Cooley
C
0

Check your version of pip with pip -V, and try to update it.

Please comment on the question if this does not help : )

Cooley answered 12/3, 2019 at 17:41 Comment(0)
R
0

Can you share the setup.py file? I had similar problem, solved it using setup.py file.

Refine answered 6/9, 2021 at 8:1 Comment(1)
Please provide additional details in your answer. As it's currently written, it's hard to understand your solution.Sutherland

© 2022 - 2024 — McMap. All rights reserved.