Dataflow with python flex template - launcher timeout
Asked Answered
K

2

4

I'm trying to run my python dataflow job with flex template. job works fine locally when I run with direct runner (without flex template) however when I try to run it with flex template, job stuck in "Queued" status for a while and then fail with timeout.

Here is some of logs I found in GCE console:

INFO:apache_beam.runners.portability.stager:Executing command: ['/usr/local/bin/python', '-m', 'pip', 'download', '--dest', '/tmp/dataflow-requirements-cache', '-r', '/dataflow/template/requirements.txt', '--exists-action', 'i', '--no-binary', ':all:'

Shutting down the GCE instance, launcher-202011121540156428385273524285797, used for launching.

Timeout in polling result file: gs://my_bucket/staging/template_launches/2020-11-12_15_40_15-6428385273524285797/operation_result.
Possible causes are:
1. Your launch takes too long time to finish. Please check the logs on stackdriver.
2. Service [email protected] may not have enough permissions to pull container image gcr.io/indigo-computer-272415/samples/dataflow/streaming-beam-py:latest or create new objects in gs://my_bucket/staging/template_launches/2020-11-12_15_40_15-6428385273524285797/operation_result.
3. Transient errors occurred, please try again.

For 1, I see no useful lo. For 2, service account is default service account so it should all permissions.

How can I debug this further?

Here is my Docker file:

FROM gcr.io/dataflow-templates-base/python3-template-launcher-base

ARG WORKDIR=/dataflow/template
RUN mkdir -p ${WORKDIR}
WORKDIR ${WORKDIR}

ADD localdeps localdeps
COPY requirements.txt .
COPY main.py .
COPY setup.py .
COPY bq_field_pb2.py .
COPY bq_table_pb2.py .
COPY core_pb2.py .

ENV FLEX_TEMPLATE_PYTHON_REQUIREMENTS_FILE="${WORKDIR}/requirements.txt"
ENV FLEX_TEMPLATE_PYTHON_PY_FILE="${WORKDIR}/main.py"
ENV FLEX_TEMPLATE_PYTHON_SETUP_FILE="${WORKDIR}/setup.py"

RUN pip install -U  --no-cache-dir -r ./requirements.txt

I'm following this guide - https://cloud.google.com/dataflow/docs/guides/templates/using-flex-templates

Kippy answered 13/11, 2020 at 0:14 Comment(4)
When you run the template, can you explicitly specify a service account having the rights for accessing the buckets via --parameters service_account_email="[email protected]"Telecast
it didn't help. in fact I was able to run github example project in above document without issueKippy
The guide that you linked to makes no mention of FLEX_TEMPLATE_PYTHON_SETUP_FILE (admittedly today is more than two months after you posted your message above and this flex template stuff seems to be changing rapidly at the moment). Do you know of other documentation that explains FLEX_TEMPLATE_PYTHON_SETUP_FILE because I cannot find any.Gripsack
documentation is pretty bad :/ I think I found it in sample repo. here is explanation of the field github.com/GoogleCloudPlatform/python-docs-samples/issues/…Kippy
B
8

A possible cause of this issue can be found within the requirements.txt file. If you are trying to install apache-beam within the requirements file the flex template will experience the exact issue you are describing: Jobs stay some time in the Queued state and finally fail with Timeout in polling result.

The reason being, they are affected by this issue. This only affects flex templates, the jobs run properly locally or with Standard Templates.

The solution is to install it separately in the Dockerfile.

RUN pip install -U apache-beam==<your desired version>
RUN pip install -U -r ./requirements.txt
Bergson answered 16/11, 2020 at 9:11 Comment(1)
I experienced a similar problem, my pipeline wasn't timing out but it was taking far too long to start. This fix worked for me too. See details at #65766566Gripsack
C
0

Download the requirements to speed up launching the Dataflow job.

FROM gcr.io/dataflow-templates-base/python3-template-launcher-base

ARG WORKDIR=/dataflow/template
RUN mkdir -p ${WORKDIR}
WORKDIR ${WORKDIR}

COPY . .

ENV FLEX_TEMPLATE_PYTHON_PY_FILE="${WORKDIR}/main.py"
ENV FLEX_TEMPLATE_PYTHON_SETUP_FILE="${WORKDIR}/setup.py"
ENV FLEX_TEMPLATE_PYTHON_REQUIREMENTS_FILE="${WORKDIR}/requirements.txt"

RUN apt-get update \
    # Upgrade pip and install the requirements.
    && pip install --no-cache-dir --upgrade pip \
    && pip install --no-cache-dir -r $FLEX_TEMPLATE_PYTHON_REQUIREMENTS_FILE \
    # Download the requirements to speed up launching the Dataflow job.
    && pip download --no-cache-dir --dest /tmp/dataflow-requirements-cache -r $FLEX_TEMPLATE_PYTHON_REQUIREMENTS_FILE


# Since we already downloaded all the dependencies, there's no need to rebuild everything.
ENV PIP_NO_DEPS=True
Coattail answered 19/10, 2021 at 20:37 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.