I'm developping an API and deploying it on Google Cloud Run.
There is a prestart python script that import pandas and numpy. When I time the imports numpy take about 2 seconds and pandas about 4 seconds on Cloud Run as opposed to less than 0.5 second on my local machine.
I'm using python:3.8-alpine
as my base image in order to build my docker container. (Though I have tried several non Alpine images... )
Here is the Dockerfile
FROM python:3.8-alpine
COPY requirements.txt ./
RUN apk add --no-cache --virtual build-deps g++ gcc gfortran make libffi-dev openssl-dev file build-base \
&& apk add --no-cache libstdc++ openblas-dev lapack-dev \
&& pip install --no-cache-dir uvicorn gunicorn fastapi \
&& CFLAGS="-g0 -Wl,--strip-all -I/usr/include:/usr/local/include -L/usr/lib:/usr/local/lib" \
&& pip install --no-cache-dir --compile --global-option=build_ext --global-option="-j 16" -r requirements.txt \
&& rm -r /root/.cache \
&& find /usr/local/lib/python3.*/ -name 'tests' -exec rm -r '{}' + \
&& find /usr/local/lib/python3.*/site-packages/ \( -type d -a -name test -o -name tests \) -o \( -type f -a -name '*.pyc' -o -name '*.pyo' \) -exec rm -r '{}' + \
&& find /usr/local/lib/python3.*/site-packages/ -name '*.so' -print -exec /bin/sh -c 'file "{}" | grep -q "not stripped" && strip -s "{}"' \; \
&& find /usr/lib/ -name '*.so' -print -exec /bin/sh -c 'file "{}" | grep -q "not stripped" && strip -s "{}"' \; \
&& find /usr/local/lib/ -name '*.so' -print -exec /bin/sh -c 'file "{}" | grep -q "not stripped" && strip -s "{}"' \; \
&& rm -rf /usr/local/lib/python*/ensurepip \
&& rm -rf /usr/local/lib/python*/idlelib \
&& rm -rf /usr/local/lib/python*/distutils/command \
&& rm -rf /usr/local/lib/python*/lib2to2 \
&& rm -rf /usr/local/lib/python*/__pycache__/* \
&& rm -r /requirements.txt /databases.zip \
&& rm -rf /tmp/* \
&& rm -rf /var/cache/apk/* \
&& apk del build-deps g++ gcc make libffi-dev openssl-dev file build-base
CMD ["python","script.py"]
requirements.txt :
numpy==1.2.0
pandas==1.2.1
and the execution python file script.py :
import time
ts = time.time()
import pandas
te = time.time()
print(te-ts)
Are these slow imports to be expected? Or perhaps there is some python import trick ?
I have been looking all over stackoverflow and github issues but nothing similar to this "issue"/"behavior".
Thanks in advance.