Why does it take ages to install Pandas on Alpine Linux
Asked Answered
C

12

170

I've noticed that installing Pandas and Numpy (it's dependency) in a Docker container using the base OS Alpine vs. CentOS or Debian takes much longer. I created a little test below to demonstrate the time difference. Aside from the few seconds Alpine takes to update and download the build dependencies to install Pandas and Numpy, why does the setup.py take around 70x more time than on Debian install?

Is there any way to speed up the install using Alpine as the base image or is there another base image of comparable size to Alpine that is better to use for packages like Pandas and Numpy?

Dockerfile.debian

FROM python:3.6.4-slim-jessie

RUN pip install pandas

Build Debian image with Pandas & Numpy:

[PandasDockerTest] time docker build -t debian-pandas -f Dockerfile.debian . --no-cache
    Sending build context to Docker daemon  3.072kB
    Step 1/2 : FROM python:3.6.4-slim-jessie
     ---> 43431c5410f3
    Step 2/2 : RUN pip install pandas
     ---> Running in 2e4c030f8051
    Collecting pandas
      Downloading pandas-0.22.0-cp36-cp36m-manylinux1_x86_64.whl (26.2MB)
    Collecting numpy>=1.9.0 (from pandas)
      Downloading numpy-1.14.1-cp36-cp36m-manylinux1_x86_64.whl (12.2MB)
    Collecting pytz>=2011k (from pandas)
      Downloading pytz-2018.3-py2.py3-none-any.whl (509kB)
    Collecting python-dateutil>=2 (from pandas)
      Downloading python_dateutil-2.6.1-py2.py3-none-any.whl (194kB)
    Collecting six>=1.5 (from python-dateutil>=2->pandas)
      Downloading six-1.11.0-py2.py3-none-any.whl
    Installing collected packages: numpy, pytz, six, python-dateutil, pandas
    Successfully installed numpy-1.14.1 pandas-0.22.0 python-dateutil-2.6.1 pytz-2018.3 six-1.11.0
    Removing intermediate container 2e4c030f8051
     ---> a71e1c314897
    Successfully built a71e1c314897
    Successfully tagged debian-pandas:latest
    docker build -t debian-pandas -f Dockerfile.debian . --no-cache  0.07s user 0.06s system 0% cpu 13.605 total

Dockerfile.alpine

FROM python:3.6.4-alpine3.7

RUN apk --update add --no-cache g++

RUN pip install pandas

Build Alpine image with Pandas & Numpy:

[PandasDockerTest] time docker build -t alpine-pandas -f Dockerfile.alpine . --no-cache
Sending build context to Docker daemon   16.9kB
Step 1/3 : FROM python:3.6.4-alpine3.7
 ---> 4b00a94b6f26
Step 2/3 : RUN apk --update add --no-cache g++
 ---> Running in 4b0c32551e3f
fetch http://dl-cdn.alpinelinux.org/alpine/v3.7/main/x86_64/APKINDEX.tar.gz
fetch http://dl-cdn.alpinelinux.org/alpine/v3.7/main/x86_64/APKINDEX.tar.gz
fetch http://dl-cdn.alpinelinux.org/alpine/v3.7/community/x86_64/APKINDEX.tar.gz
fetch http://dl-cdn.alpinelinux.org/alpine/v3.7/community/x86_64/APKINDEX.tar.gz
(1/17) Upgrading musl (1.1.18-r2 -> 1.1.18-r3)
(2/17) Installing libgcc (6.4.0-r5)
(3/17) Installing libstdc++ (6.4.0-r5)
(4/17) Installing binutils-libs (2.28-r3)
(5/17) Installing binutils (2.28-r3)
(6/17) Installing gmp (6.1.2-r1)
(7/17) Installing isl (0.18-r0)
(8/17) Installing libgomp (6.4.0-r5)
(9/17) Installing libatomic (6.4.0-r5)
(10/17) Installing pkgconf (1.3.10-r0)
(11/17) Installing mpfr3 (3.1.5-r1)
(12/17) Installing mpc1 (1.0.3-r1)
(13/17) Installing gcc (6.4.0-r5)
(14/17) Installing musl-dev (1.1.18-r3)
(15/17) Installing libc-dev (0.7.1-r0)
(16/17) Installing g++ (6.4.0-r5)
(17/17) Upgrading musl-utils (1.1.18-r2 -> 1.1.18-r3)
Executing busybox-1.27.2-r7.trigger
OK: 184 MiB in 50 packages
Removing intermediate container 4b0c32551e3f
 ---> be26c3bf4e42
Step 3/3 : RUN pip install pandas
 ---> Running in 36f6024e5e2d
Collecting pandas
  Downloading pandas-0.22.0.tar.gz (11.3MB)
Collecting python-dateutil>=2 (from pandas)
  Downloading python_dateutil-2.6.1-py2.py3-none-any.whl (194kB)
Collecting pytz>=2011k (from pandas)
  Downloading pytz-2018.3-py2.py3-none-any.whl (509kB)
Collecting numpy>=1.9.0 (from pandas)
  Downloading numpy-1.14.1.zip (4.9MB)
Collecting six>=1.5 (from python-dateutil>=2->pandas)
  Downloading six-1.11.0-py2.py3-none-any.whl
Building wheels for collected packages: pandas, numpy
  Running setup.py bdist_wheel for pandas: started
  Running setup.py bdist_wheel for pandas: still running...
  Running setup.py bdist_wheel for pandas: still running...
  Running setup.py bdist_wheel for pandas: still running...
  Running setup.py bdist_wheel for pandas: still running...
  Running setup.py bdist_wheel for pandas: still running...
  Running setup.py bdist_wheel for pandas: still running...
  Running setup.py bdist_wheel for pandas: finished with status 'done'
  Stored in directory: /root/.cache/pip/wheels/e8/ed/46/0596b51014f3cc49259e52dff9824e1c6fe352048a2656fc92
  Running setup.py bdist_wheel for numpy: started
  Running setup.py bdist_wheel for numpy: still running...
  Running setup.py bdist_wheel for numpy: still running...
  Running setup.py bdist_wheel for numpy: still running...
  Running setup.py bdist_wheel for numpy: finished with status 'done'
  Stored in directory: /root/.cache/pip/wheels/9d/cd/e1/4d418b16ea662e512349ef193ed9d9ff473af715110798c984
Successfully built pandas numpy
Installing collected packages: six, python-dateutil, pytz, numpy, pandas
Successfully installed numpy-1.14.1 pandas-0.22.0 python-dateutil-2.6.1 pytz-2018.3 six-1.11.0
Removing intermediate container 36f6024e5e2d
 ---> a93c59e6a106
Successfully built a93c59e6a106
Successfully tagged alpine-pandas:latest
docker build -t alpine-pandas -f Dockerfile.alpine . --no-cache  0.54s user 0.33s system 0% cpu 16:08.47 total
Clingstone answered 28/2, 2018 at 20:10 Comment(7)
.apk now available, so zero need to build from source - pkgs.alpinelinux.org/packages?name=*pandas&branch=edgePlexiform
@jtlz2, pandas is not available on the branch edge of Alpine. which is a pity...Leanoraleant
@Leanoraleant It is available again now!Plexiform
I tried the suggestions from several comments, and still ended up trying to build pandas whenever I added it to alpine. I did some digging and found that (1) pandas is not officially packaged in apk and probably won't be any time soon, BUT, (2) pandas is available as a community supported package that installs a pre-compiled binary under /usr/lib and doesn't require you to compile it. See my answer for more info: https://mcmap.net/q/103397/-why-does-it-take-ages-to-install-pandas-on-alpine-linuxIlluviation
python:3.7-stretch image worked for meMetagnathous
I have django + celery + redis + pandas + many other packages. Switching from alpine to slim-jessie is a great idea.Synergistic
@SerhiiKushchenko since debian-jessie era we had strech and buster that already went to the history. Nowadays, there is bullseye. Recommending switching to jessie in half of 2021 is a bit of ignorance and smells with an antique shop.Overcloud
L
94

Debian based images use only python pip to install packages with .whl format:

  Downloading pandas-0.22.0-cp36-cp36m-manylinux1_x86_64.whl (26.2MB)
  Downloading numpy-1.14.1-cp36-cp36m-manylinux1_x86_64.whl (12.2MB)

WHL format was developed as a quicker and more reliable method of installing Python software than re-building from source code every time. WHL files only have to be moved to the correct location on the target system to be installed, whereas a source distribution requires a build step before installation.

Wheel packages pandas and numpy are not supported in images based on Alpine platform. That's why when we install them using python pip during the building process, we always compile them from the source files in alpine:

  Downloading pandas-0.22.0.tar.gz (11.3MB)
  Downloading numpy-1.14.1.zip (4.9MB)

and we can see the following inside container during the image building:

/ # ps aux
PID   USER     TIME   COMMAND
    1 root       0:00 /bin/sh -c pip install pandas
    7 root       0:04 {pip} /usr/local/bin/python /usr/local/bin/pip install pandas
   21 root       0:07 /usr/local/bin/python -c import setuptools, tokenize;__file__='/tmp/pip-build-en29h0ak/pandas/setup.py';f=getattr(tokenize, 'open', open)(__file__);code=f.read().replace('\r\n', '\n
  496 root       0:00 sh
  660 root       0:00 /bin/sh -c gcc -Wno-unused-result -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -DTHREAD_STACK_SIZE=0x100000 -fPIC -Ibuild/src.linux-x86_64-3.6/numpy/core/src/pri
  661 root       0:00 gcc -Wno-unused-result -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -DTHREAD_STACK_SIZE=0x100000 -fPIC -Ibuild/src.linux-x86_64-3.6/numpy/core/src/private -Inump
  662 root       0:00 /usr/libexec/gcc/x86_64-alpine-linux-musl/6.4.0/cc1 -quiet -I build/src.linux-x86_64-3.6/numpy/core/src/private -I numpy/core/include -I build/src.linux-x86_64-3.6/numpy/core/includ
  663 root       0:00 ps aux

If we modify Dockerfile a little:

FROM python:3.6.4-alpine3.7
RUN apk add --no-cache g++ wget
RUN wget https://pypi.python.org/packages/da/c6/0936bc5814b429fddb5d6252566fe73a3e40372e6ceaf87de3dec1326f28/pandas-0.22.0-cp36-cp36m-manylinux1_x86_64.whl
RUN pip install pandas-0.22.0-cp36-cp36m-manylinux1_x86_64.whl

we get the following error:

Step 4/4 : RUN pip install pandas-0.22.0-cp36-cp36m-manylinux1_x86_64.whl
 ---> Running in 0faea63e2bda
pandas-0.22.0-cp36-cp36m-manylinux1_x86_64.whl is not a supported wheel on this platform.
The command '/bin/sh -c pip install pandas-0.22.0-cp36-cp36m-manylinux1_x86_64.whl' returned a non-zero code: 1

Unfortunately, the only way to install pandas on an Alpine image is to wait until build finishes.

Of course if you want to use the Alpine image with pandas in CI for example, the best way to do so is to compile it once, push it to any registry and use it as a base image for your needs.

EDIT: If you want to use the Alpine image with pandas you can pull my nickgryg/alpine-pandas docker image. It is a python image with pre-compiled pandas on the Alpine platform. It should save your time.

Legionnaire answered 1/3, 2018 at 19:27 Comment(7)
Well, that's too bad. However, it looks like six, pytz, and python-dateutil are downloading .whl packages on Alpine. Does that mean its possible to build wheels for pandas and numpy for Alpine, but that it's just not happening currently?Clingstone
No, it is not possible to build wheels for pandas and nampy on alpine platform. Those wheels doesn't support it. I showed that in the answer, when tried to install pandas from its wheel package in alpine image.Legionnaire
@Nickolay Is there a workaround way to recycle a pandas build that has been built on alpine and then cached? (this could be hosted somewhere locally)Plexiform
The reason this is this way is because these wheels contain binaries build from c/c++ and linked with glibc, but alpine does not have glibc, it instead uses musl, which means new binaries must be compiled and linked against musl.Mas
I suppose it's the same for mac M1. It's taking agesSelfassured
Pandas is just not installing at all for me now. It starts with v1.4.0, fails, tries one version lower and repeat. I did not go thru the entire error log but it seems to be some version dependency problems. Debian is better for these packages at least and if image space is a concern slim images are an option.Flyboat
I don't know whether this is logical or not, however, sometimes I just copy the working binary dependencies which were built on a python virtual environment through the pip install command. The bad thing here is that the container size increased a bit compared to the traditional approach.Manganese
P
43

ANSWER: AS OF 3/9/2020, FOR PYTHON 3, IT STILL DOESN'T!

Here is a complete working Dockerfile:

FROM python:3.7-alpine
RUN echo "@testing http://dl-cdn.alpinelinux.org/alpine/edge/testing" >> /etc/apk/repositories
RUN apk add --update --no-cache py3-numpy py3-pandas@testing

The build is very sensitive to the exact python and alpine version numbers - getting these wrong seems to provoke Max Levy's error so:libpython3.7m.so.1.0 (missing) - but the above does now work for me.

My updated Dockerfile is available at https://gist.github.com/jtlz2/b0f4bc07ce2ff04bc193337f2327c13b


[Earlier Update:]

ANSWER: IT DOESN'T!

In any Alpine Dockerfile you can simply do*

RUN apk add py2-numpy@community py2-scipy@community py-pandas@edge

This is because numpy, scipy and now pandas are all available prebuilt on alpine:

https://pkgs.alpinelinux.org/packages?name=*numpy

https://pkgs.alpinelinux.org/packages?name=*scipy&branch=edge

https://pkgs.alpinelinux.org/packages?name=*pandas&branch=edge

One way to avoid rebuilding every time, or using a Docker layer, is to use a prebuilt, native Alpine Linux/.apk package, e.g.

https://github.com/sgerrand/alpine-pkg-py-pandas

https://github.com/nbgallery/apks

You can build these .apks once and use them wherever in your Dockerfile you like :)

This also saves you having to bake everything else into the Docker image before the fact - i.e. the flexibility to pre-build any Docker image you like.

PS I have put a Dockerfile stub at https://gist.github.com/jtlz2/b0f4bc07ce2ff04bc193337f2327c13b that shows roughly how to build the image. These include the important steps (*):

RUN echo "@community http://dl-cdn.alpinelinux.org/alpine/edge/community" >> /etc/apk/repositories
RUN apk update
RUN apk add --update --no-cache libgfortran
Plexiform answered 21/5, 2018 at 6:57 Comment(12)
Looks like it was recently removed? pkgs.alpinelinux.org/package/edge/testing/x86/py-pandasPlexiform
@ChrisWedgwood They are actively working on it - see github.com/alpinelinux/aports/pull/6330Plexiform
@ChrisWedgwood Working again, phew!Plexiform
@Plexiform @Leanoraleant @Chris Wedgwood This gives WARNING: The repository tag for world dependency 'py3-numpy@community' does not exist what do I do?Washedout
@Plexiform ERROR: Not committing changes due to missing repository tags this error occurs when the RUN apk ... command mentioned in the above answer is executed . Do we need to add any repositories before executing this command in dockerfile?Washedout
@Washedout Can you try prepending RUN echo "@community http://dl-cdn.alpinelinux.org/alpine/edge/community" >> /etc/apk/repositories then RUN apk update? If it works I will update the answer.Plexiform
@Washedout Got it working - see updated answer - you also need Alpine >= 3.8 to pick up libgfortran that scipy - and possibly numpy - requires.Plexiform
@Plexiform - thanks! ButI'm getting: ERROR: unsatisfiable constraints: so:libpython3.7m.so.1.0 (missing): required by: py3-pandas-0.24.1-r0 [so:libpython3.7m.so.1.0] when I add the following lines to your sample Dockerfile: RUN echo "@testing http://dl-cdn.alpinelinux.org/alpine/edge/testing" >> /etc/apk/repositories RUN apk add --update --no-cache py3-pandas@testing using FROM python:3.7-alpine3.8 - what am I missing?Gasaway
This seems to be a running battle... :/Plexiform
I have silly/noob question: So I ran the Dockerfile mentioned above. However, python can't import pandas ModuleNotFoundError: No module named 'pandas' Is there a special way to get python to recognize this version of pandas?Nelrsa
@Nelrsa Do #14296180 , leemendelowitz.github.io/blog/… , bic-berkeley.github.io/psych-214-fall-2016/sys_path.html help?Plexiform
@Plexiform I switched over to 3.7-slim-buster and everything went smoothly there pythonspeed.com/articles/base-image-python-docker-imagesNelrsa
P
29

Real honest advice here, switch to Debian based image and then all your problems will be gone.

Alpine for python applications doesn't work well.

Here is an example of my dockerfile:

FROM python:3.7.6-buster

RUN pip install pandas==1.0.0
RUN pip install sklearn
RUN pip install Django==3.0.2
RUN pip install cx_Oracle==7.3.0
RUN pip install excel
RUN pip install djangorestframework==3.11.0

The python:3.7.6-buster is more appropriate in this case, in addition, you don't need any extra dependency in the OS.

Follow a usefull and recent article: https://pythonspeed.com/articles/alpine-docker-python/:

Don’t use Alpine Linux for Python images Unless you want massively slower build times, larger images, more work, and the potential for obscure bugs, you’ll want to avoid Alpine Linux as a base image. For some recommendations on what you should use, see my article on choosing a good base image.

Photolysis answered 7/2, 2020 at 17:5 Comment(4)
You can reduce the number of layers in your image i.e. RUN pip install <packegeA> && pip install <packageB> and so on instead of using a block of RUN commands. It affects your build performance :)Casilde
You can also use pip --no-cache to shave off a little more footprint. What you should really do is just put them line by line in a requirements.txt file and pip install --no-cache -r requirements.txtMas
This is the best solution I've found on Alpine for python Dockers. This solves so many issuesDebora
Love the real talk. FWIW, my fresh python:3.10-<> install times for pandas were: alpine (25m46), buster (1m7), slim (33s). I went with slim.Atonic
M
18

Just going to bring some of these answers together in one answer and add a detail I think was missed. The reason certain python libraries, particularly optimized math and data libraries, take so long to build on alpine is because the pip wheels for these libraries include binaries precompiled from c/c++ and linked against gnu-libc (glibc), a common set of c standard libraries. Debian, Fedora, CentOS all (typically) use glibc, but alpine, in order to stay lightweight, uses musl-libc instead. c/c++ binaries build on a glibc system will not work on a system without glibc and the same goes for musl.

Pip looks first for a wheel with the correct binaries, if it can't find one, it tries to compile the binaries from the c/c++ source and links them against musl. In many cases, this won't even work unless you have the python headers from python3-dev or build tools like make.

Now the silver lining, as others have mentioned, there are apk packages with the proper binaries provided by the community, using these will save you the (sometimes lengthy) process of building the binaries.

You can, in fact, install from a pure python .whl on alpine, but, at the time of this writing, manylinux did not support binary distributions for alpine due to the musl/gnu issue.

Update Oct 2022

Newer versions of python/pip support musl via the package musllinux which, I assume, is a musl impl for manylinux. Still no official 'musl' support for CUDA though.

Mas answered 2/10, 2019 at 23:51 Comment(0)
C
9

ATTENTION
Look at the @jtlz2 answer with the latest update

OUTDATED

So, py3-pandas & py3-numpy packages moved to the testing alpine repository, so, you can download it by adding these lines in to the your Dockerfile:

RUN echo "http://dl-8.alpinelinux.org/alpine/edge/testing" >> /etc/apk/repositories \
  && apk update \
  && apk add py3-numpy py3-pandas

Hope it helps someone!

Alpine packages links:
- py3-pandas
- py3-numpy

Alpine repositories docks info.

Crippen answered 17/9, 2019 at 11:15 Comment(4)
This worked for me! Thanks for providing an updated answer!Gemmell
Fixed in my answerPlexiform
@Plexiform cool, thanks, but i moved to the debian buster instead of alpine and didnt tried install it again with alpine, but anyway, thanks for reply, also fixed my answerCrippen
Just to note that py3-pandas is not available for 3.11.x, it is only in the 'edge' release as of the time I'm writing this comment. edit: Obviously it says that in post above, I just missed that reference earlier, sorry.Nu
S
6

In this case the alpine not be the best solution change alpine for slim:

FROM python:3.8.3-alpine

Change to that:

FROM python:3.8.3-slim

In my case it was resolved with this small change.

Scheldt answered 5/8, 2021 at 18:2 Comment(1)
Well, it makes sense for this to work bc you are actually going out of alpine and using Debian, and Debian is the de-facto image of python. See image variants for further information.Atween
S
2

This worked for me:

FROM python:3.8-alpine
RUN echo "@testing http://dl-cdn.alpinelinux.org/alpine/edge/testing" >> /etc/apk/repositories
RUN apk add --update --no-cache py3-numpy py3-pandas@testing
ENV PYTHONPATH=/usr/lib/python3.8/site-packages

COPY . /app
WORKDIR /app

RUN pip install -r requirements.txt

EXPOSE 5003 
ENTRYPOINT [ "python" ] 
CMD [ "app.py" ]

Most of the code here is from the answer of jtlz2 from this same thread and Faylixe from another thread.

Turns out the lighter version of pandas is found in the Alpine repository py3-numpy but it doesn't get installed in the same file path from where Python reads the imports by default. Therefore you need to add the ENV. Also be mindful about the alpine version.

Shute answered 7/7, 2020 at 4:57 Comment(0)
U
1

I have solved the installation with some additional changes:

Requirements

  • Migrate from python3.8-alpine to python3.10-alpine:
docker pull python:3.10-alpine

Important!

I had to migrate because when I was installing py3-pandas, it installed the package as python3.10, not in the required version that I was using python3.8).

To figure out where the libraries of a package were installed, you can check that with the following command:

apk info -L py3-pandas
  • Not install backports.zoneinfo package since python3.9 (I had to add a condition in the requirements.txt to install the package with versions lower than 3.9).
backports.zoneinfo==0.2.1;python_version<"3.9"

Installation

After the previous changes, I proceed to install panda performing the following:

  • Add 3 additional repositories to /etc/apk/repositories (the repositories can vary based on the version of your distribution), reference here:
for x in $(echo "main community testing"); \
    do echo "https://dl-cdn.alpinelinux.org/alpine/edge/${x}" >> /etc/apk/repositories; \
    done
  • Validate the content of the file /etc/apk/repositories:
$ cat /etc/apk/repositories
https://dl-cdn.alpinelinux.org/alpine/v3.16/main
https://dl-cdn.alpinelinux.org/alpine/v3.16/community
https://dl-cdn.alpinelinux.org/alpine/edge/main
https://dl-cdn.alpinelinux.org/alpine/edge/community
https://dl-cdn.alpinelinux.org/alpine/edge/testing
  • Perform to install pandas (pynum is installed automatically as a dependency of pandas):
sudo apk update && sudo apk add py3-pandas
  • Set the environment variable PYTHONPATH:
export PYTHONPATH=/usr/lib/python3.10/site-packages/
  • Validate the packages can be imported (on my case I tested it with django):
python manage.py shell 
import pandas as pd
import numpy as np
technologies =  ['Spark','Pandas','Java','Python', 'PHP']
fee = [25000,20000,15000,15000,18000]
duration = ['5o Days','35 Days',np.nan,'30 Days', '30 Days']
discount = [2000,1000,800,500,800]
columns=['Courses','Fee','Duration','Discount']
df = pd.DataFrame(list(zip(technologies,fee,duration,discount)), columns=columns)
print(df)
Unremitting answered 10/7, 2022 at 20:50 Comment(0)
I
0

pandas is considered a community supported package, so the answers pointing to edge/testing are not going to work as Alpine does not officially support pandas as a core package (it still works, it's just not supported by the core Alpine developers).

Try this Dockerfile:

FROM python:3.8-alpine
RUN echo "@community http://dl-cdn.alpinelinux.org/alpine/edge/community" >> /etc/apk/repositories \
&& apk add py3-pandas@community
ENV PYTHONPATH="/usr/lib/python3.8/site-packages"

This works for the vanilla Alpine image too, using FROM alpine:3.12.


Update: thanks to @cegprakash for raising the question about how to work with this setup when you also have a requirements.txt file that must be satisfied inside the container.

I added one line to the Dockerfile snippet to export the PYTHONPATH variable into the container runtime. If you do this, it won't matter whether pandas or numpy are included in the requirements file or not (provided they are pegged to the same version that was installed via apk).

The reason this is needed is that apk installs the py3-pands@community package under /usr/lib, but that location is not on the default PYTHONPATH that pip checks before installing new packages. If we don't include this step to add it, pip and python will not find the package and pip will try to download and install it under /usr/local which is what we're trying to avoid.

And given that we really want to make sure that pip doesn't try to install pandas, I would suggest to not include pandas or numpy in the requirements.txt file if you've already installed them with apk using the above method. It's just a little extra insurance that things will go as intended.

Illuviation answered 11/9, 2020 at 0:31 Comment(2)
should we include pandas and numpy in requirements.txt or is it not needed?Stebbins
@Stebbins If your requirements.txt is going to be copied into a docker container where apk add py3-pandas@community was already run via a Dockerfile directive when the image was created, then no, it is not necessary. I modified my answer with some additional explanation that I think will answer your question more thoroughly. Thank you for bringing it up :)Illuviation
W
0

The following Dockerfile worked for me to install pandas, among other dependencies as listed below.

python:3.10-alpine Dockerfile

# syntax=docker/dockerfile:1
FROM python:3.10-alpine as base

RUN apk add --update --no-cache --virtual .tmp-build-deps \
    gcc g++ libc-dev linux-headers postgresql-dev build-base \
    && apk add libffi-dev

COPY requirements.txt requirements.txt
RUN pip install --no-cache-dir --upgrade -r requirements.txt

pyproject.toml dependencies

python = "^3.10"
Django = "^3.2.9"
djangorestframework = "^3.12.4"
PyYAML = ">=5.3.0,<6.0.0"
Markdown = "^3.3.6"
uritemplate = "^4.1.1"
install = "^1.3.5"
drf-spectacular = "^0.21.0"
django-extensions = "^3.1.5"
django-filter = "^21.1"
django-cors-headers = "^3.10.1"
httpx = "^0.22.0"
channels = "^3.0.4"
daphne = "^3.0.2"
whitenoise = "^6.2.0"
djoser = "^2.1.0"
channels-redis = "^3.4.0"
pika = "^1.2.1"
backoff = "^2.1.2"
psycopg2-binary = "^2.9.3"
pandas = "^1.5.0"
Wolfort answered 3/10, 2022 at 9:7 Comment(0)
P
0

I have a same problem. While I just simple change

FROM python:3

to

FROM python:3.10

everything works fine. (Hope this will help.)

Pentastich answered 6/3, 2023 at 7:39 Comment(0)
K
-1

alpine takes lot of time to install pandas and the image size is also huge. I tried the python:3.8-slim-buster version of python base image. Image build was very fast and size of image was less than half in comparison to alpine python docker image

https://github.com/dguyhasnoname/k8s-cluster-checker/blob/master/Dockerfile

Kyla answered 9/8, 2020 at 11:3 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.