No module named 'numpy' during docker build
Asked Answered
G

1

5

I am following the instruction (https://github.com/huggingface/transfer-learning-conv-ai) to install conv-ai from huggingface, but I got stuck on the docker build step: docker build -t convai .

I am using Mac 10.15, python 3.8, increased Docker memory to 4G.

I have tried the following ways to solve the issue:

  1. add numpy in requirements.txt
  2. add RUN pip3 install --upgrade setuptools in Dockerfile
  3. add --upgrade to RUN pip3 install -r /tmp/requirements.txt in Dockerfile
  4. add RUN pip3 install numpy before RUN pip3 install -r /tmp/requirements.txt in Dockerfile
  5. add RUN apt-get install python3-numpy before RUN pip3 install -r /tmp/requirements.txt in Dockerfile
  6. using python 3.6.13 because of this post, but it has exact same error.
  7. I am currently working on debugging inside the container by entering right before the RUN pip3 install requirements.txt

Can anyone help me on this? Thank you!!

The error:

 => [6/9] COPY . ./                                                                                                          0.0s
 => [7/9] COPY requirements.txt /tmp/requirements.txt                                                                        0.0s
 => ERROR [8/9] RUN pip3 install -r /tmp/requirements.txt                                                                   98.2s
------
 > [8/9] RUN pip3 install -r /tmp/requirements.txt:
#12 1.111 Collecting torch (from -r /tmp/requirements.txt (line 1))
#12 1.754   Downloading https://files.pythonhosted.org/packages/46/99/8b658e5095b9fb02e38ccb7ecc931eb1a03b5160d77148aecf68f8a7eeda/torch-1.8.0-cp36-cp36m-manylinux1_x86_64.whl (735.5MB)
#12 81.11 Collecting pytorch-ignite (from -r /tmp/requirements.txt (line 2))
#12 81.76   Downloading https://files.pythonhosted.org/packages/f8/d3/640f70d69393b415e6a29b27c735047ad86267921ad62682d1d756556d48/pytorch_ignite-0.4.4-py3-none-any.whl (200kB)
#12 81.82 Collecting transformers==2.5.1 (from -r /tmp/requirements.txt (line 3))
#12 82.17   Downloading https://files.pythonhosted.org/packages/13/33/ffb67897a6985a7b7d8e5e7878c3628678f553634bd3836404fef06ef19b/transformers-2.5.1-py3-none-any.whl (499kB)
#12 82.29 Collecting tensorboardX==1.8 (from -r /tmp/requirements.txt (line 4))
#12 82.50   Downloading https://files.pythonhosted.org/packages/c3/12/dcaf67e1312475b26db9e45e7bb6f32b540671a9ee120b3a72d9e09bc517/tensorboardX-1.8-py2.py3-none-any.whl (216kB)
#12 82.57 Collecting tensorflow (from -r /tmp/requirements.txt (line 5))
#12 83.12   Downloading https://files.pythonhosted.org/packages/de/f0/96fb2e0412ae9692dbf400e5b04432885f677ad6241c088ccc5fe7724d69/tensorflow-1.14.0-cp36-cp36m-manylinux1_x86_64.whl (109.2MB)
#12 95.24 Collecting spacy (from -r /tmp/requirements.txt (line 6))
#12 95.81   Downloading https://files.pythonhosted.org/packages/65/01/fd65769520d4b146d92920170fd00e01e826cda39a366bde82a87ca249db/spacy-3.0.5.tar.gz (7.0MB)
#12 97.41     Complete output from command python setup.py egg_info:
#12 97.41     Traceback (most recent call last):
#12 97.41       File "<string>", line 1, in <module>
#12 97.41       File "/tmp/pip-build-cc3a804w/spacy/setup.py", line 5, in <module>
#12 97.41         import numpy
#12 97.41     ModuleNotFoundError: No module named 'numpy'
#12 97.41     
#12 97.41     ----------------------------------------
#12 98.11 Command "python setup.py egg_info" failed with error code 1 in /tmp/pip-build-cc3a804w/spacy/

@Håken Lid The error I got if I RUN pip3 install numpy right before RUN pip3 install -r tmp/requirements:

 => [ 8/10] RUN pip3 install numpy                                                                                          10.1s
 => ERROR [ 9/10] RUN pip3 install -r /tmp/requirements.txt                                                                112.4s
------                                                                                                                            
 > [ 9/10] RUN pip3 install -r /tmp/requirements.txt:                                                                             
#13 1.067 Requirement already satisfied: numpy in /usr/local/lib/python3.6/dist-packages (from -r /tmp/requirements.txt (line 1)) 
#13 1.074 Collecting torch (from -r /tmp/requirements.txt (line 2))                                                               
#13 1.656   Downloading https://files.pythonhosted.org/packages/46/99/8b658e5095b9fb02e38ccb7ecc931eb1a03b5160d77148aecf68f8a7eeda/torch-1.8.0-cp36-cp36m-manylinux1_x86_64.whl (735.5MB)                                                                           
#13 96.46 Collecting pytorch-ignite (from -r /tmp/requirements.txt (line 3))
#13 97.02   Downloading https://files.pythonhosted.org/packages/f8/d3/640f70d69393b415e6a29b27c735047ad86267921ad62682d1d756556d48/pytorch_ignite-0.4.4-py3-none-any.whl (200kB)
#13 97.07 Collecting transformers==2.5.1 (from -r /tmp/requirements.txt (line 4))
#13 97.32   Downloading https://files.pythonhosted.org/packages/13/33/ffb67897a6985a7b7d8e5e7878c3628678f553634bd3836404fef06ef19b/transformers-2.5.1-py3-none-any.whl (499kB)
#13 97.43 Collecting tensorboardX==1.8 (from -r /tmp/requirements.txt (line 5))
#13 97.70   Downloading https://files.pythonhosted.org/packages/c3/12/dcaf67e1312475b26db9e45e7bb6f32b540671a9ee120b3a72d9e09bc517/tensorboardX-1.8-py2.py3-none-any.whl (216kB)
#13 97.76 Collecting tensorflow (from -r /tmp/requirements.txt (line 6))
#13 98.27   Downloading https://files.pythonhosted.org/packages/de/f0/96fb2e0412ae9692dbf400e5b04432885f677ad6241c088ccc5fe7724d69/tensorflow-1.14.0-cp36-cp36m-manylinux1_x86_64.whl (109.2MB)
#13 109.6 Collecting spacy (from -r /tmp/requirements.txt (line 7))
#13 110.0   Downloading https://files.pythonhosted.org/packages/65/01/fd65769520d4b146d92920170fd00e01e826cda39a366bde82a87ca249db/spacy-3.0.5.tar.gz (7.0MB)
#13 111.6     Complete output from command python setup.py egg_info:
#13 111.6     Traceback (most recent call last):
#13 111.6       File "<string>", line 1, in <module>
#13 111.6       File "/tmp/pip-build-t6n57csv/spacy/setup.py", line 10, in <module>
#13 111.6         from Cython.Build import cythonize
#13 111.6     ModuleNotFoundError: No module named 'Cython'
#13 111.6     
#13 111.6     ----------------------------------------
#13 112.3 Command "python setup.py egg_info" failed with error code 1 in /tmp/pip-build-t6n57csv/spacy/
------
executor failed running [/bin/sh -c pip3 install -r /tmp/requirements.txt]: exit code: 1

requirements.txt:

torch
pytorch-ignite
transformers==2.5.1
tensorboardX==1.8
tensorflow  # for tensorboardX
spacy

Dockerfile:

FROM ubuntu:18.04

MAINTAINER Loreto Parisi [email protected]

########################################  BASE SYSTEM
# set noninteractive installation
ARG DEBIAN_FRONTEND=noninteractive
RUN apt-get update && apt-get install -y apt-utils
RUN apt-get install -y --no-install-recommends \
    build-essential \
    pkg-config \
    tzdata \
    curl

######################################## PYTHON3
RUN apt-get install -y \
    python3 \
    python3-pip

# set local timezone
RUN ln -fs /usr/share/zoneinfo/America/New_York /etc/localtime && \
    dpkg-reconfigure --frontend noninteractive tzdata

# transfer-learning-conv-ai
ENV PYTHONPATH /usr/local/lib/python3.6 
COPY . ./
COPY requirements.txt /tmp/requirements.txt
RUN pip3 install -r /tmp/requirements.txt

# model zoo
RUN mkdir models && \
    curl https://s3.amazonaws.com/models.huggingface.co/transfer-learning-chatbot/finetuned_chatbot_gpt.tar.gz > models/finetuned_chatbot_gpt.tar.gz && \
    cd models/ && \
    tar -xvzf finetuned_chatbot_gpt.tar.gz && \
    rm finetuned_chatbot_gpt.tar.gz
    
CMD ["bash"]

Steps I ran so far:

git clone https://github.com/huggingface/transfer-learning-conv-ai
cd transfer-learning-conv-ai
pip install -r requirements.txt
python -m spacy download en
docker build -t convai .
Gallic answered 12/3, 2021 at 15:21 Comment(7)
What happened when you put RUN pip3 install numpy before RUN pip3 install -r /tmp/requirements.txt? If that step was successful you should not get this ModuleNotFoundError.Botnick
Odd, you could delete from before it fails, build then run exec into the container with bash. That would enable you to test from terminalHagride
sorry my internet was done for a bit.I added the error for RUN pip3 install numpy in the post, this is with python3.6. Thanks @HåkenLid !Gallic
yeah, I am continuing doing that now, this is a great guidance, I will post if I found anything useful, so far I just got the same error inside the container. Thanks @jabberwocky!Gallic
Hi guys,this is weird because I think I have tried updating pip in Dockerfile already. Inside my docker container, the pip version is 9.0.1 (my local pip version is 21.0.01). And I found this post which says basically pip 9.0.1 is known to have a broken version of setuptools. so I am trying again inside the container with pip 21.0.1 @HåkenLid @HagrideGallic
It's working now!!! so the lesson is that --upgrade in the same line of RUN pip3 install -r requirements.txt does't work, but adding another line RUN pip3 install --upgrade pip before that would fix it. python 3.6 or 3.8 does not matter. Yayyyy!! @Hagride @HåkenLidGallic
No problem @moon, glad you solvedHagride
F
8

It seems that pip does not install the pre-built wheel, but instead tries to build spacy from source. This is a fragile process and requires extra dependencies.

To avoid this, you should ensure that the Python packages pip, wheel and setuptools are up to date before proceeding with the installation.

# replace RUN pip3 install -r /tmp/requirements.txt

RUN python3 -m pip install --upgrade pip setuptools wheel                                                                                                                                                                                                
RUN python3 -m pip install -r /tmp/requirements.txt  
Forficate answered 12/3, 2021 at 16:44 Comment(5)
Thank you Haken! Can you also educated me on (1)what are -m in python -m pip and -r in pip install -r requirements.txt or which CLI I can use to look them up? (2) if I want to understand/exercise more about topics in build/wheel/setuptools, are there any good materials you can point me to based on your experience? Thank you!!Gallic
python -m pip and pip actually do the same thing. But the first one is the recommended usage, for reasons explained here: #25750121Botnick
As to wheel, pip and setuptools. They are all used to install packages in Python, usually from the Pypi package repository. The reason there are multiple tools, is that this side of python has changed a lot over the years, and new features have been added. Wheel is one of the newest parts. The main advantage is that you can download pre-compiled builds of python packages that are written in C. If you can't use wheel, then pip will download the source code and try to compile the package, which can fail if some library dependency is not found.Botnick
In this case the Dockerfile uses the base image Ubuntu:18.04, which is a few years old and not up to date. So the default versions of pip, wheel and setuptools will also be outdated. There is a usable wheel available on pypi of the spacy package, but I think the old version of pip (or perhaps wheel?) doesn't know where to look for it.Botnick
Thank you very much Haken! I learned a lot from your answer and more detailed explanations, this made my day!Gallic

© 2022 - 2024 — McMap. All rights reserved.