I am currently developing a Python program which has a segment which uses a headless version of Chrome and Selenium to perform a repetitive process. I am aiming to run the program on Lambda.
The overall program has around 1GB of dependencies so the option to use the standard method of using a .zip archive, containing all my function code and dependencies is not an option as the total unzipped size of the function and all layers can't exceed the unzipped deployment package size limit of 250 MB.
So, that is where the new AWS Lambda – Container Image Support (I used this linked tutorial to develop this whole implementation so please read if you need more info) comes in. This allows me to package and deploy my Lambda function as container images of up to 10 GB in size.
I am using the base image hosted in ECR Public provided by AWS which runs Amazon Linux 2. Firstly - in my Dockerfile I:
- Download the base image.
- Define some global variables.
- Copy my files over.
- Install my pip appendices
- Use yum to install some packages.
and finally - I install both Chrome (87.0.4280.88 at the time of reading) and Chromedriver (87.0.4280.88)
- Finally download install both latest versions of Chrome and Chromedriver
there is a possibility this could be where the problem lies, but I highly doubt this as both are the same version - ChromeDriver uses the same version number scheme as Chrome.
This is my Dockerfile:
# 1) DOWNLOAD BASE IMAGE.
FROM public.ecr.aws/lambda/python:3.8
# 2) DEFINE GLOBAL ARGS.
ARG MAIN_FILE="main.py"
ARG ENV_FILE="params.env"
ARG REQUIREMENTS_FILE="requirements.txt"
ARG FUNCTION_ROOT="."
ARG RUNTIME_VERSION="3.8"
# 3) COPY FILES.
# Copy The Main .py File.
COPY ${MAIN_FILE} ${LAMBDA_TASK_ROOT}
# Copy The .env File.
COPY ${ENV_FILE} ${LAMBDA_TASK_ROOT}
# Copy The requirements.txt File.
COPY ${REQUIREMENTS_FILE} ${LAMBDA_TASK_ROOT}
# Copy Helpers Folder.
COPY helpers/ ${LAMBDA_TASK_ROOT}/helpers/
# Copy Private Folder.
COPY priv/ ${LAMBDA_TASK_ROOT}/priv/
# Copy Source Data Folder.
COPY source_data/ ${LAMBDA_TASK_ROOT}/source_data/
# 4) INSTALL DEPENDENCIES.
RUN --mount=type=cache,target=/root/.cache/pip python3.8 -m pip install --upgrade pip
RUN --mount=type=cache,target=/root/.cache/pip python3.8 -m pip install wheel
RUN --mount=type=cache,target=/root/.cache/pip python3.8 -m pip install urllib3
RUN --mount=type=cache,target=/root/.cache/pip python3.8 -m pip install -r requirements.txt --default-timeout=100
# 5) DOWNLOAD & INSTALL CHROMEIUM + CHROMEDRIVER.
#RUN yum -y upgrade
RUN yum -y install wget unzip libX11 nano wget unzip xorg-x11-xauth xclock xterm
# Install Chrome
RUN wget https://intoli.com/install-google-chrome.sh
RUN bash install-google-chrome.sh
# Install Chromedriver
RUN wget https://chromedriver.storage.googleapis.com/87.0.4280.88/chromedriver_linux64.zip
RUN unzip ./chromedriver_linux64.zip
RUN rm ./chromedriver_linux64.zip
RUN mv -f ./chromedriver /usr/local/bin/chromedriver
RUN chmod 755 /usr/local/bin/chromedriver
# 5) SET CMD OF HANDLER.
CMD [ "main.lambda_handler" ]
This image always builds without a problem and creates my image as expected.
and my docker-compose.yml file:
version: "3.7"
services:
lambda:
image: tbg-lambda:latest
build: .
ports:
- "8080:8080"
env_file:
- ./params.env
So - now that the image is build I can test locally with cURL. Here, I am passing an empty JSON payload:
curl -XPOST "http://localhost:8080/2015-03-31/functions/function/invocations" -d '{}'
which runs the whole program perfectly start to end using Chrome headless mode with no errors.
So great - the Docker container works locally and as expected.
Lets upload it to ECR so I can use it with my Lambda Function (ECR URL changed for security):
aws ecr create-repository --repository-name tbg-lambda:latest --image-scanning-configuration scanOnPush=true
docker tag tbg-lambda:latest 123412341234.dkr.ecr.sa-east-1.amazonaws.com/tbg-lambda:latest
aws ecr get-login-password | docker login --username AWS --password-stdin 123412341234.dkr.ecr.sa-east-1.amazonaws.com
docker push 123412341234.dkr.ecr.sa-east-1.amazonaws.com/tbg-lambda:latest
Everything pushes up as expected - I then create my new lambda function, choosing "Container Image" as the function option and attach the IAM role with all the permissions I need:
I have the memory set the max value just to ensure this isn't the problem:
Ok - so lets to get to the point of failure:
I use a test event to invoke the function through the console:
Everything runs perfectly until it hits the code which creates the webdriver driver with Chrome:
options = Options()
options.add_argument('--no-sandbox')
options.add_argument('--headless')
options.add_argument('--single-process')
options.add_argument('--disable-dev-shm-usage')
options.add_argument('--remote-debugging-port=9222')
options.add_argument('--disable-infobars')
driver = webdriver.Chrome(
service_args=["--verbose", "--log-path={}".format(logPath)],
executable_path=f"/usr/local/bin/chromedriver",
options=options
)
PS: logPath is just another folder in the project directory - the logs output here as expected, the logs are shown below.
Heres is the part of the Cloudwatch Logs where the error is highlighted:
Caught WebDriverException Error: unknown error: Chrome failed to start: crashed.
(unknown error: DevToolsActivePort file doesn't exist)
(The process started from chrome location /usr/bin/google-chrome is no longer running, so ChromeDriver is assuming that Chrome has crashed.)
END RequestId: 7c933bca-5f0d-4458-9529-db28da677444
REPORT RequestId: 7c933bca-5f0d-4458-9529-db28da677444 Duration: 59104.94 ms Billed Duration: 59105 ms Memory Size: 10240 MB Max Memory Used: 481 MB
RequestId: 7c933bca-5f0d-4458-9529-db28da677444 Error: Runtime exited with error: exit status 1 Runtime.ExitError
And here is the full Chromedriver log file:
[1608748453.064][INFO]: Starting ChromeDriver 87.0.4280.88 (89e2380a3e36c3464b5dd1302349b1382549290d-refs/branch-heads/4280@{#1761}) on port 54581
[1608748453.064][INFO]: Please see https://chromedriver.chromium.org/security-considerations for suggestions on keeping ChromeDriver safe.
[1608748453.064][INFO]: /dev/shm not writable, adding --disable-dev-shm-usage switch
[1608748453.679][SEVERE]: CreatePlatformSocket() failed: Address family not supported by protocol (97)
[1608748453.679][INFO]: listen on IPv6 failed with error ERR_ADDRESS_UNREACHABLE
[1608748454.432][INFO]: [13826d22c628514ca452d1f2949eb011] COMMAND InitSession {
"capabilities": {
"alwaysMatch": {
"browserName": "chrome",
"goog:chromeOptions": {
"args": [ "--no-sandbox", "--headless", "--single-process", "--disable-dev-shm-usage" ],
"extensions": [ ]
},
"platformName": "any"
},
"firstMatch": [ {
} ]
},
"desiredCapabilities": {
"browserName": "chrome",
"goog:chromeOptions": {
"args": [ "--no-sandbox", "--headless", "--single-process", "--disable-dev-shm-usage" ],
"extensions": [ ]
},
"platform": "ANY",
"version": ""
}
}
[1608748454.433][INFO]: Populating Preferences file: {
"alternate_error_pages": {
"enabled": false
},
"autofill": {
"enabled": false
},
"browser": {
"check_default_browser": false
},
"distribution": {
"import_bookmarks": false,
"import_history": false,
"import_search_engine": false,
"make_chrome_default_for_user": false,
"skip_first_run_ui": true
},
"dns_prefetching": {
"enabled": false
},
"profile": {
"content_settings": {
"pattern_pairs": {
"https://*,*": {
"media-stream": {
"audio": "Default",
"video": "Default"
}
}
}
},
"default_content_setting_values": {
"geolocation": 1
},
"default_content_settings": {
"geolocation": 1,
"mouselock": 1,
"notifications": 1,
"popups": 1,
"ppapi-broker": 1
},
"password_manager_enabled": false
},
"safebrowsing": {
"enabled": false
},
"search": {
"suggest_enabled": false
},
"translate": {
"enabled": false
}
}
[1608748454.433][INFO]: Populating Local State file: {
"background_mode": {
"enabled": false
},
"ssl": {
"rev_checking": {
"enabled": false
}
}
}
[1608748454.433][INFO]: Launching chrome: /usr/bin/google-chrome --disable-background-networking --disable-client-side-phishing-detection --disable-default-apps --disable-dev-shm-usage --disable-hang-monitor --disable-popup-blocking --disable-prompt-on-repost --disable-sync --enable-automation --enable-blink-features=ShadowDOMV0 --enable-logging --headless --log-level=0 --no-first-run --no-sandbox --no-service-autorun --password-store=basic --remote-debugging-port=0 --single-process --test-type=webdriver --use-mock-keychain --user-data-dir=/tmp/.com.google.Chrome.xgjs0h data:,
mkdir: cannot create directory ‘/.local’: Read-only file system
touch: cannot touch ‘/.local/share/applications/mimeapps.list’: No such file or directory
/usr/bin/google-chrome: line 45: /dev/fd/62: No such file or directory
/usr/bin/google-chrome: line 46: /dev/fd/62: No such file or directory
prctl(PR_SET_NO_NEW_PRIVS) failed
[1223/183429.578846:FATAL:zygote_communication_linux.cc(255)] Cannot communicate with zygote
Failed to generate minidump.[1608748469.769][INFO]: [13826d22c628514ca452d1f2949eb011] RESPONSE InitSession ERROR unknown error: Chrome failed to start: crashed.
(unknown error: DevToolsActivePort file doesn't exist)
(The process started from chrome location /usr/bin/google-chrome is no longer running, so ChromeDriver is assuming that Chrome has crashed.)
[1608748469.769][DEBUG]: Log type 'driver' lost 0 entries on destruction
[1608748469.769][DEBUG]: Log type 'browser' lost 0 entries on destruction
One thing that I might think could be the problem would the way lambda is running this container vs how I am running it locally.
Alot of people reccomend NOT to not run chrome as root - so is Lambda running the container as root and thats what is causing this? If so how can I tell Lambda or Docker to run the code as a non-root user.
This is mentioned here: https://github.com/heroku/heroku-buildpack-google-chrome/issues/46#issuecomment-484562558
I have been fighting with this error pretty much since AWS announced the lambda containers so any help with this would be brilliant 🙏 Please ask for any more info if I missed something!
Thanks in advance.