How to run headless Chrome with Selenium in Python?
S

11

129

I'm trying some stuff out with selenium, and I really want my script to run quickly.

I thought that running my script with headless Chrome would make it faster.

First, is that assumption correct, or does it not matter if I run my script with a headless driver?

I want headless Chrome to work, but somehow it isn't working correctly. I tried different things, and most suggested that it would work as said here in the October update:

How to configure ChromeDriver to initiate Chrome browser in Headless mode through Selenium?

But when I tried that, I saw weird console output, and it still doesn't seem to work.

Any tips appreciated.

Schlimazel answered 6/12, 2018 at 18:0 Comment(6)
That's pretty much outdated, or what do you mean? Maybe I miss a point, could you clarify what you mean?Schlimazel
headless won't make it run noticeably fasterPhotocell
@CoreyGoldberg how so, do you have any sources?Schlimazel
you should benchmark bothPhotocell
@Schlimazel What is the weird console output?Merlinmerlina
I agree with @CoreyGoldberg. Though, running headless has other advantagesGrandiloquent
L
223

To run chrome-headless just add --headless via chrome_options.add_argument, e.g.:

from selenium import webdriver 
from selenium.webdriver.chrome.options import Options

chrome_options = Options()
# chrome_options.add_argument("--disable-extensions")
# chrome_options.add_argument("--disable-gpu")
# chrome_options.add_argument("--no-sandbox") # linux only
chrome_options.add_argument("--headless=new") # for Chrome >= 109
# chrome_options.add_argument("--headless")
# chrome_options.headless = True # also works
driver = webdriver.Chrome(options=chrome_options)
start_url = "https://duckgo.com"
driver.get(start_url)
print(driver.page_source.encode("utf-8"))
# b'<!DOCTYPE html><html xmlns="http://www....
driver.quit()

So my thought is that running it with headless chrome would make my script faster.

Try using chrome options like --disable-extensions or --disable-gpu and benchmark it, but I wouldn't count with substantial improvement.


References: headless-chrome

Licensee answered 6/12, 2018 at 18:32 Comment(4)
@AndroidNoobie the edit suggested by ukashima huksay is one that is implemented if I recall correctly may 2018. It finds its way now for getting rep. ukashima huksay should have mentioned it though. (From Review).Noma
@ukashima huksay next time you find this chrome change mention it in a comment behind the change as I did a few weeks ago somewhere on a question. See also my previous comment above this one. (From Review).Noma
It's recommended to use chrome_options.add_argument("--headless=new") now for a better experience: selenium.dev/blog/2023/headless-is-going-awayEanes
If you are getting a security warning, try adding chrome_options.add_argument("user-agent=fake-useragent"). This fixed it for me.Prevaricator
O
28

Install & run containerized Chrome:

docker pull selenium/standalone-chrome
docker run --rm -d -p 4444:4444 --shm-size=2g selenium/standalone-chrome

Connect using webdriver.Remote:

driver = webdriver.Remote('http://localhost:4444/wd/hub', webdriver.DesiredCapabilities.CHROME)
driver.set_window_size(1280, 1024)
driver.get('https://www.google.com')
Ordinarily answered 31/1, 2020 at 19:3 Comment(6)
from selenium import webdriver and driver = webdriver.Remote('http://localhost:4444/wd/hub', webdriver.DesiredCapabilities.CHROME)Atween
what are the advantages of this over --headless?Destitution
Where do you get DesiredCapabilities from? I don't see the import... I think you meant to use webdriver.DesiredCapabilities?Spagyric
@GregWoods - This is a great solution, many websites (including websites that use 'Cloudfare DNS' to detect robots and crawlers will see the --headless flag in chrome, and will prevent you from browsing the website with your software. By using a docker container, you circumvent that --headless flag that can cause you to be blocked.Shea
@GregWoods Great solution, too, when you want to run Selenium tests on a desktop-less server. Before switching to this solution, we had to install a desktop on one of our servers only because otherwise, the Chrome package would not even install. Now we can run Selenium tests on any server that has a Docker environment.Cavity
This was great, but took some tweaking to work with selenium 4.15. Driver should now look like: driver = webdriver.Remote("http://localhost:4444/wd/hub", options=webdriver.ChromeOptions())Prevaricator
C
9
from time import sleep

from selenium import webdriver
from selenium.webdriver.chrome.options import Options

chrome_options = Options()
chrome_options.add_argument("--headless")

driver = webdriver.Chrome(executable_path="./chromedriver", options=chrome_options)
url = "https://mcmap.net/q/121965/-how-to-run-headless-chrome-with-selenium-in-python"
driver.get(url)

sleep(5)

h1 = driver.find_element_by_xpath("//h1[@itemprop='name']").text
print(h1)

Then I run script on our local machine

➜ python script.py
Running Selenium with Headless Chrome Webdriver

It is working and it is with headless Chrome.

Comradery answered 27/9, 2019 at 14:10 Comment(0)
S
6

If you are using Linux environment, you might have to add --no-sandbox as well and also specific window size settings. The --no-sandbox flag is not needed on Windows if you set user container properly.

Use --disable-gpu only on Windows. Other platforms no longer require it. The --disable-gpu flag is a temporary work around for a few bugs.

//Headless chrome browser and configure
            WebDriverManager.chromedriver().setup();
            ChromeOptions chromeOptions = new ChromeOptions();
            chromeOptions.addArguments("--no-sandbox");
            chromeOptions.addArguments("--headless");
            chromeOptions.addArguments("disable-gpu");
//          chromeOptions.addArguments("window-size=1400,2100"); // Linux should be activate
            driver = new ChromeDriver(chromeOptions);
Spectacled answered 7/12, 2018 at 7:11 Comment(0)
P
4

As stated by the accepted answer:

options.add_argument("--headless")

These tips might help to speed things up especially for headless:

There are quite a few things you can do in headless that you cant do in non headless

Since you will be using Chrome Headless, I've found adding this reduces the CPU usage by about 20% for me (I found this to be a CPU and memory hog when looking at htop)

--disable-crash-reporter

This will only disable when you are running in headless This might speed things up for you!!!

My settings are currently as follows and I reduce the CPU (but only a marginal time saving) by about 20%:

options.add_argument("--no-sandbox");
options.add_argument("--disable-dev-shm-usage");
options.add_argument("--disable-renderer-backgrounding");
options.add_argument("--disable-background-timer-throttling");
options.add_argument("--disable-backgrounding-occluded-windows");
options.add_argument("--disable-client-side-phishing-detection");
options.add_argument("--disable-crash-reporter");
options.add_argument("--disable-oopr-debug-crash-dump");
options.add_argument("--no-crash-upload");
options.add_argument("--disable-gpu");
options.add_argument("--disable-extensions");
options.add_argument("--disable-low-res-tiling");
options.add_argument("--log-level=3");
options.add_argument("--silent");

I found this to be a pretty good list (full list I think) of command line switches with explanations: https://peter.sh/experiments/chromium-command-line-switches/

Some additional things you can turn off are also mentioned here: https://github.com/GoogleChrome/chrome-launcher/blob/main/docs/chrome-flags-for-tools.md

I hope this helps someone

Peel answered 24/1, 2023 at 15:17 Comment(0)
C
4

Recently there is an update performed on headless mode of Chrome. The flag --headless is now modified and can be used as below

  • For Chrome version 109 and above, --headless=new flag allows us to explore full functionality Chrome browser in headless mode.
  • For Chrome version 108 and below (till Version 96), --headless=chrome option will provide us the headless chrome browser.

So, let's add

options.add_argument("--headless=new")

for newer version of Chrome in headless mode as mentioned above.

Creamery answered 23/2, 2023 at 3:15 Comment(0)
C
3

Once you have selenium and web driver installed. Below worked for me with headless Chrome on linux cluster :

from selenium import webdriver
options = webdriver.ChromeOptions()
options.add_argument("--headless")
options.add_argument("--disable-extensions")
options.add_argument("--disable-dev-shm-usage")
options.add_argument("--no-sandbox")
options.add_experimental_option("prefs",{"download.default_directory":"/databricks/driver"})
driver = webdriver.Chrome(chrome_options=options)
Contributory answered 29/4, 2020 at 12:26 Comment(0)
M
2

Todo (tested on headless server Debian Linux 9.4):

  1. Do this:

    # install chrome
    curl -sS -o - https://dl-ssl.google.com/linux/linux_signing_key.pub | apt-key add -
    echo "deb [arch=amd64]  http://dl.google.com/linux/chrome/deb/ stable main" >> /etc/apt/sources.list.d/google-chrome.list
    apt-get -y update
    apt-get -y install google-chrome-stable
    
    # install chrome driver
    wget https://chromedriver.storage.googleapis.com/77.0.3865.40/chromedriver_linux64.zip
    unzip chromedriver_linux64.zip
    mv chromedriver /usr/bin/chromedriver
    chown root:root /usr/bin/chromedriver
    chmod +x /usr/bin/chromedriver
    
  2. Install selenium:

    pip install selenium
    

    and run this Python code:

    from selenium import webdriver
    from selenium.webdriver.chrome.options import Options
    options = Options()
    options.add_argument("no-sandbox")
    options.add_argument("headless")
    options.add_argument("start-maximized")
    options.add_argument("window-size=1900,1080"); 
    driver = webdriver.Chrome(chrome_options=options, executable_path="/usr/bin/chromedriver")
    driver.get("https://www.example.com")
    html = driver.page_source
    print(html)
    
Malayopolynesian answered 14/10, 2019 at 8:29 Comment(0)
S
2
from selenium import webdriver
from selenium.webdriver.chrome.options import Options
chrome_options = Options()
chrome_options.add_argument("--headless")
driver = webdriver.Chrome(executable_path=r"C:\Program 
Files\Google\Chrome\Application\chromedriver.exe", options=chrome_options)

This is ok for me.

Selene answered 13/8, 2022 at 10:1 Comment(0)
D
0

You can run headless Chrome with Selenium in Python as shown below. *--headless=new is better because--headless uses old headless mode according Headless is Going Away!:

from selenium import webdriver

options = webdriver.ChromeOptions()
options.add_argument("--headless=new") # Here
driver = webdriver.Chrome(options=options)

Or:

from selenium import webdriver
from selenium.webdriver.chrome.options import Options

options = Options()
options.add_argument("--headless=new") # Here
driver = webdriver.Chrome(options=options)

In addition, the examples below can test Django Admin with headless Chrome, Selenium, pytest-django and Django. *My answer explains how to test Django Admin with multiple headless browsers(Chrome, Microsoft Edge and Firefox), Selenium, pytest-django and Django:

# "tests/test_1.py"

import pytest
from selenium import webdriver
from django.test import LiveServerTestCase

@pytest.fixture(scope="class")
def chrome_driver_init(request):
    options = webdriver.ChromeOptions()
    options.add_argument("--headless=new")
    chrome_driver = webdriver.Chrome(options=options)
    request.cls.driver = chrome_driver
    yield
    chrome_driver.close()

@pytest.mark.usefixtures("chrome_driver_init")
class Test_URL_Chrome(LiveServerTestCase):
    def test_open_url(self):
        self.driver.get(("%s%s" % (self.live_server_url, "/admin/")))
        assert "Log in | Django site admin" in self.driver.title

Or:

# "tests/conftest.py"

import pytest
from selenium import webdriver

@pytest.fixture(scope="class")
def chrome_driver_init(request):
    options = webdriver.ChromeOptions()
    options.add_argument("--headless=new")
    chrome_driver = webdriver.Chrome(options=options)
    request.cls.driver = chrome_driver
    yield
    chrome_driver.close()
# "tests/test_1.py"

import pytest
from django.test import LiveServerTestCase

@pytest.mark.usefixtures("chrome_driver_init")
class Test_URL_Chrome(LiveServerTestCase):
    def test_open_url(self):
        self.driver.get(("%s%s" % (self.live_server_url, "/admin/")))
        assert "Log in | Django site admin" in self.driver.title
Distort answered 28/8, 2023 at 2:14 Comment(0)
F
0

There are different ways of running Chrome in headless environments. (You'll find more details in this answer: https://mcmap.net/q/122264/-downloading-with-chrome-headless-and-selenium)

One, the standard headless mode: (Faster than headed mode, but you may experience compatibility issues.)

options.add_argument("--headless")

Then there's the new Chrome headless mode as of Chrome 109: (It runs at the same speed as headed mode, as the two are virtually identical.)

options.add_argument("--headless=new")

(Between Chrome 96 and 108, that new mode used to be --headless=chrome, but it was renamed.)

You can also run regular Chrome in a headless environment if using a headless display, such as Xvfb with a Python program that controls it, such as pyvirtualdisplay. (See https://mcmap.net/q/122265/-how-do-i-run-selenium-in-xvfb and https://mcmap.net/q/117788/-can-selenium-webdriver-open-browser-windows-silently-in-the-background)

from pyvirtualdisplay import Display
from selenium import webdriver

display = Display(visible=0, size=(800, 600))
display.start()

driver = webdriver.Chrome()
driver.get('http://www.google.com')
driver.quit()

display.stop()

For more compatibility, you can try combining the above together with new Chrome headless mode:

options.add_argument("--headless=new")
Fail answered 29/8, 2023 at 1:2 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.