How to set up rselenium for R?
Asked Answered
G

3

16

"everything was better back then"...

since firefox 49 (?) you can't use the rselenium package not straightforward anymore. I have searched the whole internet to find a SIMPLE How To Manual for setting up rselenium but did not find anything relevant and uptodate.

Can someone provide me and all the others out there who have no clue a simple How To manual? Like:

  1. download XY
  2. open AB

so I can run code like the following

require(RSelenium)

remDr <- remoteDriver(remoteServerAddr = "localhost", port = 4444L, 
browserName = "firefox")
remDr$open()
Gynophore answered 26/2, 2017 at 12:59 Comment(2)
Depending on what you're trying to do, you may want to try out github.com/hrbrmstr/splashr though hundreds if not thousands of folks seem to be able to do what you're asking w/o issue. – Skateboard
I have used splashr and can recommend it also πŸ‘ – Anuran
A
11

Download latest version of RSelenium >= 1.7.1. Run the following:

library(RSelenium)
rD <- rsDriver() # runs a chrome browser, wait for necessary files to download
remDr <- rD$client
# no need for remDr$open() browser should already be open

If you want a firefox browser use rsDriver(browser = "firefox").

This is detailed in http://rpubs.com/johndharrison/RSelenium-Basics appendix. The recommended way to run RSelenium is via Docker containers however. Instructions for use of Docker with RSelenium can be found at http://rpubs.com/johndharrison/RSelenium-Docker

ISSUES:

If you have issues which may occur due to admin rights or other variables such as anti-virus software you can run a Selenium server manually. The easiest way to do this is via the wdman package:

selCommand<- 
  wdman::selenium(jvmargs = c("-Dwebdriver.chrome.verboseLogging=true"), 
                  retcommand = TRUE)
> cat(selCommand)
C:\PROGRA~3\Oracle\Java\javapath\java.exe -Dwebdriver.chrome.verboseLogging=true -Dwebdriver.chrome.driver="C:\Users\john\AppData\Local\binman\binman_chromedriver\win32\2.27/chromedriver.exe" -Dwebdriver.gecko.driver="C:\Users\john\AppData\Local\binman\binman_geckodriver\win64\0.14.0/geckodriver.exe" -Dphantomjs.binary.path="C:\Users\john\AppData\Local\binman\binman_phantomjs\windows\2.1.1/phantomjs-2.1.1-windows/bin/phantomjs.exe" -jar "C:\Users\john\AppData\Local\binman\binman_seleniumserver\generic\3.0.1/selenium-server-standalone-3.0.1.jar" -port 4567

Using one of the wdman functions with the retcommand option enabled will return the commandline call that would have been ran.

Now you can run the output of cat(selCommand) in a terminal

C:\Users\john>C:\PROGRA~3\Oracle\Java\javapath\java.exe -Dwebdriver.chrome.verboseLogging=true -Dwebdriver.chrome.driver="C:\Users\john\AppData\Local\binman\binman_chromedriver\win32\2.27/chromedriver.exe" -Dwebdriver.gecko.driver="C:\Users\john\AppData\Local\binman\binman_geckodriver\win64\0.14.0/geckodriver.exe" -Dphantomjs.binary.path="C:\Users\john\AppData\Local\binman\binman_phantomjs\windows\2.1.1/phantomjs-2.1.1-windows/bin/phantomjs.exe" -jar "C:\Users\john\AppData\Local\binman\binman_seleniumserver\generic\3.0.1/selenium-server-standalone-3.0.1.jar" -port 4567
12:15:29.206 INFO - Selenium build info: version: '3.0.1', revision: '1969d75'
12:15:29.206 INFO - Launching a standalone Selenium Server
2017-02-08 12:15:29.223:INFO::main: Logging initialized @146ms
12:15:29.265 INFO - Driver class not found: com.opera.core.systems.OperaDriver
12:15:29.265 INFO - Driver provider com.opera.core.systems.OperaDriver registration is skipped:
Unable to create new instances on this machine.
12:15:29.265 INFO - Driver class not found: com.opera.core.systems.OperaDriver
12:15:29.266 INFO - Driver provider com.opera.core.systems.OperaDriver is not registered
12:15:29.271 INFO - Driver provider org.openqa.selenium.safari.SafariDriver registration is skipped:
 registration capabilities Capabilities [{browserName=safari, version=, platform=MAC}] does not match the current platform WIN10
2017-02-08 12:15:29.302:INFO:osjs.Server:main: jetty-9.2.15.v20160210
2017-02-08 12:15:29.317:INFO:osjsh.ContextHandler:main: Started o.s.j.s.ServletContextHandler@c4c815{/,null,AVAILABLE}
2017-02-08 12:15:29.332:INFO:osjs.ServerConnector:main: Started ServerConnector@4af044{HTTP/1.1}{0.0.0.0:4567}
2017-02-08 12:15:29.333:INFO:osjs.Server:main: Started @257ms
12:15:29.334 INFO - Selenium Server is up and running

Now try and run a browser

remDr <- remoteDriver(port = 4567L, browserName = "chrome")
remDr$open()

If you are unable to manually run a Selenium Server then you will need to address your issues (including relevant log files) to the Selenium project or the appropriate driver project (chromedriver/geckodriver/ghostdirver etc.)

Anuran answered 26/2, 2017 at 13:12 Comment(1)
When I run rD <- rsDriver() cmd, RSrudio crashes at BEGIN: POSTDOWNLOAD. The last message I get is: " BEGIN: POSTDOWNLOAD This application has requested the Runtime to terminate it in an unusual way." – Mauk
D
7
  1. Download Docker at https://www.docker.com/products/docker-desktop

  2. Run docker pull selenium/standalone-chrome-debug in terminal (or cmd for windows)

  3. In Docker Desktop's Dashboard, go to the "images" tab on the left. After that, you should see something like this: enter image description here Click Run

  4. A popup will appear. There, click on "Optional Settings" enter image description here

  5. Type 4445 on Ports. Click on the "plus" sign, type 5901 on the other input that will be created on Ports. It should look like the image below. After that, click Run. enter image description here

  6. Now, if you click on the Containers / Apps tab on the left, there should be something like this: enter image description here

  7. In Rs console, go:

    install.packages("RSelenium")
    library(RSelenium)
    
    remDr <- remoteDriver(
            remoteServerAdd = "localhost",
            port = 4445L,
            browser = "chrome"
    )
    
    remDr$open()
    

Every time you want RSelenium to work you will need to run the Docker container as you did in steps 3 and 5 above.

The steps also allow you to use VNC to watch what happens and debug. If you need to learn a bit about it go to https://www.realvnc.com/pt/connect/download/viewer/ More details are out of the scope of this topic.

Well, I think this can take you to a point where you can now follow these instructions of RSelenium's basic usage vignette: https://cran.r-project.org/web/packages/RSelenium/vignettes/basics.html

You should also read about security related to exposed ports and how to handle it. These videos from R Consortium may help you out from here on: https://www.youtube.com/watch?v=OxbvFiYxEzI and https://www.youtube.com/watch?v=JcIeWiljQG4

I hope it may help you as you would have helped me some time ago.

Deference answered 8/12, 2020 at 3:17 Comment(1)
bookmarking this answer! – Bidden
L
2

I could never get RSelenium to work for me because I got the "could not determine server status" error. Finally, I figured out how to make it work, after reading an answer on this post, and the solution is very simple.

The issue is that Selenium by default runs with the latest versions of both Chrome and Firefox, regardless of what browser you specify. So, the Chrome driver will interfere with Firefox even if you specify Firefox. To avoid this issue, specify an older version of the Chrome driver or no driver at all (NULL).

driver <- rsDriver(browser = "firefox",
                   chromever = NULL,
                   verbose = FALSE)
remote_driver <- driver[["client"]] 
remote_driver$navigate("www.someurlhere.com")

This will open someurlhere.com in a Firefox browser. I still have never been able to make it work for Chrome.

Limerick answered 6/4, 2023 at 18:53 Comment(2)
I created a stack overflow account, then went and earned enough reputation just so I could upvote and comment that you're a lifesaver. No idea why this was so difficult to find as a solution but it worked fantastically for me. Thank you! – Elope
this here is what worked for me. thanks – Upswell

© 2022 - 2024 β€” McMap. All rights reserved.