Cant open https web using Slimerjs, casperjs, phantomjs
Asked Answered
T

2

10

This is first time i cant open website using headless browser such: phantomjs, slimerjs or casperjs. I just want to open website. I just create very basic script to open the website and take screenshot. but 3 (three) of them give me blank picture.

i try using:

--debug=true 
--ssl-protocol=TLSv1.2 (i try each of available protocol) 
--ignore-ssl-errors=true

Here my script:

Slimerjs

var page = require("webpage").create();
page.open("https://domain/")
    .then(function(status){
         if (status == "success") {
            page.viewportSize = { width:1024, height:768 };
            page.render('screenshot.png');
         }
         else {
             console.log("Sorry, the page is not loaded");
         }
         page.close();
         phantom.exit();
    });

phantomjs

var page = require('webpage').create();
page.open('https://domain/', function() {
  page.render('screenshot.png');
  phantom.exit();
});

casperjs

var casper = require('casper').create({
  viewportSize: {width: 950, height: 950}
});

casper.start('https://domain/', function() {
    this.capture('screenshot.png');
});

casper.run();

I even try to use screen capture service to know if they can open or not. But all of them give me nothing too.

is there i miss something?

Thermobarograph answered 15/4, 2018 at 7:43 Comment(0)
M
6

The issue is not because of PhantomJS as such. The site you are checking is protected by a F5 network protection

https://devcentral.f5.com/articles/these-are-not-the-scrapes-youre-looking-for-session-anomalies

So its not that the page doesn't load. It is that the protection mechanism detects that PhantomJS is a bot based on checks they have implemented

Page Loaded

The easiest of fixes is to use Chrome instead of PhantomJS. Else it means a decent amount of investigation time

Some similar unanswered/answered question in the past

Selenium and PhantomJS : webpage thinks Javascript is disabled

PhantomJS get no real content running on AWS EC2 CentOS 6

file_get_contents while bypassing javascript detection

Python POST Request Not Returning HTML, Requesting JavaScript Be Enabled

I will update this post with more details that I find. But my experience says, go with what works instead of wasting time on such sites which don't work under PhantomJS

Update-1

I have tried to import the browser cookies to PhantomJS and it still won't work. Which means there is some hard checks

Cookies

Mattheus answered 18/4, 2018 at 4:29 Comment(3)
you really give me the clue. i will try it. btw. what do you mean about chrome. did you means puppeteer?Thermobarograph
You can use that if you want or Selenium + Chrome. Selenium has binding in almost all languagesMattheus
glad to spend a lot of reputation for this information. really worthed. thanks broThermobarograph
P
1

I experienced this issue with phantomJS and the following service args resolved it:

--ignore-ssl-errors=true
--ssl-protocol=any
--web-security=false
--proxy-type=None

Can't help you with casperJS and slimerJS, and don't know exactly why this worked.

Pseudonym answered 18/4, 2018 at 2:4 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.