CasperJS can not trigger twitter infinite scroll
Asked Answered
W

4

7

I am trying to get some information from twitter using CasperJS. And I'm stuck with infinite scroll. The thing is that even using jquery to scroll the page down nothings seems to work. Neither scrolling, neither triggering the exact event on window (smth like uiNearTheBottom) doesn't seem to help. Interesting thing - all of these attempts work when injecting JS code via js console in FF & Chrome. Here's the example code :

casper.thenEvaluate(function(){
    $(window).trigger('uiNearTheBottom');
});

or

casper.thenEvaluate(function(){
    document.body.scrollTop  =  document.body.scrollHeight;
});
Walden answered 8/7, 2013 at 7:23 Comment(1)
When CasperJS injects jQuery into the client-side page, it blocks content loaded by Twitter's infinite scrolling. This is a site specific issue. Please see my answer below for a solution.Breadstuff
P
4

If casper.scrollToBottom() fails you or casper.scroll_to_bottom(), then the one below will serve you:

this.page.scrollPosition = { top: this.page.scrollPosition["top"] + document.body.scrollHeight, left: 0 };

A working example:

casper.start(url, function () {
 this.wait(10000, function () {
    this.page.scrollPosition = { top: this.page.scrollPosition["top"] + document.body.scrollHeight, left: 0 };
    if (this.visible("div.load-more")) {
        this.echo("I am here");
    }
})});

It uses the underlying PhantomJS scroll found here

Placer answered 10/11, 2014 at 7:34 Comment(3)
Are you sure document.body.scrollHeight is in Casper context and not inside of a casper.evaluate?Alliber
@ArtjomB. I have added a working code. In fact, I'm presently using it in a scraping that I am doing. It involves calling the underlying code as found in PhantomJS.Placer
There's now a working copy of twitter scrapping with CasperJS at gist.github.com/nwaomachux/35d1c424966fccd16ae1Placer
V
2

CasperJs is based on PhantomJS and as per below discussion no window object exist for the headless browser.

You can check the discussion here

Virendra answered 8/7, 2013 at 7:41 Comment(1)
In at least, document exists in page context. And in the first time scroll is working. But tweets not loading.Walden
B
1

On Twitter you can use:

casper.scrollToBottom();
casper.wait(1000, function () {
    casper.capture("loadedContent.png");
});

But if you include jQuery... , the above code won't work!

var casper = require('casper').create({
    clientScripts: [
        'jquery-1.11.0.min.js'
    ]
});

The script injection blocks Twitter's infinite scroll from loading content. On BoingBoing.net, CasperJS scrollToBottom() works with jQuery without blocking. It really depends on the site.

However, you can inject jQuery after the content has loaded.

casper.scrollToBottom();
casper.wait(1000, function () {
    casper.capture("loadedContent.png");

    // Inject client-side jQuery library
    casper.options.clientScripts.push("jquery.js");

    // And use like so...
    var height = casper.evaluate(function () {
        return $(document).height();
    });
});
Breadstuff answered 4/5, 2014 at 19:51 Comment(0)
Z
0

I have adopted this from a previous answer

var iterations = 5; //amount of pages to go through
var timeToWait = 2000; //time to wait in milliseconds

var last;
var list = [];

for (i = 0; i <= iterations; i++) {
    list.push(i);
}

//evaluate this in the browser context and pass the timer back to casperjs
casper.thenEvaluate(function(iters, waitTime) {
    window.x = 0;
    var intervalID = setInterval(function() {
        console.log("Using setInternal " + window.x);
        window.scrollTo(0, document.body.scrollHeight); 

        if (++window.x === iters) {
            window.clearInterval(intervalID);
        }
    }, waitTime);
}, iterations, timeToWait);

casper.each(list, function(self, i) {

    self.wait(timeToWait, function() {
        last = i;
        this.echo('Using this.wait ' + i);
    });

});

casper.waitFor(function() {
    return (last === list[list.length - 1] && iterations === this.getGlobal('x'));
}, function() {
    this.echo('All done.')
});

Essentially what happens is I enter the page context, scroll to the bottom, and then wait 2 seconds for the content to load. Obviously I would have liked to use repeated applications of casper.scrollToBottom() or something more sophisticated, but the loading time wasn't allowing me to make this happen.

Zendejas answered 20/8, 2015 at 23:51 Comment(1)
where is the concept of inifnite scrolling here ? . you have just iterated over a loop .Hoff

© 2022 - 2024 — McMap. All rights reserved.