HTMLUnit doesn't wait for Javascript
Asked Answered
A

4

23

I have a GWT based page that I would like to create an HTML snapshot for it using HtmlUnit. The page loads using Ajax/JavaScript information on a product, so for about 1 second there is a Loading... message and then the content appears.

The problem is that HtmlUnit doesn't seem to capture the information and all I'm getting is the "Loading..." span.

Below is an experimental code with HtmlUnit where I try to give it enough time to wait for the loading of the data but it doesn't seem to change anything and I am still unable to capture the data loaded by the GWT javascript.

        WebClient webClient = new WebClient();
        webClient.setJavaScriptEnabled(true);
        webClient.setThrowExceptionOnScriptError(false);
        webClient.setAjaxController(new NicelyResynchronizingAjaxController()); 

        WebRequest request = new WebRequest(new URL("<my_url>"));
        HtmlPage page = webClient.getPage(request);

        int i = webClient.waitForBackgroundJavaScript(1000);

        while (i > 0)
        {
            i = webClient.waitForBackgroundJavaScript(1000);

            if (i == 0)
            {
                break;
            }
            synchronized (page) 
            {
                System.out.println("wait");
                page.wait(500);
            }
        }

        webClient.getAjaxController().processSynchron(page, request, false);

        System.out.println(page.asXml());

Any ideas...?

Aw answered 5/4, 2011 at 16:28 Comment(0)
A
19

Thank you for responding. I actually should have reported this sooner that I have found the solution myself. Apparently when initialising WebClient with FF:

WebClient webClient = new WebClient(BrowserVersion.FIREFOX_3_6);

It seem to be working. When initialising WebClient with the default constructor it uses IE7 by default and I guess FF has better support for Ajax and is the recommended emulator to use.

Aw answered 20/4, 2011 at 9:21 Comment(3)
I have to comment this one. Had the same problem and was trying to debug the whole code.. Thanks so much for this.Paraguay
Hi. I have the same problem, using FIREFOX despite of IE makes the pages load almost properly now, but still I'm stuck on ...Loading... message, it should be around 9 seconds, used your part of code as well and nothing :( please helpRao
Thank you! I've lost several HOURS on debugging until i found your comment!Wimberly
C
15

I believe by default NicelyResynchronizingAjaxController will only resynchronize AJAX calls that were caused by a user action, by tracking which thread it originated from. Perhaps the GWT generated JavaScript is being called by some other thread which NicelyResynchronizingAjaxController does not want to wait for.

Try declaring your own AjaxController to synchronize with everything regardless of originating thread:

webClient.setAjaxController(new AjaxController(){
    @Override
    public boolean processSynchron(HtmlPage page, WebRequest request, boolean async)
    {
        return true;
    }
});
Clyve answered 19/4, 2011 at 23:4 Comment(0)
O
5

As documentation states, waitForBackgroundJavaScript is experimental:

Experimental API: May be changed in next release and may not yet work perfectly!

The next approach has always worked for me, regardless of the BrowserVersion used:

int tries = 5;  // Amount of tries to avoid infinite loop
while (tries > 0 && aCondition) {
    tries--;
    synchronized(page) {
        page.wait(2000);  // How often to check
    }
}

Note aCondition is whatever you're checking for. EG:

page.getElementById("loading-text-element").asText().equals("Loading...")
Orlosky answered 13/8, 2014 at 5:1 Comment(1)
Polling like this worked fine for me so far as well.Warplane
C
3

None of the so far provided solutions worked for me. I ended up with Dan Alvizu's solution + my own hack:

private WebClient webClient = new WebClient();

public void scrapPage() {
    makeWebClientWaitThroughJavaScriptLoadings();
    HtmlPage page = login();
    //do something that causes JavaScript loading
    waitOutLoading(page);
}

private void makeWebClientWaitThroughJavaScriptLoadings() {
    webClient.setAjaxController(new AjaxController(){
        @Override
        public boolean processSynchron(HtmlPage page, WebRequest request, boolean async)
        {
            return true;
        }
    });
}

private void waitOutLoading(HtmlPage page) {
    while(page.asText().contains("Please wait while loading!")){
        webClient.waitForBackgroundJavaScript(100);
    }
}

Needless to say, "Please wait while loading!" should be replaced with whatever text is shown while your page is loading. If there is no text, maybe there is a way to check for existence of some gif (if that is used). Of course, you could simply provide a big enough milliseconds value if you're feeling adventurous.

Chrysanthemum answered 5/8, 2014 at 20:42 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.