HTML - How do I know when all frames are loaded?
Asked Answered
S

12

12

I'm using .NET WebBrowser control. How do I know when a web page is fully loaded?

I want to know when the browser is not fetching any more data. (The moment when IE writes 'Done' in its status bar...).

Notes:

  • The DocumentComplete/NavigateComplete events might occur multiple times for a web site containing multiple frames.
  • The browser ready state doesn't solve the problem either.
  • I have tried checking the number of frames in the frame collection and then count the number of times I get DocumentComplete event but this doesn't work either.
  • this.WebBrowser.IsBusy doesn't work either. It is always 'false' when checking it in the Document Complete handler.
Sim answered 23/3, 2009 at 9:44 Comment(0)
S
0

Here's what finally worked for me:

       public bool WebPageLoaded
    {
        get
        {
            if (this.WebBrowser.ReadyState != System.Windows.Forms.WebBrowserReadyState.Complete)
                return false;

            if (this.HtmlDomDocument == null)
                return false;

            // iterate over all the Html elements. Find all frame elements and check their ready state
            foreach (IHTMLDOMNode node in this.HtmlDomDocument.all)
            {
                IHTMLFrameBase2 frame = node as IHTMLFrameBase2;
                if (frame != null)
                {
                    if (!frame.readyState.Equals("complete", StringComparison.OrdinalIgnoreCase))
                        return false;

                }
            }

            Debug.Print(this.Name + " - I think it's loaded");
            return true;
        }
    }

On each document complete event I run over all the html element and check all frames available (I know it can be optimized). For each frame I check its ready state. It's pretty reliable but just like jeffamaphone said I have already seen sites that triggered some internal refreshes. But the above code satisfies my needs.

Edit: every frame can contain frames within it so I think this code should be updated to recursively check the state of every frame.

Sim answered 26/3, 2009 at 10:12 Comment(0)
U
3

Here's how I solved the problem in my application:

private void wbPost_DocumentCompleted(object sender, WebBrowserDocumentCompletedEventArgs e)
{
    if (e.Url != wbPost.Url)
        return;
    /* Document now loaded */
}
Until answered 24/2, 2010 at 21:26 Comment(1)
If you do e.g. a click in a navigation bar and it causes that a new web site is reloaded in a frame/iframe, you won't be happy with this solution.Conchiferous
K
2

My approach to doing something when page is completely loaded (including frames) is something like this:

using System.Windows.Forms;
    protected delegate void Procedure();
    private void executeAfterLoadingComplete(Procedure doNext) {
        WebBrowserDocumentCompletedEventHandler handler = null;
        handler = delegate(object o, WebBrowserDocumentCompletedEventArgs e)
        {
            ie.DocumentCompleted -= handler;
            Timer timer = new Timer();
            EventHandler checker = delegate(object o1, EventArgs e1)
            {
                if (WebBrowserReadyState.Complete == ie.ReadyState)
                {
                    timer.Dispose();
                    doNext();
                }
            };
            timer.Tick += checker;
            timer.Interval = 200;
            timer.Start();
        };
        ie.DocumentCompleted += handler;
    }

From my other approaches I learned some "don't"-s:

  • don't try to bend the spoon ... ;-)
  • don't try to build elaborate construct using DocumentComplete, Frames, HtmlWindow.Load events. Your solution will be fragile if working at all.
  • don't use System.Timers.Timer instead of Windows.Forms.Timer, strange errors will begin to occur in strange places if you do, due to timer running on different thread that the rest of your app.
  • don't use just Timer without DocumentComplete because it may fire before your page even begins to load and will execute your code prematurely.
Karnes answered 31/1, 2010 at 17:22 Comment(0)
C
2

Here's my tested version. Just make this your DocumentCompleted Event Handler and place the code that you only want be called once into the method OnWebpageReallyLoaded(). Effectively, this approach determines when the page has been stable for 200ms and then does its thing.

// event handler for when a document (or frame) has completed its download
Timer m_pageHasntChangedTimer = null;
private void webBrowser_DocumentCompleted( object sender, WebBrowserDocumentCompletedEventArgs e ) {
    // dynamic pages will often be loaded in parts e.g. multiple frames
    // need to check the page has remained static for a while before safely saying it is 'loaded'
    // use a timer to do this

    // destroy the old timer if it exists
    if ( m_pageHasntChangedTimer != null ) {
        m_pageHasntChangedTimer.Dispose();
    }

    // create a new timer which calls the 'OnWebpageReallyLoaded' method after 200ms
    // if additional frame or content is downloads in the meantime, this timer will be destroyed
    // and the process repeated
    m_pageHasntChangedTimer = new Timer();
    EventHandler checker = delegate( object o1, EventArgs e1 ) {
        // only if the page has been stable for 200ms already
        // check the official browser state flag, (euphemistically called) 'Ready'
        // and call our 'OnWebpageReallyLoaded' method
        if ( WebBrowserReadyState.Complete == webBrowser.ReadyState ) {
            m_pageHasntChangedTimer.Dispose();
            OnWebpageReallyLoaded();
        }
    };
    m_pageHasntChangedTimer.Tick += checker;
    m_pageHasntChangedTimer.Interval = 200;
    m_pageHasntChangedTimer.Start();
}

OnWebpageReallyLoaded() {
    /* place your harvester code here */
}
Comedian answered 13/4, 2010 at 14:28 Comment(0)
S
0

Have you tried WebBrowser.IsBusy property?

Systemic answered 23/3, 2009 at 13:11 Comment(1)
yep. The web browser claims not to be busy each time the document complete handler is called...Sim
M
0

How about using javascript in each frame to set a flag when the frame is complete, and then have C# look at the flags?

Mesonephros answered 23/3, 2009 at 13:33 Comment(2)
I don't want to manipulate the DOM tree of every site that the browser is navigating to. But suppose I do use your solution, how do I do it in javascript?Sim
I don't see the advantage of doing this in JS vs C#.Mansell
G
0

I don't have an alternative for you, but I wonder if the IsBusy property being true during the Document Complete handler is because the handler is still running and therefore the WebBrowser control is technically still 'busy'.

The simplest solution would be to have a loop that executes every 100 ms or so until the IsBusy flag is reset (with a max execution time in case of errors). That of course assumes that IsBusy will not be set to false at any point during page loading.

If the Document Complete handler executes on another thread, you could use a lock to send your main thread to sleep and wake it up from the Document Complete thread. Then check the IsBusy flag, re-locking the main thread is its still true.

Germanophile answered 23/3, 2009 at 13:45 Comment(2)
But the IsBusy is set to false too early. For example, if you have six frames in a web page, when the first frame completes loading, the IsBusy is false on DocumentComplete event.Sim
Each frame gets its own webbrowser (IWebBrowser2 implementation). Likely the IsBusy attribute only applies to the specific frame. And when it's complete, its no longer busy.Mansell
A
0

I'm not sure it'll work but try to add a JavaScript "onload" event on your frameset like that :

function everythingIsLoaded() { alert("everything is loaded"); }
var frameset = document.getElementById("idOfYourFrameset");
if (frameset.addEventListener)
    frameset.addEventListener('load',everythingIsLoaded,false); 
else
    frameset.attachEvent('onload',everythingIsLoaded); 
Ardoin answered 25/3, 2009 at 15:22 Comment(2)
I want to be able to know if all frames are loaded for any web site so I don't know which frames it contains.Sim
You should do that on the frameset (parent of all frames), not on each frame. It's pretty easy to get it from any web site like that : document.getElementsByTagName('frameset')[0]Ardoin
S
0

Can you use jQuery? Then you could easily bind frame ready events on the target frames. See this answer for directions. This blog post also has a discussion about it. Finally there is a plug-in that you could use.

The idea is that you count the number of frames in the web page using:

$("iframe").size()

and then you count how many times the iframe ready event has been fired.

Sotos answered 26/3, 2009 at 7:35 Comment(0)
M
0

You will get a BeforeNavigate and DocumentComplete event for the outer web page, as well as each frame. You know you're done when you get the DocumentComplete event for the outer webpage. You should be able to use the managed equivilent of IWebBrowser2::TopLevelContainer() to determine this.

Beware, however, the website itself can trigger more frame navigations anytime it wants, so you never know if a page is truly done forever. The best you can do is keep a count of all the BeforeNavigates you see and decrement the count when you get a DocumentComplete.

Edit: Here's the managed docs: TopLevelContainer.

Mansell answered 26/3, 2009 at 7:39 Comment(2)
I tried counting the before navigates and the document complete in the WebBrowser control. It is not synced... :(. There are more before navigate than document complete. [Maybe it has to do with caching or duplicate frames that are fetched. I don't know].Sim
Regarding the document complete event: in C# WebBrowser you don't get the document object that just completed loading. Just the url. So you can't get to its browser container.Sim
S
0

Here's what finally worked for me:

       public bool WebPageLoaded
    {
        get
        {
            if (this.WebBrowser.ReadyState != System.Windows.Forms.WebBrowserReadyState.Complete)
                return false;

            if (this.HtmlDomDocument == null)
                return false;

            // iterate over all the Html elements. Find all frame elements and check their ready state
            foreach (IHTMLDOMNode node in this.HtmlDomDocument.all)
            {
                IHTMLFrameBase2 frame = node as IHTMLFrameBase2;
                if (frame != null)
                {
                    if (!frame.readyState.Equals("complete", StringComparison.OrdinalIgnoreCase))
                        return false;

                }
            }

            Debug.Print(this.Name + " - I think it's loaded");
            return true;
        }
    }

On each document complete event I run over all the html element and check all frames available (I know it can be optimized). For each frame I check its ready state. It's pretty reliable but just like jeffamaphone said I have already seen sites that triggered some internal refreshes. But the above code satisfies my needs.

Edit: every frame can contain frames within it so I think this code should be updated to recursively check the state of every frame.

Sim answered 26/3, 2009 at 10:12 Comment(0)
A
0

I just use the webBrowser.StatusText method. When it says "Done" everything is loaded! Or am I missing something?

Avens answered 30/3, 2010 at 20:54 Comment(0)
R
0

Checking for IE.readyState = READYSTATE_COMPLETE should work, but if that's not proving reliable for you and you literally want to know "the moment when IE writes 'Done' in its status bar", then you can do a loop until IE.StatusText contains "Done".

Report answered 3/11, 2011 at 4:10 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.