Getting HTML body content in WinForms WebBrowser after body onload event executes
Asked Answered
O

1

8

I have a WebBrowser control in WinForms whose URL property is set to an external webpage. I also have an event handler for the DocumentCompleted event. Inside this handler, I'm trying to get specific elements, but wb.Document.Body seems to capture the HTML before onload is executed.

{System.Windows.Forms.HtmlElement}
    All: {System.Windows.Forms.HtmlElementCollection}
    CanHaveChildren: true
    Children: {System.Windows.Forms.HtmlElementCollection}
    ClientRectangle: {X = 0 Y = 0 Width = 1200 Height = 0}
    Document: {System.Windows.Forms.HtmlDocument}
    DomElement: {mshtml.HTMLBodyClass}
    ElementShim: {System.Windows.Forms.HtmlElement.HtmlElementShim}
    Enabled: true
    FirstChild: null
    htmlElement: {mshtml.HTMLBodyClass}
    Id: null
    InnerHtml: "\n"
    InnerText: null
    Name: ""
    NativeHtmlElement: {mshtml.HTMLBodyClass}
    NextSibling: null
    OffsetParent: null
    OffsetRectangle: {X = 0 Y = 0 Width = 1200 Height = 0}
    OuterHtml: "<body onload=\"evt_Login_onload(event);\" uitheme=\"Web\">\n</body>"
    OuterText: null
    Parent: {System.Windows.Forms.HtmlElement}
    ScrollLeft: 0
    ScrollRectangle: {X = 0 Y = 0 Width = 1200 Height = 0}
    ScrollTop: 0
    shimManager: {System.Windows.Forms.HtmlShimManager}
    ShimManager: {System.Windows.Forms.HtmlShimManager}
    Style: null
    TabIndex: 0
    TagName: "BODY"

"<body onload=\"evt_Login_onload(event);\" uitheme=\"Web\">\n</body>" is the pre-JavaScript content. Is there a way to capture the state of the body tag after evt_Login_onload(event); executes?

I have also tried using wb.Document.GetElementById("id"), but it returns null.

Olpe answered 21/8, 2013 at 22:28 Comment(0)
N
11

Here is how it can be done, I've put some comments inline:

private void Form1_Load(object sender, EventArgs e)
{
    bool complete = false;
    this.webBrowser1.DocumentCompleted += delegate
    {
        if (complete)
            return;
        complete = true;
        // DocumentCompleted is fired before window.onload and body.onload
        this.webBrowser1.Document.Window.AttachEventHandler("onload", delegate
        {
            // Defer this to make sure all possible onload event handlers got fired
            System.Threading.SynchronizationContext.Current.Post(delegate 
            {
                // try webBrowser1.Document.GetElementById("id") here
                MessageBox.Show("window.onload was fired, can access DOM!");
            }, null);
        });
    };

    this.webBrowser1.Navigate("http://www.example.com");
}

Updated, it's 2019 and this answer is surprisingly still getting attention, so I'd like to note that my recommended way of doing with modern C# would be using async/await, like this.

Nutritious answered 22/8, 2013 at 2:1 Comment(4)
+1 I can confirm this answer worked for me, though I had no need of using SyncronizationContext.Current.Post() because I'm running single threaded.Holierthanthou
@StevendeSalas, the whole purpose of SyncronizationContext.Current.Post was to return from the onload event handler and continue asynchronously on the same UI thread (so any possible exceptions wouldn't be thrown inside the MSHTML code firing the event). The code has evolved a bit since then to use async/await instead, example.Nutritious
I tried a similar approach- but didn't work.. Can you take a look #22698487?Silsby
@Lijo, check this: https://mcmap.net/q/24354/-how-to-cancel-task-await-after-a-timeout-period. Visit all links listed there.Nutritious

© 2022 - 2024 — McMap. All rights reserved.