WebBrowser Control in a new thread
Asked Answered
M

4

88

I have a list Uri's that I want "clicked" To achieve this I"m trying to create a new web-browser control per Uri. I create a new thread per Uri. The problem I'm having is the thread end before the document is fully loaded, so I never get to make use of the DocumentComplete event. How can I overcome this?

var item = new ParameterizedThreadStart(ClicIt.Click); 
var thread = new Thread(item) {Name = "ClickThread"}; 
thread.Start(uriItem);

public static void Click(object o)
{
    var url = ((UriItem)o);
    Console.WriteLine(@"Clicking: " + url.Link);
    var clicker = new WebBrowser { ScriptErrorsSuppressed = true };
    clicker.DocumentCompleted += BrowseComplete;
    if (String.IsNullOrEmpty(url.Link)) return;
    if (url.Link.Equals("about:blank")) return;
    if (!url.Link.StartsWith("http://") && !url.Link.StartsWith("https://"))
        url.Link = "http://" + url.Link;
    clicker.Navigate(url.Link);
}
Minimus answered 24/11, 2010 at 17:41 Comment(0)
U
158

You have to create an STA thread that pumps a message loop. That's the only hospitable environment for an ActiveX component like WebBrowser. You won't get the DocumentCompleted event otherwise. Some sample code:

private void runBrowserThread(Uri url) {
    var th = new Thread(() => {
        var br = new WebBrowser();
        br.DocumentCompleted += browser_DocumentCompleted;
        br.Navigate(url);
        Application.Run();
    });
    th.SetApartmentState(ApartmentState.STA);
    th.Start();
}

void browser_DocumentCompleted(object sender, WebBrowserDocumentCompletedEventArgs e) {
    var br = sender as WebBrowser;
    if (br.Url == e.Url) {
        Console.WriteLine("Natigated to {0}", e.Url);
        Application.ExitThread();   // Stops the thread
    }
}
Unapproachable answered 24/11, 2010 at 21:9 Comment(4)
Yes! Just add System.Windows.Forms. Saved my day, too. ThanksHaemostatic
I'm trying to adapt this code to my situation. I have to keep the WebBrowser object alive (to save state/cookies etc.) and perform multiple Navigate() calls over time. But I'm not sure where to place my Application.Run() call, becuz it blocks further code from executing. Any clues?Subtropical
You can call Application.Exit(); to let Application.Run() return.Wilmawilmar
how can I set STA if I am using task?Concertina
W
27

Here is how to organize a message loop on a non-UI thread, to run asynchronous tasks like WebBrowser automation. It uses async/await to provide the convenient linear code flow and loads a set of web pages in a loop. The code is a ready-to-run console app which is partially based on this excellent post.

Related answers:

using System;
using System.Threading;
using System.Threading.Tasks;
using System.Windows.Forms;

namespace ConsoleApplicationWebBrowser
{
    // by Noseratio - https://stackoverflow.com/users/1768303/noseratio
    class Program
    {
        // Entry Point of the console app
        static void Main(string[] args)
        {
            try
            {
                // download each page and dump the content
                var task = MessageLoopWorker.Run(DoWorkAsync,
                    "http://www.example.com", "http://www.example.net", "http://www.example.org");
                task.Wait();
                Console.WriteLine("DoWorkAsync completed.");
            }
            catch (Exception ex)
            {
                Console.WriteLine("DoWorkAsync failed: " + ex.Message);
            }

            Console.WriteLine("Press Enter to exit.");
            Console.ReadLine();
        }

        // navigate WebBrowser to the list of urls in a loop
        static async Task<object> DoWorkAsync(object[] args)
        {
            Console.WriteLine("Start working.");

            using (var wb = new WebBrowser())
            {
                wb.ScriptErrorsSuppressed = true;

                TaskCompletionSource<bool> tcs = null;
                WebBrowserDocumentCompletedEventHandler documentCompletedHandler = (s, e) =>
                    tcs.TrySetResult(true);

                // navigate to each URL in the list
                foreach (var url in args)
                {
                    tcs = new TaskCompletionSource<bool>();
                    wb.DocumentCompleted += documentCompletedHandler;
                    try
                    {
                        wb.Navigate(url.ToString());
                        // await for DocumentCompleted
                        await tcs.Task;
                    }
                    finally
                    {
                        wb.DocumentCompleted -= documentCompletedHandler;
                    }
                    // the DOM is ready
                    Console.WriteLine(url.ToString());
                    Console.WriteLine(wb.Document.Body.OuterHtml);
                }
            }

            Console.WriteLine("End working.");
            return null;
        }

    }

    // a helper class to start the message loop and execute an asynchronous task
    public static class MessageLoopWorker
    {
        public static async Task<object> Run(Func<object[], Task<object>> worker, params object[] args)
        {
            var tcs = new TaskCompletionSource<object>();

            var thread = new Thread(() =>
            {
                EventHandler idleHandler = null;

                idleHandler = async (s, e) =>
                {
                    // handle Application.Idle just once
                    Application.Idle -= idleHandler;

                    // return to the message loop
                    await Task.Yield();

                    // and continue asynchronously
                    // propogate the result or exception
                    try
                    {
                        var result = await worker(args);
                        tcs.SetResult(result);
                    }
                    catch (Exception ex)
                    {
                        tcs.SetException(ex);
                    }

                    // signal to exit the message loop
                    // Application.Run will exit at this point
                    Application.ExitThread();
                };

                // handle Application.Idle just once
                // to make sure we're inside the message loop
                // and SynchronizationContext has been correctly installed
                Application.Idle += idleHandler;
                Application.Run();
            });

            // set STA model for the new thread
            thread.SetApartmentState(ApartmentState.STA);

            // start the thread and await for the task
            thread.Start();
            try
            {
                return await tcs.Task;
            }
            finally
            {
                thread.Join();
            }
        }
    }
}
Wyman answered 31/10, 2013 at 23:43 Comment(4)
Thanks for that brilliant and informative answer! It's exactly what I was looking for. However you seem to have (intentionally?) misplaced the Dispose() statement.Mastaba
@Paweł, you're right, that code did not even compile :) I think pasted a wrong version, now fixed. Thanks for spotting this. You may want to check a more generic approach: https://mcmap.net/q/24354/-how-to-cancel-task-await-after-a-timeout-periodWyman
I tried to run this code, however it gets stuck on task.Wait();. I am doing something wrong ?Flood
Hi, maybe you could help me with this one: #41534497 - the method works well, but if Form was instantiated before the MessageLoopWorker, it stops working.Tetrabrach
R
3

From my experience in the past the webbrowser does not like operating outside of the main application thread.

Try using httpwebrequests instead, you can set them as asynchronous and create a handler for the response to know when it is succesfull:

how-to-use-httpwebrequest-net-asynchronously

Rickettsia answered 24/11, 2010 at 18:17 Comment(6)
My problem with that is this. The Uri being clicked required the site to be logged in. I can't achieve this with WebRequest. By using the WebBrowser it uses the IE cache already, so the sites logged in. Is there a way around that? The links involve facebook. So can I log into facebook and click the link with webwrequest?Minimus
@ArtW I know this is an old comment, but people can probably solve that by setting webRequest.Credentials = CredentialsCache.DefaultCredentials;Leopardi
@Leopardi If it's an API then yes, but if it's a website with HTML elements for logging in then it'll need to use IE cookies or cache, otherwise the client doesn't know what to do with the Credentials object property and how to fill the HTML.Sunday
@Sunday The context this whole page is talking about is using the HttpWebRequest object and C# .NET, not simple HTML and form elements being posted, like you might do with JavaScript/AJAX. But regardless, you have a receiver. And for log-on you should be using Windows Authentication and IIS handles this automatically, anyway. If you need to test them manually you can use WindowsIdentity.GetCurrent().Name after implementing impersonation and test it against an AD search, if you like. Not sure how cookies and cache would be used for any of that.Leopardi
@Leopardi The question is talking about WebBrowser which would indicate that HTML pages are being loaded, OP has even said that WebRequest won't achieve what he wants, therefore if a website expects HTML input for login then setting the Credentials object won't work. Additionally, as OP says, the sites include Facebook; Windows authentication will not work on this.Sunday
@vacpcguy I hope this helps. I'll be very impressed if you can sign into Facebook with NTLM credentials. I've highlighted key information in this picture for you, if you're still arguing after this then there's absolutely no chance you'll understand.Sunday
W
0

A simple solution at which the simultaneous operation of several WebBrowsers occurs

  1. Create a new Windows Forms application
  2. Place the button named button1
  3. Place the text box named textBox1
  4. Set properties of text field: Multiline true and ScrollBars Both
  5. Write the following button1 click handler:

    textBox1.Clear();
    textBox1.AppendText(DateTime.Now.ToString() + Environment.NewLine);
    int completed_count = 0;
    int count = 10;
    for (int i = 0; i < count; i++)
    {
        int tmp = i;
        this.BeginInvoke(new Action(() =>
        {
            var wb = new WebBrowser();
            wb.ScriptErrorsSuppressed = true;
            wb.DocumentCompleted += (cur_sender, cur_e) =>
            {
                var cur_wb = cur_sender as WebBrowser;
                if (cur_wb.Url == cur_e.Url)
                {
                    textBox1.AppendText("Task " + tmp + ", navigated to " + cur_e.Url + Environment.NewLine);
                    completed_count++;
                }
            };
            wb.Navigate("https://mcmap.net/q/24377/-webbrowser-control-in-a-new-thread");
        }
        ));
    }
    
    while (completed_count != count)
    {
        Application.DoEvents();
        Thread.Sleep(10);
    }
    textBox1.AppendText("All completed" + Environment.NewLine);
    
Westleigh answered 1/11, 2017 at 7:20 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.