I am writing an application to measure how fast I can download web pages using C#. I supply a list of unique domain names, then I spawn X number of threads and perform HTTPWebRequests until the list of domains has been consumed. The problem is that no matter how many threads I use, I only get about 3 pages per second.
I discovered that the System.Net.ServicePointManager.DefaultConnectionLimit is 2, but I was under the impression that this is related to the number of connections per domain. Since each domain in the list is unique, this should not be an issue.
Then I found that the GetResponse() method blocks access from all other processes until the WebResponse is closed: http://www.codeproject.com/KB/IP/Crawler.aspx#WebRequest, I have not found any other information on the web to back this claim up, however I implemented a HTTP request using sockets, and I noticed a significant speed up (4x to 6x).
So my questions: does anyone know exactly how the HttpWebRequest objects work?, is there a workaround besides what was mentioned above?, or are there any examples of high speed web crawlers written in C# anywhere?