System.Net.WebClient request gets 403 Forbidden but browsers do not with Apache servers
Asked Answered
B

4

6

An odd one, I'm trying to read the <Head> section of a lot of different websites out there, and one particular type of server, Apache, sometimes gives the code 403 forbidden. Not all apache servers do this, so it may be a config setting or a particular version of the server.

When I then check the url with a web browser (Firefox, for example) the page loads fine. The code sorta looks like this:

var client = new WebClient();
var stream = client.OpenRead(new Uri("http://en.wikipedia.org/wiki/Barack_Obama"));

Normally, a 403 is a access permission failed sort of thing, but these are normally unsecure pages. I'm thinking that Apache is filtering on something in the request headers since I'm not bothering to create any.

Maybe someone who knows more about Apache can give me some ideas of what's missing in the headers. I'd like to keep the headers as small as possible to minimize bandwidth.

Thanks

Balakirev answered 23/2, 2010 at 4:28 Comment(0)
L
10

Try setting the UserAgent header:

string _UserAgent = "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.2; .NET CLR 1.0.3705;)";
client.Headers.Add(HttpRequestHeader.UserAgent, _UserAgent);
Laud answered 23/2, 2010 at 4:33 Comment(2)
That was the hint I needed. Thanks!Balakirev
403 may also be caused by TLS issues. To verify, you should check the text of the WebException.Response object. Or, try adding this to your code: ServicePointManager.SecurityProtocol = (SecurityProtocolType)3072; This will force TLS 1.2Isobelisocheim
J
7

I had a similar problem and below setting solved it

Client.Headers["Accept"] = "application/x-ms-application, image/jpeg, application/xaml+xml, image/gif, image/pjpeg, application/x-ms-xbap, application/x-shockwave-flash, application/vnd.ms-excel, application/vnd.ms-powerpoint, application/msword, */*";
Client.Headers["User-Agent"] ="Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 6.1; WOW64; Trident/4.0; SLCC2; .NET CLR 2.0.50727; .NET CLR 3.5.30729; .NET CLR 3.0.30729; Media Center PC 6.0; MDDC)";
Jaquenette answered 10/11, 2010 at 19:43 Comment(0)
K
1

It could be a matter of the UserAgent header, as "thedugas" said, or in fact anything the browser is silently configured to do. For instance, it could be a matter of not using a proxy server that the browser is using, or not using the correct credentials for the proxy server. These are things that may already be configured into the browser, so you're not aware they need to be done.

Kwon answered 23/2, 2010 at 4:39 Comment(0)
S
0

I had the same problem and the answer was not obvious. I found the solution sniffing the network communication. When Apache gives its "Testing 1 2 3..." page, it returns an html with a 403 forbiden code. The browser ignores gets the code and show the page, but de WebClient returns an error message. The solution is to read the response inside the Catch of a Try statment. Here is my code:

            Dim Retorno As String = ""
            Dim Client As New SiteWebClient
            Client.Headers.Add("User-Agent", "Mozilla/ 5.0(Windows NT 10.0; Win64; x64) AppleWebKit/537.36 " &
                               "(KHTML, Like Gecko) Chrome/64.0.3282.140 Safari/537.36 Edge/17.17134")
            Client.Headers.Add("Accept-Language", "pt-BR, pt;q=0.5")
            Client.Headers.Add("Accept", "Text/ html, application / xhtml + Xml, application / Xml;q=0.9,*/*;q=0.8")
            Try
                Retorno = Client.DownloadString("http://" & HostName & SitePath)
            Catch ex As Exception
                If ex.GetType = GetType(System.Net.WebException) Then
                    Try
                        Dim Exception As System.Net.WebException = ex
                        Dim Resposta As System.Net.HttpWebResponse = Exception.Response
                        Using WebStream As New StreamReader(Resposta.GetResponseStream(), System.Text.Encoding.GetEncoding("utf-8"))
                            Retorno = WebStream.ReadToEnd
                        End Using
                    Catch ex1 As Exception

                    End Try
                End If
            End Try

After the Try statment, Retorno will contain the HTML response from the server, no matter the error code the server returns.

The headers have no influence on this behaiviour.

Search answered 15/6, 2019 at 13:29 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.