Any workaround to get text in an iFrame on another domain in a WebBrowser?
Asked Answered
R

2

6

You will probably first think is not possible because of XSS restrictions. But I'm trying to access this content from an application that hosts a WebBrowser, not from javascript code in a site.

I understand is not possible and should not be possible via non hacky means to access this content from javascript because this would be a big security issue. But it makes no sense to have this restriction from an application that hosts a WebBrowser. If I'd like to steel my application user's Facebook information, I could just do a Navigate("facebook.com") and do whatever I want in it. This is an application that hosts a WebBrowser, not a webpage.

Also, if you go with Google Chrome to any webpage that contains an iFrame whose source is in another domain and right click its content and click Inspect Element, it will show you the content. Even simpler, if you navigate to any webpage that contains an iFrame in another domain, you will see its content. If you can see it on the WebBrowser, then you should be able to access it programmatically, because it have to be somewhere in the memory.

Is there any way, not from the DOM objects because they seem to be based on the same engine as javascript and therefore restricted by XSS restrictions, but from some more low level objects such as MSHTML or SHDocVw, to access this text?

Regen answered 21/9, 2011 at 19:29 Comment(1)
Have you considered using the Facebook API to get the user's information instead of web scraping?Laughing
R
6

Can this be useful for you?

foreach (HtmlElement elm in webBrowser1.Document.GetElementsByTagName("iframe"))
{
     string src = elm.GetAttribute("src");
     if (src != null && src != "")
     {
          string content = new System.Net.WebClient().DownloadString(src); //or using HttpWebRequest
          MessageBox.Show(content);
     }
}
Riboflavin answered 28/9, 2011 at 19:13 Comment(1)
This worked!!!. I had to copy my browser's cookies to a WebRequest to make it work on some tricky websites. Here is the code I used in case anyone else needs it: #3382998Regen
M
0

Do you just need a way to request content from code?

HttpWebRequest request = (HttpWebRequest)WebRequest.Create(webRequest.URL);
request.UserAgent = webRequest.UserAgent;
request.ContentType = webRequest.ContentType;
request.Method = webRequest.Method;

if (webRequest.BytesToWrite != null && webRequest.BytesToWrite.Length > 0) {
    Stream oStream = request.GetRequestStream();
    oStream.Write(webRequest.BytesToWrite, 0, webRequest.BytesToWrite.Length);
    oStream.Close();
}

// Send the request and get a response
HttpWebResponse resp = (HttpWebResponse)request.GetResponse();

// Read the response
StreamReader sr = new StreamReader(resp.GetResponseStream());

// return the response to the screen
string returnedValue = sr.ReadToEnd();

sr.Close();
resp.Close();

return returnedValue;
Mycostatin answered 21/9, 2011 at 19:58 Comment(1)
No, sorry this won't do. I need to access the content in an already loaded frame inside a webpage.Regen

© 2022 - 2024 — McMap. All rights reserved.