There are lots of sites that use this (imo) annoying "infinite scrolling" style. Examples of this are sites like tumblr, twitter, 9gag, etc..
I recently tried to scrape some pics off of these sites programatically with HtmlAgilityPack. like this:
HtmlWeb web = new HtmlWeb();
HtmlDocument doc = web.Load(url);
var primary = doc.DocumentNode.SelectNodes("//img[@class='badge-item-img']");
var picstring = primary.Select(r => r.GetAttributeValue("src", null)).FirstOrDefault();
This works fine, but when I tried to load in the HTML from certain sites, I noticed that I only got back a small amount of content (lets say the first 10 "posts" or "pictures", or whatever..) This made me wonder if it would be possible to simulate the "scrolling down to the bottom" of the page in c#.
This isn't just the case when I load the html programatically, when I simply go to sites like tumblr, and I check firebug or just "view source", I expected that all the content would be in there somewhere, but alot of it seems to be hidden/inserted with javascript. Only the content that is actually visible on my screen is present in the HTML source.
So my questions is: is it possible to simulate infinitely scrolling down to a page, and loading in that HTML with c# (preferably)?
(I know that I can use API's for tumblr and twitter, but i'm just trying to have some fun hacking stuff together with HtmlAgilityPack)