Programmatically get a screenshot of a page
Asked Answered
T

7

51

I'm writing a specialized crawler and parser for internal use, and I require the ability to take a screenshot of a web page in order to check what colours are being used throughout. The program will take in around ten web addresses and will save them as a bitmap image.

From there I plan to use LockBits in order to create a list of the five most used colours within the image. To my knowledge, it's the easiest way to get the colours used within a web page, but if there is an easier way to do it please chime in with your suggestions.

Anyway, I was going to use ACA WebThumb ActiveX Control until I saw the price tag. I'm also fairly new to C#, having only used it for a few months. Is there a solution to my problem of taking a screenshot of a web page in order to extract the colour scheme?

Thrill answered 30/12, 2009 at 18:25 Comment(6)
Haven't tried it (which is why this is a comment, not an answer) but ( dreamincode.net/code/snippet2539.htm ) seems to be a C# solution to save a web page as a bitmap.Dvina
How many pages do you crawl per month?Rehabilitate
Not many, I'm only using the images as a means to extract data so if one or two fail then it's no big problem. So far I've had no problems with it, other than the fact that it needs to use Application.Run() to move onwards.Thrill
In that case I added an answer that I think will work well, as WebBrowser.DrawToBitmap is very unreliable.Rehabilitate
I added to my answer below with another code sample to show how to do this in a Windows Forms application.Overlay
With a quick look at the feedback for the snippet Michael Todd linked to, I'd say $60 sounds like a fair price.Sharondasharos
R
31

https://screenshotlayer.com/documentation is the only free service I can find lately...

You'll need to use HttpWebRequest to download the binary of the image. See the provided url above for details.

HttpWebRequest request = HttpWebRequest.Create("https://[url]") as HttpWebRequest;
Bitmap bitmap;
using (Stream stream = request.GetResponse().GetResponseStream())
{
    bitmap = new Bitmap(stream);
}
// now that you have a bitmap, you can do what you need to do...
Rehabilitate answered 22/3, 2010 at 22:41 Comment(8)
I'll give this method a try and see how it affects the data extraction aspect.Thrill
@PsychoDad, my stream have more of 65,535 pixels. What do I do??Zins
@Zins what do you mean 65535 pixels? That's nothing. What are the dimensions of the web page you are trying to screenshot?Rehabilitate
@PsychoDad, the page contains more of 100 PDFs pages.Zins
FYI - The homepage of Websnapr says their services will be discontinued. Checked 17/12/2016.Derangement
@Mitulátbáti Added an alternativeRehabilitate
URL2PNG no longer offer a free tier: "not at this time. As they say.. "Fast, Good, Cheap. Pick two." "Deliver
After struggling with Freezer, Axl.Web.Screenshot, and other code solutions found online, we eventually tried URL2PNG. An image was able to render but the screen just came up as gray. A support engineer helped debug this on their end and we found that localStorage is not yet a supported feature. However, we were able to fix this on our app by adding some checks to see if localStorage was available. After going through all this, I highly recommend URL2PNG now for it's ease of use and great support. Even though it's not free, it's definitely worth it IMO.Varanasi
O
32

A quick and dirty way would be to use the WinForms WebBrowser control and draw it to a bitmap. Doing this in a standalone console app is slightly tricky because you have to be aware of the implications of hosting a STAThread control while using a fundamentally asynchronous programming pattern. But here is a working proof of concept which captures a web page to an 800x600 BMP file:

namespace WebBrowserScreenshotSample
{
    using System;
    using System.Drawing;
    using System.Drawing.Imaging;
    using System.Threading;
    using System.Windows.Forms;

    class Program
    {
        [STAThread]
        static void Main()
        {
            int width = 800;
            int height = 600;

            using (WebBrowser browser = new WebBrowser())
            {
                browser.Width = width;
                browser.Height = height;
                browser.ScrollBarsEnabled = true;

                // This will be called when the page finishes loading
                browser.DocumentCompleted += Program.OnDocumentCompleted;

                browser.Navigate("https://stackoverflow.com/");

                // This prevents the application from exiting until
                // Application.Exit is called
                Application.Run();
            }
        }

        static void OnDocumentCompleted(object sender, WebBrowserDocumentCompletedEventArgs e)
        {
            // Now that the page is loaded, save it to a bitmap
            WebBrowser browser = (WebBrowser)sender;

            using (Graphics graphics = browser.CreateGraphics())
            using (Bitmap bitmap = new Bitmap(browser.Width, browser.Height, graphics))
            {
                Rectangle bounds = new Rectangle(0, 0, bitmap.Width, bitmap.Height);
                browser.DrawToBitmap(bitmap, bounds);
                bitmap.Save("screenshot.bmp", ImageFormat.Bmp);
            }

            // Instruct the application to exit
            Application.Exit();
        }
    }
}

To compile this, create a new console application and make sure to add assembly references for System.Drawing and System.Windows.Forms.

UPDATE: I rewrote the code to avoid having to using the hacky polling WaitOne/DoEvents pattern. This code should be closer to following best practices.

UPDATE 2: You indicate that you want to use this in a Windows Forms application. In that case, forget about dynamically creating the WebBrowser control. What you want is to create a hidden (Visible=false) instance of a WebBrowser on your form and use it the same way I show above. Here is another sample which shows the user code portion of a form with a text box (webAddressTextBox), a button (generateScreenshotButton), and a hidden browser (webBrowser). While I was working on this, I discovered a peculiarity which I didn't handle before -- the DocumentCompleted event can actually be raised multiple times depending on the nature of the page. This sample should work in general, and you can extend it to do whatever you want:

namespace WebBrowserScreenshotFormsSample
{
    using System;
    using System.Drawing;
    using System.Drawing.Imaging;
    using System.IO;
    using System.Windows.Forms;

    public partial class MainForm : Form
    {
        public MainForm()
        {
            this.InitializeComponent();

            // Register for this event; we'll save the screenshot when it fires
            this.webBrowser.DocumentCompleted += 
                new WebBrowserDocumentCompletedEventHandler(this.OnDocumentCompleted);
        }

        private void OnClickGenerateScreenshot(object sender, EventArgs e)
        {
            // Disable button to prevent multiple concurrent operations
            this.generateScreenshotButton.Enabled = false;

            string webAddressString = this.webAddressTextBox.Text;

            Uri webAddress;
            if (Uri.TryCreate(webAddressString, UriKind.Absolute, out webAddress))
            {
                this.webBrowser.Navigate(webAddress);
            }
            else
            {
                MessageBox.Show(
                    "Please enter a valid URI.",
                    "WebBrowser Screenshot Forms Sample",
                    MessageBoxButtons.OK,
                    MessageBoxIcon.Exclamation);

                // Re-enable button on error before returning
                this.generateScreenshotButton.Enabled = true;
            }
        }

        private void OnDocumentCompleted(object sender, WebBrowserDocumentCompletedEventArgs e)
        {
            // This event can be raised multiple times depending on how much of the
            // document has loaded, if there are multiple frames, etc.
            // We only want the final page result, so we do the following check:
            if (this.webBrowser.ReadyState == WebBrowserReadyState.Complete &&
                e.Url == this.webBrowser.Url)
            {
                // Generate the file name here
                string screenshotFileName = Path.GetFullPath(
                    "screenshot_" + DateTime.Now.Ticks + ".png");

                this.SaveScreenshot(screenshotFileName);
                MessageBox.Show(
                    "Screenshot saved to '" + screenshotFileName + "'.",
                    "WebBrowser Screenshot Forms Sample",
                    MessageBoxButtons.OK,
                    MessageBoxIcon.Information);

                // Re-enable button before returning
                this.generateScreenshotButton.Enabled = true;
            }
        }

        private void SaveScreenshot(string fileName)
        {
            int width = this.webBrowser.Width;
            int height = this.webBrowser.Height;
            using (Graphics graphics = this.webBrowser.CreateGraphics())
            using (Bitmap bitmap = new Bitmap(width, height, graphics))
            {
                Rectangle bounds = new Rectangle(0, 0, width, height);
                this.webBrowser.DrawToBitmap(bitmap, bounds);
                bitmap.Save(fileName, ImageFormat.Png);
            }
        }
    }
}
Overlay answered 30/12, 2009 at 19:26 Comment(7)
Sorry for the huge delay, the code seems to work well, but I am struggling with using it within a form I have. I'm probably doing something stupid, but if you could give me a hand with it it'd be very appreciated.Thrill
DrawToBitmap is not supported and will fail sometimes, leaving a blank black or blank white bitmapRehabilitate
@Overlay - Do yo have by chance any idea why the page rendered by IE browser control have some styles applied incorrectly.Dragging
@Jenea: I can't say without seeing a specific example. It probably depends on many factors...Overlay
@Overlay It turns out that the problem is the security settings of IE control. If I disable IE ESC then page is rendered normally. If there would be a way to disable IE ESC settings for the control but it seems impossible.Dragging
Would work, but the only issue I run into, I need a screen shot of the whole webpage. If the page is larger than the browser control, then I only get what is in the window, not the whole thing.Mantoman
Must use browser.ScriptErrorsSuppressed = true;Rudderhead
R
31

https://screenshotlayer.com/documentation is the only free service I can find lately...

You'll need to use HttpWebRequest to download the binary of the image. See the provided url above for details.

HttpWebRequest request = HttpWebRequest.Create("https://[url]") as HttpWebRequest;
Bitmap bitmap;
using (Stream stream = request.GetResponse().GetResponseStream())
{
    bitmap = new Bitmap(stream);
}
// now that you have a bitmap, you can do what you need to do...
Rehabilitate answered 22/3, 2010 at 22:41 Comment(8)
I'll give this method a try and see how it affects the data extraction aspect.Thrill
@PsychoDad, my stream have more of 65,535 pixels. What do I do??Zins
@Zins what do you mean 65535 pixels? That's nothing. What are the dimensions of the web page you are trying to screenshot?Rehabilitate
@PsychoDad, the page contains more of 100 PDFs pages.Zins
FYI - The homepage of Websnapr says their services will be discontinued. Checked 17/12/2016.Derangement
@Mitulátbáti Added an alternativeRehabilitate
URL2PNG no longer offer a free tier: "not at this time. As they say.. "Fast, Good, Cheap. Pick two." "Deliver
After struggling with Freezer, Axl.Web.Screenshot, and other code solutions found online, we eventually tried URL2PNG. An image was able to render but the screen just came up as gray. A support engineer helped debug this on their end and we found that localStorage is not yet a supported feature. However, we were able to fix this on our app by adding some checks to see if localStorage was available. After going through all this, I highly recommend URL2PNG now for it's ease of use and great support. Even though it's not free, it's definitely worth it IMO.Varanasi
D
24

This question is old but, alternatively, you can use nuget package Freezer. It's free, uses a recent Gecko webbrowser (supports HTML5 and CSS3) and stands only in one dll.

var screenshotJob = ScreenshotJobBuilder.Create("https://google.com")
              .SetBrowserSize(1366, 768)
              .SetCaptureZone(CaptureZone.FullPage) 
              .SetTrigger(new WindowLoadTrigger()); 

 System.Drawing.Image screenshot = screenshotJob.Freeze();
Desimone answered 21/4, 2016 at 16:25 Comment(4)
Can it work with windows service to save the screenshots in a folder?Ceratoid
This is the best solution!Leralerch
Worked great for me!Skepticism
Freezer seems great if working properly. But it always crash: RemoteWorker has shutdown or Navigation error or Selected zone is 0 area etc, errors. Further it seems will not update any more.Painkiller
S
19

There is a great Webkit based browser PhantomJS which allows to execute any JavaScript from command line.

Install it from http://phantomjs.org/download.html and execute the following sample script from command line:

./phantomjs ../examples/rasterize.js http://www.panoramio.com/photo/76188108 test.jpg

It will create a screenshot of given page in JPEG file. The upside of that approach is that you don't rely on any external provider and can easily automate screenshot taking in large quantities.

Solidus answered 31/7, 2012 at 8:38 Comment(4)
+1 also being Webkit you know it renders modern web pages wellDemarcate
Pretty great tool, but it does not render my page very well it has lots of jquery and slickgrid.Weatherman
Hmmm. Result should be similar to what you get with any other Webkit browserSolidus
Are you sure that the page does not load any content after document.ready event?Solidus
J
6

I used WebBrowser and it doesn't work perfect for me, specially when needs to waiting for JavaScript rendering complete. I tried some Api(s) and found Selenium, the most important thing about Selenium is, it does not require STAThread and could run in simple console app as well as Services.

give it a try :

class Program
{
    static void Main()
    {
        var driver = new FirefoxDriver();

        driver.Navigate()
            .GoToUrl("http://stackoverflow.com/");

        driver.GetScreenshot()
            .SaveAsFile("stackoverflow.jpg", ImageFormat.Jpeg);

        driver.Quit();
    }
}
Jesu answered 15/6, 2015 at 8:56 Comment(2)
WebDriver extension should be installed then only it will work.Most
1) the browser should be pre-install on the server; 2) only part of page in the screenshot picture, don't clear how to get full page screen shot;Painkiller
T
1

Check this out. This seems to do what you wanted and technically it approaches the problem in very similar way through web browser control. It seems to have catered for a range of parameters to be passed in and also good error handling built into it. The only downside is that it is an external process (exe) that you spawn and it create a physical file that you will read later. From your description, you even consider webservices, so I dont think that is a problem.

In solving your latest comment about how to process multiple of them simultaneously, this will be perfect. You can spawn say a parallel of 3, 4, 5 or more processes at any one time or have the analysis of the color bit running as thread while another capturing process is happening.

For image processing, I recently come across Emgu, havent used it myself but it seems fascinating. It claims to be fast and have a lot of support for graphic analysis including reading of pixel color. If I have any graphic processing project on hand right now I will give this a try.

Tightlipped answered 24/3, 2010 at 20:57 Comment(0)
B
1

you may also have a look at QT jambi http://qt.nokia.com/doc/qtjambi-4.4/html/com/trolltech/qt/qtjambi-index.html

they have a nice webkit based java implementation for a browser where you can do a screenshot simply by doing sth like:

    QPixmap pixmap;
    pixmap = QPixmap.grabWidget(browser);

    pixmap.save(writeTo, "png");

Have a look at the samples - they have a nice webbrowser demo.

Belting answered 13/4, 2010 at 16:0 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.