How to take a snapshot of a section of a web page from the shell?
Asked Answered
S

1

5

I have a section of a web page that I need to take a gif snapshot of at a given time interval. The snapshot needs to be full page size resolution, however as I said, it only goes to a certain place on the page (in this case it's after a table).

What would be the best way to grab a page snapshot image image like this? I'd like to just throw it into a cron job and forget it, but I'm not readily seeing a tool that would make quick work of this.

SOLUTION:

As per the @Eduardo's excellent direction I implemented a clean and quick solution based around phantomjs and imagemagick (Mac: brew install phantomjs & brew install imagemagick):

*NOTE: If you want to remove imagemagick altogether just add the following to rasterize.js: page.clipRect = { top: 10, left: 10, width: 500, height: 500 }

#! /usr/bin/env bash
# Used with PhantomJS - rasterize.js source: http://j.mp/xC7u1Z

refresh_seconds=30

while true; do
    date_now=`date +"%Y-%m-%d %H%M"` 

    phantomjs rasterize.js $1 "${date_now}-original.png"  # just sucking in the first arg from shell for the URL
    convert "${date_now}-original.png" -crop 500x610+8+16 "${date_now}.png" # crop args: WIDTHxHEIGHT+LEFT_MARGIN+TOP_MARGIN
    rm "${date_now}-original.png"

    echo "Got image: ${date_now}.png - Now waiting ${refresh_seconds} seconds for next image..."
    sleep ${refresh_seconds}
done

And here's the js used by phantomjs in the above:

// As explained here: http://code.google.com/p/phantomjs/wiki/QuickStart

var page = new WebPage(),
    address, output, size;

if (phantom.args.length < 2 || phantom.args.length > 3) {
    console.log('Usage: rasterize.js URL filename');
    phantom.exit();
} else {
    address = phantom.args[0];
    output = phantom.args[1];
    page.viewportSize = { width: 600, height: 600 };
    page.open(address, function (status) {
        if (status !== 'success') {
            console.log('Unable to load the address!');
        } else {
            window.setTimeout(function () {
                page.render(output);
                phantom.exit();
            }, 200);
        }
    });
}
Stonemason answered 22/2, 2012 at 7:7 Comment(2)
Any reason you don't just capture the content? I fear there is more to your question than you've provided.Mylohyoid
@Mylohyoid nope, just wanted to take a snapshot. phantomjs turned out to be a great solution.Stonemason
F
4

This question has already been answered here: How can I take a screenshot/image of a website using Python?

It was answered on '09, but that option is still very valid. I'll try to extend with some more options.

Those tools will get you full page snapshots, which you can later clip using imagemagick easily.

Another option that you might have these days is Phantomjs. Phantom is a headless browser built to be run on node, it will allow you to take a picture of a whole page or just an area of the page.

Take a look at this example:

var page = new WebPage(),
    address, output, size;

if (phantom.args.length < 2 || phantom.args.length > 3) {
    console.log('Usage: rasterize.js URL filename');
    phantom.exit();
} else {
    address = phantom.args[0];
    output = phantom.args[1];
    page.viewportSize = { width: 600, height: 600 };
    page.open(address, function (status) {
        if (status !== 'success') {
            console.log('Unable to load the address!');
        } else {
            window.setTimeout(function () {
                page.render(output);
                phantom.exit();
            }, 200);
        }
    });
}
Fleck answered 22/2, 2012 at 7:26 Comment(3)
Perfect! I was not familiar with phanotomjs and it is an absolute gem. I was able to get this going very well with it and imagemagick.Stonemason
You can also use clipRect to select the exact portion of the page you want to rasterize on phantomjs. code.google.com/p/phantomjs/wiki/InterfaceFleck
Updated URL for clipRect docs: github.com/ariya/phantomjs/wiki/…Staphyloplasty

© 2022 - 2024 — McMap. All rights reserved.