Get the html of the javascript-rendered page (after interacting with it)
Asked Answered
U

6

40

I would like to be able to save the state of the html page after I've interacted with it.

Say I click a checkbox, or the javascript set the values of various elements.

How can I save the "javascript-rendered" page?

Thanks.

Ulloa answered 13/4, 2012 at 16:37 Comment(0)
C
27

That should do and will grab the ALL page not just the body

console.log(document.getElementsByTagName('html')[0].innerHTML);
Carruthers answered 13/4, 2012 at 16:39 Comment(5)
this is probably better, but again, the problem is saving it. it might need a bookmarklet no?Ulloa
depends where you want to save it, you can assign it to a variable, do an ajax call and save it in a database or as a file.Carruthers
but that all requires me injecting my script into the page, doesn't it?Ulloa
No you can save it as a string or if you must save is as a DOM element you can create a document fragment or another element so store it in, then you can transverse it as a normal even if it is not "injected" into your page. So you can technically grab the all page, remove it, redraw one and bring whichever old one you had. Well in theory after cross browsers wise might have issues.Carruthers
@Ulloa you can do it by taking information in querystring and render them through server side. suppose the page url will go something like foo.com/sex=male&age=120&type=developer. this is probably bad practice but it will worked for you and you don't need javascript then.Rodriques
D
50

In Chrome (and apparently Firefox), there is a special copy() method that will copy the rendered content to the clipboard. Then you can do whatever you want by pasting it to your preferred text editor.

https://developers.google.com/chrome-developer-tools/docs/commandline-api#copyobject

Console Example:

copy(document.body.innerHTML);

Note: I noticed Chrome reports undefined after the method is run, however, it seems to execute correctly and the right content is in the clipboard.

Disadvantaged answered 16/11, 2013 at 21:5 Comment(7)
Fantastic! I've been searching for ages for this. No idea why it isn't rated higher - it's exactly what's needed. Thanks!Liscomb
YOU SAVED MY LIFE !Hoyden
This is just marvelous. Hats off too you.Stipulation
undefined doesn't mean that the method isn't defined, it's saying that the method returns undefined which it does.Teodora
Works on Safari as well.Tally
If you use copy(document.documentElement.outerHTML); you will get the entire HTML page including <head> (except the `DOCTYPE? declaration.Wandering
This solution gets me the HTML but gets the scripts instead of the rendered HTML. For example, the script creates or modifies the DOM. I just get the script but not the rendered DOM.Earthstar
C
27

That should do and will grab the ALL page not just the body

console.log(document.getElementsByTagName('html')[0].innerHTML);
Carruthers answered 13/4, 2012 at 16:39 Comment(5)
this is probably better, but again, the problem is saving it. it might need a bookmarklet no?Ulloa
depends where you want to save it, you can assign it to a variable, do an ajax call and save it in a database or as a file.Carruthers
but that all requires me injecting my script into the page, doesn't it?Ulloa
No you can save it as a string or if you must save is as a DOM element you can create a document fragment or another element so store it in, then you can transverse it as a normal even if it is not "injected" into your page. So you can technically grab the all page, remove it, redraw one and bring whichever old one you had. Well in theory after cross browsers wise might have issues.Carruthers
@Ulloa you can do it by taking information in querystring and render them through server side. suppose the page url will go something like foo.com/sex=male&age=120&type=developer. this is probably bad practice but it will worked for you and you don't need javascript then.Rodriques
M
4

document.body.innerHTML will get you the HTML representation of the current document body.

That will not necessarily include all internal state of DOM objects because the HTML contains the initial default state of objects, not necessarily the state that they may have been changed to. The only way to guarantee you get all that state is to make a list of what state you want to save and actually programmatically get that state.

To answer the part of your question about saving it, you'll have to describe more about what problem you're really trying to solve.

Moriah answered 13/4, 2012 at 16:38 Comment(9)
cool... but then I need a way to save that right? So should I just create a bookmarklet that copies to the clipboard the document.body.innerHTML?Ulloa
@Ulloa - You'll have to describe more about the problem you're really trying to solve in order to answer further.Moriah
the problem: I visit a webpage, I want to save it and it's state) after I've interacted with it.Ulloa
@Ulloa - save it for what purpose? What are you going to do with the saved version? We can help a lot better if you tell us what the REAL end goal is here. After all, if you just want to view it again, a screenshot or printing the page is probably the most foolproof way of saving. I save my digital receipts to my hard disk by doing File/Save As in the browser. There are lots of different ways to save depending upon what you want to do with it later.Moriah
to load it up again I suppose. Or to fill out a form in html, and convert it to pdf. whatever the purpose, there are many.Ulloa
@Ulloa - there is no generic answer. It depends EXACTLY on what you want to do with it later. If you just want to view it later, then just use File/Save As or print a copy of the page or take a screen shot. If you want to reload it in exact form so you can interact with it again later as if you never closed the browser window, you probably cannot do that because you cannot reproduce the exact javascript state of the page on the actual domain of the site.Moriah
File/Save As doesn't save the current filled in stuff, does it?Ulloa
Also, for the most part i'm not concerned with saving the javascript state, just the page state, i.e. the effects on the dom elements.Ulloa
@Ulloa - it depends. Sorry, as I've now said a couple times, I can't help any more without you saying EXACTLY what you want to do with the saved version in the future.Moriah
K
4

To get the equivalent of view source with javascript rendered, including doctype and html tags, copy the command into the chrome console:

console.log(new XMLSerializer().serializeToString(document.doctype) + document.getElementsByTagName('html')[0].outerHTML);

In the chrome console, hover at the end of the output and click on the copy link to copy to the pasteboard.

Karlynkarma answered 17/6, 2020 at 18:39 Comment(0)
H
3

Pasting the following to your browser console ( F12 -> Console ) will automatically save a file called rendered.html to your downloads directory:

let link = document.createElement("a");
link.href = URL.createObjectURL(new Blob([document.getElementsByTagName('html')[0].innerHTML], { type: 'text/html' }));
link.download = "rendered.html";
link.click();
URL.revokeObjectURL(link.href);
Habsburg answered 7/3, 2023 at 8:56 Comment(1)
If this would name the .html file according to the page you are saving, that'd be great. To make it absolutely repeatable, the file name would need a formatted DateTime, as well.Chitchat
C
-1

Grant's solution is the most precise one, however you still need to manually fiddle with the console. To achieve the same result conveniently, without having to fiddle with the console, you can use the following browser extension.

Source Code

Firefox Extension

Usage

  1. Visit web page to copy rendered HTML from.
  2. Right mouse click anywhere on the page.
  3. Press "Copy page as HTML text".
  4. You got the same text in the clipboard, that you would've gotten by executing copy(document.body.innerHTML); in the console.
Chitchat answered 15/4, 2023 at 15:2 Comment(0)

© 2022 - 2025 — McMap. All rights reserved.