get the contents from the webview using javafx
Asked Answered
C

2

21

I am working on a swing application using JAVA FX controls . In my application i have to take print out the html page displayed in the webview . What I am trying is to load the html content of webview in a string with the help of HtmlDocuement.

To load the content of html file from web view,I am using the following code but its not working:

try
{
    String str=webview1.getEngine().getDocment().Body().outerHtml();
}
catch(Exception ex)
{
}
Colbert answered 11/1, 2013 at 7:16 Comment(0)
L
25

WebEngine.getDocument returns org.w3c.dom.Document, not JavaScript document which you expect judging by your code.

Unfortunately, printing out org.w3c.dom.Document requires quite a bit of coding. You can try the solution from What is the shortest way to pretty print a org.w3c.dom.Document to stdout?, see code below.

Note that you need to wait until the document is loaded before working with Document. This is why LoadWorker is used here:

public void start(Stage primaryStage) {
    WebView webview = new WebView();
    final WebEngine webengine = webview.getEngine();
    webengine.getLoadWorker().stateProperty().addListener(
            new ChangeListener<State>() {
                public void changed(ObservableValue ov, State oldState, State newState) {
                    if (newState == Worker.State.SUCCEEDED) {
                        Document doc = webengine.getDocument();
                        try {
                            Transformer transformer = TransformerFactory.newInstance().newTransformer();
                            transformer.setOutputProperty(OutputKeys.OMIT_XML_DECLARATION, "no");
                            transformer.setOutputProperty(OutputKeys.METHOD, "xml");
                            transformer.setOutputProperty(OutputKeys.INDENT, "yes");
                            transformer.setOutputProperty(OutputKeys.ENCODING, "UTF-8");
                            transformer.setOutputProperty("{http://xml.apache.org/xslt}indent-amount", "4");

                            transformer.transform(new DOMSource(doc),
                                    new StreamResult(new OutputStreamWriter(System.out, "UTF-8")));
                        } catch (Exception ex) {
                            ex.printStackTrace();
                        }
                    }
                }
            });
    webengine.load("http://stackoverflow.com");
    primaryStage.setScene(new Scene(webview, 800, 800));
    primaryStage.show();
}
Lir answered 11/1, 2013 at 18:13 Comment(1)
Your method will retrieve missing content if the site has errors html ex : I am very concerned about this topic if you have a fix please please helpMarlo
G
45
String html = (String) webEngine.executeScript("document.documentElement.outerHTML");
Gingerich answered 29/12, 2013 at 17:29 Comment(1)
this one liner wont work if not in a worker. Will return empty html. It also won't work for a site like google.com. Won't return live DOM, only underlying html/javascript.Auburn
L
25

WebEngine.getDocument returns org.w3c.dom.Document, not JavaScript document which you expect judging by your code.

Unfortunately, printing out org.w3c.dom.Document requires quite a bit of coding. You can try the solution from What is the shortest way to pretty print a org.w3c.dom.Document to stdout?, see code below.

Note that you need to wait until the document is loaded before working with Document. This is why LoadWorker is used here:

public void start(Stage primaryStage) {
    WebView webview = new WebView();
    final WebEngine webengine = webview.getEngine();
    webengine.getLoadWorker().stateProperty().addListener(
            new ChangeListener<State>() {
                public void changed(ObservableValue ov, State oldState, State newState) {
                    if (newState == Worker.State.SUCCEEDED) {
                        Document doc = webengine.getDocument();
                        try {
                            Transformer transformer = TransformerFactory.newInstance().newTransformer();
                            transformer.setOutputProperty(OutputKeys.OMIT_XML_DECLARATION, "no");
                            transformer.setOutputProperty(OutputKeys.METHOD, "xml");
                            transformer.setOutputProperty(OutputKeys.INDENT, "yes");
                            transformer.setOutputProperty(OutputKeys.ENCODING, "UTF-8");
                            transformer.setOutputProperty("{http://xml.apache.org/xslt}indent-amount", "4");

                            transformer.transform(new DOMSource(doc),
                                    new StreamResult(new OutputStreamWriter(System.out, "UTF-8")));
                        } catch (Exception ex) {
                            ex.printStackTrace();
                        }
                    }
                }
            });
    webengine.load("http://stackoverflow.com");
    primaryStage.setScene(new Scene(webview, 800, 800));
    primaryStage.show();
}
Lir answered 11/1, 2013 at 18:13 Comment(1)
Your method will retrieve missing content if the site has errors html ex : I am very concerned about this topic if you have a fix please please helpMarlo

© 2022 - 2024 — McMap. All rights reserved.