How can I implement Mozilla readability.js to my Website?
Asked Answered
P

3

7

https://github.com/mozilla/readability (readability.js is for creating a read view for web pages)

How can I implement readability.js to this test Webpage The problem is, readability.js deletes the elements of this website, that I want to keep and leaves those that should be removed. I hope someone can help me. Thank you! Is there any documentation on how to use readability.js?

<html><head>
<title>Reader View shows only the browser in reader view</title>
    <script src="https://raw.githack.com/mozilla/readability/master/Readability.js"></script>
</head>
<body>
Everything outside the main div tag vanishes in Reader View<br>
<img class="no-print" src="http://dummyimage.com/1024x100/000/ffffff&text=This+banner+should+vanish+in+print+view">
<div>
   <h1>H1 tags outside ot a p tag are hidden in reader view</h1>
   <img class="no-print" src="http://dummyimage.com/1024x100/000/ffffff&text=This+banner+is resized+in+print+view">
   <p>
 123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789
 123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789
 123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789
 123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789
 123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789
 123456789 123456
</p>
</div>
</body>
    <script>
    var article = new Readability(document).parse();
    </script>
</html>

source of the Test page: Optimize website to show reader view in Firefox

Polanco answered 10/6, 2020 at 22:38 Comment(0)
H
8

You can use DOMPurify and Readability together like they've mentioned in their docs -

import { Readability } from '@mozilla/readability'
import DOMPurify from 'dompurify';

function readable(doc) {
  const reader = new Readability(doc)
  const article = reader.parse()
  return article
}

let cloneDoc = document.cloneNode(true)
let parsed = readable(cloneDoc)
const markup = DOMPurify.sanitize(parsed.content)

markup will be an html string of the readable content. Try console.log(parsed) to see the available properties.

Huesman answered 26/8, 2020 at 14:23 Comment(0)
L
4

Did you try this?

From their github page:

"Readability's parse() works by modifying the DOM. This removes some elements in the web page. You could avoid this by passing the clone of the document object while creating a Readability object."

var documentClone = document.cloneNode(true); 
var article = new Readability(documentClone).parse();

You can make a copy of the dom object so that you're not actually modifying the real dom

Liripipe answered 16/8, 2020 at 21:0 Comment(0)
P
0

Okay....

    document.getElementById("body").innerHTML = "<font face='Calibri' size='4'> 
    <h1>"+article.title+"</h1>"+article.content;
Polanco answered 11/6, 2020 at 17:25 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.