jTidy and TagSoup documentation

About

Asked 15/12, 2010 at 16:49 Answered 15/12, 2010 at 17:4

Solved java jtidy tag-soup jericho-html-parser

I'm looking for documentation (officially documentation if it is possible) for TagSoup and jTidy libraries.

I want use this libraries to manipulate html "tagsoup" files that include xml tags with different namespaces mixed between html (html, xhtml or html5) tags.

I have tested HTMLCleaner, NekoHTML and Jericho, but i don't find documentation for jTidy and TagSoup, apart from simplest examples to clear a file.

I need documentation about manipulate contents, replace tags, extract info, etc...

Thanks

Note: After test all options, I used StAX / Woodstox :

Isabea answered 15/12, 2010 at 16:49 Comment(4)

Did you consider Jsoup? It can't be done better/easier. It has a good Cookbook as well. – Bugeye 15/12, 2010 at 17:1

I'm testing Jsoup. She look easy but view examples code, it seem no thread-safe. am i right? – Isabea 15/12, 2010 at 17:59

Is it me or does Jsoup not support output stream ? – Difficile 27/9, 2015 at 17:50

@Difficile Check at the end of the question. StAX is for streams. – Isabea 27/9, 2015 at 18:47

The answer to a similar question on the tagsoup-friends google group may help:

Documentation for TagSoup

You've probably already seen them, but the javadoc for JTidy is available here: http://jtidy.sourceforge.net/apidocs/index.html

Goddamn answered 15/12, 2010 at 17:4 Comment(2)

So TagSoup use SAX API, but ¿JTidy? :( Thanks – Isabea 15/12, 2010 at 17:33

JTidy does not, it is basically something like you give it an input stream, get it parsed and then get the output from the output stream. – Sarah 27/2, 2012 at 4:47

Hot tags

Godot Unity Godot Help Programming Godot 4.X GUI GDScript 3D 2D Physics CSharp Godot 3.X VR XR Projects C++

Recommended topics

Hot tags