I have HTML that contains links such as the following:
<p class="Results">Web :
SPLIT<a href="http://www.google.fr/">http://www</a>
SPLIT<a href="http://www.google.fr/">.google</a>
SPLIT<a href="http://www.google.fr/">.fr/</a>
</p>
We are converting the HTML into a PDF using flying-saucer
.
I open the PDF that results from the following and the links are not clickable.
It seems as if the href
s are not added.
If I remove the SPLIT
word, the links are clickable but only because the built-in reader link detector
makes them clickable if they are a valid URL.
Any ideas why my links are being removed in the final PDF?
Code:
ITextRenderer itextRender = null;
Tidy tidy = new Tidy();
tidy.setXmlOut(true);
tidy.setShowWarnings(false);
// tidy.setXmlTags(false);
tidy.setInputEncoding(UTF_8_DN);
tidy.setOutputEncoding(UTF_8_DN);
tidy.setXHTML(true);//
tidy.setMakeClean(true);
dataStream = new ByteArrayInputStream(data);
stream = new ByteArrayOutputStream(32 * 1024);
Post process - Convert Html into Xhtml valid format
org.w3c.dom.Document w3cDoc = tidy.parseDOM(dataStream, stream);
itextRender = new ITextRenderer();
itextRender.setDocument(w3cDoc, null);
itextRender.layout();
itextRender.createPDF(stream);
Update
I did various experiments and they all failed.
I tried adding the style display:block
to my links - this failed.
I tried adding a form
with get action
and button
/submit
- this failed even more as all input type="button"
or submit
are interpreted as textfields in the final PDF.
return "<input type=\"button\" value=\"Click me\">" +
"<form action=\"http://www.example.com\" method=\"GET\">\n" +
" <input type=\"submit\" /> \n" +
"</form>"+
"</input><a href=\"" + url + "\" title=\"" + linkContent + "\" target=\"_blank\" style=\"display:block\">" + linkContent + "</a>";
For example, the Click Me
Button is turned into an editable textfield: