Acrobat PDF generated with FlyingSaucer & ITextRenderer seems to be removing links
Asked Answered
T

0

8

I have HTML that contains links such as the following:

<p class="Results">Web : 
SPLIT<a href="http://www.google.fr/">http://www</a>
SPLIT<a href="http://www.google.fr/">.google</a>
SPLIT<a href="http://www.google.fr/">.fr/</a>
</p>

We are converting the HTML into a PDF using flying-saucer.

I open the PDF that results from the following and the links are not clickable. It seems as if the hrefs are not added.

If I remove the SPLIT word, the links are clickable but only because the built-in reader link detector makes them clickable if they are a valid URL.

Any ideas why my links are being removed in the final PDF?

Code:

  ITextRenderer itextRender = null;
  Tidy tidy = new Tidy();
  tidy.setXmlOut(true);
  tidy.setShowWarnings(false);
  // tidy.setXmlTags(false);
  tidy.setInputEncoding(UTF_8_DN);
  tidy.setOutputEncoding(UTF_8_DN);
  tidy.setXHTML(true);//
  tidy.setMakeClean(true);

  dataStream = new ByteArrayInputStream(data);

  stream = new ByteArrayOutputStream(32 * 1024);

  Post process - Convert Html into Xhtml valid format
  org.w3c.dom.Document w3cDoc = tidy.parseDOM(dataStream, stream);

  itextRender = new ITextRenderer();
  itextRender.setDocument(w3cDoc, null);

  itextRender.layout();
  itextRender.createPDF(stream);

Update

I did various experiments and they all failed. I tried adding the style display:block to my links - this failed. I tried adding a form with get action and button/submit - this failed even more as all input type="button" or submit are interpreted as textfields in the final PDF.

return "<input type=\"button\" value=\"Click me\">" +
            "<form action=\"http://www.example.com\" method=\"GET\">\n" +
            "  <input type=\"submit\" /> \n" +
            "</form>"+
            "</input><a href=\"" + url + "\" title=\"" + linkContent + "\" target=\"_blank\" style=\"display:block\">" + linkContent + "</a>";

For example, the Click Me Button is turned into an editable textfield:

enter image description here enter image description here

Tahoe answered 10/11, 2016 at 17:25 Comment(3)
Did you ever find a solution for this? I'm running into the same issue.Terrell
@Terrell No we moved on. The only solution I can think of is to use the underlying pdf library to overlay the link button over text with http or https:// . Links in reader ( that are not detected ) are almost like transparent buttons . But even this solution had a problem that Of a text/label box-area has more than one link ... which do you choose ?Tahoe
@Terrell What I meant was: not solution was to modify the code that generates the pdf and specifically the text blocks. If text block has link, take width height x y of block, and layer click/link overlay on top with proper link.Tahoe

© 2022 - 2024 — McMap. All rights reserved.