Wkhtmltopdf Characters in single line partially cut between pages
Asked Answered
A

12

48

I am working in a project using ruby on rails(3.1). My requirement is to produce pdf from the html content. So I use pdfkit gem.

In some pages, characters in single line partially cut between pages. When I convert html convert to pdf using pdfkit gem

version of wkhtmltopdf: wkhtmltopdf -- 0.11.0 rc1

operating system: Linux CentOS 5.5

In the image below showing character partially cut between pages.

Please suggest a solution.

Example 1

enter image description here

Example 2

enter image description here

Anette answered 9/1, 2012 at 10:9 Comment(5)
What is the full command you are using to generate the pdf?Norseman
Command generated from pdfkit gem: wkhtmltopdf "--page-size" "A4" "--margin-top" "5mm" "--margin-right" "5mm" "--margin-bottom" "5mm" "--margin-left" "5mm" "--encoding" "UTF-8" "--quiet" "1011284.html" "test.pdf"Anette
what happens if you change the margin? does it still cut it off?Ultramicroscope
Showing same error after changing the marginAnette
* { page-break-inside: avoid; page-break-after: avoid; page-break-before: avoid; }Canalize
C
17

I did have this problem with a table:

enter image description here

Then I added this to my CSS:

table, img, blockquote {page-break-inside: avoid;}

This fixed the problem:

enter image description here

Cleareyed answered 27/2, 2015 at 14:20 Comment(2)
This didn't work for me. I tried settings this attribute on enclosing td, tr and div and line is still cropped.Mould
What i have found is that the basic rendering of the document respects page-break-inside: avoid;, but if you use the --margin-bottom option, then that doesn't respect that rule and starts splitting things mid line.Shellieshellproof
B
12

I just ran across this and found something that resolved the issue for me. In my particular case, there were divs with display: inline-block; margin-bottom: -20px;. Once I changed them to block and reset the margin-bottom, the line splitting disappeared. YMMV.

Barnyard answered 28/8, 2012 at 18:54 Comment(2)
Thank you, I had the same problem with an "article" element. After adding display: block, it worked like a charm.Sldney
@nvahalik: To which element did you add display:block? i have a similar issue with exporting table in pdf. SO question here - #17046885Visitation
S
9

According to some documentation I found (see Page Breaking), this is a known issue and suggests using CSS page breaks to insert page breaks (assuming you are using patched version of QT):

The current page breaking algorithm of WebKit leaves much to be desired. Basically webkit will render everything into one long page, and then cut it up into pages. This means that if you have two columns of text where one is vertically shifted by half a line. Then webkit will cut a line into to pieces display the top half on one page. And the bottom half on another page. It will also break image in two and so on. If you are using the patched version of QT you can use the CSS page-break-inside property to remedy this somewhat. There is no easy solution to this problem, until this is solved try organising your HTML documents such that it contains many lines on which pages can be cut cleanly.

See also: http://code.google.com/p/wkhtmltopdf/issues/detail?id=9, http://code.google.com/p/wkhtmltopdf/issues/detail?id=33 and http://code.google.com/p/wkhtmltopdf/issues/detail?id=57.

Shriek answered 9/1, 2012 at 13:4 Comment(4)
This no longer the case. The answer below by @Cleareyed resolves any page break issues, not to mention just getting the latest version of wkhtmltopdf (0.12.2.1). Add the following to your CSS: table, img, blockquote {page-break-inside: avoid;}Execrate
@Execrate not right. The problem is only partially solved and its still there. The page-break-inside will only help for the whole block you are adding it to. For example, if 1 paragraph / block is more than a page long, then page-break-inside will not help and the text will be cut in some cases. Its okay to fix it if it is static text, but it is a problem with dynamically generated text when you dont know how long that particular block will be. So the problem is still there and only partially resolved.Stationery
@Neel, in that case I'd say it's mostly solved. At least in my particular scenario, 1 paragraph/block was never going to be a problem. Quite frankly, a paragraph/block should never be longer than a normal page, but in what seems like the rare case that it is, then yes, that would be a place where the problem still exists.Execrate
Am using 0.12.5.0 (patched QT) and it is still breaking for me.Jordan
G
5

In my case, the issue was resolved by commenting out the following css:

html, body {
  overflow-x: hidden;
} 

In general, check if any tags have overflow set as hidden and remove it or set it to visible.

Btw, I am using wkhtmltopdf version 0.12.2.1 on Windows 8.

Goofy answered 23/4, 2015 at 6:12 Comment(0)
E
2

This is old but hopefully will help someone - I was having the issues too, tried everything - even resorting back to old versions mentioned (12.1) but to no avail. I kept tweaking css to play around, trying to throw in page-break avoids everywhere, not having much progress. Then I tweaked css that was on the root div of my html, and it fixed it. I made so many tweaks trying to get it to work so I can't be 100% sure, but I believe the issue was it set to 'display:table' with margin: 0 auto and a specific width on the main outer div. It started working and not cutting off either images or tables mid-row once I removed that. Then the page-break-inside: avoid was working after that as expected.

I believe ultimately the code is trying to guess as best as it can exactly how many pixels high each page is, and where exactly (down to the pixel) is your content. We have to make it easy for the library to detect this by removing as much odd css in there as possible, so it's as simple as possible to calculate down to the pixel where the content lies. That's my guess.

Enciso answered 17/1, 2020 at 19:58 Comment(0)
M
2

https://github.com/ArthurHub/HTML-Renderer/issues/38

                    **var head = "<head><style type=\"text/css\"> td, h1, h2, h3, p, b, div, i, span, label, ul, li, tr, table { page-break-inside: avoid; } </style></head>";**

                    PdfDocument pdf = PdfGenerator.GeneratePdf("html>" + head + "<body>" +  m42Notes + "</body></html>", configurationOptions);
Mustang answered 7/10, 2020 at 18:20 Comment(1)
Resolved my problem.Stew
F
1

I scoured the internet for a couple of weeks, trying to overcome this issue. None of the solutions I found worked for me, but something else did.

I had a two column layout where the text was getting cut off mid-text. In the broken state, my basic structure looked like this:

@media print {
  * {
    page-break-inside: avoid;
    page-break-after: avoid;
    page-break-before: avoid;
  }
}
.col-9{
  display: inline-block;
  width: 70%;
}
.col-9{
  display: inline-block;
  width: 25%;
}

<div class="col-9">
  [a lot of text here, that would spill over multiple pages]
</div>
<div class="col-3">
  [a short sidebar here]
</div>

I fixed it by changing it to:

@media print {
  * {
    page-break-inside: avoid;
    page-break-after: avoid;
    page-break-before: avoid;
  }
}

.col-9{
  display: block;
  float: left;
  width: 70%;
}
.col-9{
  display: block;
  float: left;
  width: 25%;
}
.clear{
  clear: both;
}

<div class="col-9">
  [a lot of text here, that no longer split mid-line.]
</div>
<div class="col-3">
  [a short sidebar here]
</div>
<div class="clear"></div>

For some reason, the tool could not handle the display: inline-block setup. It works with floats. I'm running version 0.12.4.

Farci answered 10/7, 2018 at 19:42 Comment(0)
S
1

I solved problem adding margin-top and margin-bottom, like this:

$this->get('knp_snappy.pdf')->generateFromHtml($html, $pdfFilepath, [
        'default-header' => false,
        'header-line' => false,
        'footer-line' => false,
        'disable-javascript' => true,
        'margin-top' => '3mm',
        'margin-bottom' => '3mm',
        'margin-right' => '5mm',
        'margin-left' => '5mm',
        'orientation' => 'Landscape',
    ], true);
Southern answered 1/4, 2020 at 6:59 Comment(0)
A
0

The cut text problem is a known webkit problem and it seems developers found a solution inside wkhtmltopdf. Updating to 0.12.1 will fix the cut-text problem (if you don't want to waste time with compilations, you can just take the binary file from here: https://github.com/h4cc/wkhtmltopdf-amd64 ).

Adalai answered 12/7, 2014 at 17:10 Comment(4)
I'm using 0.12.2.1, which is a more updated version of wkhtmltopdf. I still seem to have this problem, so I don't think this is the fix (since I doubt they reintroduced the bug in a newer version).Adman
I confirm 0.12.1 worked at the moment - didn't play with it since.Adalai
Using wkhtmltopdf 0.12.2.1 (with patched qt), still the issue.Cerf
I have wkhtmltopdf 0.12.6 (with patched qt) and i still have the issue, but only when I use the --margin-bottom option. I think that whatever fix is done to properly decide where to split the pages, it is done before applying the margin options, and so the margin options mess it up.Shellieshellproof
I
0

Have been putting up with this for months and finally found a fix for my situation. I'm using the github css stylesheet in the html file I'm converting, and code blocks that span multiple pages get the text cut if. Nothing is missing, it's just cut in half.

Bottom of a page:

bottom of page

Start of next page:

start of next page

So in the github stylesheet overflow is set to auto for <pre> tags.

.markdown-body .highlight pre,
.markdown-body pre {
  padding: 16px;
  overflow: auto;
...

Switching the overflow property to hidden solved it for me!

.markdown-body .highlight pre,
.markdown-body pre {
  padding: 16px;
  overflow: hidden;

Think I tried all the other answers on this page, but this is solved for me. Hope it helps someone else out :)

Ianthe answered 28/2, 2021 at 15:48 Comment(0)
P
0

I was able to find a workaround to this issue by installing wkhtmltox_0.12.6-1.bionic_amd64.deb (for Ubuntu) from https://github.com/wkhtmltopdf/packaging/releases/0.12.6-1

After updating this wkhtmltox package, the tables and text will not cut off at the end of the page anymore. This fix introduced a different issue for me, now the generated pdf has no styling. For example font-family, font-size or even text alignment are all gone, and are using some default setting.

Potbellied answered 8/6, 2021 at 10:4 Comment(1)
re-installing fonts on my ubuntu fixed the fonts issue.Potbellied
M
0

When converting HTML to PDF using wkhtmltopdf, using display: table; and display: table-cell; instead of vertical-align and display: inline or display: inline-block in your CSS code helps prevent word-cutting issues. This approach ensures that words are not cut off at the end of lines in the generated PDF.

try this its work

Melinamelinda answered 1/3 at 11:17 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.