Generate PDF based on HTML code (iTextSharp, PDFSharp?)
Asked Answered
S

11

25

Does the library PDFSharp can - like iTextSharp - generate PDF files *take into account HTML formatting *? (bold (strong), spacing (br), etc.)

Previously I used iTextSharp and roughly handled in such a way (code below):

 string encodingMetaTag = "<meta http-equiv=\"Content-Type\" content=\"text/html; charset=utf-8\" />";
 string htmlCode = "text <div> <b> bold </ b> or <u> underlined </ u> <div/>";

 var sr = new StringReader (encodingMetaTag + htmlCode);
 var pdfDoc = new Document (PageSize.A4, 10f, 10f, 10f, 0f);
 var = new HTMLWorker htmlparser (pdfDoc);
 PdfWriter.GetInstance (pdfDoc, HttpContext.Current.Response.OutputStream);
 pdfDoc.Open ();
 htmlparser.Parse (sr);
 pdfDoc.Close ();

incorporated into the appropriate HTML form to a PDF document dealt with the class object HTMLWorker.. so what with PDFSharp? Has PDFSharp similar solution?

Shrinkage answered 29/9, 2011 at 12:9 Comment(0)
S
19

I know this question is old, but here's a clean way to do it...

You can use HtmlRenderer combined with PDFSharp to accomplish this:

Bitmap bitmap = new Bitmap(1200, 1800);
Graphics g = Graphics.FromImage(bitmap);
HtmlRenderer.HtmlContainer c = new HtmlRenderer.HtmlContainer();
c.SetHtml("<html><body style='font-size:20px'>Whatever</body></html>");
c.PerformPaint(g);
PdfDocument doc = new PdfDocument();
PdfPage page = new PdfPage();
XImage img = XImage.FromGdiPlusImage(bitmap);
doc.Pages.Add(page);
XGraphics xgr = XGraphics.FromPdfPage(doc.Pages[0]);
xgr.DrawImage(img, 0, 0);
doc.Save(@"C:\test.pdf");
doc.Close();
        

Some people report that the final image looks a bit blurry, apparently due to automatic anti-aliasing. Here's a post message on how to fix that: http://forum.pdfsharp.com/viewtopic.php?f=2&t=1811&start=0

Sholem answered 14/6, 2013 at 22:59 Comment(5)
It does not generate native PDF code. What it does it rendering HTML as image and insert image to PDF. I don't think this is a proper way of generating PDF from HTML. As far as I know, there is no such code library which can convert html to PDF is available yet. You have to write one yourself.Posset
Check the latest HTML Renderer, it supports native PdfSharp rendering. also available via NuGet: HtmlRenderer.PdfSharpDebatable
Be aware it is now XImage.FromBitmapSource.Clydesdale
user281848: what difference it makes?Photosphere
Any idea on how to use this to convert current page to pdf?Photosphere
M
11

No, PDFsharp does not currently include code to parse HTML files.

Michey answered 29/9, 2011 at 12:39 Comment(3)
Thank you for your reply. A great pity that PDFSharp not have such functionality. Whether it is in the plans a new version of the library? So I must find another solution or another library, or return to the iTextSharp ..Shrinkage
How's that for an honest reply? Way to go, PDFSharp! Clear communication.Icecold
Why the downvote? PDFsharp still does not parse HTML files. Today there is a third-party add-on that does: https://mcmap.net/q/528670/-generate-pdf-based-on-html-code-itextsharp-pdfsharpMichey
V
5

Old question but none of above worked for me. Then i tried generatepdf method of HtmlRenderer in combination of pdfsharp. Hope it helps: You must install a nuget named HtmlRenderer.pdfsharp.

var doc = TheArtOfDev.HtmlRenderer.PdfSharp.PdfGenerator.GeneratePdf("Your html in a string",PageSize.A4);
  PdfPage page = new PdfPage();
  XImage img = XImage.FromGdiPlusImage(bitmap);
  doc.Pages.Add(page);
  XGraphics xgr = XGraphics.FromPdfPage(doc.Pages[0]);
  xgr.DrawImage(img, 0, 0);
  doc.Save(Server.MapPath("test.pdf"));
  doc.Close();
Vainglory answered 29/6, 2015 at 7:24 Comment(2)
I am getting following error when I try to download the nuget package for Xamarin.Android : "Could not install package 'HtmlRenderer.Core 1.5.0.5'. You are trying to install this package into a project that targets 'MonoAndroid,Version=v6.0', but the package does not contain any assembly references or content files that are compatible with that framework. For more information, contact the package author."Donnell
What's all the code doing in lines 2-7? Surely you only need to create the document and then save it?Fermanagh
C
4

If you only want a certain HTML string written to the PDF but not the rest, you can use the HtmlContainer from TheArtOfDev HtmlRenderer. This snippet uses V 1.5.1

using PdfSharp.Pdf;
using PdfSharp;
using PdfSharp.Drawing;
using TheArtOfDev.HtmlRenderer.PdfSharp;

//create a pdf document
using (PdfDocument doc = new PdfDocument())
{
    doc.Info.Title = "StackOverflow Demo PDF";

    //add a page
    PdfPage page = doc.AddPage();
    page.Size = PageSize.A4;

    //fonts and styles
    XFont font = new XFont("Arial", 10, XFontStyle.Regular);
    XSolidBrush brush = new XSolidBrush(XColor.FromArgb(0, 0, 0));

    using (XGraphics gfx = XGraphics.FromPdfPage(page))
    {
        //write a normal string
        gfx.DrawString("A normal string written to the PDF.", font, brush, new XRect(15, 15, page.Width, page.Height), XStringFormats.TopLeft);

        //write the html string to the pdf
        using (var container = new HtmlContainer())
        {
            var pageSize = new XSize(page.Width, page.Height);

            container.Location = new XPoint(15,  45);
            container.MaxSize = pageSize;
            container.PageSize = pageSize;
            container.SetHtml("This is a <b>HTML</b> string <u>written</u> to the <font color=\"red\">PDF</font>.<br><br><a href=\"http://www.google.nl\">www.google.nl</a>");

            using (var measure = XGraphics.CreateMeasureContext(pageSize, XGraphicsUnit.Point, XPageDirection.Downwards))
            {
                container.PerformLayout(measure);
            }

            gfx.IntersectClip(new XRect(0, 0, page.Width, page.Height));

            container.PerformPaint(gfx);
        }
    }

    //write the pdf to a byte array to serve as download, attach to an email etc.
    byte[] bin;
    using (MemoryStream stream = new MemoryStream())
    {
        doc.Save(stream, false);
        bin = stream.ToArray();
    }
}
Clearing answered 24/7, 2019 at 9:55 Comment(0)
N
3

In a project that I developed last year I used wkhtmltopdf (http://wkhtmltopdf.org/) to generate pdf from html then I read the file and get back it to the user.

It works fine for me and it could be an idea for you...

Necking answered 30/9, 2011 at 10:15 Comment(2)
Don't use this lib. It almost consume 50% CPU usages in a single request.Kemper
He asked about .NET solutionOlnay
E
2

Have you guys heard of this. I might be answering very late but I thought it helps. It is very simple and works well.

var htmlContent = String.Format("<body>Hello world: {0}</body>", 
        DateTime.Now);
var htmlToPdf = new NReco.PdfGenerator.HtmlToPdfConverter();
var pdfBytes = htmlToPdf.GeneratePdf(htmlContent);

Edit: I came here with the question of converting HTML code to PDF using 'PDFSharp' and found out that 'PDFSharp' cannot do it then I found out about NReco and it worked for me so I felt it might help someone just like me.

Evadne answered 6/1, 2017 at 22:39 Comment(7)
Is NReco in any way related to "PDFSharp"? After all the OP wants a solution for PDFSharp...Defoliate
I came here with the same question of converting HTML code to PDF using 'PDFSharp' and found out that 'PDFSharp' cannot do it then I found out about NReco and it worked for me so I felt it might help someone just like me. Thanks for downvoting.Evadne
Sorry, I did not downvote (why should I, that would cost me rep, too). Seemingly someone else wondered the same and considered it easier to downvote than to comment...Defoliate
That been said, why didn't you simply put that into your answer. The answer as it is now sounds like an ad for NReco, and it does not indicate what your comment does.Defoliate
@MuraliKrishna thanks for this idea. NReco works fine in my case as I was not able to parse HTML via PDFSharpShirring
Note that NReco is limited for commercial usage. If someone is looking into this to use in their commercial project, be mindful of using it.Taeniacide
NReco allows the community license to be used for 1 server deployment, unlike every other commercial option. It also properly parses css unlike TheArtOfDev.HtmlRenderer. No, I do not work for NReco.Mcnamara
D
1

I know there is a really old question but I realize that there is no one saying actually an accurate method to render an HTML into a PDF. Based on my test I found out that you need the following code to successfully do it.

Bitmap bitmap = new Bitmap(790, 1800);
Graphics g = Graphics.FromImage(bitmap);
XGraphics xg = XGraphics.FromGraphics(g, new XSize(bitmap.Width, bitmap.Height));
TheArtOfDev.HtmlRenderer.PdfSharp.HtmlContainer c = new TheArtOfDev.HtmlRenderer.PdfSharp.HtmlContainer();
c.SetHtml("Your html in a string here");

PdfDocument pdf = new PdfDocument();
PdfPage page = new PdfPage();
XImage img = XImage.FromGdiPlusImage(bitmap);
pdf.Pages.Add(page);
XGraphics xgr = XGraphics.FromPdfPage(pdf.Pages[0]);
c.PerformLayout(xgr);
c.PerformPaint(xgr);
xgr.DrawImage(img, 0, 0);
pdf.Save("test.pdf");

There is another way to do but you might have problems with the size.

PdfDocument pdf = PdfGenerator.GeneratePdf(text, PageSize.A4);
pdf.Save("test.pdf");
Delft answered 13/10, 2015 at 22:42 Comment(1)
Did you see the first comment on the top-voted answer? I agree with that sentiment that this is not the appropriate solution based on what the asker was looking for. Inserting an image of html into a PDF document is not the goal.Newark
C
1

HTML Renderer for PDF using PdfSharp can generate a PDF from an HTML

  1. as an image, or
  2. as text

before inserting to the PDF.

To render as an image, please refer to the code from Diego answer.

To render as text, please refer code below:

static void Main(string[] args)
{
    string html = File.ReadAllText(@"C:\Temp\Test.html");
    PdfDocument pdf = PdfGenerator.GeneratePdf(html, PageSize.A4, 20, null, OnStylesheetLoad, OnImageLoadPdfSharp);
    pdf.Save(@"C:\Temp\Test.pdf");
}

public static void OnImageLoadPdfSharp(object sender, HtmlImageLoadEventArgs e)
{
    var imgObj = Image.FromFile(@"C:\Temp\Test.png");
    e.Callback(XImage.FromGdiPlusImage(imgObj));    
}

public static void OnStylesheetLoad(object sender, HtmlStylesheetLoadEventArgs e)
{
    e.SetStyleSheet = @"h1, h2, h3 { color: navy; font-weight:normal; }";
}

HTML code

<html>
    <head>
        <title></title>
        <link rel="Stylesheet" href="StyleSheet" />      
    </head>
    <body>
        <h1>Images
            <img src="ImageIcon" />
        </h1>
    </body>
</html>
Carothers answered 8/8, 2017 at 7:37 Comment(1)
Gotta say ... "it works" .. but the HTML rendere is EXTREMELY limited :(Equi
P
1

Unfortunately, HtmlRenderer is not an appropriate library to be used in a project based on .NET 5.0:

System.IO.FileLoadException: 'Could not load file or assembly 'HtmlRenderer,
Version=1.5.0.6, Culture=neutral, PublicKeyToken=null'. The located assembly's 
manifest definition does not match the assembly reference. (0x80131040)'

Also, I found that the dependency package HtmlRender.PdfSharp has the following warning message:

Package 'HtmlRenderer.PdfSharp 1.5.0.6' was restored using 
'.NETFramework,Version=v4.6.1, .NETFramework,Version=v4.6.2, 
.NETFramework,Version=v4.7, .NETFramework,Version=v4.7.1, 
.NETFramework,Version=v4.7.2, .NETFramework,Version=v4.8' instead of the project 
target framework 'net5.0'. This package may not be fully compatible with your project.

By the way, I managed to render HTML as PDF using another library IronPDF:

License.LicenseKey = "license key";
var renderer = new ChromePdfRenderer();
PdfDocument pdf = await renderer.RenderHtmlAsPdfAsync(youtHtml);
pdf.SaveAs("your html as pdf.pdf");

The line with License.LicenseKey is not necessary and you can remove it, but your pdf will be generated with the IronPDF watermark in the end of each page. But IronPDF provides getting trial license key.

Pieria answered 10/1, 2022 at 4:42 Comment(0)
S
0

after a long struggle, i succeed using Polybioz.HtmlRenderer.PdfSharp.Core

=> HtmlRenderer.PdfSharp.Core is a partial port of HtmlRenderer.PdfSharp for .NET Core

it work in Net5.0 :)

my solution, directly inspired from VDWWD and others

          PdfPage page = new PdfPage();
        PdfOutline outline = new PdfOutline();

        page = document.AddPage();
        XGraphics gfx_ = XGraphics.FromPdfPage(page);
        using (var container = new HtmlContainer())
        {
            var pageSize = new XSize(page.Width, page.Height);

            var x = 5;
            var y = 100;


            container.Location = new XPoint(x, y);
            container.MaxSize = pageSize;
            container.PageSize = pageSize;


                        const string latinstuff =
            "Facin exeraessisit la consenim iureet dignibh eu <b>facilluptat</b> vercil dunt autpat. " +
            "Ecte magna faccum dolor sequisc iliquat, quat, quipiss equipit accummy niate magna " +
            "facil iure eraesequis am velit, quat atis dolore dolent luptat nulla adio odipissectet " +
            "lan venis do essequatio conulla facillandrem <u>zzriusci</u> bla ad minim inis nim velit eugait " +
            "aut aut lor at ilit ut nulla ate te eugait alit augiamet ad magnim iurem il eu feuissi.\n" +
            "Guer sequis duis eu feugait luptat lum adiamet, si tate dolore mod eu facidunt adignisl in " +
            "henim dolorem nulla faccum vel inis dolutpatum iusto od min ex euis adio exer sed del " +
            "dolor ing enit veniamcon vullutat praestrud molenis ciduisim doloborem ipit nulla consequisi.\n" +
            "Nos adit pratetu eriurem delestie del ut lumsandreet nis exerilisit wis nos alit venit praestrud " +
            "dolor sum volore facidui blaor erillaortis ad ea augue corem dunt nis  iustinciduis euisi.\n" +
            "Ut ulputate volore min ut nulpute dolobor sequism olorperilit autatie modit wisl illuptat dolore " +
            "min ut in ute doloboreet ip ex et am dunt at.";


            //container.SetHtml("This is a <b>HTML</b> string <u>written</u> to the <font color=\"red\">PDF</font>.<br><br><a href=\"http://www.google.nl\">www.google.nl</a>");
            string text = "This is a <b>HTML</b> string <u>written</u> to the <font color=\"red\">PDF</font>.<br>" +
                $"<br><a href=\"http://www.google.nl\">www.google.nl</a>{DateTime.Now.ToLongTimeString()}";
            text +=  latinstuff;
            text += "<p style=\"text-align:center;\">" + latinstuff  + "</p>";                
            text += "<p style=\"text-align:justify;\">" + latinstuff + "</p>";
            text += "<div width=\"70mm\" style=\"text-align:justify;\"><p>" + latinstuff + "</p></div>";

            container.SetHtml(text);

            using (var measure = XGraphics.CreateMeasureContext(pageSize, XGraphicsUnit.Point, XPageDirection.Downwards))
            {
                container.PerformLayout(measure);
            }

            gfx_.IntersectClip(new XRect(0, 0, page.Width + 400, page.Height));
            container.PerformLayout(gfx_);
            container.PerformPaint(gfx_);
        }

the only think which is remain is that the bold is not rendered (the "facilluptat" should be bolded) in the latinstuff, see my print screen of the page

Superphysical answered 23/9, 2022 at 8:21 Comment(0)
I
-2

I'll recommend you NReco.PdfGenerator because have free and paid license and its easy to install from nuget.

Main page: https://www.nrecosite.com/pdf_generator_net.aspx

Documentation: https://www.nrecosite.com/doc/NReco.PdfGenerator/

If you want create PDF from html file try:

String html = File.ReadAllText("main.html");
var htmlToPdf = new NReco.PdfGenerator.HtmlToPdfConverter();
htmlToPdf.GeneratePdf(html, null, "C:/Users/Tmp/Desktop/mapa.pdf");
Infatuate answered 13/4, 2018 at 21:26 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.