wkhtmltopdf outputstream & download - diaglog
Asked Answered
P

2

6

is it possible to get a pdf stream created by wkhtmltopdf from any html file and popup a download dialog in IE/Firefox/Chrome etc.?

At the moment I get my outputstream by this code:

public class Printer
{
    public static MemoryStream GeneratePdf(StreamReader Html, MemoryStream pdf, Size pageSize)
    {
        Process p;
        StreamWriter stdin;
        ProcessStartInfo psi = new ProcessStartInfo();

        psi.FileName =  @"C:\PROGRA~1\WKHTML~1\wkhtmltopdf.exe";

        // run the conversion utility 
        psi.UseShellExecute = false;
        psi.CreateNoWindow = true;
        psi.RedirectStandardInput = true;
        psi.RedirectStandardOutput = true;
        psi.RedirectStandardError = true;

        // note that we tell wkhtmltopdf to be quiet and not run scripts 
        psi.Arguments = "-q -n --disable-smart-shrinking " + (pageSize.IsEmpty ? "" : "--page-width " + pageSize.Width + "mm --page-height " + pageSize.Height + "mm") + " - -";

        p = Process.Start(psi);

        try
        {
            stdin = p.StandardInput;
            stdin.AutoFlush = true;
            stdin.Write(Html.ReadToEnd());
            stdin.Dispose();

            CopyStream(p.StandardOutput.BaseStream, pdf);
            p.StandardOutput.Close();
            pdf.Position = 0;

            p.WaitForExit(10000);

            return pdf;
        }
        catch
        {
            return null;
        }
        finally
        {
            p.Dispose();
        }
    }

    public static void CopyStream(Stream input, Stream output)
    {
        byte[] buffer = new byte[32768];
        int read;
        while ((read = input.Read(buffer, 0, buffer.Length)) > 0)
        {
            output.Write(buffer, 0, read);
        }
    }
}

Then I want to display the dialog:

MemoryStream PDF = Printer.GeneratePdf(Rd, PDFStream, Size);

byte[] byteArray1 = PDF.ToArray();
PDF.Flush();
PDF.Close();
Response.BufferOutput = true;

Response.Clear();
Response.ClearHeaders();
Response.AddHeader("Content-Disposition", "attachment; filename=Test.pdf");
Response.ContentType = "application/octet-stream";
Response.BinaryWrite(byteArray1);
Response.End();

With MemoryStreams created from a PDF file this works fine, but here I only get an empty page. The bytearray has 1270 Bytes.

Poppied answered 24/4, 2012 at 6:20 Comment(1)
Did you fix this issue? Did my solution help?Microwatt
M
4

Is this still a problem?

I just created a new ASP.net website to test this on my computer after installing wkhtmltopdf 0.11.0 rc2 and it worked fine creating the PDF. My version was only slightly different;

In my CSHTML I had:

MemoryStream PDFStream = new MemoryStream();
MemoryStream PDF = Derp.GeneratePdf(PDFStream);
byte[] byteArray1 = PDF.ToArray();
PDF.Flush();
PDF.Close();
Response.BufferOutput = true;
Response.Clear();
Response.ClearHeaders();
Response.AddHeader("Content-Disposition", "attachment; filename=Test.pdf");
Response.ContentType = "application/octet-stream";
Response.BinaryWrite(byteArray1);
Response.End();

My Derp class

public class Derp
{
    public static MemoryStream GeneratePdf(MemoryStream pdf)
    {
        using (StreamReader Html = new StreamReader(@"Z:\HTMLPage.htm"))
        {
            Process p;
            StreamWriter stdin;
            ProcessStartInfo psi = new ProcessStartInfo();
            psi.FileName = @"C:\wkhtmltopdf\wkhtmltopdf.exe";
            psi.UseShellExecute = false;
            psi.CreateNoWindow = true;
            psi.RedirectStandardInput = true;
            psi.RedirectStandardOutput = true;
            psi.RedirectStandardError = true;
            psi.Arguments = "-q -n --disable-smart-shrinking " + " - -";
            p = Process.Start(psi);
            try
            {
                stdin = p.StandardInput;
                stdin.AutoFlush = true;
                stdin.Write(Html.ReadToEnd());
                stdin.Dispose();
                CopyStream(p.StandardOutput.BaseStream, pdf);
                p.StandardOutput.Close();
                pdf.Position = 0;
                p.WaitForExit(10000);
                return pdf;
            }
            catch
            {
                return null;
            }
            finally
            {
                p.Dispose();
            }
        }
    }

    public static void CopyStream(Stream input, Stream output)
    {
        byte[] buffer = new byte[32768];
        int read;
        while ((read = input.Read(buffer, 0, buffer.Length)) > 0)
        {
            output.Write(buffer, 0, read);
        }
    }
}
Microwatt answered 16/8, 2012 at 16:31 Comment(5)
So if you have problems with the output, you might want to check how you get the source HTML to the converter if that has any problems in it.Microwatt
What's the + " - -"; for at the end of your arguments?Whirl
@Mvision I have concatenated it dumbly, but it means that instead of files for input and output, I will use streams as input and output. For example, to get wkhtmltopdf to get its input from STDIN instead of a file or URL and output to a file you'd use wkhtmltopdf.exe - output.pdf. Similarly to use an input file and output to STDOUT (or maybe stderr, cant remember) you would use wkhtmltopdf.exe input.html -Microwatt
Thanks! but I would recommand 2 things: using stream.CopyTo instead of creating your own implementation. I would also use an utf8 writer for stdin since wkhtmltopdf expect utf8 by default and Process uses a different encoding by default: using (StreamWriter stdin = new StreamWriter(p.StandardInput.BaseStream, Encoding.UTF8))Merge
Excellent points @Merge - I haven't worked with C# in many years and I don't have a windows machine now so I don't feel confident in testing/editing this - feel free to edit or add your own answer if you have the time :)Microwatt
M
0

My take on this based on @Nenotlep answer. This is only the pdf generation part.

I am using async. I created a new StreamWriter because wkhtmltopdf is expecting utf-8 by default but it is set to something else when the process starts.

I removed p.WaitForExit(...) since I wasn't handling if it fails and it would hang anyway on await tStandardOutput. If timeout is needed, then you would have to call Wait on the different tasks with a cancellationtoken or timeout and handle accordingly.

public static async Task<byte[]> GeneratePdf(string html, Size pageSize)
{
    ProcessStartInfo psi = new ProcessStartInfo
    {
        FileName = @"C:\PROGRA~1\WKHTML~1\wkhtmltopdf.exe",
        UseShellExecute = false,
        CreateNoWindow = true,
        RedirectStandardInput = true,
        RedirectStandardOutput = true,
        RedirectStandardError = true,
        Arguments = "-q -n --disable-smart-shrinking " 
            + (pageSize.IsEmpty ? "" : "--page-width " + pageSize.Width 
            + "mm --page-height " + pageSize.Height + "mm") + " - -";
    };

    using (var p = Process.Start(psi))
    using (var pdfSream = new MemoryStream())
    using (var utf8Writer = new StreamWriter(p.StandardInput.BaseStream, 
                            Encoding.UTF8))
    {
        await utf8Writer.WriteAsync(html);
        utf8Writer.Close();
        var tStdOut = p.StandardOutput.BaseStream.CopyToAsync(pdfSream);
        var tStdError = p.StandardError.ReadToEndAsync();

        await tStandardOutput;
        string errors = await tStandardError;

        if (!string.IsNullOrEmpty(errors))
        {
            //deal with errors
        }

        return pdfSream.ToArray();
    }
}

Things I haven't included in there but could be useful as a reference:

  • you can pass the authentication cookie if needed using --cookie
  • you can set the base tag with href pointing to the server of your html page for additional requests (css, image, etc) in your html page
Merge answered 27/11, 2018 at 21:33 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.