I am using wkhtmltopdf.exe (version 0.12.0 final) to generate pdf files from html files, I do this with .NET C#
My problem is getting javascript, stylesheets and images to work by only specifying relative paths in the html. Right now I have it working if I use absolute paths. But it doesn't work with relative paths, which makes the whole html generation a bit to complicated. I have boiled what I do down to the following example:
string CMDPATH = @"C:\Program Files\wkhtmltopdf\bin\wkhtmltopdf.exe";
string HTML = string.Format(
"<div><img src=\"{0}\" /></div><div><img src=\"{1}\" /></div><div>{2}</div>",
"./sohlogo.png",
"./ACLASS.jpg",
DateTime.Now.ToString());
WriteFile(HTML, "test.html");
Process p;
ProcessStartInfo psi = new ProcessStartInfo();
psi.FileName = CMDPATH;
psi.UseShellExecute = false;
psi.WorkingDirectory = AppDomain.CurrentDomain.BaseDirectory;
psi.CreateNoWindow = true;
psi.RedirectStandardInput = true;
psi.RedirectStandardOutput = true;
psi.RedirectStandardError = true;
psi.Arguments = "-q - -";
p = Process.Start(psi);
StreamWriter stdin = p.StandardInput;
stdin.AutoFlush = true;
stdin.Write(HTML);
stdin.Dispose();
MemoryStream pdfstream = new MemoryStream();
CopyStream(p.StandardOutput.BaseStream, pdfstream);
p.StandardOutput.Close();
pdfstream.Position = 0;
WriteFile(pdfstream, "test.pdf");
p.WaitForExit(10000);
int test = p.ExitCode;
p.Dispose();
I have tried relative paths like: "./sohlogo.png" and simply "sohlogo.png" both displays correctly in the browser via the html file. But none of them work in the pdf file. There is no data in the error stream.
The following commandline works like a charm with the relative paths:
"c:\Program Files\wkhtmltopdf\bin\wkhtmltopdf.exe" test.html test.pdf
I could really need some input at this stage. So any help is much appreciated!
Just for reference the WriteFile and CopyStream methods looks like this:
public static void WriteFile(MemoryStream stream, string path)
{
using (FileStream writer = new FileStream(path, FileMode.Create))
{
byte[] bytes = stream.ToArray();
writer.Write(bytes, 0, bytes.Length);
writer.Flush();
}
}
public static void WriteFile(string text, string path)
{
using (StreamWriter writer = new StreamWriter(path))
{
writer.WriteLine(text);
writer.Flush();
}
}
public static void CopyStream(Stream input, Stream output)
{
byte[] buffer = new byte[32768];
int read;
while ((read = input.Read(buffer, 0, buffer.Length)) > 0)
{
output.Write(buffer, 0, read);
}
}
EDIT: My Workaround for Neo Nguyen.
I could not get this to work with relative paths. So what I did instead was a method that prepends all paths with a root path. It solves my problem so maybe it will solve yours:
/// <summary>
/// Prepends the basedir x in src="x" or href="x" to the input html text
/// </summary>
/// <param name="html">the initial html</param>
/// <param name="basedir">the basedir to prepend</param>
/// <returns>the new html</returns>
public static string MakeRelativePathsAbsolute(string html, string basedir)
{
string pathpattern = "(?:href=[\"']|src=[\"'])(.*?)[\"']";
// SM20140214: tested that both chrome and wkhtmltopdf.exe understands "C:\Dir\..\image.png" and "C:\Dir\.\image.png"
// Path.Combine("C:/
html = Regex.Replace(html, pathpattern, new MatchEvaluator((match) =>
{
string newpath = UrlEncode(Path.Combine(basedir, match.Groups[1].Value));
if (!string.IsNullOrEmpty(match.Groups[1].Value))
{
string result = match.Groups[0].Value.Replace(match.Groups[1].Value, newpath);
return result;
}
else
{
return UrlEncode(match.Groups[0].Value);
}
}));
return html;
}
private static string UrlEncode(string url)
{
url = url.Replace(" ", "%20").Replace("#", "%23");
return url;
}
I tried different System.Uri.Escape*** methods like System.Uri.EscapeDataString(). But they ended up doing to severe url encoding for wkhtmltopdf to understand it. Because of lack of time I just did the quick and dirty UrlEncode above.