Prawn::Errors::IncompatibleStringEncoding: Your document includes text that's not compatible with the Windows-1252 character set

Asked 7/9, 2017 at 4:25 Answered 15/9, 2017 at 19:38

ruby-on-rails pdf utf-8 fonts prawn

Below is my Prawn PDF file to generate a name on the PDF -

def initialize(opportunity_application)
  pdf = Prawn::Document.new(:page_size => [1536, 2048], :page_layout => :landscape)
  cell_1 = pdf.make_cell(content: "Eylül Çamcı".force_encoding('iso-8859-1').encode('utf-8'), borders: [], size: 66, :text_color => "000000", padding: [0,0,0,700], font: "app/assets/fonts/opensans.ttf")

  t = pdf.make_table [[cell_1]]
  t.draw
  pdf.render_file "tmp/mos_certificates/application_test.pdf"
end

When rendering the name Eylül Çamcı which is Turkish, I get the following error -

Prawn::Errors::IncompatibleStringEncoding: Your document includes text that's not compatible with the Windows-1252 character set.
If you need full UTF-8 support, use TTF fonts instead of PDF's built-in fonts.

I'm already using a TTF font that supports the characters in that name, what can I do to print the name correctly?

Naima answered 7/9, 2017 at 4:25 Comment(3)

are you following this instructions #37287476 – Basque 7/9, 2017 at 4:50

I tried this as well, and it spouted the same error. Here is the gist of what I tried - gist.github.com/mikevic/e1617641704aed9d8642b54fb5ea0351 – Naima 8/9, 2017 at 14:48

aren't you missing font "Opensans". I checked your gist, in the following post they first updated the font family and create a new one for "Arial" => { :normal => "/assets/fonts/Arial.ttf", :italic => "/assets/fonts/Arial Italic.ttf", } then they tell Prawnpdf to use that font family with font "Arial" – Basque 9/9, 2017 at 5:26

It seams Turkish is missing in iso-8859-1.

On the other hand iso-8859-9 should work.

So you may try to change your code like (check the iso number that I changed):

...
cell_1 = pdf.make_cell(content: "Eylül Çamcı".force_encoding('iso-8859-9').encode('utf-8'), borders: [], size: 66, :text_color => "000000", padding: [0,0,0,700], font: "app/assets/fonts/opensans.ttf")
...

And a fun link which is not only related with character set but also other internalisation differences for Turkey.

Edit 1: I made a basic check, it seems the text is already in UTF-8. So why need to change to iso-8859 and come back to UTF-8?

Can you please try "Eylül Çamcı".force_encoding('utf-8') alone?

irb(main):013:0> "Eylül Çamcı".encoding
=> #<Encoding:UTF-8>
irb(main):014:0> "Eylül Çamcı".force_encoding('UTF-8')
=> "Eylül Çamcı"
irb(main):015:0>

Edit 2: Also can you check your font path? Both font exists and the path is proper?

#Rails.root.join('app/assets/fonts/opensans.ttf')
cell_1 = pdf.make_cell(content: "Eylül Çamcı".force_encoding('utf-8'), borders: [], size: 66, :text_color => "000000", padding: [0,0,0,700], font: Rails.root.join('app/assets/fonts/opensans.ttf'))

Haemocyte answered 11/9, 2017 at 6:25 Comment(3)

Sorry, but I'm still getting the same error when I try to change the ISO code as well :/ – Naima 11/9, 2017 at 7:5

Sorry.. Seems I misled you. Please check my edited answer. I may not solve but may give us a clue. – Haemocyte 11/9, 2017 at 7:33

One more edit. If it does not solve either, I am sorry friend. The rest of the code seems OK to me. – Haemocyte 11/9, 2017 at 7:48

I'm not sure I remember how Prawn works, but PDF files don't support UTF-8, which is the default Ruby encoding for String objects.

In fact, PDF files only support ASCII encoding using internal fonts - any other encoding requires that you bring your own font (which is also recommended for portability).

The workaround is to either use character maps (CMaps) - either custom CMaps or pre-defined ones (BYO font).

Generally, PDF files include an embedded font (or a subset of a font), and a CMap, mapping the value of a byte (or, a number of bytes) to a desired font glyph. i.e. mapping 97, which is 'a' in ASCII, to the å glyph when using the specified font.

Last time I used Prawn, I think it supported TTF fonts and created font maps automatically using UTF-8 Strings for the text input - but you have to load an appropriate font into Prawn and remember to use it!.

You can see an example in this answer.

Good Luck!

EDIT

I updated the answer to reflect @mkl's comments.

@mkl pointed out that other encodings are supported or possible (BYO font), including predefined some multibyte encoding (which use pre-defined CMaps).

Carnauba answered 15/9, 2017 at 19:38 Comment(4)

"In fact, PDF files only support ASCII encoding." - this simply is wrong. There is a wide palette of possible encodings for fonts in PDFs, both single byte and multi byte. Merely UTF-8 happens not to be among them. – Kudos 18/9, 2017 at 4:19

@Kudos - I'm think you're mistaken. Multi-byte encodings aren't possible in the PDF format and any encoding other than ASCII (with a limited number of built in fonts) requires that you bring your own font and map the glyphs. You might be thinking of the authoring tool rather than the file format. – Carnauba 18/9, 2017 at 9:50

"requires that you bring your own font" - but what is the problem about that? Embedding fonts actually is a necessity if you want PDFs to be really portable. That been said, though, even if only considering the standard 14 fonts there is much more than merely ASCII, please have a look at Annex D of the PDF specification ISO 32000-1 (part 2 has been released this year but I could not compare yet). And beyond those standard 14 fonts, PDF supports many predefined multi-byte encodings (cf. e.g. section 9.7.5 in ISO 32000-1) and an option to built your own encodings. – Kudos 18/9, 2017 at 11:11

@Kudos - I updated my answer to reflect your comments. Let me know if you have further input. – Carnauba 18/9, 2017 at 11:55

From this anwser about Force strings to UTF-8 from any encoding :

"Forcing" an encoding is easy, however it won't convert the characters just change the encoding:
str = str.force_encoding("UTF-8")
str.encoding.name # => 'UTF-8'
If you want to perform a conversion, use encode

Indeed, as @MehmetKaplan said:

It seams Turkish is missing in iso-8859-1.

On the other hand iso-8859-9 should work.

Therefore, you won't need the force_encodinganymore but just encode

[37] pry(main)> "Eylül Çamcı".encode('iso-8859-1')
Encoding::UndefinedConversionError: U+0131 from UTF-8 to ISO-8859-1
from (pry):39:in `encode'
[38] pry(main)> "Eylül Çamcı".encode('iso-8859-9')
=> "Eyl\xFCl \xC7amc\xFD"

This mean you have to drop the UTF-8 entirely in your code.

content: "Eylül Çamcı".encode('iso-8859-9'),

Ballance answered 12/9, 2017 at 11:13 Comment(1)

I'm still getting the same error :/ Do you think it has something to do with the fonts? I have checked on Google Fonts and Opensans supports the string I am trying with. – Naima 14/9, 2017 at 3:30

Recommended topics

Hot tags