I made some progress on this. Starting with code from http://koivi.com/fill-pdf-form-fields/, I modified the value encoding to output numeric codes for any characters outside the ascii range.
Now with pitulski's special strings:
Poznań Śródmieście Ćwiartka Ósma
outputs Pozna ródmiecie wiartka Ósma
with some box shapes superimposed
ęóąśłżźćńĘÓĄŚŁŻŹĆŃ
outputs óÓ
with more box shapes. I think it may be that the box shapes are characters my server doesn't recognize.
I tried it with some French characters: ùûüÿ€’“”«»àâæçéèêëïôœÙÛÜŸÀÂÆÇÉÈÊËÏÎÔ
and they all came out OK, but some of them were overlapping.
--edit-- I just tried entering these manually into the form and got the same result minus the box shapes (using Evince). I then tried with a different form (created by someone else) - after entering ęóąśłżźćńĘÓĄŚŁŻŹĆŃ
, ółÓŁ
was displayed. It looks like it depends which characters are included in the document's embedded fonts.
/*
KOIVI HTML Form to FDF Parser for PHP (C) 2004 Justin Koivisto
Version 1.2.?
Last Modified: 2013/01/17 - Jon Hulka(jon dot hulka at gmail dot com)
- changed character encoding, all non-ascii characters get encoded as numeric character references
This library is free software; you can redistribute it and/or modify it
under the terms of the GNU Lesser General Public License as published by
the Free Software Foundation; either version 2.1 of the License, or (at
your option) any later version.
This library is distributed in the hope that it will be useful, but
WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY
or FITNESS FOR A PARTICULAR PURPOSE. See the GNU Lesser General Public
License for more details.
You should have received a copy of the GNU Lesser General Public License
along with this library; if not, write to the Free Software Foundation,
Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
Full license agreement notice can be found in the LICENSE file contained
within this distribution package.
Justin Koivisto
justin dot koivisto at gmail dot com
http://koivi.com
*/
/**
* createXFDF
*
* Tales values passed via associative array and generates XFDF file format
* with that data for the pdf address sullpiled.
*
* @param string $file The pdf file - url or file path accepted
* @param array $info data to use in key/value pairs no more than 2 dimensions
* @param string $enc default UTF-8, match server output: default_charset in php.ini
* @return string The XFDF data for acrobat reader to use in the pdf form file
*/
function createXFDF($file,$info,$enc='UTF-8'){
$data=
'<?xml version="1.0" encoding="'.$enc.'"?>
<xfdf xmlns="http://ns.adobe.com/xfdf/" xml:space="preserve">
<fields>';
foreach($info as $field => $val){
$data.='
<field name="'.$field.'">';
if(is_array($val)){
foreach($val as $opt)
//2013.01.17 - Jon Hulka - all non-ascii characters get character references
$data.='
<value>'.mb_encode_numericentity(htmlspecialchars($opt),array(0x0080, 0xffff, 0, 0xffff), 'UTF-8').'</value>';
// $data.='<value>'.htmlentities($opt,ENT_COMPAT,$enc).'</value>'."\n";
}else{
$data.='
<value>'.mb_encode_numericentity(htmlspecialchars($val),array(0x0080, 0xffff, 0, 0xffff), 'UTF-8').'</value>';
// $data.='<value>'.htmlentities($val,ENT_COMPAT,$enc).'</value>'."\n";
}
$data.='
</field>';
}
$data.='
</fields>
<ids original="'.md5($file).'" modified="'.time().'" />
<f href="'.$file.'" />
</xfdf>';
return $data;
}