Adding metadata in EPS file using Java
Asked Answered
K

1

5

I'm currently reading and writing .EPS file to manipulate/add metadata (Keywords and Tags) in the file.

PS: File encoding is Windows-1251 or Cp1251 -Russian-

I'm reading EPS file like this: (String lines; is a global variable)

try (BufferedReader br = new BufferedReader(new InputStreamReader(new FileInputStream(file), "Cp1251"))) {
    String line;
    while((line = br.readLine()) != null) {
        if(line.contains("</xmpTPg:SwatchGroups>")) {
            lines.add(line);
            lines.add(descriptionKwrds);
        }
        else
            lines.add(line);
        System.out.println(line);
    }
} catch (FileNotFoundException ex) {
    Logger.getLogger(script.class.getName()).log(Level.SEVERE, null, ex);
} catch (UnsupportedEncodingException ex) {
    Logger.getLogger(script.class.getName()).log(Level.SEVERE, null, ex);
} catch (IOException ex) {
    Logger.getLogger(script.class.getName()).log(Level.SEVERE, null, ex);
}

In above descriptionKwrds is the metadata (tags) that I want to manipulate an EPS file like:

String descriptionKwrds = "<photoshop:AuthorsPosition>icon vector illustration symbol bubble sign</photoshop:AuthorsPosition>";

And writing EPS file like this:

try {
    try (BufferedWriter out = new BufferedWriter(new OutputStreamWriter(new FileOutputStream(file.getName()), "Cp1251"))) {
        for(String s : lines)
            out.write(s + "\n");
        out.flush();
    }
} catch (FileNotFoundException ex) {
    Logger.getLogger(script.class.getName()).log(Level.SEVERE, null, ex);
} catch (IOException ex) {
    Logger.getLogger(script.class.getName()).log(Level.SEVERE, null, ex);
}

File is reading and writing correctly, but when I open newly generated file. It says that the file is corrupted.

Files before and after manipulation are file1 and file2 respectively. And using ESP Converter to open EPS files online.

How I can achieve it?

Katusha answered 11/6, 2019 at 6:10 Comment(2)
I have no idea what metadata you think you are adding, there are neither tags nor keywords (other than PostScript operators) in an EPS file. You haven't supplied any before or after example to look at, and you haven't said what you are using to open the file after creation. You need to bear in mind that PostScript is a programming language, altering the program by randomly poking it isn't likely to achieve success, you need to understand the program first. If you post a before and after file I'll take a look at them.Kalvin
Hi @KenS, I have added info that you asked. You can check that. Thanks!Katusha
K
9

OK your problem is that your EPS file is an 'EPS with preview'. In addition to the actual PostScript program, there is a bitmap which any application placing the EPS on a page can use to disply a 'preview' to the user.

The file has binary at the beginning of it like this:

C5 D0 D3 C6 20 00 00 00 DC 49 05 00 00 00 00 00
00 00 00 00 FC 49 05 00 AE AC 4D 00 FF FF 00 00

If you read Adobe Technical Note 5002 "Encapsulated PostScript File Format Specification" and look at page 23 you will see that it defines the DOS EPS Binary File Header, which begins hex C5D0D3C6, just as your file does. So you can see your file has a DOS header, which defines a preview.

Now byes 4-7 define the start of the PostScript, and bytes 8-11 define the length of the PostScript section. 12-15 are the start of the Metafile (0 for your case, so not present) and 16-19 are the byte length, again 0. Then at bytes 20-23 there is the start of the TIFF representation, and bytes 24-27 are the length of the TIFF. Finally there's the checksum of the header in the remaining two bytes; here we have 0xFFFF which means 'ignore the checksum'. In this case the header has been padded out with two bytes (0x00) to make the total 32 bytes which is why the offset of the PostScript section is 0x20.

Your problem is that, because you have added content to the PostScript section (therefore increasing its size), but have not updated the file header, to contain the new length of the PostScript section, or the new position of the preview, any EPS consumer won't be able to strip the preview. In effect you have corrupted the PostScript program.

You either need to update the file header, or strip the preview bitmap by removing the file header and trimming the bitmap off the end to produce a 'pure' EPS file (ie one with no preview).

I almost forgot to add some clarification; you are not updating 'keywords' or 'tags' in the EPS file. You are adding PostScript-language program code which executes PostScript operators. In this case, when run through a 'Disitller'-like PostScript interpreter (that is, one which produces PDF as an output), the PDF file will have its metadata altered. You aren't altering the metadata of the EPS at all (that's done with the comments in the header). For a PostScript consumer which is not a Distiller the changes you have made will have no effect at all.

[Update]

Modifying the header of 'file2' (that is the file which has had pdfmarks added) like this:

C5 D0 D3 C6 20 00 00 00 32 26 05 00 00 00 00 00
00 00 00 00 52 26 05 00 AE AC 4D 00 FF FF 00 00

Results in a working file. It seems that the modifications actually made the file shorter. The original size of the PostScript section was 0x0549DC and the offset of the TIFF bitmap was 0x0549FC. After modification the size of the PostScript section is 0x052632 and the offset of the TIFF bitmap is 0x052652.

I have a sneaking suspicion that this is due to CR/LF translation, and if so this will also have corrutped the TIFF bitmap stored at the end of the file (I notice the binary at the end does indeed appear to be different). You need to read and write this file as a binary file, not text.

Kalvin answered 11/6, 2019 at 9:40 Comment(9)
Thanks for your explanation. Your explanation is based on the Postscript language. But I am modifying the EPS file in java language. PS: I don't know much about the Postscript.Katusha
Well, you are already modifying the PostScript file (EPS is PostScript). In fact while my explanation deals with PostScript, that isn't the point. The file, being an EPS with preview file, has a header. The header has byte offsets and lengths of various parts of the file content. If you modify the content, you must either modify the header so that those offsets and lengths are correct, or remove the header and the preview bitmap. None of that is anything to do with PostScript. The URL I linked to above gives the structure of the header, should you want to modify the EPS header.Kalvin
As I'm not a Postscript developer. I really need to modify the EPS file in Java.Katusha
I'm not suggesting using any specific language, let alone PostScript. You need to modify the bytes at the beginning of the file, how you do that is entirely up to you. You either need to calculate the new size of the PostScript section after you'ved modified it, and set the sizes and offsets of the PostScript and bitmap bytes in the header, or strip the header and the preview entirely. ALl of this is fimply editing the bytes in the file, its nothing to do with PostScript at all. As I don't understand Java I can't write it for you.Kalvin
I which software you checked the file info like bytes and header info?Katusha
I used a binary editor, in this case Micsrosoft Visual Studio, but any binary editor will do.Kalvin
Did you know how to do it in postScript ?Katusha
To be honest, I wouldn't even try, PostScript is not a great language for this kind of thing. Also, you need to know how much data you added to the body of the PostScript program so you can add that to the offset of the bitmap, and the length of the PostScript section in the header.Kalvin
< #56798125> @user5377037 could you please ans this question ?Mathis

© 2022 - 2024 — McMap. All rights reserved.