CMap unrecoverably broken
Hi, I recently encountered an issue with LuaTeX (1.18.0) + Ghostscript (10.04.0) and it turns out that the culprit seems to be LuaTeX :) This may be summarized as follows: 1. Compile with `lualatex` the following `test.tex` file: --8<---------------cut here---------------start------------->8--- \documentclass{article} \begin{document} Foo. \end{document} --8<---------------cut here---------------end--------------->8--- 2. Run then the following command: --8<---------------cut here---------------start------------->8--- gs -V -P- -dNOPAUSE -dBATCH -sDEVICE=pdfwrite -sstdout=%stderr "-sOutputFile=test.pdf.pdf" test.pdf --8<---------------cut here---------------end--------------->8--- On GNU/Linux, this returns: ┌──── │ GPL Ghostscript 10.04.0 (2024-09-18) │ Copyright (C) 2024 Artifex Software, Inc. All rights reserved. │ This software is supplied under the GNU AGPLv3 and comes with NO WARRANTY: │ see the file COPYING for details. │ Processing pages 1 through 1. │ Page 1 │ │ The following errors were encountered at least once while processing this file: │ CMap unrecoverably broken │ │ **** This file had errors that were repaired or ignored. │ **** The file was produced by: │ **** >>>> LuaTeX-1.18.0 <<<< │ **** Please notify the author of the software that produced this │ **** file that it does not conform to Adobe's published PDF │ **** specification. └──── You can find more details on how I came to LuaTeX in the TeX Live mailing list archive and on the Ghostcript's Bugzilla: ┌──── │ https://tug.org/pipermail/tex-live/2024-December/050985.html │ https://bugs.ghostscript.com/show_bug.cgi?id=708201 └──── Best regards. -- Denis
On 12/19/2024 8:37 PM, Denis Bitouzé wrote:
Hi,
I recently encountered an issue with LuaTeX (1.18.0) + Ghostscript (10.04.0) and it turns out that the culprit seems to be LuaTeX :) This may be summarized as follows:
1. Compile with `lualatex` the following `test.tex` file:
--8<---------------cut here---------------start------------->8--- \documentclass{article} \begin{document} Foo. \end{document} --8<---------------cut here---------------end--------------->8---
2. Run then the following command:
--8<---------------cut here---------------start------------->8--- gs -V -P- -dNOPAUSE -dBATCH -sDEVICE=pdfwrite -sstdout=%stderr "-sOutputFile=test.pdf.pdf" test.pdf --8<---------------cut here---------------end--------------->8---
On GNU/Linux, this returns:
┌──── │ GPL Ghostscript 10.04.0 (2024-09-18) │ Copyright (C) 2024 Artifex Software, Inc. All rights reserved. │ This software is supplied under the GNU AGPLv3 and comes with NO WARRANTY: │ see the file COPYING for details. │ Processing pages 1 through 1. │ Page 1 │ │ The following errors were encountered at least once while processing this file: │ CMap unrecoverably broken │ │ **** This file had errors that were repaired or ignored. │ **** The file was produced by: │ **** >>>> LuaTeX-1.18.0 <<<< │ **** Please notify the author of the software that produced this │ **** file that it does not conform to Adobe's published PDF │ **** specification. └────
You can find more details on how I came to LuaTeX in the TeX Live mailing list archive and on the Ghostcript's Bugzilla:
┌──── │ https://tug.org/pipermail/tex-live/2024-December/050985.html │ https://bugs.ghostscript.com/show_bug.cgi?id=708201 └────
It's a GS issue ... it can't handle an in itself valid 0 beginbfrange endbfrange because after all zero means zero entries. I have a patch for that but delay discussing it with luigi after the x-mas etc days as there is little hurry with this one. Hans ----------------------------------------------------------------- Hans Hagen | PRAGMA ADE Ridderstraat 27 | 8061 GH Hasselt | The Netherlands tel: 038 477 53 69 | www.pragma-ade.nl | www.pragma-pod.nl -----------------------------------------------------------------
Hi Hans, On Wed, 25 Dec 2024, Hans Hagen wrote:
It's a GS issue ... it can't handle an in itself valid
Might be, but the statement:
│ **** file that it does not conform to Adobe's published PDF │ **** specification.
says something different. I am not sure about the PDF spec here, but if an empty beginbfrange is not allowed per spec, then luatex should not produce it. Best regards Norbert -- PREINING Norbert https://www.preining.info arXiv / Cornell University + IFMGA Guide + TU Wien + TeX Live GPG: 0x860CDC13 fp: F7D8 A928 26E3 16A1 9FA0 ACF0 6CAC A448 860C DC13
Am Wed, 25 Dec 2024 22:58:40 +0900 schrieb Norbert Preining:
Hi Hans,
On Wed, 25 Dec 2024, Hans Hagen wrote:
It's a GS issue ... it can't handle an in itself valid
Might be, but the statement:
│ **** file that it does not conform to Adobe's published PDF │ **** specification.
says something different. I am not sure about the PDF spec here, but if an empty beginbfrange is not allowed per spec, then luatex should not produce it.
In the ghostscript bug report Ken relaxed this statement to "senseless but legal" and the next gs seems to accept that. https://bugs.ghostscript.com/show_bug.cgi?id=708042#c3. The PDF/Postscript specs do not really say if 0 is allowed as value (typical vagueness) but as all readers seem to handle that ... -- Ulrike Fischer http://www.troubleshooting-tex.de/
On 12/25/2024 3:35 PM, Ulrike Fischer wrote:
Am Wed, 25 Dec 2024 22:58:40 +0900 schrieb Norbert Preining:
Hi Hans,
On Wed, 25 Dec 2024, Hans Hagen wrote:
It's a GS issue ... it can't handle an in itself valid
Might be, but the statement:
│ **** file that it does not conform to Adobe's published PDF │ **** specification.
says something different. I am not sure about the PDF spec here, but if an empty beginbfrange is not allowed per spec, then luatex should not produce it.
In the ghostscript bug report Ken relaxed this statement to "senseless but legal" and the next gs seems to accept that.
https://bugs.ghostscript.com/show_bug.cgi?id=708042#c3.
The PDF/Postscript specs do not really say if 0 is allowed as value (typical vagueness) but as all readers seem to handle that ...
In the meantime I've seen plenty of pdf files (not produced by pdf/luatex) that contain (for instance) "senseless but legal" operator / font usage in the page stream so this zero 'array' is not the worst we can have, especialy because what the tex engines produce is (apart from possible user injections) about as clean as one can get. One should also keep in mind that there might be plenty documents out there that have zero entry bfrange's so it would be interesting to see what the impact is of gs suddenly quitting on 'senseless' for a few decades of tex documents out there. Hans ----------------------------------------------------------------- Hans Hagen | PRAGMA ADE Ridderstraat 27 | 8061 GH Hasselt | The Netherlands tel: 038 477 53 69 | www.pragma-ade.nl | www.pragma-pod.nl -----------------------------------------------------------------
Hi On Wed, 25 Dec 2024, Ulrike Fischer wrote:
In the ghostscript bug report Ken relaxed this statement to "senseless but legal" and the next gs seems to accept that.
Great to hear!
The PDF/Postscript specs do not really say if 0 is allowed as value (typical vagueness) but as all readers seem to handle that ...
:-( One of the happy things in life ;-) On Thu, 26 Dec 2024, Hans Hagen wrote:
In the meantime I've seen plenty of pdf files (not produced by pdf/luatex) that contain (for instance) "senseless but legal" operator / font usage in
That is not a valid excuse, though. Only because other software creates incorrect pdfs should not make it ok to create incorrect pdfs.
One should also keep in mind that there might be plenty documents out there that have zero entry bfrange's so it would be interesting to see what the
Yes, and so it is good that the next gs seems to accept it. Still, having a PDFs produced by "us" (as the TeX world) being generally ok in most cases is very much preferrable. Best regards, happy holidays, and all the best for the next year Norbert -- PREINING Norbert https://www.preining.info arXiv / Cornell University + IFMGA Guide + TU Wien + TeX Live GPG: 0x860CDC13 fp: F7D8 A928 26E3 16A1 9FA0 ACF0 6CAC A448 860C DC13
On Wed, 25 Dec 2024 at 15:04, Norbert Preining
Hi Hans,
On Wed, 25 Dec 2024, Hans Hagen wrote:
It's a GS issue ... it can't handle an in itself valid
Might be, but the statement:
│ **** file that it does not conform to Adobe's published PDF │ **** specification.
says something different. I am not sure about the PDF spec here, but if an empty beginbfrange is not allowed per spec, then luatex should not produce it.
hm , I always consider such messages from GS but I prefer verapdf. In this case we can start from pdf-a1b-2005.mkiv copy the icc profile from CTAN /support/colorprofiles/sRGB.icc into texmf-dist/tex/context/colors/icc/sRGB.icc run context --luatex pdf-a1b-2005.mkiv run verapdf pdf-a1b-2005.pdf that should say <validationReport jobEndStatus="normal" profileName="PDF/A-1B validation profile" statement="PDF file is compliant with Validation Profile requirements." isCompliant="true"> <details passedRules="128" failedRules="0" passedChecks="808" failedChecks="0"></details> </validationReport> make a qdf with qpdf -qdf pdf-a1b-2005.pdf pdf-a1b-2005.qdf edit pdf-a1b-2005.qdf adding 0 beginbfrange endbfrange fix qdf with fix-qdf pdf-a1b-2005.qdf >pdf-a1b-2005.qdf.pdf and finally run verapdf pdf-a1b-2005.qdf.pdf : <validationReport jobEndStatus="normal" profileName="PDF/A-1B validation profile" statement="PDF file is compliant with Validation Profile requirements." isCompliant="true"> <details passedRules="128" failedRules="0" passedChecks="814" failedChecks="0"></details> </validationReport> But even in this case, if I change 1 beginbfrange endbfrange I still have a <validationReport jobEndStatus="normal" profileName="PDF/A-1B validation profile" statement="PDF file is compliant with Validation Profile requirements." isCompliant="true"> <details passedRules="128" failedRules="0" passedChecks="814" failedChecks="0"></details> </validationReport> but a "warning" message ( not in the report ! ) Dec 27, 2024 12:50:01 PM org.verapdf.pd.font.cmap.CMapFactory getCMap WARNING: Can't parse CMap CMap 19 0 obj, using default java.io.IOException: CMap contains invalid entry in bfrange. Expected TT_HEXSTRING but got TT_KEYWORD One could say that the pdf is not valid because it's malformed, and verapdf is wrong here. -- luigi
participants (5)
-
Denis Bitouzé
-
Hans Hagen
-
luigi scarso
-
Norbert Preining
-
Ulrike Fischer