[pdftex-Bugs][4014] Drops characters from included pdf figure
Bugs item #4014, was opened at 2009-04-08 17:24
Status: Closed
Priority: 3
Submitted By: Andrey Paramonov (pent)
Assigned to: Nobody (None)
Summary: Drops characters from included pdf figure
Category: PDF inclusion
Group: None
Resolution: Wont Fix
Initial Comment:
To reproduce:
1) Download attached file. It is a pdf figure containing some non-latin characters. Note that the used font subset is embedded.
2) Create the following minimal LaTeX document:
\documentclass{article}
\usepackage[T2A]{fontenc}
\usepackage[utf8]{inputenc}
\usepackage[russian]{babel}
\usepackage{graphicx}
\begin{document}
\includegraphics{fig}
\end{document}
3) pdflatex it. Open the resulting pdf file and note that all non-latin characters have disappeared.
My pdflatex says:
---------
Running `LaTeX' on `test' with ``pdflatex "\nonstopmode\input
{test.tex}"''
This is pdfTeXk, Version 3.141592-1.40.3 (Web2C 7.5.6)
%&-line parsing enabled.
entering extended mode
LaTeX2e <2005/12/01>
Babel Comment By: The Thanh Han (hanthethanh)
Date: 2009-04-12 14:27 Message:
the problem we have is basically this: we have fontA in fig.pdf and fontB on disk. They have the same name. How pdftex can detect whether fontB is a superset of fontA, so that fontA will be replaced by fontB only when it is safe to do so.
This sounds simple, but it's not easy to do for some reasons:
- pdftex parses fontB at the end, not during pdf inclusion. At this phase it's too late to change the decision whether to copy fontA.
- telling whether "fontB is a superset of fontA" is difficult. They might differ in various ways, not only in glyph set.
- fontA might be in Type1C format, which pdftex cannot parse.
----------------------------------------------------------------------
Comment By: Andrey Paramonov (pent)
Date: 2009-04-10 19:50
Message:
Hello!
Thanks for the detailed explanations.
I've dug a bit deeper and indeed, there are two versions of NimbusRomNo9L-Regu in my system:
/usr/share/fonts/type1/gsfonts/n021003l.pfb (v.1.0.6, dated 08/02/2007, Debian package gsfonts)
/usr/share/texmf-texlive/fonts/type1/urw/times/utmr8a.pfb (v.1.0.5, dated 01/10/2006, Debian package texlive-fonts-recommended)
The workarounds you suggest are helpful, but I still think the problem must be eliminated completely, if there is a way to do so.
Probably you are right that different fonts should always have different names. But such an assertion only works in the ideal world. In reality, if user A sends a figure to user B, there is a (rather high) chance that their versions of the font do not match. Even on my local system I'm not sure which of the packages to blame: gsfonts for not renaming font after major upgrade, or texlive-fonts-recommended for providing an old version of the font (or myself for using Debian at all ;-).
In my opinion, we should be more tolerant to what we receive. If there is a way to detect "glyph xxx undefined" situation beforehand, pdftex should do so and act as if \pdfmapline{-l7x-utmr NimbusRomNo9L-Regu} (for my example) is present in the document. Is there a way to implement such a behavior?
Thanks for your effort,
Andrey
----------------------------------------------------------------------
Comment By: The Thanh Han (hanthethanh)
Date: 2009-04-09 21:28
Message:
I think it's useful to have a summary of the problem for further reference:
- fig.pdf contains a font named NimbusRomNo9L-Regu
- pdftex sees that it has the same font on disk and tries to use the font on disk instead of copying the font from fig.pdf. (Reason: if we have eg fig.pdf & fig2.pdf, the font would be included once instead of twice)
- however these 2 fonts are different: the one embedded in fig.pdf has extra glyphs that the font on disk doesn't have. pdftex cannot know this, since the font names are the same.
- hence the output pdf doesn't have those glyphs, too.
workarounds/solutions:
- if you are in hurry, just say: \pdfinclusioncopyfonts=1
- for more detailed info, see bug report #2092
- the proper solution IMO is to change the font name to denote that it differ from the original NimbusRomNo9L-Regu
It's questionable whether pdftex should use the font on disk instead of copying the font from fig.pdf by default. There are 2 main arguments for this behavior:
- that behavior was the default since the beginning, and it's not wise to change it; it can break pdf inclusion in existing documents
- if this behavior is not desired, it can be changed easily (\pdfinclusioncopyfonts=1)
btw, if you think this is a bug, please describe what should be the correct behavior in your opinion.
----------------------------------------------------------------------
Comment By: The Thanh Han (hanthethanh)
Date: 2009-04-09 05:41
Message:
if the font has been changed, the font name should have be changed too.
----------------------------------------------------------------------
Comment By: Andrey Paramonov (pent)
Date: 2009-04-09 04:13
Message:
Yes, probably a different version of the font was used to create the figure. I don't see however why it is forbidden.
The font embedding feature has been *designed* to resolve the problems with different version of the fonts. If pdftex ignores the feature, and does so consciously, I see the problem is in pdftex.
The discussion and workarounds mentioned in #2092 suggest that by default, pdftex tries to be clever and drops embedded fonts that would later be included anyway. In this case however pdftex becomes overconfident and drops fonts it cannot later restore. To me, this is a clear optimization bug.
Andrey Paramonov
----------------------------------------------------------------------
Comment By: The Thanh Han (hanthethanh)
Date: 2009-04-08 19:05
Message:
update: the link was wrong again; somehow a ';' was automatically inserted to the link.
Please see bug #2092 for a similar problem and workaround.
----------------------------------------------------------------------
Comment By: The Thanh Han (hanthethanh)
Date: 2009-04-08 19:02
Message:
update: the previous link was broken, correction:
http://sarovar.org/tracker/index.php?func=detail&aid=2092&group_id=106&atid=493
----------------------------------------------------------------------
Comment By: The Thanh Han (hanthethanh)
Date: 2009-04-08 18:59
Message:
I guess the figure was created with a modified version of utmr8a.pfb (NimbusRomNo9L-Regu), since the original utmr8a.pfb doesn't have non-latin glyphs. This is not a bug of pdftex. For a workaround, please see
http://sarovar.org/tracker/index.php?func=detail&aid=2092&group_id=106&atid=493
----------------------------------------------------------------------
You can respond by visiting:
http://sarovar.org/tracker/?func=detail&atid=493&aid=4014&group_id=106
participants (1)
-
pdftex-bugs@sarovar.org