20 Sep
2009
20 Sep
'09
12:17 a.m.
Hans Hagen
put an ActualText tag on anything that happens not to match what you would get from the ToUnicode mapping.
hm, if one knows the character (say c) then why not adapt the tounicode vector
The same glyph could correspond to different Unicode in the source. This is exactly what happens normally with hyphens. In practice what I see with my method is that discretionary hyphens always get an ActualText, and if the font is older and has names like "Asmall" or "ffl" (which I don't bother handling specially) then the substituted stuff gets an ActualText. I could look at the font's internal encoding the way I think Cairo does, but it doesn't matter a whole lot.