Philipp Reichmuth wrote:
I've been writing a script that sifts through the unic-xxx.tex files to get a readable mapping what Unicode characters are supported using \Amacron-style names.
mtxtools can create such lists using the unicode consotium glyph table, mojca's mapping list and enco/regi files we use mtxtools to create the tables needed for xetex (used for case mapping) and luatex (more extensive manipulations)
In the process I found one bug and something that might be another bug:
- the Cyrillic block (unic-004.tex) is missing an \unknownchar line for U+04CF, so that the remaining (few) glyphs are off by one
just mail me the patched file
- the Hebrew block (unic-005.tex) starts with a \numexpr line indicating an offset of 224 = E0; however, the first character in the list is U+05D0. So either the whole block is off by 16, starting at 0x0490 instead of 0x0500, or the 224 should be a 208 (=D0) instead. BTW unic-005.tex is the only file with Macintosh line endings. Are the unic-xxx files automatically generated or maintained by hand?
maintained by hand, again, just send me the fixed file, but we need to make sure that the fix is ok (i.e. works as expected)
Incidentally, it would be trivial now to put the list of ConTeXt glyphs on the Wiki, if anyone's interested.
there is a file contextnames.txt in the distributions (maintained by mojca), while the not yet distributed char-def.lua has the info for luatex
I wanted to use this to work towards better support for the whole range of ConTeXt glyphs with OpenType fonts under XeTeX, by reading what ConTeXt glyphs are available in a font and building a list of "\catcode`ā=\active \def ā {\amacron}"-style list for the rest. (Unfortunately this kind of list would be font-specific, but the generic alternative would be a huge list of active characters with an \ifnum\XeTeXcharglyph"....>0 macro behind it, and that would probable be quite slow.) I wonder if there is a more intelligent way to achieve this goal; since part of the logic for mapping code points into glyph macros exists already, it would be easier if there was a way to reuse that.
best take a look at mtxtools; if needed we can generate the definitions ; concerning speed, it will not be that slow, because tex is quite fast on such tests (unless XeTeXcharglyph is slow due to lib access); the biggest thing is to make sure that things don't expand in unwanted ways. (i must find time to update my xetex bin ; i must admit that i never tried to use open type fonts in xetex (the mac is broken)
The best way out would be if I could enable ConTeXt's UTF-8 regime while running XeTeX in \XeTeXinputencoding=bytes mode, but I haven't gotten that to work yet.
maybe mojca has Hans -- ----------------------------------------------------------------- Hans Hagen | PRAGMA ADE Ridderstraat 27 | 8061 GH Hasselt | The Netherlands tel: 038 477 53 69 | fax: 038 477 53 74 | www.pragma-ade.com | www.pragma-pod.nl -----------------------------------------------------------------