Wrong spacing for OpenType math subscripts

Johannes Rosenberger

14 Mar 2021 14 Mar '21

8:12 p.m.

Hi LuaTeX developers, first of all, thank you very much for all your work on this and the great OpenType support and possibility of fixing internals using Lua code! In the following document, the math subscripts are so close to the following characters that they sometimes overlap: \documentclass{article} \usepackage{unicode-math} \def\supp{{\rm supp}} \begin{document} $$ I(P_X,P_{Y|X}) = \sum_{(X,Y) \in \supp P_{XY}} P_{XY}(X,Y) \log_2 \frac{P_{XY}(X,Y)}{P_X(X) P_Y(Y)}. $$ \end{document} The problem cannot be fixed by simply adding spacing, e.g. \Uordopenspacing, because the space is different per character. We need italics correction, in all circumstances where an italic symbol is last in the subscript and something follows the noad. By the manual, Section 7.6.4 this is not applied to subscripts, but only the base characters. Setting \mathitalicsmode=1 to force it changes nothing, because italics correction is then only applied in the subscript to the characters before the last one. Maybe luatex assumes the last character to be followed by nothing, because in the mlist of the subscript follows nothing even if after the whole noad follows something. Setting \mathitalicsmode=1 and adding manual 0-kerns (like \/) helps. This is implemented by the following lua code: \directlua{ ty = node.types() sub_addkern = function(sub) scan_sub_addkern(sub) local kern = node.new("kern") if ty[sub.id] == "math_char" or ty[sub.id] == "math_char_text" then local old = node.new("noad", 0) old.nucleus = sub sub = node.new("sub_mlist") sub.head = old node.insert_after(sub.head, old, kern) elseif ty[sub.id] == "sub_box" or ty[sub.id] == "sub_mlist" then local tail = node.tail(sub.head) if not (ty[tail.id] == "kern") then node.insert_after(sub.head, tail, kern) end end return sub end scan_sub_addkern = function(n) if n.head then n.head = scan_all_sub_addkern(n.head) end if n.sub then n.sub = sub_addkern(n.sub) end if n.nucleus then scan_sub_addkern(n.nucleus) end if n.num then scan_sub_addkern(n.num) end if n.denom then scan_sub_addkern(n.denom) end end scan_all_sub_addkern = function(head) for n in node.traverse(head) do scan_sub_addkern(n) end return head end luatexbase.add_to_callback("pre_mlist_to_hlist_filter", scan_all_sub_addkern, "default mlist_to_hlist + italics-corrected subscript") } XeTeX does this right out of the box. Best, Johannes

Show replies by date

Hans Hagen

15 Mar 15 Mar

3:08 a.m.

On 3/15/2021 3:12 AM, Johannes Rosenberger wrote:

...

Hi LuaTeX developers,

first of all, thank you very much for all your work on this and the great OpenType support and possibility of fixing internals using Lua code!

In the following document, the math subscripts are so close to the following characters that they sometimes overlap:

\documentclass{article} \usepackage{unicode-math}

\def\supp{{\rm supp}}

\begin{document}

$$ I(P_X,P_{Y|X}) = \sum_{(X,Y) \in \supp P_{XY}} P_{XY}(X,Y) \log_2 \frac{P_{XY}(X,Y)}{P_X(X) P_Y(Y)}. $$

\end{document} I can't speak for latex but it has to do with the fact that some of the tex related fonts have old school metrics: they lie about the width. The traditional engine always adds italic correction and in some cases afterwards removes it (for that reason the 'width' is the 'real width' minus 'italic correction'. In opentype the real width is used and such corrections are done with math kerns (and most otf tex fonts lack them); italic correction is only added is a few cases.

I guess that xetex has a classical math engine in which case it will add italic corrections to the width so that hides the issue with fonts. There are / have been ways to force luatex into old school mode but none of them is good enough to catch all so we no whave two code paths: old school tex fonts and opentype cf cambria (which sets the standard). We'll see if the gyre fonts will get this done the open type way. If not, we'll have to work around it and you solution is one of them although it might not work not out well with a font that has real widths and also italic corrections which then get applied. There are several solutions for it what aproach is chosen depends on the macro package and taste. Hans ----------------------------------------------------------------- Hans Hagen | PRAGMA ADE Ridderstraat 27 | 8061 GH Hasselt | The Netherlands tel: 038 477 53 69 | www.pragma-ade.nl | www.pragma-pod.nl -----------------------------------------------------------------

Johannes Rosenberger

2:36 p.m.

Excerpts from Hans Hagen's message of March 15, 2021 10:08 am:

...

On 3/15/2021 3:12 AM, Johannes Rosenberger wrote:

...
Hi LuaTeX developers,

first of all, thank you very much for all your work on this and the great OpenType support and possibility of fixing internals using Lua code!

In the following document, the math subscripts are so close to the following characters that they sometimes overlap:

\documentclass{article} \usepackage{unicode-math}

\def\supp{{\rm supp}}

\begin{document}

$$ I(P_X,P_{Y|X}) = \sum_{(X,Y) \in \supp P_{XY}} P_{XY}(X,Y) \log_2 \frac{P_{XY}(X,Y)}{P_X(X) P_Y(Y)}. $$

\end{document}

I can't speak for latex but it has to do with the fact that some of the tex related fonts have old school metrics: they lie about the width. The traditional engine always adds italic correction and in some cases afterwards removes it (for that reason the 'width' is the 'real width' minus 'italic correction'. In opentype the real width is used and such corrections are done with math kerns (and most otf tex fonts lack them); italic correction is only added is a few cases.

Aren't the two measures dual? The two concepts seem to me simply like positive (italic) and negative (kerning) correction. I don't see how it should be a problem to apply both if the respective amounts are specified by the font. The latter is the case e.g. for Latin Modern Math. It is simply not applied by LuaTeX. I only imagine that applying both could be a problem if the font specifies both variants alternaviely such that applying both would result in an overcorrection. But wouldn't this be a particularly badly designed font? So: Could compatibility with the $real top edge width = width + it. correction$ really hurt fonts with $real bottom edge width = width - kerning$? The OpenType spec [1] lets one apply italic correction to a base glyph to shift the superscript to the right. This seems very similar to applying italic correction to a subscript to shift the following base glyph right.

...

There are / have been ways to force luatex into old school mode but none of them is good enough to catch all so we no whave two code paths: old school tex fonts and opentype cf cambria (which sets the standard).

Unfortunately I have no cambria to test, because I'm on Linux. Are OpenType fonts relying on italics correction ignored, then? Apparently, such fonts like Latin Modern Math are effectively unusable for maths involving upper-case indices if they are uncorrected. No, LuaTeX is unusable if you want to use such a font.

...

We'll see if the gyre fonts will get this done the open type way. If not, we'll have to work around it and you solution is one of them although it might not work not out well with a font that has real widths and also italic corrections which then get applied. There are several solutions for it what aproach is chosen depends on the macro package and taste.

As above: What a font could this be? Why should italics correction be specified on top of the real maximal width? If the 'real width' is not the slanted/italic character width, how is it real? The TeX Gyre fonts don't seem to be developed further at lightning speed. E.g. the latest TeX Gyre fonts are from 2016. Do you know of any free math font which has intrinsic math kerning, so that I can test it?

...

Can you post the pdf of the test ? -- luigi scarso

Here is my correction code, together with example documents, as tex+pdf for some fonts: https://git.sr.ht/~jorsn/luatex-math-spacing Information on how to easily use other fonts is in the GNUMakefile. Best, Johannes [1]: https://docs.microsoft.com/en-us/typography/opentype/spec/math

Hans Hagen

5:37 p.m.

On 3/15/2021 9:36 PM, Johannes Rosenberger wrote:

...

Excerpts from Hans Hagen's message of March 15, 2021 10:08 am:

...
On 3/15/2021 3:12 AM, Johannes Rosenberger wrote:

...
Hi LuaTeX developers,

first of all, thank you very much for all your work on this and the great OpenType support and possibility of fixing internals using Lua code!

In the following document, the math subscripts are so close to the following characters that they sometimes overlap:

\documentclass{article} \usepackage{unicode-math}

\def\supp{{\rm supp}}

\begin{document}

$$ I(P_X,P_{Y|X}) = \sum_{(X,Y) \in \supp P_{XY}} P_{XY}(X,Y) \log_2 \frac{P_{XY}(X,Y)}{P_X(X) P_Y(Y)}. $$

\end{document}

I can't speak for latex but it has to do with the fact that some of the tex related fonts have old school metrics: they lie about the width. The traditional engine always adds italic correction and in some cases afterwards removes it (for that reason the 'width' is the 'real width' minus 'italic correction'. In opentype the real width is used and such corrections are done with math kerns (and most otf tex fonts lack them); italic correction is only added is a few cases.

Aren't the two measures dual? The two concepts seem to me simply like positive (italic) and negative (kerning) correction. I don't see how it should be a problem to apply both if the respective amounts are specified by the font. The latter is the case e.g. for Latin Modern Math. It is simply not applied by LuaTeX. I only imagine that applying both could be a problem if the font specifies both variants alternaviely such that applying both would result in an overcorrection. But wouldn't this be a particularly badly designed font?

different concepts (traditional tex has often rather special metrics in math, partly because of limitations in the tfm format wrt number of properties and so) .. also opentype math kerns are more advanced and meant for shape following kerns

...

So: Could compatibility with the $real top edge width = width + it. correction$ really hurt fonts with $real bottom edge width = width - kerning$?

what magically might work for one font might fail for another, so you fix this and break that .. i gave up on some heuristic (and context has some tweaks like 'fixing' widths, adding staircase kerns, but that is of little use to you)

...

The OpenType spec [1] lets one apply italic correction to a base glyph to shift the superscript to the right. This seems very similar to applying italic correction to a subscript to shift the following base glyph right.

...
There are / have been ways to force luatex into old school mode but none of them is good enough to catch all so we no whave two code paths: old school tex fonts and opentype cf cambria (which sets the standard).

Unfortunately I have no cambria to test, because I'm on Linux.

fwiw cambria and microsoft opentype math has set the standard

...

Are OpenType fonts relying on italics correction ignored, then? Apparently, such fonts like Latin Modern Math are effectively unusable for maths involving upper-case indices if they are uncorrected. No, LuaTeX is unusable if you want to use such a font.

they have more advanced staircase kerns (italic correction is more a text thing; you need different corrections for super, mid and sub, left and right)

...

...
We'll see if the gyre fonts will get this done the open type way. If not, we'll have to work around it and you solution is one of them although it might not work not out well with a font that has real widths and also italic corrections which then get applied. There are several solutions for it what aproach is chosen depends on the macro package and taste.

As above: What a font could this be? Why should italics correction be specified on top of the real maximal width? If the 'real width' is not the slanted/italic character width, how is it real?

normally italic correction *is* defined (for text fonts) to correct the width somehow for bounding characters; some fonts let the shaope stick out, others go right so the correction can be positive or negative (and there are more boundary cases to consider)

...

The TeX Gyre fonts don't seem to be developed further at lightning speed. E.g. the latest TeX Gyre fonts are from 2016. Do you know of any free math font which has intrinsic math kerning, so that I can test it?

there will probably be an update release this year an dat some point no updates (so a year is not saying much) e.g. dejavu (free because dev paid by user groups), lucidaot (non free but cheap @ tug because also sponsored by it), even cambria is not that expensive (and occasionally buying a font is definitely cheaper than buying a phone every few years so you could give lucida a try as you can use if lifelong) Hans ----------------------------------------------------------------- Hans Hagen | PRAGMA ADE Ridderstraat 27 | 8061 GH Hasselt | The Netherlands tel: 038 477 53 69 | www.pragma-ade.nl | www.pragma-pod.nl -----------------------------------------------------------------

luigi scarso

3:13 a.m.

On Mon, Mar 15, 2021 at 9:15 AM Johannes Rosenberger wrote:

...

Hi LuaTeX developers,

first of all, thank you very much for all your work on this and the great OpenType support and possibility of fixing internals using Lua code!

In the following document, the math subscripts are so close to the following characters that they sometimes overlap:

Can you post the pdf of the test ? -- luigi

1455

Age (days ago)

1455

Last active (days ago)

List overview

Download

4 comments

3 participants

participants (3)

Hans Hagen
Johannes Rosenberger
luigi scarso