[luatex-fonts] non-ascii filenames in font cache
Hi Hans, the font cache currently drops non-ascii bytes when creating file names by means of containers.cleanname(). Dohyun Kim sent a fix for data-con.lua (see below). My own test with the unicode library leads to some odd results. Also I noticed that as a pattern, [^%w%d] is a bit redundant since %d is a subset of %w in both string and unicode.utf8. Regards Philipp #!/usr/bin/env texlua local non_ascii_names = { [[华文仿宋.ttf]], [[华文细黑.ttf]], [[华文黑体.ttf]], } --- [a]: current data-con --- [b]: include non-ascii (proposed by Dohyun Kim) --- [c]: with selene unicode for i = 1, #non_ascii_names do local name = non_ascii_names[i] print"" print("[a]", name, string.gsub(string.lower(name), "[^%w%d]+","-")) print("[b]", name, string.gsub(string.lower(name), "[^%w%d\128-\255]+","-")) print("[c]", name, unicode.utf8.gsub(unicode.utf8.lower(name), "[^%w%d]+","-")) end
On 4/28/2013 12:04 PM, Philipp Gesang wrote:
the font cache currently drops non-ascii bytes when creating file names by means of containers.cleanname(). Dohyun Kim sent a fix for data-con.lua (see below). My own test with the unicode library leads to some odd results.
strange that it wasn't noticed before as it's rather old code function containers.cleanname(name) return (gsub(lower(name),"[^%w\128-\255]+","-")) end is good enough i guess Hans ----------------------------------------------------------------- Hans Hagen | PRAGMA ADE Ridderstraat 27 | 8061 GH Hasselt | The Netherlands tel: 038 477 53 69 | voip: 087 875 68 74 | www.pragma-ade.com | www.pragma-pod.nl -----------------------------------------------------------------
···
On 4/28/2013 12:04 PM, Philipp Gesang wrote:
the font cache currently drops non-ascii bytes when creating file names by means of containers.cleanname(). Dohyun Kim sent a fix for data-con.lua (see below). My own test with the unicode library leads to some odd results.
strange that it wasn't noticed before as it's rather old code
Personally I would rename the files instead of reporting it.
function containers.cleanname(name) return (gsub(lower(name),"[^%w\128-\255]+","-")) end
is good enough i guess
Of course, thanks! Philipp
On Sun, Apr 28, 2013 at 12:56:25PM +0200, Hans Hagen wrote:
On 4/28/2013 12:04 PM, Philipp Gesang wrote:
the font cache currently drops non-ascii bytes when creating file names by means of containers.cleanname(). Dohyun Kim sent a fix for data-con.lua (see below). My own test with the unicode library leads to some odd results.
strange that it wasn't noticed before as it's rather old code
I noticed it long ago (by reading the code), but since I didn't have any fonts with non-ASCII filenames, I didn't bother. Regards, Khaled
Am 28.04.2013 um 14:08 schrieb Khaled Hosny
On Sun, Apr 28, 2013 at 12:56:25PM +0200, Hans Hagen wrote:
On 4/28/2013 12:04 PM, Philipp Gesang wrote:
the font cache currently drops non-ascii bytes when creating file names by means of containers.cleanname(). Dohyun Kim sent a fix for data-con.lua (see below). My own test with the unicode library leads to some odd results.
strange that it wasn't noticed before as it's rather old code
I noticed it long ago (by reading the code), but since I didn't have any fonts with non-ASCII filenames, I didn't bother.
IIRC this was on purpose because there had been problems when fonts used non-ascii characters. Wolfgang
On 4/28/2013 2:15 PM, Wolfgang Schuster wrote:
Am 28.04.2013 um 14:08 schrieb Khaled Hosny
: On Sun, Apr 28, 2013 at 12:56:25PM +0200, Hans Hagen wrote:
On 4/28/2013 12:04 PM, Philipp Gesang wrote:
the font cache currently drops non-ascii bytes when creating file names by means of containers.cleanname(). Dohyun Kim sent a fix for data-con.lua (see below). My own test with the unicode library leads to some odd results.
strange that it wasn't noticed before as it's rather old code
I noticed it long ago (by reading the code), but since I didn't have any fonts with non-ASCII filenames, I didn't bother.
IIRC this was on purpose because there had been problems when fonts used non-ascii characters.
indeed, and as this patch only involves caching it means that the problem moves elsewhere Hans ----------------------------------------------------------------- Hans Hagen | PRAGMA ADE Ridderstraat 27 | 8061 GH Hasselt | The Netherlands tel: 038 477 53 69 | voip: 087 875 68 74 | www.pragma-ade.com | www.pragma-pod.nl -----------------------------------------------------------------
participants (5)
-
Hans Hagen
-
Khaled Hosny
-
Philipp Gesang
-
Philipp Gesang
-
Wolfgang Schuster