Bruno Le Floch
Hello David,
comparing LuaTeX and LilyPond integration of their respective extension languages. [...] I've added some rough sketches at the end of the article that should make clear why this can't be done in formats alone but will require primitive support as well if things are supposed to turn out nicely.
I believe that some of the points you raise, and the syntax you propose, could be obtained at the format level. Below, I'm just throwing ideas out, feel free to kill most of them.
For instance, it is possible to get the syntax \luadef parshape ... \endluadef (i.e., replacing end by \endluadef in your example): just read everything from \luadef to \endluadef with verbatim category codes.
That does not work inside of macros.
A side note: rather than using \noexpand in
\directlua{tex.print("\noexpand\\message{Hi}")}
you can use \unexpanded as
\def\nexplua#1{\directlua{\unexpanded{#1}}} \nexplua{tex.print("\\message{Hi}")}
I was pretty sure I tried several combinations of \unexpanded but I have to admit that this appears to work. Ah, I think I tried being too clever, using something like \directlua\expandafter{\unexpanded ... or something similar. Which would be rather pointless indeed.
One solution is to follow the footsteps of \verb, changing category codes before reading the lua code.
You can't in macros and macro arguments. [...]
Either way, category code changes will encounter a big problem: the user will write working code, then try to put it in a macro, and fail, because the TeX parser will not know that one part of the definition is meant to become Lua code.
A way out would be that LuaTeX's "eyes" recognize what part of the code is TeX, and what part is Lua. In fact, you allude to this possibility when proposing a new catcode. It is possible to achieve this distinction while keeping the existing catcodes:
But that won't help against % being a comment character and # being a hash mark and so on.
\def\foo#1#2{% % % Here, normal TeX catcodes are in effect. % This is a comment, but we can do useful \message{things with #1 and #2.} % #(-- This is a Lua comment, then code. function mess(x) tex.print( "\\message{argument = " .. x .. "}") end mess(#(#1#)) #)% }
Here, I've gone for using #( and #), i.e., a macro paremeter character (catcode 6) followed by a parenthesis, to switch between the TeX interpreter and the Lua interpreter.
#(...) already has a meaning in Lua, so it's not good as the escape back into TeX. But I'd rather use something like #( function mess(x) ... end mess(tex.detokenize (tex.get_undelimited ())) #){#1}% } This is at least communicating with proper syntax and data structures. What I proposed was that Lua code is read in line by line until no unfinished block remains, reverting back to TeX automatically after that. But of course, there are various possibilities for the actual syntax. The important thing is rather that source code in user-maintainable situations should belong either to the Lua lexer and tokenizer or the TeX lexer and tokenizer and not run through both. And that needs to be integrated at a rather low level. It may even be possible to juggle something like that into the existing read callbacks. But without some basic mostly format-independent documented and promoted standard framework underlying the LuaTeX documentation (like the plain TeX format is a standard framework presented for iniTeX) I don't see that finding consistent and/or widespread use. -- David Kastrup