Heiko, \pdfescapestring{#} returns doubled hash of category 12. Is that ok? I mean if it was two hashes in category 6 it wouldn't that strange, but all the characters returned by \pdfescape* are \other. So I think the result for \pdfescapestring{#} should be just one hash of cat 12. ? pdftex-1.40.0-beta-20060213 -- Pawe/l Jackowski P.Jackowski@gust.org.pl
On Thu, Feb 16, 2006 at 08:09:29AM +0100, Pawe? Jackowski wrote:
Heiko,
\pdfescapestring{#}
returns doubled hash of category 12. Is that ok? I mean if it was two hashes in category 6 it wouldn't that strange, but all the characters returned by \pdfescape* are \other. So I think the result for \pdfescapestring{#} should be just one hash of cat 12. ?
It's correct. \pdfescape* call the classic routine
"tokens_to_string" to get a string representation of the tokens.
Then the escape rules are applied.
During tokens_to_string TeX doubles # in this process and
add spaces after macro names.
"tokens_to_string" calls "show_token_list", defined by original TeX;
and is called by "print_pdf_toks" that many primitives use.
Same behaviour as \message, \detokenize, ...
Example for illustration:
\nopagenumbers
\tt
\message{[#|\relax|\string#]}
\pdfescapestring{[#|\relax|\string#]}
\begingroup
\escapechar=-1
\message{[#|\relax]}
\pdfescapestring{[#|\relax|\string#]}
\endgroup
\detokenize{[#|\relax|\string#]}
\begingroup
\catcode`/=0
\catcode`\\=12
/message{[\relax]}
/pdfescapestring{[\relax]}
/detokenize{[\relax]}
/endgroup
\end
Yours sincerely
Heiko
On Thu, Feb 16, 2006 at 09:43:35AM +0100, Heiko Oberdiek wrote:
Example for illustration:
Another example that shows the purpose of the \pdfescape* primitives: \begingroup \obeylines \def^^M{^^J}% \obeyspaces \immediate\pdfobj{<< /Name (String) /Hello#World (Hello#World) /abc/\relax0 (abc/\relax0) The result is invalid PDF. With the \string\pdfescape* functions we get valid PDF name and string objects: /\pdfescapename{Name} (\pdfescapestring{String}) /\pdfescapename{Hello#World} (\pdfescapestring{Hello#World}) /\pdfescapename{abc/\relax_x} (\pdfescapestring{abc/\relax_x})
}% \endgroup
\nopagenumbers
\null
\end
Yours sincerely
Heiko
Pawel:
returns doubled hash of category 12. Is that ok? I mean if it was two hashes in category 6 it wouldn't that strange, but all the characters returned by \pdfescape* are \other. So I think the result for \pdfescapestring{#} should be just one hash of cat 12. ?
Heiko:
It's correct. \pdfescape* call the classic routine "tokens_to_string" to get a string representation of the tokens. Then the escape rules are applied.
During tokens_to_string TeX doubles # in this process and add spaces after macro names.
"tokens_to_string" calls "show_token_list", defined by original TeX; and is called by "print_pdf_toks" that many primitives use.
Same behaviour as \message, \detokenize, ...
It convinces me, that all of those commands should work consequently and I won't insist on changing anything -) Yes, I know it complies all character-catcode-6-rules, but was just thinking is there is any profit of that behaviour in the case of \pdfescape*. I mean \message{#} -> ## (writing) \def\hash{##} -> # (reading) is texish and is symetric. But \pdfescapename{#} -> #23#23 \pdfescapehex{#} -> 2323 \pdfescapestring{#} -> ## is just a consequence of hash-rulez but the doubled hash is always redundant. And of course double hash of category 12 remains double hash \pdfunescapehex{#23#23} -> ## unless \scantokened or something... So, the fact that \pdfescape* primitives doubles a hash is unusable. In practise, you need to use \string# to avoid doubles and I couldn't find realistic example, one may need to have them doubled. But still, it is good that all that commands are consistent. Best, -- Pawe/l Jackowski P.Jackowski@gust.org.pl
Paweł Jackowski wrote:
Pawel:
returns doubled hash of category 12. Is that ok? I mean if it was two hashes in category 6 it wouldn't that strange, but all the characters returned by \pdfescape* are \other. So I think the result for \pdfescapestring{#} should be just one hash of cat 12. ?
Heiko:
It's correct. \pdfescape* call the classic routine "tokens_to_string" to get a string representation of the tokens. Then the escape rules are applied.
During tokens_to_string TeX doubles # in this process and add spaces after macro names.
"tokens_to_string" calls "show_token_list", defined by original TeX; and is called by "print_pdf_toks" that many primitives use.
Same behaviour as \message, \detokenize, ...
It convinces me, that all of those commands should work consequently and I won't insist on changing anything -)
Yes, I know it complies all character-catcode-6-rules, but was just thinking is there is any profit of that behaviour in the case of \pdfescape*. I mean
\message{#} -> ## (writing) \def\hash{##} -> # (reading)
is texish and is symetric. But
\pdfescapename{#} -> #23#23 \pdfescapehex{#} -> 2323 \pdfescapestring{#} -> ##
is just a consequence of hash-rulez but the doubled hash is always redundant. And of course double hash of category 12 remains double hash
\pdfunescapehex{#23#23} -> ##
unless \scantokened or something...
So, the fact that \pdfescape* primitives doubles a hash is unusable. In practise, you need to use \string# to avoid doubles and I couldn't find realistic example, one may need to have them doubled. But still, it is good that all that commands are consistent.
in generall i think that it would be handy to have a way to disable the hash duplication (also in the other situations), because now one sometimes ends up in parsing them away; something \pdfduplicatesixcode [0|1] or so; getting rid of the space after the \cs is more tricky since info about a space originaly being there is lost at that stage. Hans -- ----------------------------------------------------------------- Hans Hagen | PRAGMA ADE Ridderstraat 27 | 8061 GH Hasselt | The Netherlands tel: 038 477 53 69 | fax: 038 477 53 74 | www.pragma-ade.com | www.pragma-pod.nl -----------------------------------------------------------------
participants (3)
-
Hans Hagen
-
Heiko Oberdiek
-
Paweł Jackowski