XML processing beginner's question
Hello list, having the xml data at the bottom, I would like to process it so that the result is like this: --- What it is (e.g. bold formatted) date: 2023-08-01 (italic) Description (small font size) Another text (small font size) hd1 - Header 1 § 1 First (A first short description) AAAAAAAAAA BBBBBBBBBB § 2 Second (A second short description) CCCCCCCCCC DDDDDDDDDD § 3 Third (A third short description) EEEEEEEEEE FFFFFFFFFF --- How can I process the <element>s differently? The first element contains a <date> tag and so it differs from the other ones. The second element's <name> tag contains the word "Header" which makes it different again. The other elements contain a <shortdescription> tag that they all have in common. What could be the appropriate xml setups to generate the above output? Michael --- xml data: \startbuffer[xmlcontent] <?xml version="1.0" encoding="UTF-8" ?> <document> <element> <mdata> <name>What it is</name> <date>2023-08-01</date> </mdata> <tdata> <content> <p>Description</p> <p>Another text</p> </content> </tdata> </element> <element> <mdata> <num>hd1</num> <name>Header 1</name> </mdata> <tdata> <content> <p>Text of Header 1</p> </content> </tdata> </element> <element> <mdata> <num>1</num> <name>First</name> <shortdescription>A first short description</shortdescription> </mdata> <tdata> <content> <p>AAAAAAAAAA</p> <p>BBBBBBBBBB</p> </content> </tdata> </element> <element> <mdata> <num>2</num> <name>Second</name> <shortdescription>A second short description</shortdescription> </mdata> <tdata> <content> <p>CCCCCCCCCC</p> <p>DDDDDDDDDD</p> </content> </tdata> </element> <element> <mdata> <num>3</num> <name>Third</name> <shortdescription>A third short description</shortdescription> </mdata> <tdata> <content> <p>EEEEEEEEEE</p> <p>FFFFFFFFFF</p> </content> </tdata> </element> </document> \stopbuffer
Have you looked at chapter 3.10 "Testing" of the manual xml-mkiv.pdf? There are a lot of commands there that should help you, such as \xmldoiftext {#1} {/mdata/date} {\bf \xmlflush {#1}} or \xmldoifelsetext. There's also \xmlfilter, which you can use to test for the content of tags. And of course, you can process in Lua and search for strings or use lpeg. However, your question is a bit vague now. Show us some code you have and we can take it from there; that's easier than writing the whole setup for you. Thomas On 8/21/23 17:29, Michael Löscher wrote:
Hello list,
having the xml data at the bottom, I would like to process it so that the result is like this:
--- What it is (e.g. bold formatted) date: 2023-08-01 (italic) Description (small font size) Another text (small font size)
hd1 - Header 1
§ 1 First (A first short description) AAAAAAAAAA BBBBBBBBBB
§ 2 Second (A second short description) CCCCCCCCCC DDDDDDDDDD
§ 3 Third (A third short description) EEEEEEEEEE FFFFFFFFFF ---
How can I process the <element>s differently? The first element contains a <date> tag and so it differs from the other ones. The second element's <name> tag contains the word "Header" which makes it different again. The other elements contain a <shortdescription> tag that they all have in common.
What could be the appropriate xml setups to generate the above output?
Michael
--- xml data: \startbuffer[xmlcontent] <?xml version="1.0" encoding="UTF-8" ?> <document> <element> <mdata> <name>What it is</name> <date>2023-08-01</date> </mdata> <tdata> <content> <p>Description</p> <p>Another text</p> </content> </tdata> </element> <element> <mdata> <num>hd1</num> <name>Header 1</name> </mdata> <tdata> <content> <p>Text of Header 1</p> </content> </tdata> </element> <element> <mdata> <num>1</num> <name>First</name> <shortdescription>A first short description</shortdescription> </mdata> <tdata> <content> <p>AAAAAAAAAA</p> <p>BBBBBBBBBB</p> </content> </tdata> </element> <element> <mdata> <num>2</num> <name>Second</name> <shortdescription>A second short description</shortdescription> </mdata> <tdata> <content> <p>CCCCCCCCCC</p> <p>DDDDDDDDDD</p> </content> </tdata> </element> <element> <mdata> <num>3</num> <name>Third</name> <shortdescription>A third short description</shortdescription> </mdata> <tdata> <content> <p>EEEEEEEEEE</p> <p>FFFFFFFFFF</p> </content> </tdata> </element> </document> \stopbuffer
Yes, I have done that. But I don't seem to have the basic context of how the processing works in order. All I have so far is this as a starting point: \startxmlsetups xml:mysetup \xmlsetsetup{main}{document|element|mdata|tdata|name|date|num|content|shortdescription|p}{xml:*} \stopxmlsetups \xmlregistersetup{xml:mysetup} \startxmlsetups xml:mysetup:document \xmlflush{#1} \stopxmlsetups \startxmlsetups xml:mysetup:element \xmlflush{#1} \stopxmlsetups \startxmlsetups xml:mysetup:mdata \xmlflush{#1} \stopxmlsetups \startxmlsetups xml:mysetup:tdata \xmlflush{#1} \stopxmlsetups \startxmlsetups xml:mysetup:name \xmlflush \stopxmlsetups \startxmlsetups xml:mysetup:num \xmlflush{#1} \stopxmlsetups \startxmlsetups xml:mysetup:content \xmlflush{#1} \stopxmlsetups \startxmlsetups xml:mysetup:shortdescription \xmlflush{#1} \stopxmlsetups \startxmlsetups xml:mysetup:p \xmlflush{#1}\par \stopxmlsetups \starttext \xmlprocessbuffer {mysetup}{xmlcontent}{} \stoptext Am 21.08.2023 um 17:45 schrieb Thomas A. Schmitz:
Have you looked at chapter 3.10 "Testing" of the manual xml-mkiv.pdf? There are a lot of commands there that should help you, such as
\xmldoiftext {#1} {/mdata/date} {\bf \xmlflush {#1}}
or \xmldoifelsetext.
There's also \xmlfilter, which you can use to test for the content of tags. And of course, you can process in Lua and search for strings or use lpeg. However, your question is a bit vague now. Show us some code you have and we can take it from there; that's easier than writing the whole setup for you.
Thomas
On 8/21/23 17:29, Michael Löscher wrote:
Hello list,
having the xml data at the bottom, I would like to process it so that the result is like this:
--- What it is (e.g. bold formatted) date: 2023-08-01 (italic) Description (small font size) Another text (small font size)
hd1 - Header 1
§ 1 First (A first short description) AAAAAAAAAA BBBBBBBBBB
§ 2 Second (A second short description) CCCCCCCCCC DDDDDDDDDD
§ 3 Third (A third short description) EEEEEEEEEE FFFFFFFFFF ---
How can I process the <element>s differently? The first element contains a <date> tag and so it differs from the other ones. The second element's <name> tag contains the word "Header" which makes it different again. The other elements contain a <shortdescription> tag that they all have in common.
What could be the appropriate xml setups to generate the above output?
Michael
--- xml data: \startbuffer[xmlcontent] <?xml version="1.0" encoding="UTF-8" ?> <document> <element> <mdata> <name>What it is</name> <date>2023-08-01</date> </mdata> <tdata> <content> <p>Description</p> <p>Another text</p> </content> </tdata> </element> <element> <mdata> <num>hd1</num> <name>Header 1</name> </mdata> <tdata> <content> <p>Text of Header 1</p> </content> </tdata> </element> <element> <mdata> <num>1</num> <name>First</name> <shortdescription>A first short description</shortdescription> </mdata> <tdata> <content> <p>AAAAAAAAAA</p> <p>BBBBBBBBBB</p> </content> </tdata> </element> <element> <mdata> <num>2</num> <name>Second</name> <shortdescription>A second short description</shortdescription> </mdata> <tdata> <content> <p>CCCCCCCCCC</p> <p>DDDDDDDDDD</p> </content> </tdata> </element> <element> <mdata> <num>3</num> <name>Third</name> <shortdescription>A third short description</shortdescription> </mdata> <tdata> <content> <p>EEEEEEEEEE</p> <p>FFFFFFFFFF</p> </content> </tdata> </element> </document> \stopbuffer
___________________________________________________________________________________ If your question is of interest to others as well, please add an entry to the Wiki!
maillist : ntg-context@ntg.nl / https://www.ntg.nl/mailman/listinfo/ntg-context webpage : https://www.pragma-ade.nl / http://context.aanhet.net archive : https://bitbucket.org/phg/context-mirror/commits/ wiki : https://contextgarden.net ___________________________________________________________________________________
On 8/21/23 17:59, Michael Löscher wrote:
Yes, I have done that. But I don't seem to have the basic context of how the processing works in order. All I have so far is this as a starting point:
Really? I told you about the various commands \xmldoif, but there's nothing in your starting point. I don't want to provide anybody homework solutions, so just to give you an idea to get you started: \startxmlsetups xml:mysetup \xmlsetsetup{main}{document|element|mdata|tdata|name|date|num|content|shortdescription|p}{xml:*} \stopxmlsetups \xmlregistersetup{xml:mysetup} \startxmlsetups xml:document \xmlflush {#1} \stopxmlsetups \startxmlsetups xml:element \xmlflush {#1} \stopxmlsetups \startxmlsetups xml:mdata \xmldoifelsetext {#1} {/date} {{\bf \xmltext {#1} {name}}\par {\it \xmltext {#1} {date}}\par} {\xmltext {#1} {content}\par} \stopxmlsetups This will process the name in bold and the date in italic. But I'm sure you can do better after reading and digesting the chapter I referred to. Thomas
-----Ursprüngliche Nachricht----- Von: Thomas A. Schmitz
Gesendet: Montag, 21. August 2023 18:20 An: mailing list for ConTeXt users Betreff: [NTG-context] Re: XML processing beginner's question On 8/21/23 17:59, Michael Löscher wrote:
Yes, I have done that. But I don't seem to have the basic context of how the processing works in order. All I have so far is this as a starting point:
Really? I told you about the various commands \xmldoif, but there's nothing in your starting point. I don't want to provide anybody homework solutions, so just to give you an idea to get you started:
\startxmlsetups xml:mysetup \xmlsetsetup{main}{document|element|mdata|tdata|name|date|num|con tent|shortdescription|p}{xml:*} \stopxmlsetups
\xmlregistersetup{xml:mysetup}
\startxmlsetups xml:document \xmlflush {#1} \stopxmlsetups
\startxmlsetups xml:element \xmlflush {#1} \stopxmlsetups
\startxmlsetups xml:mdata \xmldoifelsetext {#1} {/date} {{\bf \xmltext {#1} {name}}\par {\it \xmltext {#1} {date}}\par} {\xmltext {#1} {content}\par} \stopxmlsetups
Just to add to this: You can also apply a more XSLT-like approach, like test directly when matching: \startxmlsetups xml:mysetup \xmlsetsetup{main}{document}{xml:*} \xmlsetsetup{main}{element[@class="myclass"]}{xml:element-with-attribute} \xmlsetsetup{main}{element[./subelement-one]}{xml:element-with-subelement-one} \xmlsetsetup{main}{element[./subelement-two]}{xml:element-with-subelement-two} \stopxmlsetups \xmlregistersetup{xml:mysetup} \startxmlsetups xml:document \xmlflush {#1} \stopxmlsetups \startxmlsetups xml:element-with-attribute 0 \stopxmlsetups \startxmlsetups element-with-subelement-one 1 \stopxmlsetups \startxmlsetups element-with-subelement-two 2 \stopxmlsetups But, I think the way this is processed differs a bit from XSLT. In XSLT the most specific match will be applied, but ConTeXt seems to proceed from top to bottom until it finds a match. (Is that correct?) Best, Denis
On 8/22/2023 9:06 AM, denis.maier@unibe.ch wrote:
But, I think the way this is processed differs a bit from XSLT. In XSLT the most specific match will be applied, but ConTeXt seems to proceed from top to bottom until it finds a match. (Is that correct?) it just associates the most recent match with a setup . if needed we could extend the mechanism with varianst but i have to admit that it has been stable (mostly untouched) for close to 15 years now so all has to be done very careful; it has also be tuned for performance and large scale throughput
Hans ----------------------------------------------------------------- Hans Hagen | PRAGMA ADE Ridderstraat 27 | 8061 GH Hasselt | The Netherlands tel: 038 477 53 69 | www.pragma-ade.nl | www.pragma-pod.nl -----------------------------------------------------------------
On 8/21/2023 5:59 PM, Michael Löscher wrote:
Yes, I have done that. But I don't seem to have the basic context of how the processing works in order. All I have so far is this as a starting point: you can also find examples in the test suite (xml subpath)
----------------------------------------------------------------- Hans Hagen | PRAGMA ADE Ridderstraat 27 | 8061 GH Hasselt | The Netherlands tel: 038 477 53 69 | www.pragma-ade.nl | www.pragma-pod.nl -----------------------------------------------------------------
participants (5)
-
denis.maier@unibe.ch
-
Hans Hagen
-
Hans Hagen
-
Michael Löscher
-
Thomas A. Schmitz