How to extract tags present above a closing tag in an XML document?
Clash Royale CLAN TAG#URR8PPP
How to extract tags present above a closing tag in an XML document?
I have an XML document with the following format:
<root>
<W> word </W>
<W> word2 </W>
<Some random tag>
<W> word 3 </W>
</Some random tag>
.
.
.
<X>
<W> </W>
<W> </W>
</X>
<W> word4 </W>
<W> word5 </W>
<Some random tag>
<W> word6 </W>
</Some random tag>
.
.
.
<X>
<W> </W>
<W> </W>
</X>
</root>
There are a bunch of tags above the </X>
. There can be any order amongst them that is <W>
may be present inside some other tag or even outside. I wish to extract all the text in the <W>
tag present before the </X>
and put it in a list. Then the text in the<W>
present between the first </X>
and the second </X>
should be put it in a new list.
</X>
<W>
<W>
</X>
<W>
</X>
</X>
How do I do so?
What I tried: I used the xml
module. Since there is no particular order where <W>
may be present, I iterate over all the tags. However, in this manner, I can't tell when the </X>
closes which is a problem as there might be some <W>
inside the <X>
tag as well. Other than this method, I am out of ideas.
xml
<W>
</X>
<W>
<X>
EDIT: To make things clear:
<root>
<W> word </W>
<W> word2 </W>
<Some_random_tag>
<W> word 3 </W>
</Some_random_tag>
<X>
<W>alice </W>
<W>bob </W>
</X>
<W> word4 </W>
<W> word5 </W>
<Some_random_tag>
<W> word6 </W>
</Some_random_tag>
<X>
<W>one </W>
<W>two </W>
</X>
</root>
In the above example, I need word, word2, word3, Alice, Bob
in one list (All text in <W>
present above the first </X>
). And word4, word5, word6, one, two
in another list.(All text in <W>
present between the first </X>
and the second </X>
).
word, word2, word3, Alice, Bob
<W>
</X>
word4, word5, word6, one, two
<W>
</X>
</X>
By clicking "Post Your Answer", you acknowledge that you have read our updated terms of service, privacy policy and cookie policy, and that your continued use of the website is subject to these policies.