XSLT script extracting all motion and arrival verbs into separate untokenized document
Opened this issue · 0 comments
I need to get an aggregation of all instances of motion and arrival verbs into a single place without the tokenized text.
XSLT should:
-
Search for a list of key words in translations in
following-sibling::spanGrp[@type='annotations'][1]
and<xsl:for-each select="$annotations/span[@ana='#S' and @xml:lang='en']">
key words should include: ("come", "coming", "came", "comes", "go", "went", "goes", "going", "arrive", "arrives", "arrived", "got there", "got here", "went home", "go home", "come home", "came home",.....) -
For each sentence whose
following-sibling::spanGrp[@type='annotations'][1]
contains one of the key words:
- copy at<span type="S">
level (remove spaces)
- (one tab over)copy span[@ana='#S' and @xml:lang='en']
-
Can manually refine the results in a spreadsheet