adjective and adverbial use of VERBs
KoichiYasuoka opened this issue · 28 comments
In Classical Chinese, VERBs are often used as adjective or adverbial. For example, the sentence "立天下之正位" has two VERBs, 立 and 正, where 立 is the root verb and 正 modifies 位 (NOUN). For another example, the sentence "必正立" also has two VERBs, 正 and 立, where 立 is the root verb and 正 modifies 立 (VERB). Now we annotate 正 as amod in "立天下之正位", and 正 as advmod in "必正立" as shown below. But validate.py says advmod to VERB is invalid and amod to VERB is valid.
validate.py says advmod to VERB is invalid and amod to VERB is valid
Instead of "says it is valid", I would interpret it as "does not say it is invalid" :-) The validator may not be smart enough to catch all errors, and, sadly, some errors just cannot be detected automatically (although the amod
error probably could).
Anyways, a VERB
constitutes a clause (even if just a one-word clause), hence I would use advcl
instead of advmod
, and acl
instead of amod
. The validator will accept it.
AUX
does not have its own modifiers (with a few exceptions such as conj
or fixed
). In UD, the AUX+VERB
set (meaning the main VERB
which is modified by the AUX
) is understood as one verbal element, and any modifiers of this element are attached to its head, that is, to the main VERB
. Hence in the above example, you would have
advmod(事, 不)*
aux(事, 可)
advcl(事, 以)
- One of the possible exceptions is negation. So you can actually attach the first 不 directly to the auxiliary, and the validator should accept it if 不 has the feature
Polarity=Neg
.
fixed
seems suitable for 可以 and Polarity=Neg
for 不. Now I'm trying tentatively to eliminate invalid advmod
s.
If I'm not mistaken, deriving different parts of speech is quite common and flexible in Classical Chinese. Verbs are often used as nouns, nouns as verbs, both nouns and verbs as adjectives or adverbs, adjectives as verbs, etc. The problem is it's all zero derivation in Classical Chinese (at least as it is written), so syntax and context are the only clues for interpretation and disambiguation.
The root issue here I think is that if the word is tagged with its original POS -- rather than its derived POS -- then we can end up with weird deprel-POS links as seen above (e.g. VERB as advmod parent of another VERB). Although it may be possible to save the specific examples illustrated above by analyzing the verb as a one-word clause modifying another verb or noun as suggested by Dan (rather than as an adverb or adjective derived from a verb), I don't think that's possible with all the part-of-speech derivations.
For example, nouns can be used as transitive verbs. If in such a case it's still tagged with the original part of speech NOUN, then there is the dilemma of how to link it to its direct object, since an obj parent should normally have the tag VERB, not NOUN. I can't think of any noun-noun relation that would be a reasonable substitute for this case (unlike, perhaps, in the above advcl and acl cases).
So it seems like there are at least two options on the table?
-
tag words according to their original/underived POS no matter what, and allowing some weird deprel-POS matches (e.g. VERB as advmod parent of another VERB, NOUN as obj parent of another NOUN)
-
tag words according to their derived POS, then no weird matches between the deprel and the POS tags
There could also be separate layers of annotations, one for the original POS and one for the derived POS, so neither information is lost/hidden?
- tag words according to their original/underived POS no matter what, and allowing some weird deprel-POS matches (e.g. VERB as advmod parent of another VERB, NOUN as obj parent of another NOUN)
This option looks alike our Classical Chinese Treebank, but NOUN-NOUN phrase treatment differs. We never use obj
between NOUNs, we use compound
nmod
nsubj
(right-to-left) and flat
conj
compound:redup
(left-to-right). If we use obj
, its head word should be VERB. Please scrutinize our 孟子 here (16MB PDF).
Just reading UD annotation guidelines: "Simple Clauses"
A simple clause minimally consists of a predicate together with its core argument dependents, but may be extended with oblique modifiers.
Oops. According to this guideline, we cannot regard that a VERB (one-word only) constitutes a clause. It reminds me that we determined to use advmod
for "必正立" (and amod
for "立天下之正位") one year ago when we scrutinized the guidelines. Well, is the guideline now changed?
This option looks alike our Classical Chinese Treebank, but NOUN-NOUN phrase treatment differs.
What I meant there is in reference to my example where when a word that is originally a noun is being used as verb, it would, according to your practice (I assume), still be tagged NOUN, even though it is lexically and syntactically functioning as a verb. This, I believe, might be the actual cause of the issue you reported.
Let me illustrate with an example (rough glossing/translation):
或時 睡顿 , 則 杖 之
sometimes fall.sleep , then hit.with.stick 3rd.person.pronoun
"Sometimes (he) fell asleep, then (he) would hit him with a stick."
The word 杖 is originally a noun meaning "stick(/rod/cane)", but here it has been derived as a verb meaning "to hit with a stick". Third person pronoun 之 is its direct object. If 杖 is tagged as NOUN, then we would end up with 杖 NOUN --obj--> 之 PRON, which makes no sense because one would expect the head of obj to be a VERB.
On the other hand, if your treebank would tag cases like 杖 in the above example as a VERB (in which case 杖 VERB --obj--> 之 PRON is good), then is there a reason why in your examples, the 正 in 立天下之正位 cannot be tagged as ADJ and 正 in 必正立 cannot be tagged as ADV to reflect their actual lexical/syntactic usage in those phrases?
A clause can definitely consist of a single verb in the case where there are no arguments overtly expressed. Here is one example from English:
He sat down smiling.
Here "smiling" is annotated as "acl", because it expresses a secondary predication with the implicit subject being co-referential with the subject of the main clause. In languages that allow pro-drop there are many more possibilities of having single-word clauses.
Well, then, "smiling" of "He sat down with smiling face" is also acl
? It seems for me that "-ing" examples are not suitable to discuss isolating languages, because "smiling" is sometimes "smile" + "-ing" in such languages...
What I meant there is in reference to my example where when a word that is originally a noun is being used as verb, it would, according to your practice (I assume), still be tagged NOUN, even though it is lexically and syntactically functioning as a verb.
Sorry, it's your misunderstanding. "Please scrutinize our 孟子 here (16MB PDF)" again. You can find, especially in pp.449-519, several words are treated as NOUN and VERB. In addition, you cannot find any ADJ there.
In English, the "-ing" forms are sometimes tagged ADJ
and attached as amod
, as can be seen in this query: http://hdl.handle.net/11346/PMLTQ-WFN7
If 正 is really a verb in 必正立, then I would expect that it can be understood as describing an action where somebody or something (unknown) corrected the standing. If it is not possible to see the action of correcting (or state of being corrected) there, then I think 正 is an adverb derived from the verb 正.
"不可以不事親"
If 以 means "use" here, what does the whole clause mean???
Edwin G. Pulleyblank: Outline of Classical Chinese Grammar, UBC Press, p.23:
Only transitive verbs may follow 可 "possible" directly, in which case they must be understood as passive; that is, the subject of 可 is the object (or patient) of the verb --- 人可殺 "the man is possible to kill" = "the man may be killed". A transitive verb in an active sense, or an intransitive verb requires 可以, rather than 可 alone --- 王可以殺人 "it is possible for the king to kill a man" or "the king can kill a man"; 王可以來 "it is possible for the king to come" or "the king can come". In this construction 以, which as a verb means "use" and as a coverb (or preposition) is used for the instrument, fills the role of passive transitive verb complement to 可. That is, the meaning of instrument is extended to include agency: "the king may be used to" → "the king may be the agent to".
So can I say 以 here is an active voice marker, rather than a verb meaning "use"?
So can I say 以 here is an active marker?
I cannot answer precisely since I'm vague what you mean in the word "marker". If it means mark
at deprel, it seems rather better choise than advmod
in 可以. But, in practical point of view, we have already finished to change them into fixed
(as suggested here). Please see this commit.
OK. Anyways, it has lost all the properties of verb methinks... 以 is more like an adverb here... if it is used to modify 可.
Now I've tentatively tried to introduce VerbForm=Conv
for VERBs with advmod
(transgressive use of VERBs which do not have objects or subjects). Please see this commit. We understand it is bad choice for isolating languages, but rather better choice for us than to change their POS into ADV
...
One of the possible exceptions is negation. So you can actually attach the first 不 directly to the auxiliary, and the validator should accept it if 不 has the feature
Polarity=Neg
.
In May of 2019 the validator accepted the case above, but now the validator for UD 2.5 does not accept the negation of aux
. We added Polarity=Neg
for all 不 in May, then how do we do with the new validator?
The current validator is supposed to be almost the same as in May; changes in source code involve mostly adding a unique id to each test. (But in the beginning of May, the online validation page temporarily did not show all errors reported by the validator; so you would have to compare the current results either to the actual validation script in May, or to the online validation report in April.)
I am now looking at the source code of the test. It actually checks several conditions to establish that the child node is a negation particle: besides the Polarity=Neg
feature, it also assumes that the tag will be PART
, following the description of PART and the v2 guidelines explanation.
In general, the
PART
tag should be used restrictively and only when no other tag is possible.
Umm... The negation words, 不, 未, 弗, 非 and so on, of Classical Chinese do not satisfy this restriction, since they were originally verbs (in ancient era) and later used as adverbs to negate other verbs (including auxiliary verbs later). So they were sometimes used alone. Well, I can change their tags into PART
by my script, but does it follow the guideline?
Perhaps it will help if I extend the condition to also accept ADV
?
Perhaps it will help if I extend the condition to also accept
ADV
?
Yes, in the negation of AUX
, if the new validator accepts ADV
with Polarity=Neg
, our problem in Classical Chinese validation will be resolved.
# text = 不肖者不得不及
1 不 不 ADV v,副詞,否定,無界 Polarity=Neg 2 advmod _ Gloss=not|SpaceAfter=No
2 肖 肖 VERB v,動詞,変化,性質 _ 3 acl _ Gloss=resemble|SpaceAfter=No
3 者 者 PART p,助詞,提示,* _ 7 nsubj _ Gloss=that-which|SpaceAfter=No
4 不 不 ADV v,副詞,否定,無界 Polarity=Neg 5 advmod _ Gloss=not|SpaceAfter=No
5 得 得 AUX v,助動詞,可能,* Mood=Pot 7 aux _ Gloss=must|SpaceAfter=No
6 不 不 ADV v,副詞,否定,無界 Polarity=Neg 7 advmod _ Gloss=not|SpaceAfter=No
7 及 及 VERB v,動詞,行為,移動 _ 0 root _ Gloss=reach|SpaceAfter=No
Done in UniversalDependencies/tools@0b142d4. A few of those errors are still there but most are gone.
Thank you very much, @dan-zeman , I've just fixed five bugs of aux
. Thank you again, and please help me about the problem of orphan
.
and please help me about the problem of orphan.
Something weird is happening on Github today. Some issues freeze instead of opening. And I wrote an answer in that orphan issue a couple of hours ago but I don't see it there now. Stay tuned, I will write it again.