UniversalDependencies/tools

More than one copula.

Closed this issue · 11 comments

We want to add more than one copula word.
Japanese has normal copula and honorific copula.

We also have contracted forms of the copula in speech corpora.

Hopefully, the restriction on the copula words should be relaxed in Japanese and other morphologically rich languages.
http://quest.ms.mff.cuni.cz/udvalidator/cgi-bin/unidep/langspec/specify_auxiliary.pl?lcode=ja

I agree with this suggestion. However, although it is possible to trace these elements to verbs in older attestations of languages where you could xcomp the copulation, the modern usage is often, at least in some languages, very off and would not fit in well. Here, I am not talking about any hypothetical situation but an existing analysis that treats a postposition as a token on its own (see i- and -dur, annotations mark latter to be i- in lemma, which is wrong). To improve this situation for languages that show typological similarities to Japanese, a way to consistently express these modern surfaces of copular constructions is necessary.

I think that the opposition of normal vs. honorific copula can be understood as deficient paradigm, which is an exception to the one-copula-lemma general rule. It is described in the guidelines and also in the introductory paragraphs on the specify_auxiliary page. However, for each of the copula lemmas, one must fill out the field "Deficient" and describe, which part of the deficient paradigm the lemma covers. I filled this field for the current copula だ (saying that it is the normal, non-honorific copula); once the field is filled, the system allows adding other copulas.

Could you advise how to deal with contracted forms of the copula in speech corpus?

I don't know what exactly is the nature of the contracted copula. But if it is a contraction, then I suppose it can be linked to the full, uncontracted form. In that case it should get the lemma of the full form, and no additional entry in the system is needed (because the validator checks the lemma, not the surface form).

Is there any method to remove from aux list via http://quest.ms.mff.cuni.cz/udvalidator/cgi-bin/unidep/langspec/specify_auxiliary.pl ?
We would like to move some auxiliary words to copula.

No, the interface does not support removal. Let me know the lemmas that should be removed and I will remove them manually in the back end.

Removed.

Thanks!

Done.