Praise in other languages

Question

Praise in other languages

gaborcsardi opened this issue 8 years ago · 37 comments

gaborcsardi commented 8 years ago

Specifically Chinese first, via @Avatoo. \o/

We need to work out some simple architecture first.

Answer 1 · 2016-04-18T07:59:25.000Z

I can help for praise in French 😄

Answer 2 · 2016-04-18T08:06:47.000Z

@masalmon Cool, I'll soon update the code to handle multiple languages.

Answer 3 · 2016-04-18T08:08:40.000Z

Merveilleux ! Fantastique ! Superbe !

Answer 4 · 2016-04-20T22:43:02.000Z

I am thinking about a good way to do this. The goal would be to be able to write

praise(gettext("Your tests are ${adjective}!"))

or something like this, and then get praise in multiple languages. Two things are required for this:

We need to add some translations to testthat or whatever package we want to add international praise to. This would use the usual NLS system.
We need to add the parts of speech in other languages. E.g. adjectif for French, etc.

E.g. the gettext translates the string above to "Vos tests sont ${adjectif}", and then we just use this template as we are using it now.

Does this make sense? Or do we want to try automatic translation via the google translate API? I guess that could be error prone, so maybe the NLS way is better?

Answer 5 · 2016-04-21T08:50:50.000Z

Would it be a lot of work to test the google translate API on a few examples to see how bad the results are?

Answer 6 · 2016-04-21T11:04:43.000Z

The thing is, even is we can use google translate, we also want a way that lets people have more control. So why not start with that?

Answer 7 · 2016-12-09T14:06:00.000Z

Hi @gaborcsardi following up on this -- a bit late sorry. What exactly could I do to help make praise work for French too (apart from contributing words)?

@chucheria would like to contribute for Spanish.

Answer 8 · 2016-12-09T14:13:43.000Z

@masalmon Thanks!

I guess we would need to decide what "work" means. I.e. consider testthat praise. It is implemented like this:

praise::praise("Your tests are ${adjective}!")
praise::praise("${EXCLAMATION} - ${adjective} code.")

So how would (hypothetical) user Hadley add support for other languages? Or all this would be automatic? The two obvious solutions are:

We translate all non-template words via Google translate or sg. similar in praise, if we detect a French locale. And then just substitute in the templated words, i.e. the nice adjectives to get a sentence in French.
We require user Hadley to supply templates in various languages. We might help user Hadley with translation tips via an automatic translation service.

The first solution is nice if it works well, and maybe it works well for simple templates. Maybe we can implement both solutions.

What do you think?

Answer 9 · 2016-12-09T14:26:56.000Z

I guess the first solution is easier? Or in the case of a package like testthat, I could translate all the templates, because there are not many anyway?

Also, the idea would be to have people contribute the nice adjectives (because then you only need to know your language and a few git commands), but I guess that part is easy.

Answer 10 · 2016-12-09T14:33:40.000Z

I guess the first solution is easier? Or in the case of a package like testthat, I could translate all the templates, because there are not many anyway?

Anyway, maybe we can implement both? Let's implement the automatic way, and see how it works. Btw. Google translate is not free any more, but maybe this works: http://www.r-pkg.org/pkg/RYandexTranslate

Also, the idea would be to have people contribute the nice adjectives (because then you only need to know your language and a few git commands), but I guess that part is easy.

Agreed.

Answer 11 · 2016-12-17T18:31:13.000Z

I've just installed RYandexTranslate & registered for the free service (at last!). They seem to use two-letters language code.

I've also looked at your commit regarding language detection, is there a particular reason you use Sys.getlocale() instead of Sys.getlocale(category = "LC_COLLATE")?

Answer 12 · 2016-12-17T18:46:14.000Z

Last very small things for today, I looked at praise code in testthat and the praising and encouraging sentences are "hard-coded". Should praise have categories for this (a "english_congratulation.R" and "english_encouragement.R"), and can we hope to have them replaced in testthat?

The Yandex API works well for the unique sentence to be translated in testthat:

> translate(api_key, text = "Your tests are", lang = "en-fr")
$lang
[1] "en-fr"

$text
[1] "Vos tests sont"

Answer 13 · 2016-12-17T18:49:13.000Z

I've just realized that in languages like French ${adjective} will need to be ${singular-adjective} and ${plural-adjective}.

Answer 14 · 2016-12-19T08:53:47.000Z

I've also looked at your commit regarding language detection, is there a particular reason you use Sys.getlocale() instead of Sys.getlocale(category = "LC_COLLATE")?

DOn't remember. Looks like this is what I am doing: 0ca9979#diff-951791f1fb37d9e5b0f0cf852ce38d83R30

I suppose we can add LC_COLLATE here as well, I don't really see why you would have that set up and the others not, but I don't know much about locales.

. Should praise have categories for this (a "english_congratulation.R" and "english_encouragement.R"), and can we hope to have them replaced in testthat?

Maybe, but in general I would leave writing sentences up to package authors depending on praise.

I've just realized that in languages like French ${adjective} will need to be ${singular-adjective} and ${plural-adjective}.

Hmmm, yeah, that's a problem, and more "complicated" languages will be even worse.

So I would keep it simple and use the auto-translation for suggestions only. Maybe the manual praise translation is even better, then people speaking various languages can just contribute translations to testthat and other praising packages. How about this?

Answer 15 · 2016-12-19T09:03:02.000Z

On my PC (Windpws)

> Sys.getlocale()
[1] "LC_COLLATE=Spanish_Spain.1252;LC_CTYPE=Spanish_Spain.1252;LC_MONETARY=Spanish_Spain.1252;LC_NUMERIC=C;LC_TIME=Spanish_Spain.1252"
> Sys.getlocale("LC_COLLATE")
[1] "Spanish_Spain.1252"

so the substr wouldn't work with Sys.getlocale()?

I don't understand what you mean by manual praise translation? How would this work for the praising packages?

Answer 16 · 2016-12-19T09:07:24.000Z

so the substr wouldn't work with Sys.getlocale()?

OK, we'll need to read more about locales I suppose. Or find good code that gives a two or three letter code from the locales.

I don't understand what you mean by manual praise translation? How would this work for the praising packages?

Package author writes the sentences in all languages she knows. (She can get help from auto-translation, but I would implement auto-translation later.) Then people that know other languages can submit pull requests that add support for other languages that praise supports. I think this is good, because it encourages collaboration.

Answer 17 · 2016-12-19T09:09:39.000Z

Ok, I'll try to make myself wiser about locales in the next weeks.
And could some examples be kept in the praise package itself if they are general sentences?

Answer 18 · 2016-12-19T09:11:19.000Z

And could some examples be kept in the praise package itself if they are general sentences?

Sure, that makes a lot of sense. We can have a praise_code() function or praise_package() or some generic function, e.g. praise_this("package").

Answer 19 · 2016-12-19T09:12:30.000Z

What would the praise_this("package") function do? Create the infrastructure for recognizing language?

Answer 20 · 2016-12-19T09:13:44.000Z

Oh, no, sorry, these would be just praising sentences that are kept within praise, and they could be translated to all languages we support.

Answer 21 · 2016-12-21T10:18:30.000Z

There's a R package for plurals but only in English, what a pity: https://github.com/hrbrmstr/pluralize

Answer 22 · 2016-12-21T10:47:39.000Z

@masalmon No prob, if we go the "manual" way, we don't really need that.

Btw. I think hunspell can do this for all languages that it supports, but we don't need to worry about it now.

Answer 23 · 2017-01-15T16:21:31.000Z

Just a summary of the discussion (cc @chucheria ) @gaborcsardi please correct me if I'm wrong which I quite likely am :-)

The international branch of this package has e.g. english-adverbs.R, for each new language we have to add all the corresponding .R. @chucheria & I could create these files and they'd be filled during git workshop, even if the rest of the international structure of the package isn't ready, because these collections of words will still get useful at some point.
The code for recognizing the locale needs to be improved a bit. Note, we'll have to write the correspondance between a 2/3-letter language code and the full name of the language.
The code for recognizing the locale will be used in generic functions inside praise.
However, the international possibilities will be useful only if

maintainers of packages using praise, e.g. like testthat do, accept to see their R code modified so that it includes recognition of the locale,
volunteers submit translations of sentences of the package to the package maintainer,
so that if the locale is a language other than English that is offered by praise + the package itself (you need the adverbs in Spanish in praise and the sentences in Spanish in testthat for instance), the package can output messages in this language.

Answer 24 · 2017-01-16T19:07:22.000Z

I would not put locale stuff in testthat & co, I would just do sg. like

praise_lang("You are ${adjective}!", lang = "en")
praise_lang("Du bist ${adjective}!", lang = "de")

or sg like this.

Answer 25 · 2017-02-08T08:47:15.000Z

Or even just

praise("You are ${adjective}!", lang = "en")
praise("Du bist ${adjective}!", lang = "de")

or

praise(
  en = "You are ${adjective}!",
  de = "Du bist ${adjective}!"
)

Answer 26 · 2017-02-08T08:48:50.000Z

Another way would be to use gettext...

Answer 27 · 2017-02-08T09:10:23.000Z

What is gettext?

Answer 28 · 2017-02-08T09:11:13.000Z

The standard way to translate text messages. See ?gettext.

Answer 29 · 2017-02-08T09:21:39.000Z

So with gettext, people could just write

praise("You are ${adjective}!")

as before, but then praise() would check if the "You are ${adjective}!" string has a translation in the current locale, either

in the calling package, or
in praise itself.
After the translation, we would do the templating, as before, using the detected language.

Then the messages would need to be translated using e.g. msgtools. But the words lists would be the same as before.

Answer 30 · 2017-02-08T09:51:13.000Z

This sounds like the easiest solution?

Answer 31 · 2017-02-08T10:12:58.000Z

For the users, yes. Even for people adding new words.

For people dealing with the translation system (=us), not really. :)

Answer 32 · 2017-02-08T10:14:19.000Z

But then we can praise ourselves ;-)

Answer 33 · 2017-02-08T17:10:26.000Z

OK, I implemented a framework: https://github.com/rladies/praise/tree/international

I'll write a short guide on how to add translations, and then we can test it on you if you don't mind. :)

Btw. we'll need to re-organize the package a bit, because non-ASCII characters are not allowed in code. So I'll move the words to data/ or inst/.

Answer 34 · 2017-02-08T17:41:48.000Z

Awesome! Looking forward to testing it.

Génial ! J'ai hâte de le tester !

Answer 35 · 2017-02-23T11:12:41.000Z

Here is a short how-to: https://github.com/rladies/praise/blob/international/inst/international.md

I have added Hungarian, not too many words, just s PoC.

FYI.

Answer 36 · 2017-02-23T11:54:33.000Z

I'll have a better look next week but this looks AWESOME! 👏👏👏

Answer 37 · 2019-06-15T11:54:17.000Z

For other languages it's important to make the difference for genre, its not the same expressions for men than for women, can change the written and also the meaning from very good to very bad ^_^