Inconsistent inclusion of nikud in Hebrew results
NeatNit opened this issue · 2 comments
Not sure if this is a bug in this tool or in something more upstream, but I'm seeing inconsistent inclusion of nikud - Hebrew phonetic notation (also spelt niqqud, nikkud - for future searches to find this issue) when translating words into Hebrew:
~ $ trans -b -no-bidi en:he hello
שלום
~ $ trans -b -no-bidi en:he more
יותר
~ $ trans -b -no-bidi en:he less
פָּחוֹת
~ $ trans -b -no-bidi en:he lesser
קָטָן יוֹתֵר
~ $ trans -b -no-bidi en:he indeed
אכן
~ $ trans -b -no-bidi en:he element
אֵלֵמֶנט
~ $ trans -b -no-bidi en:he opposite
מול
~ $ trans -b -no-bidi en:he above
מֵעַל
Seemingly at random, some results include nikud and some do not. For example "less" translates with nikud, "more" without.
I noticed this bit in the docs, which I originally thought was relevant and indicative that this is a bug:
In brief mode, phonetic notation (if any) is not shown by default. To enable this, put an at sign “@” in front of the language code
But as I type this and try the listed example with and without the flag, I realise it's something completely different and not related to the target language's superfluous notation.
Either way though: for consistent output, I think it should always show a translation without nikud (nikud is extremely rare in everyday life, but always appears in dictionaries)
I noticed that the full output does show good options without nikud:
Version info:
Translate Shell 0.9.7.1
platform Linux
terminal type xterm-256color
bi-di emulator [N/A]
gawk (GNU Awk) 5.3.0
fribidi (GNU FriBidi) 1.0.13
audio player mpv --no-config
terminal pager less
web browser xdg-open
user locale en_US.UTF-8 (English)
host language en
source language auto
target language en
translation engine auto
proxy [NONE]
user-agent Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/104.0.0.0 Safari/537.36 Edg/104.0.1293.54
ip version [DEFAULT]
theme default
init file [NONE]
running in Termux on Android
As far as I'm aware, the output of trans
is consistent with Google Translate (https://translate.google.com/), which does include nikud (most of the time).
As trans
is just a command-line interface which is mostly language-agnostic, we can't fix this on our part, unless Google's API provides both nikud-marked text and regular text (which is not the case so far).
If you want translation without nikud then I suggest using Bing as the engine: