Add a generic way to expand symmbols regardless of translation

Question

Add a generic way to expand symmbols regardless of translation

Opened this issue 7 days ago · 13 comments

Is your feature request related to a problem? Please describe.

There are several requests to expand the symbols file, such as #6904, #5194, #16720. They all seem to stall on the concern that adding additional symbols will break speech for some languages.

Describe the solution you'd like

Based on the way cldr was added to NVDA, I propose making that logic more generic, i.e. by adding more additional symbol dictionaries next to cldr that can be enabled/disabled from a checkable list in the speech settings panel. So instead of having the Include Unicode Consortium data (including emoji) when processing characters and symbols checkbox, add a checkable list called Additional symbol processing dictionaries that contains the Unicode Consortium data (including emoji) item, among others.

Additional context

In #16732, @yeatersink proposed adding some symbol dictionaries for several ancient languages. I think that we follow the approach as outlined above, we could create an ancient languages symbols dictionary that contains all these symbols.

Answer 1 · 2024-06-25T20:57:07.000Z

@LeonarddeR, thanks for having opened the issue.

I would rather have described the use case, before describing a solution.
The use case seems to be: find the best way to read languages for which no synth exists such as ancient languages.
A solution may be to expand symbol files; maybe that's not the only one?

Answer 2 · 2024-06-26T07:33:06.000Z

Greetings,
Thank you to @LeonarddeR for re- raising this issue. I sincerely appreciate it.

@CyrilleB79 I will present the problem, then the proposed solution:

1 The problem.
We need access to several ancient languages for required education. These languages are often grouped together in dictionaries, which demands access to a multitude of languages on the same page, at the same time. This is also true for grammars where books are written in one language to teach another language.

NVDA already has a automatic language switching feature, which allows this to happen with the languages NVDA already reconizes. Adding new or additional languages should not be a problem.
We need a solution to make this happen, and I and my team are willing to do what needs to be dome to make it work.

2 The Solution:
So far, we have written several tables for several languages for Lib Louis, and Duxbury. Thus, we have braille. However, we need speech. NVDA seems to be the most logical solution.

What I would like to see is to be able to add a new language to NVDA for each of the languages that we need.

NVDA has an automatic language switching feature which will accommodate several languages on the same page. It seems that if we add a new language for the languages that we have written braille for, then this will present the most natural and helpful solution. The lib louis tables are already in NVDA, we just need speech.

This is why we opened this issue, and created a pull request.
3 What we have done so far:

We have tested the characters and symbols for all the languages by adding them to our English locale on our own personal computer. We have created locale folders for each individual language already.

We used the NVDA Speech synthesizer, and it speaks the characters, when they are part of the English locale.

What would be optimal is if we can use NVDA, Microsoft voices or the eloquence tts to simply speak what the characters are. But also recognize them as their own language. This way we could check a box to tell NVDA what language it is or what language to speak.

For example:
Akkadian has a US system and a German system. They use the same Unicode character, but have different names for them and different braille for them. So, we would like to be able to tell NVDA that we what NVDA to speak the German names, just like we can choose the German lib louis braille table as opposed to NVDA being confused and not knowing if it is the US system or the German system.

I know we can work this out. Please help me, I desperately need this for school, and Leonard is going to be needing it in the fall for his school as well.

Answer 3 · 2024-06-26T08:07:17.000Z

@LeonarddeR This would actually be a great solution. I'm actually learning Spanish at the minute and it would be nice to be able to enable something that could tell me when I encounter a Spanish n which NVDA currently calls a regular n.

I love the idea of being able to have a list of options in terms of character sets that are currently enabled. It would be great if we could have a sub-menu inside settings or something that could have a keyboard shortcut for easy access.

Answer 4 · 2024-06-26T08:30:12.000Z

Some thoughts.

@CyrilleB79 I tried to come up with a generic approach to cover several use cases as outlined in the mentioned issues, therefore I see this particular issue merely as a solution focused one rather than having a discussion about the underlying issues. Again, the problem regarding ancient languages is not an isolated issue.
@yeatersink I think we should treat these ancient languages separately from the concept of supported locales in NVDA, but rather as a bunch of foreign characters NVDA should know how to pronounce, and that's exactly where symbol dictionaries are meant for. Your point about the German and English system for Akkadian emphasizes this. That case can be solved by providing the English symbol names in a dictionary for English, and the German names in a dictionary specifically for German. Automatic language switching is mainly meant for speech synthesizers to know when to switch languages, but these ancient languages don't have a proper speech synthesizer, apart from Hebrew.
@paulGeoghegan I'm not sure about the spanish n, do you mean Ñ? That's properly announced with ESpeak here, OneCore stays silent though. I think that should be treated as a different issue.

Answer 5 · 2024-06-26T08:41:09.000Z

@LeonarddeR

@paulGeoghegan I'm not sure about the spanish n, do you mean Ñ? That's properly announced with ESpeak here, OneCore stays silent though. I think that should be treated as a different issue.

My example was just an example of how others could use it because I just love the idea of having support for different character sets that you could enable and disable when you need them.

Yes I am using one core so I might try out Espeak.

Answer 6 · 2024-06-26T09:06:01.000Z

The solution proposed by @LeonarddeR seems promising because it takes into account the fact that the names of characters may have various names in different languages (e.g. difference between English and German names). On the opposite, the solution provided in the PR was not adaptable for users speaking a language other than English.

Answer 7 · 2024-06-26T09:15:07.000Z

@CyrilleB79 to be honest it might not be either way. For example the Akkadian language character names are defined in English so if you are a non-english speaker then you may not understand them. This does not apply to every language but to a lot of them. This solution would however allow for a solution that would allow other language users to at least enable other character sets if they wish.

Answer 8 · 2024-06-26T10:06:01.000Z

It is common sense that if a symbol is not defined in a locale, English is used as a fallback. Also note that enabling these additional character sets would be opt-in, they'd still be disabled by default.
That said, ideally these new symbols would be translated to other languages as well, but it is up to every locale maintainer to do this or not.

Answer 9 · 2024-06-26T11:45:48.000Z

@LeonarddeR exactly. I just wanted to make it clear to @CyrilleB79 that they would technically be language-specific.

Answer 10 · 2024-06-26T12:31:10.000Z

@LeonarddeR exactly. I just wanted to make it clear to @CyrilleB79 that they would technically be language-specific.

It was clear to me and that was the sense of my comment #16739 (comment).

To be extra-clear, this new subset of characters should be provided in English and the opportunity should be offered to translators to translate them. If they don't, the character's name will fallback to English.

Answer 11 · 2024-07-01T16:00:52.000Z

@seanbudd Curious to know what you think about this approach. Would NV Access accept a pr for this?

Answer 12 · 2024-07-01T20:58:59.000Z

I agree this would be a huge improvement in screen reading different kind of content.
It would be nice if the checkable list would group symbols in soemthing like:

Language specific symbols
Mathematical symbols
Musical symbols
Other scientific symbols
Emojis

But note that in case of mathematical symbols at least, I added many of them already to the symbols.dic some years ago, so they are translated in many languages now. We should be careful not to cause conflicts between the current symbols.dic file and the optional symbol dictionaries.

Answer 13 · 2024-07-02T00:36:47.000Z

@LeonarddeR - feel free to open a PR