Speech-Rule-Engine/speech-rule-engine

Nemeth: spacing bug

NSoiffer opened this issue · 2 comments

I have a feeling we discussed this issue in the past. I finally heard back from the BANA Nemeth committee about "what is a function". The question is whether there is a fixed list (as in the green book) or if it is open-ended. The short answer is that the committee says it is open ended. If something is a function name, it is supposed to get a space after it (Rule 119).

In sre-tests\input\nemeth\aata.json, there is this test:

    "Appl_141": {
      "id": "m-12748",
      "url": "http://abstract.ups.edu/aata/exercises-vect.html#ZRY",
      "input": "<math><mi>Hom</mi><mo data-mjx-texclass=\"NONE\">⁡</mo><mo stretchy=\"false\">(</mo><mi>V</mi><mo>,</mo><mi>W</mi><mo stretchy=\"false\">)</mo></math>",
      "tex": "\\Hom(V, W)",
      "expected": "⠠⠓⠕⠍⠷⠠⠧⠠⠀⠠⠺⠾",
      "reference": {
        "m-12748": "http://abstract.ups.edu/aata/exercises-vect.html#ZRY"
      }
    },

The correct output should have a space before the paren (⠷): "⠠⠓⠕⠍⠀⠷⠠⠧⠠⠀⠠⠺⠾"

BANA says they will have a new reference in a few weeks. This is what they sent me regarding function names:

This list is not meant to be exhaustive. While working on the update we recognized that the list would be very long and still not complete for the exact reason that mathematicians create their own function names. We placed the following sentence before the list:

A partial list of function names and their abbreviated forms is given below.

Unfortunately that means following form and inserting the space. I’ve removed the section numbers for some sort of propriety, but if the language is approved this is what the revised code says about spacing:

A space is left after an unmodified function name or its abbreviated form. If the function name or its abbreviated form carries a superscript, subscript, modifier, or other braille indicator, the space follows the superscript, subscript, termination of modifier, or other braille indicator.
If two or more consecutive function names or their abbreviated forms occur, they may be printed with or without a space between them. The transcription follows print spacing. When there is doubt concerning the presence of a space in print between the function names or their abbreviated forms, a space should be inserted in the transcription.
The expression which follows or precedes the function name or its abbreviated form is spaced in accordance with the other spacing rules of this Code.

While on the topic of function names we also say:

Follow print when transcribing a function name or its abbreviated form. All abbreviated function names are transcribed in Nemeth Code. Function names used in mathematical context that are not abbreviated are also transcribed in Nemeth Code.

Thanks, Neal. I believe there was also a related to the question of how to transcribe a colon, where the BANA explanations are very confusing.

However, this seems to be a genuine SRE/Nemeth bug as the expression with lowecase hom works fine and adds the missing space. And it does work, for instance, for trace Tr/tr but not kernel Ker/ker.