charlesLoder/hebrew-transliteration

Add `PASS_THROUGH` option

Closed this issue · 3 comments

For an ADDITIONAL_FEATURE, writing the callback can get messy:

const heb = require("./dist/index");
const rules = require("./dist/rules");

const result = heb.transliterate("בְּרֵאשִׁ֖ית וַיַּבְדֵּל", {
  ADDITIONAL_FEATURES: [
    {
      // matches any sheva in a syllable that is NOT preceded by a vowel character
      HEBREW: "(?<![\u{05B1}-\u{05BB}\u{05C7}].*)\u{05B0}",
      FEATURE: "syllable",
      TRANSLITERATION: function (syllable, _hebrew, schema) {
        const next = syllable.next;
        // discrepancy here: in havarotjs SHEVA is simply the character
        // whereas transliteration is concerned with a specific sheva, a vocal sheva
        const nextVowel = next.vowelName === "SHEVA" ? "VOCAL_SHEVA" : next.vowelName;

        if (next && nextVowel) {
          const vowel = schema[nextVowel] || "";
          // replaceAndTransliterate is an internal helper function
          return rules.replaceAndTransliterate(syllable.text, new RegExp("\u{05B0}", "u"), vowel, schema);
        }

        return syllable.text;
      }
    }
  ]
});

// bērēʾšît wayyabdēl

Namely, you have to imprt a rule and use it.

The PASS_THROUGH option could work like this:

const result = heb.transliterate("בְּרֵאשִׁ֖ית וַיַּבְדֵּל", {
  ADDITIONAL_FEATURES: [
    {
      // matches any sheva in a syllable that is NOT preceded by a vowel character
      HEBREW: "(?<![\u{05B1}-\u{05BB}\u{05C7}].*)\u{05B0}",
      FEATURE: "syllable",
      PASS_THROUGH: true,
      TRANSLITERATION: function (syllable, _hebrew, schema) {
        const next = syllable.next;
        // discrepancy here: in havarotjs SHEVA is simply the character
        // whereas transliteration is concerned with a specific sheva, a vocal sheva
        const nextVowel = next.vowelName === "SHEVA" ? "VOCAL_SHEVA" : next.vowelName;

        if (next && nextVowel) {
          const vowel = schema[nextVowel] || "";
          return syllable.text.replace(new RegExp("\u{05B0}", "u"), vowel);
        }

        return syllable.text;
      }
    }
  ]
});

This way no import is used, and it can continue to map characters in the rules as usual. No need to implement existing logic.

bērēʾšît wayyabdēl would be wrong because shewa is a short vowel and the b in the second word is spirantizated to v.

bērēʾšît wayyabdēl would be wrong because shewa is a short vowel and the b in the second word is spirantizated to v.

In standard academic SBL transliteration, the default is actually to not spirantize unless necessary — SBLHS §5.1.1.4 — it does seem counter intuitive though

The other impetus for this, is that on the site a schema can be downloaded as a json file. If there is an import statement on top, then it's not valid json.