daniel-sc/ng-extract-i18n-merge

The single quote caracter ' keeps alternating between being encoded to ' and being left as is from one extract to the next, resulting in massive diffs

Opened this issue · 4 comments

Describe the bug
The single quote caracter ' keeps alternating between being encoded to ' and being left as is from one extract to the next, resulting in massive diffs. If the file initially contains ' it will be switched back to ' and vice versa. The issue only occurs when localized strings are added or removed from the localized file being merged.

This is very annoying as all the entries of the file having a single quote in them (we are talking about thousands) keep being marked as changed by the source control system, which makes reviewing the changes impossible.

Setup/Configuration
Example of such a diff:

 <trans-unit id="4330443193346340995" datatype="html">
         <source>Number of trees planted every year</source>
-        <target state="translated">Nombre total d'arbres plantés chaque année</target>
+        <target state="translated">Nombre total d&apos;arbres plantés chaque année</target>
       </trans-unit>

Config in angular.json

"extract-i18n": {
          "builder": "ng-extract-i18n-merge:ng-extract-i18n-merge",
          "options": {
            "browserTarget": "my-project:build",
            "format": "xliff",
            "includeContext": false,
            "outputPath": "projects/my-project/src/locales",
            "targetFiles": [
              "messages.fr.xlf",
              "../../../messages.xlf"
            ],
            "sourceFile": "../../../messages.xlf"
          }
        },

Expected behavior
The file remains stable.

Version (please complete the following information):

  • Angular: 17.0.9
  • OS: Linux
  • nodejs: 20.10.0
  • ng-extract-i18n-merge version: 2.9.1

Hi @nkosi23
I understand your point, if the encoding switches back and forth, this would be a bug.
With the given information, I could not reproduce this. For me ' and &apos;, both get encoded to &apos;.

  • Is it possible that some other translation tool is interfering here?
  • I notice, you have "../../../messages.xlf" in your targetFiles which is not as intended. "messages.xlf" is the sourceFile - i.e. where translations from the template/source code is extracted to - it should not be a targetFile at the same time. (Maybe you'd rather have a "messages.en.xlf" as a second target file?).

I'd appreciate if you could give feedback if your issue could be resolved or provide a (small) reproduction repository?

Sorry for the delay with my reply, I've had quite a lot of work lately and also wanted to take the time to see if the issue would reappear after fixing the angular.json file as you suggested (thanks for pointing this out). Now my config file look like the below:

"i18n": {
        "sourceLocale": "en",
        "locales": {
          "fr": {
            "translation": "projects/my-project/src/locales/messages.fr.xlf"
          }
        }
      },
...
"extract-i18n": {
          "builder": "ng-extract-i18n-merge:ng-extract-i18n-merge",
          "options": {
            "browserTarget": "my-project:build",
            "format": "xliff",
            "includeContext": false,
            "trim": true,
            "outputPath": "projects/my-project/src/locales",
            "targetFiles": [
              "messages.fr.xlf"
            ],
            "sourceFile": "../../messages.xlf"
          }
        },

But unfortunately, as I was just starting to believe that the issue disappeared, it just happened again. It looks like as soon a certain category of change/diff is detected, the encoding no longer is deterministic, at least the one of this character.

I do not use any other translation tool, I only use (and only have installed) the built-in i18n tool provided by angular + your package. Hopefully someone can have ideas, but I will try to create a repro

@nkosi23 is it possible that you have some automatic code formatting like prettier active or enabled in your IDE? A reproduction would be great!

I have the same issue but it's likely because of auto-formatting. I installed the Redhead XML Extension for VSCode to find errors that appeared after a merge (e.g. missing closing tags). Unfortunately, formatting XML is turned on by default: the extension is escaping ' to &apos; but it seems that ng extract-i18n is changing it back.