WeblateOrg/weblate

CDATA tags in Android stripped off on import and export

NLLAPPS opened this issue · 4 comments

Describe the issue

Hi, thanks for great tool. I've been playing around with Weblate 5.7.2 and noticed it strips off CDATA tags when uploading and downloading Android string resources.

I have seen #1211 and translate/translate#4320

But the issue seems to be there still

I already tried

  • I've read and searched the documentation.
  • I've searched for similar filed issues in this repository.

Steps to reproduce the behavior

Create a project

Add a component with
File format: Android String Resource component.

Upload base transnational file containing CDATA.

For example:
<string name="service_terms_agreement_notice">By continuing you accept our <![CDATA[<a href="%1$s">Terms of Use</a>]]> and <![CDATA[<a href="%2$s">Privacy Policy</a>]]></string>

Observe string to become (in text filed in Weblate):
By continuing you accept our <a href="%1$s">Terms of Use</a> and <a href="%2$s">Privacy Policy</a>

Observe string in exported(downloaded) XML string resource to become
By continuing you accept our &lt;a href="%1$s"&gt;Terms of Use&lt;/a&gt; and &lt;a href="%2$s"&gt;Privacy Policy&lt;/a&gt;

If you however create a new string with name "service_terms_agreement_notice" and type:
By continuing you accept our <![CDATA[<a href="%1$s">Terms of Use</a>]]> and <![CDATA[<a href="%2$s">Privacy Policy</a>]]>

Formatting is preserved text filed in Weblate so it looks like:
By continuing you accept our <![CDATA[<a href="%1$s">Terms of Use</a>]]> and <![CDATA[<a href="%2$s">Privacy Policy</a>]]>

Observe string in exported(downloaded) XML string resource to be same:
By continuing you accept our &lt;a href="%1$s"&gt;Terms of Use&lt;/a&gt; and &lt;a href="%2$s"&gt;Privacy Policy&lt;/a&gt;

Expected behavior

CDATA and HTML formatting to be preserved in exported/downloaded string resource as
By continuing you accept our <![CDATA[<a href="%1$s">Terms of Use</a>]]> and <![CDATA[<a href="%2$s">Privacy Policy</a>]]>

Screenshots

No response

Exception traceback

No response

How do you run Weblate?

Docker container

Weblate versions

  • Weblate: 5.7.2

  • Django: 5.1.1

  • siphashc: 2.4.1

  • translate-toolkit: 3.13.3

  • lxml: 5.3.0

  • pillow: 10.4.0

  • nh3: 0.2.18

  • python-dateutil: 2.9.0.post0

  • social-auth-core: 4.5.4

  • social-auth-app-django: 5.4.2

  • django-crispy-forms: 2.3

  • oauthlib: 3.2.2

  • django-compressor: 4.5.1

  • djangorestframework: 3.15.2

  • django-filter: 24.3

  • django-appconf: 1.0.6

  • user-agents: 2.2.0

  • filelock: 3.16.1

  • rapidfuzz: 3.9.7

  • openpyxl: 3.1.5

  • celery: 5.4.0

  • django-celery-beat: 2.7.0

  • kombu: 5.4.1

  • translation-finder: 2.16

  • weblate-language-data: 2024.6

  • html2text: 2024.2.26

  • pycairo: 1.27.0

  • PyGObject: 3.50.0

  • diff-match-patch: 20230430

  • requests: 2.32.3

  • django-redis: 5.4.0

  • hiredis: 3.0.0

  • sentry-sdk: 2.14.0

  • Cython: 3.0.11

  • mistletoe: 1.4.0

  • GitPython: 3.1.43

  • borgbackup: 1.4.0

  • pyparsing: 3.1.4

  • ahocorasick_rs: 0.22.0

  • python-redis-lock: 4.0.0

  • charset-normalizer: 3.3.2

  • cyrtranslit: 1.1.1

  • Python: 3.12.6

  • Git: 2.39.5

  • psycopg: 3.2.2

  • psycopg-binary: 3.2.2

  • phply: 1.2.6

  • ruamel.yaml: 0.18.6

  • tesserocr: 2.7.1

  • boto3: 1.35.21

  • zeep: 4.2.1

  • aeidon: 1.15

  • iniparse: 0.5

  • mysqlclient: 2.2.4

  • Mercurial: 6.8.1

  • git-svn: 2.39.5

  • git-review: 2.4.0

  • PostgreSQL server: 16.4

  • Database backends: django.db.backends.postgresql

  • PostgreSQL implementation: psycopg3 (binary)

  • Cache backends: default:RedisCache, avatar:FileBasedCache

  • OS encoding: filesystem=utf-8, default=utf-8

  • Celery: redis://weblate-redis:6379/1, redis://weblate-redis:6379/1, regular

  • Platform: Linux 6.1.0-26-amd64 (x86_64)

Weblate deploy checks

Additional context

No response

Another think I have noticed is this:

I alter faulty string and change it to
By continuing you accept our <![CDATA[<a href="%1$s">Terms of Use</a>]]> and <![CDATA[<a href="%2$s">Privacy</a>

Then make another change to my source files and upload it to server and then run
Project > Component > Manage > Rescan

Weblate alters the string value and replaces CDATA but only in string ediying interface. So you see
By continuing you accept our <a href=%1$s>Terms of Use</a> and <a href=%2$s>Privacy Policy</a>
while translating to another language and miss CDATA sections.

For example, say you want to translate to German,
Your source file in file system has the string as:
By continuing you accept our <![CDATA[<a href="%1$s">Terms of Use</a>]]> and <![CDATA[<a href="%2$s">Privacy</a>

Weblate shows it to you in edit section as:
By continuing you accept our <a href=%1$s>Terms of Use</a> and <a href=%2$s>Privacy Policy</a>

Hence, you do not add CDATA to translation and final export for German looks like:
By continuing you accept our &lt;a href="%1$s"&gt;Terms of Use&lt;/a&gt; and &lt;a href="%2$s"&gt;Privacy Policy&lt;/a&gt;

And, once all changes committed, your source translation you have uploaded as
By continuing you accept our <![CDATA[<a href="%1$s">Terms of Use</a>]]> and <![CDATA[<a href="%2$s">Privacy</a>

again becomes (in file system - repo)

By continuing you accept our &lt;a href="%1$s"&gt;Terms of Use&lt;/a&gt; and &lt;a href="%2$s"&gt;Privacy Policy&lt;/a&gt;

The issue you've reported needs to be addressed in the translate-toolkit. Please file the issue there, and include links to any relevant specifications about the formats (if applicable).

Thank you for your report; the issue you have reported has just been fixed.

  • In case you see a problem with the fix, please comment on this issue.
  • In case you see a similar problem, please open a separate issue.
  • If you are happy with the outcome, don’t hesitate to support Weblate by making a donation.

Hi and thank you for the update. Unfortunately CDATA issue is not resolved. Editing interface shows and seems to save CDATA properly. But, it is stripped off when strings file downloaded.

Steps to produce:

  1. Create and edit a string containing CDATA as per above.
  2. Observe that CDATA is preserved when string saved/edited/updated in the interface.
  3. Download the string file for Android
  4. Observe that CDATA stripped off and string turned in to
    <string name="service_terms_agreement_notice">By continuing you accept our &lt;a href="%1$s"&gt;Terms of Use&lt;/a&gt; and &lt;a href="%2$s"&gt;Privacy Policy&lt;/a&gt;</string>

While it should actually be exported as
<string name="service_terms_agreement_notice">By continuing you accept our <![CDATA[<a href="%1$s">Terms of Use</a>]]> and <![CDATA[<a href="%2$s">Privacy Policy</a>]]></string>