IFRCGo/go-web-app

[PROD] User content translation issues

Opened this issue ยท 5 comments

tovari commented

We have noticed a few weird cases after the first few FRs got translated:

  • #531

  • the styling codes may appear not only in Arabic, but in other languages as well. In this example https://go.ifrc.org/emergencies/6542#surge , the code is not displayed in the input language (French), but appears in the 3 other languages.

  • part of the text may not be translated (https://go.ifrc.org/emergencies/6542#surge):
    The French input:
    image
    The English translation contains some French (however the same text is translated in other places of the text):
    image
    The Spanish translation even contains English translations weirdly, besides the not translated French strings:
    image

  • #529

  • #530

Tasks

tovari commented
  • Updating text on the admin page in the original language should trigger the auto translation (unless the "Skip auto translation" attribute is checked). Currently, translation update doesn't work.
  • Emergency and FR title translations should be composed from translated components: ISO3 code shouldn't be translated, disaster type translations are in the database, only the user added should be translated by the translation api. #1032 is also related to this.

Hey @tovari! For this, we're currently not storing the generated and user defined part separately! So, we won't able to do this for the existing field reports / emergencies. Going forward we can store the user defined part of title separately and only translate those.

CC: @udaynwa @samshara

  • Emergency and FR title translations should be composed from translated components: ISO3 code shouldn't be translated, disaster type translations are in the database, only the user added should be translated by the translation api. Translation of title also changes the country #1032 is also related to this.

@tovari

Auto-generated titles in the IFRC GO system are currently translated as whole units using the Translation API. To improve accuracy and consistency, translations should be composed by translating individual components of the titles.

Example:

  • Field Report title: {ISO3}: {Disaster} - {Date} {SUMMARY}
    • ISO3 (Country Code) should remain untranslated.
    • Disaster should use static translations from the system.
    • Date should remain untranslated.
    • SUMMARY should be translated using the Translation API.

Technical Details

Auto-generated Titles in IFRC GO:

  1. DREF Application
    • {Country} {Disaster} {Year}
  2. DREF Operational Update
    • {Country} {Disaster} {Year}
  3. DREF Final Report
    • {Country} {Disaster} {Year}
  4. Flash Update
    • {Countries} - {Hazard} {Date}
  5. Field Report / Emergency
    • {ISO3}: COVID-19
    • {ISO3}: COVID-19 #{Field Report Number} ({Date})
    • {ISO3}: {Disaster} - {Date} {SUMMARY} #{Field Report Number} ({Date})
    • {ISO3}: {Disaster} - {Date} {SUMMARY}

Current Implementation

  • Titles are generated on the frontend.
  • Titles are translated as a whole using the Translation API on the backend.

Planned Solution

  • Move title generation logic to the backend to facilitate translation of individual components.
  • Store all title components separately.
  • Update Field Report model and forms to store the SUMMARY component separately.
  • Add a migration to identify and separate the SUMMARY component for existing Field Reports.

Migration of Field Report: Possible Cases

  1. Structure and Data Match

    • Previous Summary: NPL: Storm - 2022/10/12 100 people displaced due to flooding
    • Country: NPL
    • Disaster: Storm
    • Date: 2022-10-12
    • New Summary: people displaced due to flooding
    • New Title (en): NPL: Storm - 2022/10/12 100 people displaced due to flooding
  2. Structure Match and Data Mismatch

    • Previous Summary: NPL: Storm - 2022/10/12 100 people displaced due to flooding
    • Country: IND
    • Disaster: Storm
    • Date: 2022-10-12
    • New Summary: people displaced due to flooding
    • New Title (en): ?
  3. Structure Mismatch

    • Previous Summary: (Storm) 100 people affected storm NPL 2021-10-12
    • Country: IND
    • Disaster: Storm
    • Date: 2022-10-12
    • New Summary: ?
    • New Title (en): ?

Field Report Title Analysis using Production Database

Field Report Count: 16,557

General Field Report Count: 14,604

  • Structure Match and Data Match: General 398, Field Report 98
  • Structure Match and Data Mismatch: General 30, Field Report 3
  • General Structure Match Count: 428
  • General Structure Mismatch Count: 14,176

COVID Field Report Count: 1,953

  • COVID Structure Match and Data Match: COVID 0, COVID FR Match 49
  • COVID Structure Match and Data Mismatch: COVID 5, COVID FR 0
  • COVID Structure Match Count: 54
  • COVID Mismatch Count: 1,899

Overall Count

  • Field Report Count: 16,557
  • Structure Match Count: 583
  • Structure Mismatch Count: 15,974

Additional Data

  • Country Missing Data: 0
  • Summary Missing Data: 4,169
  • Start Date Missing Data: 11,690
  • Report Date Missing Data: 11,324
  • Event Missing Data: 11,286
  • Disaster Missing Data: 0

Note

We also need to consider how to handle cases of data mismatch, where the disaster in the field report and the type of disaster in the emergency report do not match.

cc @udaynwa @tnagorra @thenav56

tovari commented

@samshara, as we discussed in the developer call, we don't need to do this (change titles, modify translated titles) for existing emergencies, FRs. We need to this for future emergencies and FRs.