Development of TWEC method [Palmonari et al.] with Language shift
- Verify that vector representation of a world translated in different languages is similar
- We used TWEC approach to align the document translated in different languages
- We used TWEC as-is and a TWEC personal modified version that we called TWEC-IIS (Identify injection substitution)
- Obtained results analysis
- Proceedings of the EUROPEAN PARLIAMENT (1996 - 2011)
- Extracted from the website of the European Parliament
- Composed by 1.946.253 non-structured sentences Download dataset here
If you want to know more about this project please have a look at the presentation.