Using public corpora in Transit NXT
Using public corpora in Transit NXT
Welcome to a new tooltip about Transit NXT. Today we would like to mention a very useful resource. The translation workload of some big international institutions generates a lot of reference material, which can be used by anyone if they release it in a suitable form, for example the translation memory exchange format (TMX). This is the case of the United Nations' and the European Commission's multilingual document collections (Uncorpora and DGT, respectively).
You can find other similar resources online, yet not so massive, released to the public domain either by large institutions, such as the European Medicines Agency or the European Central Bank, as well as several other European institutions, or even by communities and groups of volunteer translators who localize free and open source software into the community's local language and then release the translation memory back to the community. The OPUS corpus is an initiative to centralize this kind of public resources.These vast resources can be downloaded and converted into Transit NXT's language pair format by means of the Import TMX functionality that you will find in the resources bar (button Reference material > TMX interface), as we saw in the tooltip How to use a translation memory from another tool.
Once you have done that, you will be able to add the collection of language pairs to any project and hence potentially obtain concordance and fuzzy matches as long as you translate in one of the language combinations contained therein, of course.And that's all for now. Thanks for reading, and please do not hesitate to send your comments or questions or to ask for specific tooltips.
Categorías:
Envíenos sus comentarios