Forum: CAT Tools Technical Help
Topic: Merging translation memories
Poster: FarkasAndras
Post title: No
[quote]TransAndLoc wrote:
Hello,
In our translation company we are experts about translation memory issues.
A tmx file can be opened using Notepad or even better Notepad++ (recommended).
Open both tmx files using Notepad++. Copy the text from <body> until </body>. Do NOT include the tags <body> nor </body> only the text in between. Then paste this text in the second tmx file right after the tag </body>. Save. Now you will have both translation memories in one tmx file.
[/quote]
Yes you can do this, but I would definitely not recommend making this the standard practice. You might get mixed language codes, or languages in mixed order, or you might mess it up one time and struggle to find where you went wrong. The two tmx files might be in different encodings, and I'm not 100% sure that non-ascii characters will copy-paste correctly in all scenarios between files in different encodings. I do mess around in tmx files manually sometimes, but it's not the best practice for regular use especially for someone who doesn't know the internals that well. I have written programs that generate and read TMX files so I have a good grasp of what the format is like and can fix problems. A cryptic error message from a CAT tool after an incorrect copy-paste would stop most people dead in their tracks.
By the way, you need to use character references to get tags to show up here. I fixed part of your post to make it more intelligible. This charater reference problem that broke your post is exactly the same one that will break your TMs as well if you follow this procedure (see below). But then an expert knows this already, right?
[quote]TransAndLoc wrote:
Now you can use Word to clean the code.
...
[/quote]
Yes you can do this too, but you shouldn't. If you want to convert tmx to a table, there are better solutions. Quite apart from the potential for errors, this solution doesn't handle character references, so all <, >, & and quote characters will be messed up. It's also a little tedious for regular use. In short, you'd get a mess if you used this method on real-world TMX files.
To reply to the OP's question, the solution is:
1) require all translators to send TMX files in all cases. All CAT tools can import/export TMX so it shouldn't be a problem. There is no need for you to handle other formats.
2) find a tool that will merge TMX files for you. There are many options, including any CAT tool you may have.
[Edited at 2016-02-03 10:52 GMT]
Topic: Merging translation memories
Poster: FarkasAndras
Post title: No
[quote]TransAndLoc wrote:
Hello,
In our translation company we are experts about translation memory issues.
A tmx file can be opened using Notepad or even better Notepad++ (recommended).
Open both tmx files using Notepad++. Copy the text from <body> until </body>. Do NOT include the tags <body> nor </body> only the text in between. Then paste this text in the second tmx file right after the tag </body>. Save. Now you will have both translation memories in one tmx file.
[/quote]
Yes you can do this, but I would definitely not recommend making this the standard practice. You might get mixed language codes, or languages in mixed order, or you might mess it up one time and struggle to find where you went wrong. The two tmx files might be in different encodings, and I'm not 100% sure that non-ascii characters will copy-paste correctly in all scenarios between files in different encodings. I do mess around in tmx files manually sometimes, but it's not the best practice for regular use especially for someone who doesn't know the internals that well. I have written programs that generate and read TMX files so I have a good grasp of what the format is like and can fix problems. A cryptic error message from a CAT tool after an incorrect copy-paste would stop most people dead in their tracks.
By the way, you need to use character references to get tags to show up here. I fixed part of your post to make it more intelligible. This charater reference problem that broke your post is exactly the same one that will break your TMs as well if you follow this procedure (see below). But then an expert knows this already, right?
[quote]TransAndLoc wrote:
Now you can use Word to clean the code.
...
[/quote]
Yes you can do this too, but you shouldn't. If you want to convert tmx to a table, there are better solutions. Quite apart from the potential for errors, this solution doesn't handle character references, so all <, >, & and quote characters will be messed up. It's also a little tedious for regular use. In short, you'd get a mess if you used this method on real-world TMX files.
To reply to the OP's question, the solution is:
1) require all translators to send TMX files in all cases. All CAT tools can import/export TMX so it shouldn't be a problem. There is no need for you to handle other formats.
2) find a tool that will merge TMX files for you. There are many options, including any CAT tool you may have.
[Edited at 2016-02-03 10:52 GMT]