Forum: CAT Tools Technical Help
Topic: memoQ error: id attribute of tag "bpt"
Poster: Patrick Hopkins
Post title: Here's the answer
I wasn't able to find an answer to this problem on the web but I did figure it out for myself so I thought I'd share the solution here (even though it's been about 2 years since the original post) because I'm pretty sure others will have this problem.
A few points to start with:
Now let's fix the problem. I'll be explaining from a Windows point of view. For other OSs you'll have to adapt.
memoQ files (like those of other CAT tools) are basically XML files in text format so rather easy to manipulate if there are problems. You just have to know what to look for.
If your memoQ file is zipped (i.e. extension ends in "z") then right click on it and "Open archive" with 7-Zip. You will see one or more files. Extract the one that ends in the extension .mqxliff to your desktop (it's probably named document.mqxliff).
Right click the extracted file and open it with Notepad++ or some other text editor. You should be able to read the text clearly as it is in XML format. If you see random letters and codes then the file is compressed/zipped, so you'll need to close the file and uncompress it as explained above.
Use the key combination Ctrl-G to go to the Line Number specified by the memoQ error.
You will see that the XML file basically has a few lines for each segment, but the ones that interest us are those for the source and the target. Following is an example (using square brackets [] here otherwise the lines won't be reproduced correctly in this message):
[source xml:space="preserve" mq:segpart="1842" mq:hasfollowingobject="hasfollowingobject"][bpt id="1" ctype="bold"]{}[/bpt][bpt id="1" rid="1"][/bpt][bpt id="3" rid="2"][/bpt]Informazioni relative agli strumenti finanziari derivati ex art. 2427-bis del Codice Civile[ept id="4" rid="2"][/ept][ept id="5" rid="1"][/ept][ept id="1"]{}[/ept][/source]
[target xml:space="preserve"][bpt id="1" ctype="bold"]{}[/bpt][bpt id="1" rid="1"][/bpt][bpt id="3" rid="2"][/bpt]Information on derivative financial instruments pursuant to art. 2427-bis of the Italian Civil Code[ept id="4" rid="2"][/ept][ept id="5" rid="1"][/ept][ept id="1"]{}[/ept][/target]
I've bolded the source and target lines which refer to the two parts of the segment (in this case Italian and English). The bpt tags are used to add the bold, italics etc., and each bpt is given an ID number (id="1", id="2" etc.). Ideally, the tags are repeated in both the source and the target so what's italic in the source is also italic in the target.
So where's the error?
On a few different occasions the file generated by memoQ itself (!) created an error by giving the same ID number to two bpt tags in the same line. In fact, if you look at my example lines above, you'll see that in both the source and the target id="1" was given twice in each (I've bolded the second "1" in each line). You can see that the bpt tags in the source and target are id=1, id=1, id=3, id=4. That's your error, there are two id's with 1. All you need to do is change the second "1" to a "2" (making sure there aren't any other "2"s in the line) in both the source and the target.
For my example lines the update would look like this.
[source xml:space="preserve" mq:segpart="1842" mq:hasfollowingobject="hasfollowingobject"][bpt id="1" ctype="bold"]{}[/bpt][bpt id="2" rid="1"][/bpt][bpt id="3" rid="2"][/bpt]Informazioni relative agli strumenti finanziari derivati ex art. 2427-bis del Codice Civile[ept id="4" rid="2"][/ept][ept id="5" rid="1"][/ept][ept id="1"]{}[/ept][/source]
[target xml:space="preserve"][bpt id="1" ctype="bold"]{}[/bpt][bpt id="2" rid="1"][/bpt][bpt id="3" rid="2"][/bpt]Information on derivative financial instruments pursuant to art. 2427-bis of the Italian Civil Code[ept id="4" rid="2"][/ept][ept id="5" rid="1"][/ept][ept id="1"]{}[/ept][/target]
Save the file and you're done.
At this point you can import THIS FILE (document.mqxliff) unzipped directly without rezipping it.
I've had a few cases where I had to do this process 2 or 3 times with the same file because there were other lines with repeated IDs. The problem was always the same, as was the fix. Eventually you fix them all and then the file is imported ok.
That's it.
Good luck!
Topic: memoQ error: id attribute of tag "bpt"
Poster: Patrick Hopkins
Post title: Here's the answer
I wasn't able to find an answer to this problem on the web but I did figure it out for myself so I thought I'd share the solution here (even though it's been about 2 years since the original post) because I'm pretty sure others will have this problem.
A few points to start with:
- You'll need a text editor to fix the problem. I suggest something like Notepad++ because it can handle big files and highlights tags. Otherwise standard Notepad or similar should also work.
- You'll also need a program to open ZIP files (like 7-zip or similar).
- You need to understand if the memoQ file you're trying to import is compressed (zipped) or not. I'm not a super expert of memoQ, but I believe you can tell the difference by the extension of the filename. If the extension ends with the letter "z" (i.e. translation.docx_eng-US.mqxlz) then I believe that this means the file is compressed/zipped.
- The memoQ error will specify a line number and position. Keep that error open on the monitor or write down those two numbers for reference later.
Now let's fix the problem. I'll be explaining from a Windows point of view. For other OSs you'll have to adapt.
memoQ files (like those of other CAT tools) are basically XML files in text format so rather easy to manipulate if there are problems. You just have to know what to look for.
If your memoQ file is zipped (i.e. extension ends in "z") then right click on it and "Open archive" with 7-Zip. You will see one or more files. Extract the one that ends in the extension .mqxliff to your desktop (it's probably named document.mqxliff).
Right click the extracted file and open it with Notepad++ or some other text editor. You should be able to read the text clearly as it is in XML format. If you see random letters and codes then the file is compressed/zipped, so you'll need to close the file and uncompress it as explained above.
Use the key combination Ctrl-G to go to the Line Number specified by the memoQ error.
You will see that the XML file basically has a few lines for each segment, but the ones that interest us are those for the source and the target. Following is an example (using square brackets [] here otherwise the lines won't be reproduced correctly in this message):
[source xml:space="preserve" mq:segpart="1842" mq:hasfollowingobject="hasfollowingobject"][bpt id="1" ctype="bold"]{}[/bpt][bpt id="1" rid="1"][/bpt][bpt id="3" rid="2"][/bpt]Informazioni relative agli strumenti finanziari derivati ex art. 2427-bis del Codice Civile[ept id="4" rid="2"][/ept][ept id="5" rid="1"][/ept][ept id="1"]{}[/ept][/source]
[target xml:space="preserve"][bpt id="1" ctype="bold"]{}[/bpt][bpt id="1" rid="1"][/bpt][bpt id="3" rid="2"][/bpt]Information on derivative financial instruments pursuant to art. 2427-bis of the Italian Civil Code[ept id="4" rid="2"][/ept][ept id="5" rid="1"][/ept][ept id="1"]{}[/ept][/target]
I've bolded the source and target lines which refer to the two parts of the segment (in this case Italian and English). The bpt tags are used to add the bold, italics etc., and each bpt is given an ID number (id="1", id="2" etc.). Ideally, the tags are repeated in both the source and the target so what's italic in the source is also italic in the target.
So where's the error?
On a few different occasions the file generated by memoQ itself (!) created an error by giving the same ID number to two bpt tags in the same line. In fact, if you look at my example lines above, you'll see that in both the source and the target id="1" was given twice in each (I've bolded the second "1" in each line). You can see that the bpt tags in the source and target are id=1, id=1, id=3, id=4. That's your error, there are two id's with 1. All you need to do is change the second "1" to a "2" (making sure there aren't any other "2"s in the line) in both the source and the target.
For my example lines the update would look like this.
[source xml:space="preserve" mq:segpart="1842" mq:hasfollowingobject="hasfollowingobject"][bpt id="1" ctype="bold"]{}[/bpt][bpt id="2" rid="1"][/bpt][bpt id="3" rid="2"][/bpt]Informazioni relative agli strumenti finanziari derivati ex art. 2427-bis del Codice Civile[ept id="4" rid="2"][/ept][ept id="5" rid="1"][/ept][ept id="1"]{}[/ept][/source]
[target xml:space="preserve"][bpt id="1" ctype="bold"]{}[/bpt][bpt id="2" rid="1"][/bpt][bpt id="3" rid="2"][/bpt]Information on derivative financial instruments pursuant to art. 2427-bis of the Italian Civil Code[ept id="4" rid="2"][/ept][ept id="5" rid="1"][/ept][ept id="1"]{}[/ept][/target]
Save the file and you're done.
At this point you can import THIS FILE (document.mqxliff) unzipped directly without rezipping it.
I've had a few cases where I had to do this process 2 or 3 times with the same file because there were other lines with repeated IDs. The problem was always the same, as was the fix. Eventually you fix them all and then the file is imported ok.
That's it.
Good luck!