TMLookup | Well...

Forum: CAT Tools Technical Help
Topic: TMLookup
Poster: FarkasAndras
Post title: Well...

[quote]Michael Joseph Wdowiak Beijer wrote:

[quote]FarkasAndras wrote:

[quote]Michael Joseph Wdowiak Beijer wrote:

The TMX was created by Déjà Vu X3. Just sent it to you.

Thanks for the new version!

Michael [/quote]

As expected, this is due to creative tag formatting. The tmx has the tag split between two lines and TMLookup expects it to be on one line. Again, like a previous issue, this is because TMLookup doesn't have a proper xml parser because I can't be bothered to learn how to implement one. So instead of wrangling with a horrible coding problem I'm left wrangling with somewhat less horrible troubleshooting problems every now and then. So it goes. I could just look for xml:lang= without the tuv, but in principle some tmx files could have other elements where the language is specified with xml:lang=, not just the text itself. So then it could break on those. This can be solved in multiple ways of course, but none of them are trivial or appetizing to me. Implementing a proper parser is the least appetizing of all. So maybe I'll fix this... maybe not. Adding the language codes to the filename should work.

In the meantime, the new version of sqlite that will allow for somewhat fancier/faster text searches is trickling down the pipeline. It went through two stages and it is now one step away from where I can start fiddling with it. We'll see.
[/quote]

Indeed, I just had another look at it, and they sure got creative with the line breaks.

Until you fix it I'll just add the language codes to the filename, which seems to work fine. I might also mention to Atril support that their TMXs are a little weird. [/quote]
To be fair, their tmx is perfectly valid. Perhaps a little unusual but fine. It's my "parsing" that is not up to scratch.
BTW, if you have a lot of tmxes to import, you could fix them by removing line breaks after <tuv. Sed and other tools can do mass replacements on multiple files.

I don't want the codes in filenames to overrule langcodes read from inside the file... the latter tends to be more reliable I would think.
But in your specific example "nl-BE" should be read by TML as nl and "en-US" as en... it chops off the bit after the hyphen.

TMLookup | Well...

Trending Articles

Practice Sheet of Right form of verbs for HSC Students

Download: FK ft Shenky – Nakuyewa ”Prod by: Shenky”

How to win at Markstrat (Markstrat Tips and Tricks) – Vodites

Ominde Commission Report and Recommendations – Ominde Report of 1964

Bureau of Internal Revenue: Regional Offices (Directory)

GO 53 on Enhancement of Ex-gratia upto 5 Lakhs Toddy Tappers in Telangana

Cakewalk CA-2A Leveling Amplifier v2.0.1.97 WiN, v2.0.1.96 OSX Incl Keygen

Mp3 Download: Mdu - Kunjenjenjena

How the kill the job , when DTP request running for long hours.

Microsoft Intune から展開しているアプリのアップデートについて

18-year-old girl was beaten for half an hour by two Northampton men in 'an...

Car crash in Dunton Bassett leaves driver in critical condition

Macky 2, Two Others In Road Accident

Application log 00000000000000089514: Could not convert queue DLVST90CLNT

Detroit mafia: D’Anna Brothers agree to plea deal

Delivery block field greyed out using VA02

Muloraki Au

【個人撮影】スマホのプライベート映像♪「中に出さないで///」カラオケ屋での生ハメ撮りが流出ｗ【リベンジポルノ】＠PornHub

BREAKING NEWS: Diamond Platnumz Is Reported Dead After Ghastly Car Accident

FIAT 500 B0111 B0112