Quantcast
Channel: ProZ.com Translation Forums
Viewing all articles
Browse latest Browse all 3915

From Ms Word table to TMX file | @Hans

$
0
0
Forum: CAT Tools Technical Help
Topic: From Ms Word table to TMX file
Poster: Samuel Murray
Post title: @Hans

[quote]Hans Lenting wrote:
  • Replace bold and italic formatting and ampersands with markup. [/quote]
    Before you deal with formatting, you have to ask yourself what kind of a TMX file your target system will accept. If it's a fairly modern system, it should be able to handle standard TMX formatting tags, but it may be that your CAT tool has specific additional requirements, e.g. that the formatting tags must look a certain way.

    For example:

    The sentence "The [b]cat[/b] sat on the [u][i]mat[/i][/u]." would have to end up like this:
    <seg>The <bpt type="b">{{b}}</bpt>cat<ept> {{/b}}</ept> sat on the <bpt type="i">{{i}} </bpt><bpt type="i">{{u}}</bpt>cat<ept> {{/u}}</ept><ept>{{/i}}</ept>.</seg>

    The problem is, it's easy to replace a bold character with the same character plus markup, but it's not easy to replace a set of bold characters with the same character plus markup. And I don't think any CAT tool can automatically convert this:
    <b>t</b><b>h</b><b>i</b><b>s</b>
    into this:
    <b>this</b>

    Can you think of a Find syntax in Word that would find a piece of bold text and select the entire bold text? I can't. This is because Word regex is non-greedy, so you can't tell it to select an entire piece of bold text.

    That said (just thinking out loud), you could tell it to replace this:
    </b><b>
    with nothing.

    [Edited at 2022-08-21 13:58 GMT]

  • Viewing all articles
    Browse latest Browse all 3915

    Trending Articles