Forum: CAT Tools Technical Help
Topic: CAT tool with easily adjustable segmentation rules
Poster: Samuel Murray
Post title: @Csaba
[quote]Csaba Lehel wrote:
I still think changing the segmentation rules would be the simplest way, but the only thing I discovered is that this hidden character at the end of cells is called either 'end of cell mark', or 'end of row mark' (or marker). [/quote]
I think block-level (i.e. paragraph or cell) segmentation is pretty much hard-coded in most CAT tools. You can only adjust the segmentation rules within those block-level elements, and not create a segmentation rule that causes the segment to span across more than one block-level element. I know of no CAT tool that can do this with Word or Excel files.
Expand/merge and shrink/split in most CAT tools won't allow you to merge across block-level boundaries either. The purpose of the merge/split feature in a CAT tool is mostly to help fix it when the CAT tool didn't guess correctly how to segment a piece of text.
[quote]At the moment I have just OmegaT, that has nothing similar to join or split. [/quote]
Yes, no, OmegaT doesn't have a built-in merge and split feature yet. To merge or split in OmegaT, you have to edit the segmentation rules, which are quite complicated. You can merge and split using a script in OmegaT, though (just ask in the OmegaT forum where to get it).
But anyway, OmegaT doesn't allow you to set segmentation rules that will expand segments beyond the paragraph/cell boundary, so a merge/split feature won't help you anyway.
The best current OmegaT solution is to edit the source text and then press F5 (reload).
[Edited at 2019-06-09 19:16 GMT]
Topic: CAT tool with easily adjustable segmentation rules
Poster: Samuel Murray
Post title: @Csaba
[quote]Csaba Lehel wrote:
I still think changing the segmentation rules would be the simplest way, but the only thing I discovered is that this hidden character at the end of cells is called either 'end of cell mark', or 'end of row mark' (or marker). [/quote]
I think block-level (i.e. paragraph or cell) segmentation is pretty much hard-coded in most CAT tools. You can only adjust the segmentation rules within those block-level elements, and not create a segmentation rule that causes the segment to span across more than one block-level element. I know of no CAT tool that can do this with Word or Excel files.
Expand/merge and shrink/split in most CAT tools won't allow you to merge across block-level boundaries either. The purpose of the merge/split feature in a CAT tool is mostly to help fix it when the CAT tool didn't guess correctly how to segment a piece of text.
[quote]At the moment I have just OmegaT, that has nothing similar to join or split. [/quote]
Yes, no, OmegaT doesn't have a built-in merge and split feature yet. To merge or split in OmegaT, you have to edit the segmentation rules, which are quite complicated. You can merge and split using a script in OmegaT, though (just ask in the OmegaT forum where to get it).
But anyway, OmegaT doesn't allow you to set segmentation rules that will expand segments beyond the paragraph/cell boundary, so a merge/split feature won't help you anyway.
The best current OmegaT solution is to edit the source text and then press F5 (reload).
[Edited at 2019-06-09 19:16 GMT]