Forum: CAT Tools Technical Help
Topic: Trados and PDF files
Poster: José Henrique Lamensdorf
Post title: Getting it straight
[quote]Clarisa Moraña wrote:
Fadwa, a PDF file is a non editable file... [/quote]
There are two major types of PDF files:
a) editable, "live" or distilled; and
b) non-editable, "dead", or scanned.
The second type is equivalent to a hard copy original. A letter "O" will be nothing other than a round-shaped drawing. It is necessary to do OCR for any computer-driven (e.g. Trados-powered) translation.
Editable PDF files include the entire text, that can be translated. Such files are generated by any software capable of printing to a Postscript printer (it's a standard, not a brand).
To give you the complete picture, Postscript printers (no longer manufactured, AFAIK) used a special (computer) language with the same name. The computer sent an entire file "translated" into this language as a *.ps file, and a Postscript-compatible printer would print it accurately. So this is a printing file, i.e. everything is in place, though no longer assembled together as it was in the originating file (e.g. Word, Excel, PPT, InDesign, FrameMaker, whatever).
Instead of printing to hard copy, the *.ps file can also be "distilled" into a PDF file, whose main feature is the availability of reader programs for each computer/operating system. The advantage of the PDF is that any computer fitted with its specific reader program will open a PDF exactly the same, no matter if it's a PC with Windows, Linux, or DOS, a Macintosh, or any Android device (tablet, cell phone, TV box).
Previously, translators working on DTP-ed files had to either own and learn to operate the specific DTP app used (their proprietary files are are not mutually compatible), or team up with a suitable DTP operator. There are some converters from PDF to DOC, however they tend to create an uncontrollable quantity of text boxes on the latter, which most CAT tools have trouble with.
The modern trend for translating DTP-ed publications is to work directly on PDF files. While Trados (and other CAT tools) can trespass into them to get that text translated, they can't adjust layout issues created by text swelling, shrinking, or shifting position in the translation process.
So some solutions have come up. One if them is [url= [url removed] ]Infix[/url]. It is a PDF editor, fitted with special features for translation. Basically, it tags (the text AND the PDF) and exports all the text in TXT, XML, or XLIFF format to be translated outside - using any CAT tool you like - to later import it back, each text block into the right place, with the proper font, size, color etc. As it is a PDF editor, it includes all the DTP tools you may need to adjust text that no longer fits the space provided.
Of course, this is easier said than done. There are several other issues (e.g. partially embedded fonts among them) at play, but it is definitely faster, cheaper, and more efficient than the translator buying and learning to operate 3-4 different DTP apps (usually $$$). After all, a translator is not expected to change the existing layout.
Topic: Trados and PDF files
Poster: José Henrique Lamensdorf
Post title: Getting it straight
[quote]Clarisa Moraña wrote:
Fadwa, a PDF file is a non editable file... [/quote]
There are two major types of PDF files:
a) editable, "live" or distilled; and
b) non-editable, "dead", or scanned.
The second type is equivalent to a hard copy original. A letter "O" will be nothing other than a round-shaped drawing. It is necessary to do OCR for any computer-driven (e.g. Trados-powered) translation.
Editable PDF files include the entire text, that can be translated. Such files are generated by any software capable of printing to a Postscript printer (it's a standard, not a brand).
To give you the complete picture, Postscript printers (no longer manufactured, AFAIK) used a special (computer) language with the same name. The computer sent an entire file "translated" into this language as a *.ps file, and a Postscript-compatible printer would print it accurately. So this is a printing file, i.e. everything is in place, though no longer assembled together as it was in the originating file (e.g. Word, Excel, PPT, InDesign, FrameMaker, whatever).
Instead of printing to hard copy, the *.ps file can also be "distilled" into a PDF file, whose main feature is the availability of reader programs for each computer/operating system. The advantage of the PDF is that any computer fitted with its specific reader program will open a PDF exactly the same, no matter if it's a PC with Windows, Linux, or DOS, a Macintosh, or any Android device (tablet, cell phone, TV box).
Previously, translators working on DTP-ed files had to either own and learn to operate the specific DTP app used (their proprietary files are are not mutually compatible), or team up with a suitable DTP operator. There are some converters from PDF to DOC, however they tend to create an uncontrollable quantity of text boxes on the latter, which most CAT tools have trouble with.
The modern trend for translating DTP-ed publications is to work directly on PDF files. While Trados (and other CAT tools) can trespass into them to get that text translated, they can't adjust layout issues created by text swelling, shrinking, or shifting position in the translation process.
So some solutions have come up. One if them is [url= [url removed] ]Infix[/url]. It is a PDF editor, fitted with special features for translation. Basically, it tags (the text AND the PDF) and exports all the text in TXT, XML, or XLIFF format to be translated outside - using any CAT tool you like - to later import it back, each text block into the right place, with the proper font, size, color etc. As it is a PDF editor, it includes all the DTP tools you may need to adjust text that no longer fits the space provided.
Of course, this is easier said than done. There are several other issues (e.g. partially embedded fonts among them) at play, but it is definitely faster, cheaper, and more efficient than the translator buying and learning to operate 3-4 different DTP apps (usually $$$). After all, a translator is not expected to change the existing layout.