Forum: CAT Tools Technical Help
Topic: How to extract content from SGML to create TMX file
Poster: gianghl1983
Dear ProZ users,
Recently, I got a bilingual text in SGML as below(about 40.000 English-Vietnamese sentences). In the code below, I changed < and > symbol by [[[ and ]]] accordingly.
I want to put them in my TM but could not find a way to convert this file to TMX.
Is there anyone know a solution for this.
Thank you!
------------------------------------------
[[[doc id='N0001']]]
[[[head]]]
[[[title]]]What is a Fenqing ?[[[/title]]]
[[[corpus url=' [url removed] ']]]EVBCorpus[[[/corpus]]]
[[[author [email removed] ']]]Quoc-Hung Ngo, Werner Winiwarter[[[/author]]]
[[[citation]]]Quoc-Hung Ngo, Werner Winiwarter, (2012). "Building an English-Vietnamese Bilingual Corpus for Machine Translation", International Conference on Asian Language Processing 2012 (IALP 2012), pp. 157-160, Ha Noi, Vietnam[[[/citation]]]
[[[/head]]]
[[[text]]]
[[[spair id='1']]]
[[[s id='en1']]]What is a Fenqing ?[[[/s]]]
[[[s id='vn1']]]Fenqing là gì ?[[[/s]]]
[[[/spair]]]
[[[spair id='2']]]
[[[s id='en2']]]Fenqing is a Chinese word which literally means " angry youth " .[[[/s]]]
[[[s id='vn2']]]Fenqing là một từ tiếng Hoa mà nghĩa đen là " thanh niên phẫn nộ " .[[[/s]]]
[[[/spair]]]
[[[spair id='3']]]
[[[s id='en3']]]This word has many translations in English such as cynical youth , young nationalists , hysterical youth and angry young men .[[[/s]]]
[[[s id='vn3']]]Từ này có nhiều cách dịch sang tiếng Anh như là thanh niên hoài nghi , thanh niên theo chủ nghĩa dân tộc , thanh niên cuồng loạn và thanh niên tức giận .[[[/s]]]
[[[/spair]]]
[[[spair id='4']]]
....
[[[/text]]]
[[[/doc]]]
Topic: How to extract content from SGML to create TMX file
Poster: gianghl1983
Dear ProZ users,
Recently, I got a bilingual text in SGML as below(about 40.000 English-Vietnamese sentences). In the code below, I changed < and > symbol by [[[ and ]]] accordingly.
I want to put them in my TM but could not find a way to convert this file to TMX.
Is there anyone know a solution for this.
Thank you!
------------------------------------------
[[[doc id='N0001']]]
[[[head]]]
[[[title]]]What is a Fenqing ?[[[/title]]]
[[[corpus url=' [url removed] ']]]EVBCorpus[[[/corpus]]]
[[[author [email removed] ']]]Quoc-Hung Ngo, Werner Winiwarter[[[/author]]]
[[[citation]]]Quoc-Hung Ngo, Werner Winiwarter, (2012). "Building an English-Vietnamese Bilingual Corpus for Machine Translation", International Conference on Asian Language Processing 2012 (IALP 2012), pp. 157-160, Ha Noi, Vietnam[[[/citation]]]
[[[/head]]]
[[[text]]]
[[[spair id='1']]]
[[[s id='en1']]]What is a Fenqing ?[[[/s]]]
[[[s id='vn1']]]Fenqing là gì ?[[[/s]]]
[[[/spair]]]
[[[spair id='2']]]
[[[s id='en2']]]Fenqing is a Chinese word which literally means " angry youth " .[[[/s]]]
[[[s id='vn2']]]Fenqing là một từ tiếng Hoa mà nghĩa đen là " thanh niên phẫn nộ " .[[[/s]]]
[[[/spair]]]
[[[spair id='3']]]
[[[s id='en3']]]This word has many translations in English such as cynical youth , young nationalists , hysterical youth and angry young men .[[[/s]]]
[[[s id='vn3']]]Từ này có nhiều cách dịch sang tiếng Anh như là thanh niên hoài nghi , thanh niên theo chủ nghĩa dân tộc , thanh niên cuồng loạn và thanh niên tức giận .[[[/s]]]
[[[/spair]]]
[[[spair id='4']]]
....
[[[/text]]]
[[[/doc]]]