TMLookup | options

Forum: CAT Tools Technical Help
Topic: TMLookup
Poster: FarkasAndras
Post title: options

[quote]Michael Beijer wrote:

Okay, so I have another question about #1: is there any clever way to make it so that the program "merely"updates the db, rather than re-creates the whole darned thing? The reason I am asking is that my current database took a hell of a long time to create, and I wouldn't want to have to wait that long every time one or two TMXs in my very large collection change. In LogiTerm, there is a distinction between updating a "module" (read: db) and indexing it (basically creating it from scratch again).
[/quote]
Not sure if updating indexes (instead of reindexing from scratch) is possible in SQLite, I'm guessing not.

The intended use is: you have your large db(s) that you use on an ongoing basis. You import them once, perhaps you program them to F1-F8 for quick access (I added F5-F8 so there are 8 slots now) and leave them alone.
Then if a job comes along with some reference TMs, client-specific glossaries or some such, you dump those in your autoimport folder and have them accessible by F9. This "current project" collection changes all the time, which the autoimport feature makes quick and painless, and it's normally not very large. In my practice it will probably often contain one tmx of a couple hundred TUs. If you have a set of TMXes that amount to hundreds of thousands of TUs and you use them for various different projects, you're probably better off just making a 'normal' db out of them.

[quote]Michael Beijer wrote:
Oh yeah, and another thing. In CafeTran, we can create individual "tables" inside these SQlite databases. So instead of creating an entire new database for each subject area, or particular folder of TMXs, e.g., you could also just utilise one massive db, but have it split into multiple tables. Hans vd Broek (aka MetaArkadia) mentioned this somewhere I think. Would this be something that you could implement in TMLookup at some point? I imagine an ideal world, in which TMLookup has a dialogue in its UI with kind of a table, containing a long list of different TMX categories (let's call them "modules"), which are all stored inside one massive DB in tables, with little checkboxes next to each of them, and the user could then select which of these individual tables he wished to update or even reindex completely (e.g. depending on whether he had just updated a particular client’s TMX that morning). Just thinking out loud ;) [/quote]
Well, whether various sub-tms are in separate files or in separate tables in the same db file doesn't make much difference to my mind. Having them in separate files makes it easier to selectively delete, update, move, share with others etc. If and when I manage to add support for multiple DBs to TMLookup, you will be able to pick and choose in your list of dbs and get hits from many dbs at the same time in the same table (in whatever arbitrary ranking order you prefer, possibly colour coded). That makes it feasible to split your data into many dbs. Those smaller dbs would be easier/faster to update than a single large db.

If you want to update parts of large miscellaneous dbs with changed content often, one option is to try the delete hits feature. Delete the contents of TMX number 53 version 23 from the db (based on the source field) and import TMX number 53 version 24.

By the way, importing tmx files is much slower than importing tabbed txt files. You probably can't switch to using tabbed files, but if you can, it will help. You can add language codes to the file name to get the languages recognized correctly. Put the two-letter language codes at the end of the file name, separated by a hyphen (reference_tm_en-nl.txt). If the db has the same columns (en & nl) the file will be read based on the language codes. This is of course not necessary with tmx files because tmx files have the language codes inside the file.