STAR's toolbox for language professionals
STAR's toolbox for language professionals
This week's blog post gives you a concise overview of STAR's tool range for multilingual information management. The article is an extract from the two hundred fiftieth Premium Edition of Jost Zetsche's Translation Toolbox, a newsletter subscribed to by 10,500 language professionals.
Starry, Starry Night
Don McLean's "Vincent" is still one of those songs that makes me cry just a little every time I hear it. I don't think he really had STAR AG's many products in mind when he serenaded the "starry, starry night" in the song's refrain -- but we can make it work for the purposes of this article.
An entire team at STAR took a couple of hours last week to guide me through several of those products, and some truly are interesting.
We've discussed translation memory-based authoring a good number of times here and elsewhere. From our perspective as translators, it's a no-brainer: If technical authors used translation memories and termbases as they write in the same way translators do, not only would the documentation they produce be more consistent, there would be a much greater number of matches when it comes to the translation phase. After all, many of the authoring choices would be based on matches in existing translation memories, which in turn would turn into matches again when it comes to the new translation pass.
In my very first column in the ATA Chronicle eight years ago, I described a prime opportunity for translators with experience in working with (the challenges of) translation memory and terminology maintenance to act as consultants for technical writers. In this case, we're looking at a scenario where a technology that we as the end users didn't use (almost) disappeared because of us. It was really up to us to make this technology a go -- technical writers are just about as thrilled about it as we were when translation memory was first offered (i.e., "not"), and since we didn't take on these (ahem, very well-paid consulting roles), nothing happened.
SDL scratched its AuthorAssistant last year, and Sajan's Authoring Coach suffered the same fate. As far as I know, only two companies offer TM-based authoring products today: Across with its crossAuthor and -- now I finally come to where I was going the whole time -- STAR with its MindReader.
MindReader can be used in Word (which is part of the 1800-euro single-seat package), FrameMaker, and Arbortext (the necessary add-ons for these systems cost extra), as well as Star's own content management system GRIPS. The way it works is simple enough: You work within Word or FrameMaker as you always have, but you also see a second, independent MindReader pane that gives you the matches from the memory that match your current writing. As with translation memory, you can set the fuzziness level, you can take the segment from MindReader over with a keyboard shortcut, or -- and this is where Star's particular way of dealing with translation memory or, as they call it, "reference material," comes into play -- you can open the originating file and copy a lot more than just that one segment.
The source material for the "authoring memory" can be Transit reference materials, Transit projects, XLIFF files (including SDLXLIFF), or even existing source documents (in which case you don't get the benefit from the additional TM leverage).
Terminology work also operates as it does in a translation project. The terms that should be used are automatically confirmed (or you are asked to avoid those that really SHOULD NOT be used), and you can add terms on the fly. The benefit of the latter is that if the project goes into many languages, it's actually the original author who controls what kind of terminology should be sitting in the termbase. While the translators still have to translate the terms, they don't have to worry about selecting which ones to add since that has already been done for them.
Another STAR product is called MindReader for Outlook, and it does exactly what you would guess. It creates a database from all your past sent emails and suggests text based on what you are presently entering (the suggested text is displayed in the lower half of the email you are composing). Very clever if you write a lot of repetitive emails (if you're as creative as I am you won't get many matches . . . just kidding -- I've actually bought the program and am having lots of fun with it). True to STAR's "memory principle," you can also open the complete previous email and copy more than just a sentence at a time. And the fuzziness level of the automatic searches is also adjustable in a newly added Outlook toolbar.
The price is not much of a detriment to this tool (49 euro). Potentially more problematic is its use of resources (it requires the SQL Server in the background, which is a little resource hungry), but it hasn't been too much of a problem for me to uninstall it again.
Here is what I really like about this tool, though: It took a little bit of creativity for translation tool developers to come up with TM-based authoring for technical authors. But they're part of the same supply chain as translators, so it didn't require a stroke of genius to come up with it. What I think is so cool about a tool like MindReader for Outlook is that its audience really has nothing at all to do with translators. It's everyone. I love it when I see that the technologies I'm using can also directly benefit my mom, my high school friends, and my neighbors. That's thinking outside the box.
It also took some thinking outside the box for the MT implementation that STAR is offering. STAR Moses is, as the name implies, like most statistical-based machine translation engines based on the open-source Moses engines. Unlike many other providers, STAR doesn't leave the training of STAR Moses up to the user but offers it as a paid (and ongoing) service. And while it's possible to use the STAR Moses engine outside of STAR Transit (some large corporate clients use it as an internal and required "Google Translate substitute" with a similar web interface), the most interesting approach comes into effect as one of the resources within STAR Transit.
I have previously talked about MT-based fuzzy match repair. Déjà Vu is the primary tool that uses this method. The idea of the concept is that if I can query a machine translation engine just for the "incorrect" part of a fuzzy TM match and automatically replace that component with what the MT offers, I might be miles ahead (well, probably just centimeters, but every little bit counts, right?). This is a great way of dealing with MT as a productivity enhancement, and I think it will be one of the ways we'll deal with MT in the future.
Another way might be what STAR has developed and coined as "TM validated MT match." The thinking is this: Traditionally we have looked at MT as something that should come into play if there is no perfect or fuzzy match within the TM. This makes sense because the TM is, of course, the gold standard, created as it is by us (or our team). What if, STAR thought, we also displayed MT suggestions alongside fuzzy matches? They might be of as good or even better quality, especially if it's just a terminological difference that makes the TM match fuzzy. And what if we evaluated the MT suggestions (I always hate saying "MT match") on the basis of our fuzzy TM matches?
Here's an example (that STAR used in the presentation):
The source sentence is "Pressure increase too slow when filling reservoir"
The fuzzy TM match is "Druckanstieg zu schnell bei Füllung des Tanks" ["Pressure increase too rapid when filling reservoir"]
The MT suggestion is "Druckanstieg zu langsambei Füllung des Tanks" ["Pressure increase too slow when filling reservoir"]
The program is able to compare the fuzzy TM match (for which it "knows" that there is only one unknown term) with the MT suggestion to find out that there is no other difference between the two than that particular term. It then concludes that the MT suggestion in all likelihood is correct -- the worst it could be is to have one term incorrect -- and it becomes an "Advanced MT match."
Now, this is not going to work really well with Google Translator or Microsoft Translator because the suggestions will likely be all over the place and therefore will make it unlikely that the fuzzy TM matches will prove to be valuable in evaluating the MT suggestions. But if your MT engine is essentially based on your and maybe some additional TMs, you likely will have a much greater success rate.
Now, Germans have the reputation of not being very enthusiastic, and I'm very happy to either prove that wrong or be one of the exceptions -- because I am very enthusiastic about all the different ideas that are popping up left and right when it comes to a productive and intelligent use of machine translation. It's so much "fun" (remember, I've lived on the US West Coast for almost 20 years!) to see bright minds come up with new ideas. And what's great for us is that the best ideas so far have always come back to the resource that we hold most dear: the translation memory.
Other STAR products include the FormatChecker. This tool checks about 50 different potential errors in Word or FrameMaker documents, ranging from typographical errors to duplicated spaces, paragraph marks, manual references, and many others. The intention is to create well-formed documents before the translation even starts, thus aiming at a better return on translation memory matches and/or better entry of data into the translation memory.
A massive STAR tool is STAR CLM (Corporate Language Management) for, not surprisingly, the language management within corporations. It's a system that is able to monitor content changes in content management systems through watch folders and largely automates the production chain. While it does not have to be used alongside STAR Transit, it is certainly streamlined for its use. Competitors? Won't surprise you to find SDL, Across, and Plunet in that list.
Let me know if you want to know more and I can direct you to the right person.
About the author:
Jost Zetzsche is a German-American translator, sinologist, translation tool expert and writer who lives in Oregon. Among other publications about translation and translation tools he has published The Translator’s Tool Box: A Computer Primer for Translators, an extensive 400+ page manual about computer assisted translation technology and tools. His bi-weekly Translation Toolbox newsletter is subscribed to by more than 10,500 readers."