Is the Machine Translation Market Maturing?

By Jost Zetzsche

One sign of a maturing industry or branches within an industry are companies that evolve to cater to that specific industry branch. When all kinds of companies and products rose up in the early 2000s to cater specifically to translators (I’m not thinking of translation environment tools here, but things like accounting tools, conversion apps, etc.), I took that as a sign that “we” had kind of arrived. We were important enough that companies could actually be sustained by building tools for us.

The machine translation (MT) space is so interesting when it comes to this. It’s not an accident that the names of most MT developers are not known primarily for building MT engines (think Microsoft, Google, Yandex, Baidu, Naver, Amazon, etc.). These are companies that might sell access to their MT services, but commerce alone likely wouldn’t justify the cost of building and maintaining them. In combination with the primary purpose of those companies—which in all these cases is strengthening their core product (most often their search engines)—it does make sense to build an MT engine and then sell it as an additional benefit of sorts.

There are exceptions, of course. While DeepL is as secretive as ever, I think it’s probably not far-fetched to assume that it’s now profitable. And companies like Systran, KantanMT, and a handful of others are also…well, at least surviving. The point is that despite the very large relevance that MT plays in most of our professional lives, very few products have become profitable.

There also have been many attempts to either automate the selection of MT, like the now-defunct FairTradeTranslation, or verify the level of post-editing, like the Swiss Post-Editing Score. The Translation Automation User Society (TAUS) has proposed many approaches to MT over the years that have usually eventually been abandoned. Nevertheless, I’m always interested when there are new products that try to approach this from a slightly different angle or, maybe more correctly, stubbornly try something again that hasn’t worked in the past.

Achim Ruopp is the founder and chief executive officer of Polyglot Technology, a small company that offers a product they call MT Decider.1 Its premise is really quite simple, but it might be very helpful for some.

Most translators use MT in some way. Sometimes it’s the client who decides what engine to use, but sometimes it’s up to us. The vast majority of translators will then make a choice between the major available engines without any customization. (I know there are exceptions, but overall I believe that’s true.) Some translators have educated opinions on which engine is currently the best available engine for the language combination in question, and if it’s a language combination like English>Spanish or English>German (or vice versa), those opinions may stay educated even over a period of time because the output quality of those large language combinations doesn’t change too much. This might be different for smaller language combinations. If, for instance, you work in Finnish, Lithuanian, or Kazakh, the output of the respective MT engines might vary significantly from month to month. Why? Because the corpora the engines are trained on are relatively small, and any addition might have a significant impact on the quality of the MT output in comparison to its competing engines.

Or let’s take another case. Say you run a midsize language services provider that often needs to work with 20+ languages for which you would like to use MT (in whatever capacity). Clearly, you don’t have the insight and knowledge about which engine is the best for each language pair. So, you can subscribe to quarterly reports that run tests on large MT engines (right now: Microsoft, Google, Amazon, and DeepL) that will provide a current readout on the quality of the output of said engines. Each engine and language combination are tested with two different evaluation systems: the, much maligned, BLEU (bilingual evaluation understudy) and the newer COMET (Crosslingual Optimized Metric for Evaluation of Translation), which itself is based on a neural system. These systems are not flawless in evaluating the output, but they’re able to detect changes from one report to the next, and that’s really what’s potentially interesting. (This is also something that’s otherwise unavailable unless you use a much more expensive system like Intento—which, I was recently told by Intento, really doesn’t focus on smaller providers any more).

As an individual translator, you can buy a quarterly subscription for $29 for two language combinations. As a language services provider, it’ll cost you $299 for all 48 available language combinations. You can see a list of the available languages and other information on the Polyglot Technology site I linked to.

There are still a few shortcomings I imagine Achim will eventually address. (For example, it’s not currently possible to do assessments of customized engines or specific domains.) But, again, if there’s an uptake in the system, I’m sure there will be improvements.

For show-and-tell purposes, you can download an English>French report from Polyglot Technology. If you just want to have one report but not subscribe, you can pay the first subscription fee, get the report, and then unsubscribe again.

  1. See

Jost Zetzsche is a translation industry and translation technology consultant. He is the author of Characters with Character: 50 Ways to Rekindle Your Love Affair with Language.

This column has two goals: to inform the community about technological advances and encourage the use and appreciation of technology among translation professionals.

Leave a Comment

Your email address will not be published. Required fields are marked *

The ATA Chronicle © 2023 All rights reserved.