English-Czech COVID-19 Glossary

One way to say “to do one’s bit” in Czech is “pispt svou troškou do mlýna,” which means “to contribute one’s little bit to the mill.” For me, the mill was the global COVID-19 pandemic and the little bit I contributed was my English-Czech glossary of terms related to it.

No need for me to paint the situation in much detail since we’re all too familiar with it. It was mid-March and international borders were closed, the kids returned home from college, and we were sewing face masks. As I thought of my elderly relatives isolated in their homes and teared up looking at the faces of nurses and doctors ravaged with welts from endless shifts wearing tight-fitting respirators, goggles, and face shields, I realized that I, too, am a part of this community of humans trying to protect the vulnerable, help the sick, and preserve the way we function as a society. It seems almost funny in retrospect that the best thing I could think of doing was to create a glossary.

It wasn’t just the novel coronavirus that was spreading rapidly. There was an outpouring of information about the virus, the virus-altered reality we found ourselves in, the measures taken to mitigate it, and the research done toward overcoming it. Reading up on all this, I started to do what comes naturally to most translators and interpreters—asking myself if I could express those thoughts and name those concepts in my other working language.

I work in English and Czech and focus on the medical and pharmaceutical domains. My daily routine involves reading about developments in these fields. With the pandemic, major newspapers were chock-full of medical and scientific information. What is this virus? How do we defeat it? What tools do we need? The volume of information was, and still is, enormous. Creating the glossary was my way to get a grip on this information and prepare for the translation work that was sure to come my way. I decided not to keep this as my own personal asset (we translators tend to be pretty protective of our glossaries under normal circumstances). Instead, I’ve made the glossary available for free to anyone who might need it.

Figure 1: An example of several initial rows of the published glossary

The Glossary at a Glance

At the time of this writing, the glossary contains a collection of over 600 terms posted on my blog.1 I first posted 100 terms, which tripled and then doubled in size with subsequent updates. The blog posts contain a table with the English and Czech terms (see Figure 1) and downloadable .pdf and .tbx files that facilitate using the glossary according to the user’s needs (such as importing it into a computer-assisted translation [CAT] tool).

I have great respect and admiration for the work professionally trained terminologists do and feel compelled at this point to explain that my collection of terms is not a proper terminology database but rather what I call a glossary of the “translator’s friend” variety. It’s merely a well-researched list of language equivalents in current use to serve as a starting point for terminological research based on the context of the actual source text.

Choosing My Sources

My primary starting point is the Czech and English newspapers I read daily. I read the majority of articles related to COVID-19 and follow the links provided or research terms that catch my eye. This often leads to hours of terminological fun!

I also read recommendations/guidelines published by major international and national health care organizations (in English, I tend to limit myself to sources from the U.S., U.K., and European Union).

I also monitor legislative documents from the European Union. The major advantage of these documents for terminology research is that they are bilingual, which greatly accelerates term acquisition. I systematically look for all COVID-19 related legislation published at EUR-Lex2, download the English and Czech versions, and do terminology research over these documents.

In addition, I find it important to listen to podcasts and TV/radio interviews. One of the emphases for me was to capture the language as it’s actually used by medical professionals. For example, what a protective gown is called by a national standard for personal protective equipment might be completely different than what medical professionals actually call it.

A great stream of good sources for terminology research also comes from my wife, who is a nutrition therapist and stays informed of the latest developments as a part of her work. She forwards me interesting articles she comes across for me to read and research the terminology. I find this a great help because identifying good sources can be quite time-consuming as there is so much out there.

Figure 2: Working on file alignment using LogiTerm Pro

Extracting Terminology

I usually use the manual method: when I see an interesting term, I simply copy it into an Excel spreadsheet. I don’t do terminology research at the time of term extraction, so this spreadsheet has rows of Czech terms without English equivalents and vice versa.

If I come across bilingual sources (notably international guidelines available both in English and in Czech and the European Union legislation), I use automated term extraction. To do that, I first align the documents to create a bilingual (.tmx) file. There are a number of tools allowing us to do this but my tool of choice is LogiTerm Pro by Terminotix,3 which I find very fast, accurate and quite painless to use. (See Figure 2.) (As an aside, I learned about this tool at ATA’s Annual Conference in San Diego in 2012 and am still congratulating myself on attending that conference and that particular panel discussion where I was introduced to it.)

I then use SynchroTerm, also by Terminotix, to identify term candidates that I’ll evaluate and then extract those terms I find suitable. (See Figure 3.)

Figure 3: Automated term candidate identification in SynchroTerm

Finding Equivalents in the Second Language

I wanted to start this section with “And now for the fun part!” but actually the term extraction and the source reading prior to it were fun as well.4 But really, this part is when glossary creation gets to be its most interesting.

When I have enough terms identified in individual languages (in the first iteration of my glossary, this was 100 terms and the next two updates were several times larger), I treat the lists of terms as translation projects.

I have a COVID-19 translation project created in my CAT tool (I use SDL Trados Studio5), one for each language direction (English>Czech and Czech>English). I add the lists of terms into these projects as translatable files and start “translating.” (See Figure 4.) This allows me to use the functionality of the translation tool, such as concordance search and termbase connectivity. Termbases are shared by the projects regardless of language direction, and I have my COVID-19 glossary assigned as a termbase to my translation projects and update it often, which allows me to see right away if I have already researched a certain term.

One tool I must mention here is IntelliWebSearch6, which greatly increases the speed of web searches. I can easily do several dozen searches when researching each term. If I didn’t have this tool with its automatic customized search capabilities, I don’t think I would be writing this article. I would probably still be copy-and-pasting “covidiot” into my web browser’s search box.

When the research/translation phase is done, I save the target files and paste them into the master Excel spreadsheet containing the glossary. After a little maintenance (checking for possible duplicates, implementing changes to previously created term pairs based on new research), the glossary is ready for converting and publishing.

Excel spreadsheets, although a convenient intermediate step in glossary creation, are not, in my opinion, good to use as actual glossaries. We want the terms from the glossary to be offered to us consistently and effortlessly as suggestions while we translate. Excel cannot do that for us. That’s why I always convert them to termbase files to be used in CAT tools. The universal .tbx format I publish enables sharing across tools and, for my own use, I convert the glossary from .xlsx to .sdltb (this is a termbase file format used in SDL MultiTerm, the terminology tool of SDL Trados Studio). To perform these conversions, I use a super handy tool called Glossary Converter.7 (This tool is tied to SDL Trados Studio and SDL MultiTerm and is not a standalone application.)

Publishing the Glossary

I have a website for my translation business and publish the glossary in the blog section. I described my wishes to my “web guy,” Teo, and let him do the rest. While I have designed, built, and managed my own websites in the past, I realized over time that it’s best to focus on my core linguistic abilities and leave this type of work to experts. Yes, it does cost money (I actually ended up spending several hundred dollars to publish this free glossary) but it also saves a lot of time and frustration. Besides, I’m sure we would all appreciate it if Teo sought help from professional translators when he needs something translated. It feels good to be a part of an ecosystem of experts!

I announced the glossary on several translation discussion groups and on social media. I don’t have much of a following on Twitter so the response there was a bit underwhelming, but the response from the translation groups and from LinkedIn was very good. I saw my hitherto sleepy website go from single-digit visitorship in most months (mostly myself and Teo, I suspect), to around a thousand new visitors after my glossary was published.

Figure 4: Working on language equivalents in SDL Trados Studio

How to Use the Glossary if Czech (or English) Isn’t Your Thing

Chances are very good Czech is not one of your working languages. I hope this article might encourage you to delve into glossary building beyond the obligatory Excel spreadsheet or (the horror!) jotting down terms in a notepad or (should I even go there?) on sticky notes destined to peel off the wall behind your monitor and fall into a dust-bunny inhabited terminology limbo.

If you’re interested in creating your own COVID-19 glossary with English (or Czech) as one of the languages, you can take my glossary as a starting point. Just get rid of the Czech (or English) and use the resulting monolingual list of terms as a starting point to provide your language equivalents. To make this easy, I have published the Excel spreadsheet with my glossary on my blog as well.

Let’s Kick This Pandemic by Working Together

This is not a great time we’re going through right now. We do have, however, a chance to reassess and to come closer together in response to the challenges we face. My glossary is a tiny attempt to go in that direction. I can see a culture of more robust sharing and participation emerging in our profession and I strongly hope this is also a trend for humanity as a whole.

Notes
  1. You can access the blog at www.czechtrans.com/blog.
  2. https://eur-lex.europa.eu
  3. https://terminotix.com/index.asp?lang=en
  4. Special footnote for my children (on the off chance they ever read this). Okay, this does sound a bit nerdy.
  5. www.sdltrados.com
  6. www.intelliwebsearch.com
  7. https://appstore.sdl.com/language/app/gloss

Tomáš Barendregt is a medical and pharmaceutical translator working in Czech and English. He has over 25 years of experience as a freelance and in-house translator and interpreter. Tomáš lives in the Driftless area of southwestern Wisconsin, an ideal place for someone living the dual life of family man and ice-hockey enthusiast. He has lived half his life in the U.S. and half in the Czech Republic. He works as a Czech localization specialist at Blueprint Technologies. Contact: tomas@czechtrans.com.

Remember, if you have any ideas and/or suggestions regarding helpful resources or tools you would like to see featured, please e-mail Jost Zetzsche at jzetzsche@internationalwriters.com.

The ATA Chronicle © 2020 All rights reserved.