Terminology Management: What You Should Know

By Marina Ilari

Terminology management is the process of identifying, storing, and managing customer, company, or product-specific terminology that needs to be translated in a definitive way. Simply put, it’s the process of building a glossary with preferred translations (or ones the client doesn’t want used due to personal preferences) that you’ll then upload to a termbase. The termbase is comprised of not only terms, but also essential information about them, such as images, notes, definitions, and examples. The following will examine some of the usages, features, and benefits of terminology management for translators and project managers.

Benefits of Terminology Management

There are some clear benefits of having a solid terminology management process in place for your translation and localization projects.

  1. It allows you to achieve effective and accurate translations by organizing the usage of terms with a clear set of rules to ensure that the correct term is used within a specified context.
  2. It ensures consistency with the client’s preferred terminology across all projects and improves the translation quality by reducing the possibility of choosing incorrect translations for specific terms.
  3. It saves time by providing a quick and easy way to check that terms are used correctly in the translation, and it allows you to manage the instructions and preferences in one centralized place.

In other words, terminology management helps with consistency, accuracy, and efficiency.

Termbases and Glossaries

Sometimes the terms glossary and termbase are used interchangeably, but they are different.

In its most basic form—and in the context of this industry—a glossary contains a list of terms along with their preferred translations. It’s usually created using Excel or CSV files and includes two columns: one for the source term and another for the target term. Depending on the complexity of the project, it can also be useful to add clarifications such as definitions and usage samples.

That’s when a glossary morphs into a termbase, which is a list of terms paired with corresponding terms in the target language along with additional information about the terms, including definitions, examples, and clarifications. Termbases are typically integrated into a computer-assisted translation (CAT) tool, which can run quality checks and ensures the termbase is always properly used.

Some useful information that can be added to a termbase includes:

  • Information about usage within context. For example, you could specify if the translation included in the termbase should be used for a specific context, such as call to action (CTA) buttons (prompts on a website that tells the user to take some specified action), or if it should be used when the term is acting like a verb or a noun.
  • Specifications about which part of the speech the term belongs to, its gender, number, and its definition.
  • Reference images for terms that might need visual context.
  • A list of forbidden terms (terms the client doesn’t want you to use).
  • Case-sensitivity rules.

Creating a Termbase

Most CAT tools allow you to create a termbase within the program itself. You can also create a termbase from a glossary that you import into the “Termbase Editor” of your CAT tool. That glossary could be an Excel file but, depending on the tool utilized, it could also be in other formats.

Once you’ve chosen the file to be uploaded to the tool, the most important considerations are:

  • Which column will be imported as terms.
  • Which column will be imported as definitions (if any).
  • Which column will be imported as “other field” (if any).
  • Which content will be included as “do not import” (especially if you only want to import some languages to your termbase from the total included in your glossary).

These options may appear differently in your CAT tool, but Figure 1 below shows how you can leverage the glossaries you have in Excel and turn them into a termbase.

Figure 1: Leveraging glossaries in Excel

How Termbases Work

When the termbase is properly integrated in the CAT tool, source texts are automatically searched for relevant terminology. Your suggested translations are also searched for all terms included in your termbase. The terms are displayed on the screen of your CAT tool’s editor. It’s then the translator’s or reviewer’s job to decide if the translation shown by the tool is suitable for the context.

Figure 2 shows a screenshot of what this looks like in the CAT tool. Terms entered in the termbase appear highlighted in blue. For example, for “Spanish Mackerel” and “King Mackerel” in English, which are types of fish, you have “maccarello reale” and “maccarello reale maculato” in Italian.

Figure 2: The results of a termbase search displayed in a CAT tool’s editor

Each CAT tool may use different codes to import information along with the term entry. For example, in memoQ, if you highlight a term in red in the Excel file and specify that it’s a “NonTerm” in the “Term_Info” column (column “I”), it will be imported as a forbidden term. (See Figure 3.)

Figure 3: CAT tools may use different codes to import information along with the term entry

Termbases versus Translation Memories

Let’s take a look at how a termbase is different from a translation memory, as these two concepts are frequently confused. All CAT tools include both features but they serve different purposes.

A termbase is a searchable database or glossary containing single words or expressions that are included in a deliberate manner. A translation memory stores whole segments included in one or several projects once a linguist has confirmed those segments. While a termbase will show preferred translations for terms, a translation memory will allow you to look for previous translations for whole paragraphs or even for single terms or expressions. Previous translations may not always be suitable for your current needs, so the linguist will have to decide. This decision is based on the number of previous entries that use the same translation as well as the context of each one.

Online Termbases in Multilingual Projects

When you have different translators working with a termbase, most CAT tools offer a setting that allows you to moderate a termbase from the moment it’s created. Moderating a termbase means you have to confirm any term entries before they can be seen by all the translators working with that termbase. In other words, you essentially assume the role of a terminologist.

As the old saying goes, when you have too many cooks in the kitchen things can get messy eventually. That’s why it’s strongly recommended to use moderated termbases when working with several translators, especially if it’s a large group, as people will not only have different preferences for the translation of the same term but will have a different criterion as to which terms should be added to your termbase.

Exporting Your Termbase

CAT tools allow you to export your termbases. This is useful because it allows you to backup your termbase and share it with other colleagues who may be working with different CAT tools. Exporting also allows you to quickly make changes to the termbase outside the CAT tool and import it back.

Exporting is particularly useful when you’re working with several languages and want to make many changes simultaneously, as you can quickly add, modify, or delete all the information you want and simply reimport it to keep your termbase updated. Always remember that if you want to modify an existing entry using this procedure, you have to pay special attention to the Entry ID (aka Concept ID). (See Figure 4.)

Figure 4: Entry ID Number field

Quality Assurance

One of the most powerful uses of the termbase is in conjunction with a quality assurance (QA) tool. These tools help us automatically detect errors in a translation. Most CAT tools have an integrated QA feature, but there are also stand-alone tools, such as XBench, Verifika, and QA Distiller. These can be set up according to your needs and allow you to check things such as the use of proper punctuation, numbers, spacing, consistency, and terminology.

QA tools can be set to show a warning if a term from your termbase hasn’t been used. It can also be configured so that it warns you when a forbidden term has been used.

In the example in Figure 5, instead of using “maccarello reale” in Italian for “King Mackerel” in English, the translator used “Scomberomorus regalis.”

Please note that in most CAT tools you can set a QA warning when a forbidden term is used as a translation for a given term or every time the forbidden term is used at all (regardless of having a specific matching term in the source text).

Managing Term Entries: Regular Expressions

Now that we’ve reviewed how to create and work with a termbase, let’s focus on the options for ongoing term inclusion. This is when regular expressions (RegEx) may be useful.

A simple definition of RegEx is that they are sequences of characters that define search patterns. They are a set of instructions understood as a programming language. RegEx involve using a wide variety of characters, each with its own specific meaning, to form a string of those characters meant to represent a pattern that can be matched against other strings of text. RegEx are utilized most often in text searches, find-and-replace operations, and for input validation.

For example, in most CAT tools the pipe (|) symbol is used to indicate multiple options for a match—either expression to its left or right matches the target string. For example, “a|b” matches “a” and “b.” As another example, “canci | ón” will match both the singular “canción” and the plural “canciones.”

When using a termbase with a QA tool, the use of the “pipe” could prove a life saver, especially in languages with many variations like Russian or Polish. Just by inserting the pipe following the root of the word, you’ll be able to eliminate all false errors (also called “false positives,” when a reported error by the QA tool isn’t actually an error) from your QA report due to conjugation, declension, or use of plurals.

False positives due to the use of plurals can also be avoided by inserting an asterisk (*) at the end of your term. The asterisk is known as a repeater symbol, meaning the preceding character can be found 0 or more times. For example, the regular expression ca*t will match the strings ct, cat, caat, caaat, etc.

The use of RegEx could vary depending on the tool you’re using. I only provided basic examples of what you can do, so I strongly recommend that you delve deeper into this topic for effective terminology management.

Figure 5: QA tools showing a warning when a forbidden term has been used

To Summarize…

Terminology management is the process of building a glossary with preferred translations (or forbidden ones) to be uploaded to a termbase. That termbase can work in the context of a QA tool or CAT tool with an integrated quality assurance feature.

The nature of the content to be translated can impact directly on the criteria or treatment used for the creation of your termbase. If you correctly set your termbase, you’ll be able to:

  • Have preferred translations within context.
  • Set preferred capitalization rules.
  • Manage forbidden terminology.

Other essential tools that should be used in tandem with termbases are style guides and translation memories, which can help improve your translation quality while reducing the time spent on the actual work.

At the end of the day, it’s important to stress that all these tools need to be managed and leveraged by people. Relying on the expert linguist’s ability to make logical and contextualized decisions is the best terminology management tool that you can possibly have.

Marina Ilari, CT is an ATA-certified English>Spanish translator with over 17 years of experience in the translation industry. She is an expert in translation tools and managing projects in English and Spanish. She has worked as a translator, editor, and quality assurance specialist for many companies around the world with a special focus on creative translations and video game localization. She is the chief executive officer of Terra Translations and co-host of the podcast about translation, En Pantuflas. marina@terratranslations.com

