tekom - Europe

FAIRterm 2.0: A Web Application for FAIR Terminology Management

A summary of  Dr. Federica Vezzani, University of Padova (Italy), holds a PhD in terminology and is a tenure-track assistant professor at the Department of Linguistic and Literary Studies of the University of Padova, Italy. She is a member of the ISO/TC 37 "Language and Terminology”. Her main research interests are terminology, specialised translation, and technical communication. In particular, she focuses on the management of multilingual terminology in the medical domain, and she has developed the FAIR terminology paradigm for the optimal organisation of findable, accessible, interoperable, and reusable terminological data.

This article summarizes the latest IUNTC talk from Thursday, December 4, “Introducing the FAIRterm 2.0 which is the updated version of a web application developed to support the management of terminological data in accordance with the FAIR principles (Findable, Accessible, Interoperable, Reusable). The new release introduces improved functions for creating, curating, and disseminating terminology resources, ensuring alignment with international standards and promoting their long-term sustainability. We will show how the tool can benefit researchers and professionals in terminology, technical communication, and specialized translation by enhancing the visibility, accessibility, and reusability of terminological data.
 

Read full article

The FAIR terminology paradigm and motivation

At the beginning of the talk, the FAIR terminology paradigm developed during previous doctoral research was introduced. The core idea was to adapt the FAIR data principles, originally defined for research data in general, to the specific case of terminological resources. FAIR-compliant terminological data should be structured so that they can be searched and discovered efficiently, accessed via standard protocols, combined with other language resources, and reused in different tools, projects, and communities.

Terminology tools are never neutral: they always embody theoretical assumptions about what terminology is and how concepts, terms, and languages relate to one another. FAIRTerm was therefore built on a solid theoretical foundation, drawing on ISO standards and on terminology science more broadly.

Theoretical foundations: objects, concepts, and terms

The theoretical background from terminology science underlying FAIRTerm was then briefly reviewed. In this approach, terminology is seen as a discipline with two interrelated dimensions: a conceptual dimension and a linguistic dimension. Terminologists are not only interested in terms as linguistic forms, but also in concepts as the extra-linguistic units of knowledge they designate.

Drawing on ISO standards, the key notion of “object” was described as anything perceivable or conceivable. Objects can be physical, immaterial, or even imaginary. Objects have properties such as shape or function. Concepts arise at a higher level of abstraction: they are units of knowledge created by unique combinations of characteristics, which themselves are abstractions of properties. This was illustrated using a set-theoretic view, where a concept is defined by the set of characteristics it combines.

A crucial point highlighted is that terminology aims to describe shared expert knowledge rather than individual mental representations. Concepts are not merely “units of thought” in one person’s mind, but collectively agreed reference points within a community of specialists—for example, the shared understanding of a specific pathology among medical experts.

Only once the conceptual layer has been modelled does attention turn to the linguistic dimension, where concepts are verbalized through terms in natural languages. A single concept may be designated by several synonymous or variant terms in each language.

Data modelling and ISO standards

Building on this conceptual framework, an entity–relationship (ER) schema was defined for the database underlying FAIRTerm. At its core, a concept can be verbalized in many languages and can be designated by any number of terms. Conversely, a given term is linked to exactly one concept entry in a specific domain. This asymmetry reflects the concept-oriented approach: entries are organised around concepts, not around words.

The ER model was aligned with international standards for terminological databases, in particular the Terminological Markup Framework (TMF, ISO 16642), ISO-based data category specifications and repositories, and TBX (TermBase eXchange), the XML-based serialization of the TMF model used as a standard interchange format for terminology.

In TMF, a terminology resource is defined as a set of concept entries. Each concept entry describes exactly one concept and is subdivided into language sections, which in turn contain one or more term sections. This results in a hierarchical structure: concept level, language level, and term level. Additional levels exist, such as term-component sections for cases where individual components of multiword terms need to be described separately.

Particular attention was devoted to the choice of which data categories to assign to each level. For example, FAIRTerm attaches definitions to the language level, allowing the same concept to have well-formed definitions in multiple languages. By contrast, data categories such as part of speech, gender, and number are clearly term-level properties.

This modelling approach was illustrated with an example TBX entry from the e-mobility domain, where a single concept entry includes several language sections (for English, French, and Italian). Within each language section, term sections store designations like “neighbourhood electric vehicle” and its acronym “NEV”, with appropriate metadata and subject-field information.

From PhD prototype to first FAIRTerm version

The first FAIRTerm implementation, developed around 2019, was strongly translation oriented. The interface displayed source and target terms side by side, mirroring the same data categories on each side. This reflected the original focus on translators and on bilingual data organization. Early feedback came primarily from students in specialized translation.

The project gained momentum when FAIRTerm was adopted by the European Parliament’s Terminology Coordination Unit (TermCoord) for the “Terminology without Borders” initiative. This large collaborative project involved partners across Europe and beyond, working on terminology in multiple domains such as health, environment, and migration. In this context, the number of FAIRTerm users grew from a very small group to more than 200 contributors.

This scaling up revealed several shortcomings of the first version. The balance between conceptual and linguistic dimensions was weak: most data categories focused on the linguistic layer, and the conceptual structuring of the termbase was underexploited. The principle of term autonomy—according to which each term designating a concept should be describable with the same set of fields—was not fully implemented. In addition, the original configuration was primarily descriptive, whereas technical communicators needed prescriptive fields indicating which terms should or should not be used in documentation and corporate communication.

Design and features of the current FAIRTerm web application

To address these issues, a new version of FAIRTerm was developed and presented at the meeting.

The current interface is explicitly organized around the three-level structure of concept, language, and term. When creating a new concept entry, users first select a subject field from a controlled vocabulary derived from EuroVoc, ensuring a degree of interoperability for domain labelling. Since EuroVoc does not cover all possible domains, FAIRTerm also provides a free-text “sub-subject field” for more fine-grained or project-specific domains.

At the concept level, users can declare hierarchical relationships between concepts within their termbase: generic–specific relations, as well as partitive and comprehensive relations. This supports the development of concept systems grounded in standard terminological theory.

At the language level, users add language sections for each language in which the concept is described. Language codes are taken from standard ISO language lists. For each language section, a definition, external references and sources, and notes can be entered—for example, to indicate whether a definition has been drafted by the terminologist or taken from an external standard.

At the term level, one or more term sections can be added per language. Each term section contains the term itself, a usage label (preferred, admitted, deprecated, or obsolete), and morphosyntactic information such as part of speech, gender, and number, using values adopted from the Universal Dependencies project to maximize interoperability. Further fields record the term type (abbreviation, acronym, appellation, borrowed term, etc.), contextual examples and their sources, cross-references, register information, collocations, and notes.

A key improvement over the first version is that every term section in a given language has access to the same configuration of data categories. This enforces the principle of term autonomy and avoids uneven documentation. Additional language sections can be added to the same concept entry to support multilingual work.

FAIRTerm supports exporting terminology in TBX and in an auxiliary spreadsheet format, enabling import into other tools such as computer-assisted translation systems or authoring environments. As an illustration, an entry from a student project in law was shown for the French concept “droit subjectif”, with the corresponding Italian term “diritto soggettivo”, fully populated in the interface.

Access model, collaboration, and data import

FAIRTerm is freely available for individual users. Interested users can fill in a web form to request credentials, after which they receive a username and password by email and can start working in their personal workspace. In the current version, each user only sees their own data, and no built-in collaboration mode is available.

In the earlier, translation-oriented version, a collaborative mode existed in which project owners and teachers could oversee contributions from students and partners. However, managing multi-user projects, role differentiation, and version control proved too complex. For this reason, collaborative functionalities have not yet been reintroduced, although a role-based model (with administrators, editors, and reviewers) is envisaged for future development.

During the question session, it was asked whether FAIRTerm allows importing existing data. It was explained that this is possible via a structured Excel template that mirrors the TBX-based internal model. Users can populate the template with their terms, definitions, and metadata; the development team can then upload the file into the user’s private account and convert the content into FAIRTerm entries without data loss.

Future work and outlook

In the final part of the presentation, a “to-do list” for FAIRTerm was outlined. One high-priority item is the implementation of a search and browse interface allowing users to inspect all entries in their termbase and retrieve concepts by term or by domain. At present, the focus has been on data entry rather than on retrieval.

Another open question concerns subject-field classification. While EuroVoc offers a useful standardized vocabulary, it is not always aligned with the needs of terminologists working in areas such as sports or emerging technologies. The development team plans to consult the community on whether to continue with EuroVoc, adopt alternative classifications, or allow users to define their own domain hierarchies in a controlled way.

On the conceptual side, the relations supported by FAIRTerm are expected to be extended beyond hierarchical and partitive links to include associative relations. In the longer term, the goal is to enable automatic visualization of concept systems.

Two further technical goals are the implementation of automatic saving and support for additional export formats. While TBX export is already available and constitutes a major step towards interoperability, many semantic web and linked open data initiatives rely on formats such as RDF, SKOS, and OWL. Work is therefore underway to explore systematic conversion from TBX to RDF-based representations.

The meeting concluded with appreciative comments from participants. Overall, the presentation demonstrated how a theoretically grounded, FAIR-compliant approach to terminology can be implemented in a practical web application. FAIRTerm emerges as a promising tool for terminologists, translators, and technical communicators who aim to build reusable and interoperable terminology resources for multilingual knowledge sharing.