Previously different formats were mentioned which are used for terminology management. The possibility of organising terminology in different database formats makes it seem unlikely to assume that for terminology interchange one could agree on a certain relational-database format, such as SQL formats with a certain structure. Nevertheless to enable terminology interchange a powerful tool for terminology interchange was developed originally in cooperation with the Text Encoding Initiative (TEI) and the Localisation Industry Standards Association (LISA), which is based on SGML. LISA is an association of companies and institutions working on the translation and adaptation of software into different languages and TEI is an initiative for encoding texts with a relevant structure and semantic information.
The goal of this cooperation was to produce a format that would be a platform independently and publicly available format. The resulting format is the Machine-Readable Terminology Interchange Format (MARTIF), also known as ISO (FDIS) 12200. 150 data categories are described for MARTIF in ISO (FDIS) 12620. This number, though already huge, is not intended to define a specific set of categories but to structure categories and to show possible categories. The number of data categories results from the different needs and approaches of different work groups, though a merge of categories with purely synonymous names was already achieved. Unfortunately MARTIF does not match the needs of non-concept-oriented approaches to terminology, i.e. lexicographic and NLP approaches, because MARTIF presupposes a concept orientation rather than a word orientation.
Categories for MARTIF are divided into ten sections, which are grouped into four classes. The four classes with their sections are (Melby Wright, 1998, compare):
SGML code and consequently MARTIF is not easy to read, but it is legible. This is one major advantage of SGML-documents in comparison with other types of data, such as binary files.
As there are HTML-Browsers (such as Netscape) for the presentation of WWW-documents written in HTML, MARTIF could be presented with user-friendly interfaces. A MARTIF parser could be included in a `Black Box' guiding the user to insert only data conforming the MARTIF specification as well as presenting only relevant information to the user. MARTIF-tags, though they might seem to be complex, are just a way of structuring data. The concept of MARTIF allows to include new data categories by including data categories as attributes to Generic Identifiers (GI) , see also section 3.1. The advantage here is to keep the system as open as possible. Placing data categories as GI's themselves would mean to define them in the Document Type Declaration(DTD). To make interchange possible it is necessary that everybody uses the same DTD. Making it possible to include data categories as attributes to GI's on the other hand guarantees a wider flexibility.
Another advantage of the MARTIF format is the possibility of targeting external links from within the document. References as well as all possible hypertext links can easily be included.
MARTIF in its first stage still presupposes programmers to take a look at the sources before implementing tools for import into new systems. To develop a so called blind access, with the possibility of anyone interchanging termbanks from any system from and to MARTIF, further standardisation would be needed, especially for data categories, specific subject fields, etc.
Thorsten Trippel
Fri May 21 13:04:11 MET DST 1999