KORPUSBASIERTE EXTRAKTION VON (MEHRWORT-) TERMINOLOGIEKANDIDATEN: EINE STUDIE ANHAND DER JURISTISCHEN FACHSPRACHE DES DEUTSCHEN

No Thumbnail Available
Date
2023
Authors
Tarloyan Astghik
Elbakidze Tamar
Lipateliani Maia
Osepashvili Nino
Hovsepyan Hayk
Khachatryan Gohar
Dallakyan Meri
Straube Annika
Babych Bogdan
Atayan Vahram
Journal Title
Journal ISSN
Volume Title
Publisher
Lingva
Abstract
The paper represents a pilot study on terminology extraction. The main objective of the study is to clarify to what extent the effectiveness of automated terminology extraction can be improved based on the keyword identification method as opposed to mere structure-based term extraction. In the project, single- and multi-word terms are extracted from a specialized legal corpus of decisions of the German Federal Court of Justice (BGH). Thereby, two relatively simple procedures for automated terminology extraction are tested. On the one hand, multi-word expressions are identified according to a pure syntactic pattern, and on the other hand, the keywords of the BGH corpus are evaluated as terminological candidates and as a starting point for the identification of multi-word terms. The two methods are compared using precision and recall measures. These are calculated based on matching automatically extracted terminological candidates with manually annotated terms in the BGH corpus that are regarded in the study as the gold standard annotation.
Description
DOI: 10.51307/18293107/laph/23.64-128
Keywords
Linguistics and Philology
Citation
Collections