Creating Corpora of Finland’s Sign Languages
Salonen, J., Takkinen, R., Puupponen, A., Nieminen, H., & Pippuri, O. (2016). Creating Corpora of Finland’s Sign Languages. In E. Efthimiou, S.-E. Fotinea, T. Hanke, J. Hochgesang, J. Kristoffersen, & J. Mesch (Eds.), Workshop Proceedings : 7th Workshop on the Representation and Processing of Sign Languages: Corpus Mining / Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC 2016) (pp. 179-184). European Language Resources Association (ELRA).
© The Authors, 2016. This work is distributed under the terms of a Creative Commons License.
This paper discusses the process of creating corpora of the sign languages used in Finland, Finnish Sign Language (FinSL) and
Finland-Swedish Sign Language (FinSSL). It describes the process of getting informants and data, editing and storing the data, the
general principles of annotation, and the creation of a web-based lexical database, the FinSL Signbank, developed on the basis of the
NGT Signbank, which is a branch of the Auslan Signbank. The corpus project of Finland’s Sign Languages (CFINSL) started in
2014 at the Sign Language Centre of the University of Jyväskylä. Its aim is to collect conversations and narrations from 80 FinSL
users and 20 FinSSL users who are living in different parts of Finland. The participants are filmed in signing sessions led by a native
signer in the Audio-visual Research Centre at the University of Jyväskylä. The edited material is stored in the storage service
provided by the CSC – IT Center for Science, and the metadata will be saved into CMDI metadata. Every informant is asked to sign
a consent form where they state for what kinds of purposes their signing can be used. The corpus data are annotated using the ELAN
tool. At the moment, annotations are created on the levels of glosses and translation.
