The International Comparable Corpus : Challenges in building multilingual spoken and written comparable corpora
Čermáková, A., Jantunen, J., Jauhiainen, T., Kirk, J., Křen, M., Kupietz, M., & Uí Dhonnchadha, E. (2021). The International Comparable Corpus : Challenges in building multilingual spoken and written comparable corpora. Research in Corpus Linguistics, 9(1), 89-103. https://doi.org/10.32714/ricl.09.01.06
Julkaistu sarjassa
Research in Corpus LinguisticsTekijät
Päivämäärä
2021Tekijänoikeudet
© 2021 Research in Corpus Linguistics
This paper reports on the efforts of twelve national teams in building the International Comparable Corpus (ICC; https://korpus.cz/icc) that will contain highly comparable datasets of spoken, written and electronic registers. The languages currently covered are Czech, Finnish, French, German, Irish, Italian, Norwegian, Polish, Slovak, Swedish and, more recently, Chinese, as well as English, which is considered to be the pivot language. The goal of the project is to provide much-needed data for contrastive corpus-based linguistics. The ICC corpus is committed to the idea of re-using existing multilingual resources as much as possible and the design is modelled, with various adjustments, on the International Corpus of English (ICE). As such, ICC will contain approximately the same balance of forty percent of written language and 60 percent of spoken language distributed across 27 different text types and contexts. A number of issues encountered by the project teams are discussed, ranging from copyright and data sustainability to technical advances in data distribution.
...
Julkaisija
Asociacion Espanola de Linguistica de CorpusISSN Hae Julkaisufoorumista
2243-4712Asiasanat
Alkuperäislähde
http://ricl.aelinco.es/first-view/155-Article%20Text-1147-1-10-20210618.pdfJulkaisu tutkimustietojärjestelmässä
https://converis.jyu.fi/converis/portal/detail/Publication/98442746
Metadata
Näytä kaikki kuvailutiedotKokoelmat
Lisenssi
Samankaltainen aineisto
Näytetään aineistoja, joilla on samankaltainen nimeke tai asiasanat.
-
Étude comparative du futur simple dans un corpus littéraire finno-français
Karhunen, Anna (2008) -
Between context and comparability : Exploring new solutions for a familiar methodological challenge in qualitative comparative research
Kosmützky, Anna; Nokkala, Terhi; Diogo, Sara (Wiley-Blackwell, 2020)Finding the balance between adequately describing the uniqueness of the context of studied phenomena and maintaining sufficient common ground for comparability and analytical generalisation has widely been recognised as a ... -
Do concepts and methods have ethics?
Laihonen, Petteri (Language on the Move, 2020) -
Corpora, phraseology and dictionaries : How does corpus research intersect language teaching and learning?
Jantunen, Jarmo Harri (Uusfilologinen Yhdistys, 2016)This article discusses the role of corpus data in language learning and teaching as well as the benefits of using authentic language data in learner dictionary writing. It has been argued that acquiring and teaching ... -
Word clouds and beyond : corpus linguistic self-study material package for English for academic purposes
Hokkanen, Jere (2019)Englanti on akateemisen maailman yleiskieli, lingua franca. Tässä yhteisössä englannin kieli on välttämätön vaatimus täyteen osallistumiseen kansainvälisissä konteksteissa. Kolmannen asteen oppija tarvitsee akateemista englantia ...
Ellei toisin mainittu, julkisesti saatavilla olevia JYX-metatietoja (poislukien tiivistelmät) saa vapaasti uudelleenkäyttää CC0-lisenssillä.