Constructing a Norwegian academic wordlist

Conference paper by Kristin Hagen, Janne Bondi Johannessen and Arash Saidi in Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC 2016), 2016.

LREC 2016 logo

Abstract

We present the development of a Norwegian Academic Wordlist (AKA list) for the Norwegian Bokmål variety. To identify specific academic vocabulary we developed a 100-million-word academic corpus based on the University of Oslo archive of digital publications. Other corpora were used for testing and developing general word lists. We tried two different methods, those of Carlund et al. (2012) and Gardner & Davies (2013), and compared them. The resulting list is presented on a web site, where the words can be inspected in different ways, and freely downloaded.

Access the paper on the homepage of LREC.

Published Aug. 16, 2017 2:05 PM - Last modified May 2, 2024 10:44 AM