Knjižnica Filozofskog fakulteta
Sveučilišta u Zagrebu
Faculty of Humanities and Social Sciences Institutional Repository

Statistical Language Models for Croatian Weather-domain Corpus


Downloads per month over past year

Načinović, Lucia and Martinčić-Ipšić, Sanda and Ipšić, Ivo. (2009). Statistical Language Models for Croatian Weather-domain Corpus. In: 2nd International Conference “The Future of Information Sciences: INFuture2009 – Digital Resources and Knowledge Sharing”, 4-6 November 2009, Zagreb, Croatia.

PDF (English)
Download (261kB) | Preview


Statistical language modelling estimates the regularities in natural languages. Language models are used in speech recognition, machine translation and other applications for speech and language technologies. In this paper we will present a procedure for language models building for the Croatian weather domain corpus. Different types of n-gram statistic language models and smoothing methods for language modelling are presented. Those models are compared in terms of their estimated perplexity.

Item Type: Published conference work (Lecture)
Uncontrolled Keywords: statistical language modelling, n-gram, smoothing methods, Croatian weather-domain corpus
Subjects: Information sciences > Social-humanistic informatics
Information sciences > Natural language processing, lexicography and encyclopedic science
Departments: Department of Information Science
Date Deposited: 19 May 2017 09:29
Last Modified: 19 May 2017 09:29

Actions (login required)

View Item View Item