Statistical Language Models for Croatian Weather-domain Corpus

Statistics

Downloads

Downloads per month over past year

Načinović, Lucia and Martinčić-Ipšić, Sanda and Ipšić, Ivo. (2009). Statistical Language Models for Croatian Weather-domain Corpus. In: 2nd International Conference “The Future of Information Sciences: INFuture2009 – Digital Resources and Knowledge Sharing”, 4-6 November 2009, Zagreb, Croatia.

Preview

PDF (English)
Download (261kB) | Preview

Official URL: http://infoz.ffzg.hr/INFuture/2009/papers/INFuture2009.pdf

Abstract

Statistical language modelling estimates the regularities in natural languages. Language models are used in speech recognition, machine translation and other applications for speech and language technologies. In this paper we will present a procedure for language models building for the Croatian weather domain corpus. Different types of n-gram statistic language models and smoothing methods for language modelling are presented. Those models are compared in terms of their estimated perplexity.

Item Type:	Published conference work (Lecture)
Uncontrolled Keywords:	statistical language modelling, n-gram, smoothing methods, Croatian weather-domain corpus
Subjects:	Information sciences > Social-humanistic informatics Information sciences > Natural language processing, lexicography and encyclopedic science Linguistics
Departments:	Department of Information Science
Date Deposited:	19 May 2017 09:29
Last Modified:	19 May 2017 09:29
URI:	http://darhiv.ffzg.unizg.hr/id/eprint/8392

Actions (login required)

View Item

Faculty of Humanities and Social Sciences Institutional Repository is powered by EPrints 3 which is developed by the School of Electronics and Computer Science at the University of Southampton. More information and software credits.