Knjižnica Filozofskog fakulteta
Sveučilišta u Zagrebu
Faculty of Humanities and Social Sciences Institutional Repository

Tagset Reductions in Morphosyntactic Tagging of Croatian Texts


Downloads per month over past year

Agić, Željko and Tadić, Marko and Dovedan, Zdravko. (2009). Tagset Reductions in Morphosyntactic Tagging of Croatian Texts. In: 2nd International Conference “The Future of Information Sciences: INFuture2009 – Digital Resources and Knowledge Sharing”, 4-6 November 2009, Zagreb, Croatia.

PDF (English)
Download (206kB) | Preview


Morphosyntactic tagging of Croatian texts is performed with stochastic taggersby using a language model built on a manually annotated corpus implementingthe Multext East version 3 specifications for Croatian. Tagging accuracy in thisframework is basically predefined, i.e. proportionally dependent of two things:the size of the training corpus and the number of different morphosyntactic tagsencompassed by that corpus. Being that the 100 kw Croatia Weekly newspapercorpus by definition makes a rather small language model in terms of stochastictagging of free domain texts, the paper presents an approach dealing withtagset reductions. Several meaningful subsets of the Croatian Multext-East version3 morphosyntactic tagset specifications are created and applied on Croatiantexts with the CroTag stochastic tagger, measuring overall tagging accuracyand F1-measures. Obtained results are discussed in terms of applying differentreductions in different natural language processing systems and specifictasks defined by specific user requirements.

Item Type: Published conference work (Lecture)
Uncontrolled Keywords: morphosyntactic tagging, part-of-speech tagging, stochastic tagger, Multext East tagset, tagset reductions, Croatian language
Subjects: Information sciences > Social-humanistic informatics
Information sciences > Natural language processing, lexicography and encyclopedic science
Departments: Department of Linguistics
Department of Information Science
Date Deposited: 24 Feb 2017 09:46
Last Modified: 24 Feb 2017 09:46

Actions (login required)

View Item View Item