Knjižnica Filozofskog fakulteta
Sveučilišta u Zagrebu
Faculty of Humanities and Social Sciences Institutional Repository

Sentence classification and clause detection for Croatian

Downloads

Downloads per month over past year

Vučković, Kristina and Agić, Željko and Tadić, Marko. (2010). Sentence classification and clause detection for Croatian. In: 7th International Conference on Formal Approaches to South Slavic and Balkan Languages, 04.-06.10.2010., Dubrovnik, Hrvatska.

[img]
Preview
PDF (Croatian) - Published Version
Download (123kB) | Preview

Abstract

We present a method for classifying Croatian sentences by structure and detecting independent and dependent clauses within these sentences and provide its evaluation. A prototype system applying the method was implemented by using the NooJ linguistic development environment, both for purposes of this experiment and for further utilization in a prototype rule-based chunking and shallow parsing system for Croatian. With regards to pre-processing, we implemented and evaluated three different approaches to designing the system: (1) no pre-processing of input sentences, (2) automatic morphosyntactic tagging of sentences by using the CroTag stochastic tagger and (3) manual morphosyntactic annotation of input sentences. All three approaches were evaluated for sentence classification and clause detection accuracy in terms of precision and recall. The highest scoring system was the one using sentences with manually assigned morphosyntactic tags as input and it scored an overall F1-measure of 0.861 (P: 0.928, R: 0.813). In the paper, a more detailed discussion of system design and experiment setup is provided, followed by a discussion of the obtained results and future research directions.

Item Type: Published conference work (Paper)
Uncontrolled Keywords: sentence detection, sentence classification, clause detection, Croatian language
Subjects: Information sciences > Social-humanistic informatics
Linguistics
Departments: Department of Information Science
Department of Linguistics
Date Deposited: 12 Nov 2012 14:47
Last Modified: 09 Jun 2015 08:43
URI: http://darhiv.ffzg.unizg.hr/id/eprint/1943

Actions (login required)

View Item View Item