Knjižnica Filozofskog fakulteta
Sveučilišta u Zagrebu
Faculty of Humanities and Social Sciences Institutional Repository

Croatian Web Text Summarizer (CroWebSum)

Downloads

Downloads per month over past year

Mikelić Preradović, Nives and Ljubešić, Nikola and Boras, Damir. (2010). Croatian Web Text Summarizer (CroWebSum). In: ITI 2010 32nd International Conference on INFORMATION TECHNOLOGY INTERFACES, June 21-24, 2010, Cavtat.

[img] PDF (English) - Repository staff only
Download (100kB) | Request a copy

Abstract

The paper describes automatic summarization of newspaper texts in Croatian language. The goal of the CroWebSum is to generate high-quality extracts that are both coherent and keep relevant information from the original text. The preliminary evaluation shows that extracts in the size of 10 % of the original text have good coherence, while the extract in the size of 5 % of the original text still conveys the most relevant information. Also, while cutting down news to SMS size (maximum 160 characters), CroWebSum performed quite well. The research brought us to conclusion that we should develop a technique that uses context vectors to calculate the semantic similarity between the terms in the document as well as pronoun resolution algorithm in order to improve the text summarization for Croatian language.

Item Type: Published conference work (Paper)
Uncontrolled Keywords: Newspaper text summarizer, SweSum, Croatian language, extract, inflected language
Subjects: Information sciences > Social-humanistic informatics
Departments: Department of Information Science
Date Deposited: 03 Feb 2016 11:54
Last Modified: 03 Feb 2016 11:54
URI: http://darhiv.ffzg.unizg.hr/id/eprint/5951

Actions (login required)

View Item View Item