Knjižnica Filozofskog fakulteta
Sveučilišta u Zagrebu
Faculty of Humanities and Social Sciences Institutional Repository

Sentence Alignment as the Basis for Translation Memory Database

Downloads

Downloads per month over past year

Seljan, Sanja and Gašpar, Angelina and Pavuna, Damir. (2007). Sentence Alignment as the Basis for Translation Memory Database. In: 1. međunarodna znanstvena konferencija "The Future of Information Sciences: INFuture2007 – Digital Information and Heritage", 7.-9. studenoga 2007., Zagreb.

[img]
Preview
PDF (English)
Download (176kB) | Preview

Abstract

Sentence alignment represents the basis for computer-assisted translation (CAT), terminology management, term extraction, word alignment and crosslinguistic information retrieval. Created out of the sentence alignment process, translation memory (TM) represents the basis for further research in translation equivalencies. Automatic sentence alignment, based on parallel texts, faces two types of problems: robustness and discrepancies between source and target texts in layout and omissions which have an influence on the accuracy of the alignment process. The aim of the paper is to present research on the sentence alignment process carried out on the Croatian-English parallel texts (laws, regulations, acts and decisions) and implemented by the alignment tool WinAlign 7.5.0 by SDL Trados 2006 Professional. The alignment process and its impact on the creation of translation memories is presented through comparison of translation memories that differ regarding the levels of expert intervention in the set up of the alignment program and preparation of the source text for the segmentation. Recommendations for further development using statistical analysis, automatic learning techniques and language knowledge are suggested.

Item Type: Published conference work (Lecture)
Uncontrolled Keywords: sentence, alignment, translation memory, computer-assisted translation (CAT), tool, segmentation, set up
Subjects: Information sciences > Social-humanistic informatics
Information sciences > Natural language processing, lexicography and encyclopedic science
Departments: Department of Information Science
Date Deposited: 22 Feb 2017 08:35
Last Modified: 22 Feb 2017 08:35
URI: http://darhiv.ffzg.unizg.hr/id/eprint/7924

Actions (login required)

View Item View Item