Knjižnica Filozofskog fakulteta
Sveučilišta u Zagrebu
Faculty of Humanities and Social Sciences Institutional Repository

Building family trees with NooJ


Downloads per month over past year

Kocijan, Kristina and Požega, Marko. (2015). Building family trees with NooJ. In: Formalising Natural Languages with NooJ 2014: Selected Papers from the NooJ 2014 International Conference. Cambridge Scholars Publishing, Newcastle upon Tyne, pp. 198-210. ISBN 1-4438-7558-9

This is the latest version of this item.

Image (PNG) (grafički prikaz) (English) - Other
Download (148kB) | Preview
PDF (English) - Submitted Version
Download (655kB) | Preview
[img] Other (pptx) (English) - Presentation
Download (1MB)


Croatian language uses separate terms for each member in a family tree. These terms may differ among different geographical parts of Croatia as well. We will use NooJ to build a family tree by using syntactic grammars applied to the Croatian obituaries. Obituaries in Croatia have a recognizable form with few alternations. Its structure can be divided into four main sections (notice about the death, name of the deceased, date and place of the funeral, grieving family). For this project, sections two and four are of special interest as they can be used to build a family tree of both living and deceased relatives of the deceased. The first task in the project is to recognize the deceased person since other relationships might differ depending of the gender of the deceased. Than a list of grieving family members, with or without their names (sometimes with a notation such as ‘the family of deceased brother/sister/…) is annotated, again, in some relationships differing on the gender of the name that follows the relationship type. Across Croatia, the same relationship is not always presented with a same term. For example, a nephew is presented with terms nećak, bratanac and sinovac depending on a region. So far, we were able to include all the variations in our grammars. The relationships are provided in three different formats: 1. Singular relationship and a name of a person: brat Ivan i brat Nikola (brother Ivan and brother Nikola) which are annotated as <BROTHER=’Ivan’> <BROTHER=’Nikola’>; 2. Plural relationship of the same gender and a list of names: braća Ivan, Petar i Nikola (brothers Ivan, Petar and Nikola) annotated also as singular relationships <BROTHER=’Ivan’> <BROTHER=’Petar’> <BROTHER=’Nikola’>; 3. Plural relationship including both genders and a list of names: djeca Nikola, Ana, Frano, Marija (children Nikola, Ana, Frano, Marija) annotated as <SON=’Nikola’> <DAUGHTER=’Ana’> <SON=’Frano’> <DAUGHTER=’Marija’>, where each person is annotated as a <SON> or as a <DAUGHTER> depending on the gender of the name and not as a group <CHILDREN>. However, each of these cases can have some additional information such as: obitelj pokojnog brata Nikole (family of the deceased brother Nikola); brat Nikola s obitelji (brother Nikola with his family);kćer Marija sa suprugom i kćerkom (daughter Marija with her husband and her daughter). After the annotation process, an XML file is produced containing tags <FAMILY DECEASED=”Name” WIFE=”Name” SON=”Name” DAUGHTER=”Name” …>. From this file, a picture with drown relationships is generated using Python.

Item Type: Book Section
Uncontrolled Keywords: family tree, family relationships, kinship, terminology, local grammars, syntactic grammars, NooJ, Python
Subjects: Information sciences > Social-humanistic informatics
Departments: Department of Information Science
Date Deposited: 08 Jun 2015 13:50
Last Modified: 26 Jul 2018 11:48

Available Versions of this Item

Actions (login required)

View Item View Item