Knjižnica Filozofskog fakulteta
Sveučilišta u Zagrebu
Faculty of Humanities and Social Sciences Institutional Repository

A corpus-based analysis of English phrasal verbs in legal domain

Downloads

Downloads per month over past year

Bilić, Marija. (2018). A corpus-based analysis of English phrasal verbs in legal domain. PhD Thesis. Filozofski fakultet u Zagrebu, Department of Linguistics.
(Poslijediplomski doktorski studij lingvistike) [mentor Zovko Dinković, Irena].

[img]
Preview
PDF (Croatian)
Download (4MB) | Preview

Abstract

The thesis focuses on a corpus-based analysis of English phrasal verbs – a structure consisting of a verb and one or two morphologically invariable particles and acting as a unique lexical and semantic unit ̶ in legal domain. English phrasal verbs are chosen for the analysis since they are one of the most characteristic and productive features of the English language, but also complex and difficult to acquire due to their structural, syntactic and semantic features. Legal language is chosen for the analysis since it is a genre characterised by unambiguousness, precision, repetition, concision, i.e. a genre in complete opposition with phrasal verbs which are very often polysemic, not transparent and redundant (since they are multi-word units) and since the legislation of the EU and the Republic of Croatia is publicly available. The following hypotheses are tested: 1) Phrasal verbs can be automatically extracted via particles (adverbs, prepositions) they consist of, by using key-word extraction programs that give a list of the most frequent words where functional words (adverbs, prepositions, articles, pronouns, etc.) are top ranked 2) Phrasal verbs are frequent in English legal language, both as the source and the target language, regardless of their redundancy, polysemy and the principle of language economy since they are a typical feature of both general and legal language, which is confirmed by the existence of phrasal verbs dictionaries and numerous scientific papers. 3) Print and digital reference resources are not comprehensive in content, i.e. they do not offer more information than dictionaries of general English language. Furthermore, the analysis of the reference resources reveals their advantages and disadvantages. 4) The contrastive analysis shows greater application of transposition and modulation than other translation techniques due to the different nature of languages (English as analytic and Croatian as synthetic language) and the specific features of the legal language (precision, clarity, unambiguousness, uniformity, etc.) where no addition or omission of the message is allowed. 5) Texts originally written in English and translations in English differ in terms of the presence, frequency, and type of phrasal verbs, as well as the accuracy of their usage. The thesis is divided into eight chapters. The introductory chapter presents motivation, aims and hypotheses of the research as well as the methodology and detailed plan of the research. The second chapter discusses the problem of definition and classification of phrasal verbs and presents their structural, syntactic and semantic features. The final part of the chapter discusses the use of phrasal verbs in different genres. The third chapter presents the definition of the language for specific purposes and legal language, with a special focus on the legal language of the European Union and the Republic of Croatia. Furthermore, a critical review of the resources (manuals, guidelines, etc.) available to translators of the EU and Croatian legislation, in terms of the attention given to the topic of phrasal verbs, is presented. The fourth chapter presents the development of the corpus linguistics in general and in Croatia, different types of corpora, definition and advantages of corpus-based contrastive and translatological study as well as the overview of the corpus-based research in the field of phrasal verbs. The fifth chapter presents the problem of the definition of translation equivalence throughout the history and today, the importance of consistency in translation and the overview of different models of translation techniques. This research is based on the models of Vinay and Darbelnet (1958/ 1995) and Delisle (1993) with a special focus on the techniques of literal translation, transposition, modulation, amplification, economy, addition and omission. The sixth chapter presents the research which is based on the manual and (semi-) automatic extraction of data for the purposes of quantitative and qualitative methods of linguistic and translatological study of phrasal verbs in legal domain. The seventh chapter presents the overall results and the possibilities for further research and the abstract in English. The eighth chapter presents the bibliography, appendixes and curriculum vitae. The research consists of nine phases. The first phase includes the building of a bi-directional English-Croatian parallel corpus and comparable English and Croatian corpora of publicly available legal texts consisting of 743 936 words in total. Although the corpora of texts originally written in English (EU_en- 16 documents) and Croatian (RH_hr – 8 documents) contain almost the same number of words, RH parallel corpus (RH_en-RH_hr) contains 20% more words than EU parallel corpus (EU_en-EU_hr), since the corpus of translations in English (RH_en) contains 20% more words than the corpus of texts originally written in Croatian (RH_hr). The aim of the research is to find out whether the reason for such a difference in the number of words is related not only to the fact that English is an analytic language and Croatian a syntactic language, but to the different use of phrasal verbs as well. The second phase includes the testing of the possibility of the automatic extraction of phrasal verbs via particles they consist of on a pilot corpus consisting of 10 EU_en documents and using WordSmith Tools 6.0. The list of phrasal verbs is checked against the reference dictionary Cambridge Phrasal Verbs Dictionary (2015) and the evaluation of the system efficiency is conducted via the statistical measures of Precision, Recall and F-measure. The results of the pilot research show that the automatic extraction of phrasal verbs via particles they consist of is possible but, since Precision is only 7.5%, a considerable human intervention is needed in order to refine the results initially offered by the system. Thus, the results prove that the first hypothesis is partially correct. The measure of Recall shows that the automatic extraction is, regardless of the low level of Precision, more efficient than a purely manual method of extraction. The semi-automatic extraction is, undoubtedly, a much faster, simpler and more organized method of research which offers many different possibilities of analysis. The third phase includes semi-automatic extraction of phrasal verbs via particles from the English comparable corpus using WordSmith Tools 6.0 and the discussion on the similarities and differences between EU_en and RH_en corpora in terms of the presence and frequency of particles. The results confirm the results of the pilot research according to which the particles forming phrasal verbs account for only 2% in the total number of words. Thus, the results prove that the second hypothesis is incorrect. Among the particles that form phrasal verbs, the most productive are particle to, which is a highly functional word, and particles out, up and down, which have a low frequency in the total number of words, but most frequently form a phrasal verb. The fourth phase includes the comparison of the presence and frequency of phrasal verbs in two English subcorpora. The results show that RH_en contains more phrasal verbs (EU_en – 2286; RH_en – 1792), and EU_en contains more different phrasal verbs (EU_en – 67; RH_en – 52). Although English corpora share 36 phrasal verbs, there are considerable differences in terms of their frequency, especially in the case of the 5 most frequent phrasal verbs which account for nearly half of all phrasal verbs (EU_en: refer to, relate to, lay down, set out and carry out; RH_en: refer to, carry out, follow up, relate to and pertain to). Thus, the results prove that the fifth hypothesis is correct. In both English subcorpora, the 5 most frequent phrasal verbs account for nearly half, and 25 most frequent phrasal verbs for more than 90% of all such items. The fifth phase includes the comparison of the list of phrasal verbs in two English subcorpora to the existing lists of phrasal verbs in general (BNC - Gardner and Davies, 2007) and legal language (CEUE - Trebits, 2009). The results show that phrasal verbs have lower frequency in the English comparable corpus than in BNC (which is expected since phrasal verbs are generally more frequent in general language than in the language for special purposes), almost the same frequency in EU_en as in CEUE (which is expected since both corpora consist of EU documents) and lower frequency in RH_en than in CEUE. When comparing the list of the 25 most frequent phrasal verbs, the results show that CEUE is more similar to BNC than English comparable corpora, and EU_en to CEUE than RH_en. CEUE and EU_en considerably differ in terms of phrasal verbs set out, carry out and lay down, while CEUE and RH_en in terms of phrasal verbs set out, base on, carry out and follow up. Differences between CEUE-a and EU_en-a can be explained with the fact that the corpora consist of different types of EU documents. The sixth phase includes the analysis of the structural, syntactic and semantic features of phrasal verbs in the English comparable corpora. WordList and Concord programmes of WordSmith Tools 6.0 are used for the verification of the initial list of phrasal verbs via the verbal segment of phrasal verbs, and analysis of the context in which phrasal verbs are used. Semantic features of phrasal verbs are analysed via translation equivalents which are obtained using the option Find of Word tool for the extraction of sentences from bi-directional EnglishCroatian parallel corpora which contain phrasal verbs and their translation equivalents. Croatian translation equivalents are used only for the purposes of the analysis of the correct usage of English phrasal verbs and the analysis of translation techniques applied in the translation process. The correct usage of phrasal verbs is checked against different reference resources, i.e. print resources such as Cambridge Phrasal Verbs Dictionary (CPVD) (2015) and Longman Dictionary of English Language and Culture (1998) (LD), and digital resources such as The Free Dictionary by Farlex (FD) and Collocations Dictionary (CD). Translation equivalents of the 25 most frequent phrasal verbs are checked against different print reference resources, i.e. the dictionary of general language Veliki englesko-hrvatski rječnik (Bujas, 1999), and dictionaries of legal language Englesko-hrvatski i Europske unije pravni rječnik (Marunica, 2003), Englesko-hrvatski rječnik prava međunarodnih odnosa, kriminalistike i forenzičnih znanosti, kriminologije i sigurnosti (Gačić, 2004) and Englesko - hrvatski rječnik prava i međunarodnih i poslovnih odnosa (Gačić, 2010), as well as digital resources such as Četverojezični rječnik prava Europske unije (2003), the EU`s multilingual term base IATE and the EU`s multilingual thesaurus Eurovoc. Furthermore, the application of guidelines from the publications available to legal translators in the European Union and the Republic of Croatia is also assessed. The results show that prepositional phrasal verbs (EU_en ̶ 657; RH_en ̶ 906) account for 37% (EU_en) and 40% (RH_en) of all phrasal verbs. The most problematic proved to be dispose of, allow for, depart from, derive from and stem from. The problematic meaning of dispose of can be seen in the print and digital reference resources as well. Phrasal verbs consisting of two particles (EU_en ̶ 701; RH_en ̶ 797) account for 39% (EU_en) and 35% (RH_en) of all phrasal verbs. The most problematic proved to be follow up with (and the noun follow-up to), reach out to and sign up to/for. The problematic meaning and use of follow up with can be seen in the print and digital reference resources as well. Adverbial phrasal verbs (EU_en ̶ 434; RH_en ̶ 583) account for 24% (EU_en) and 25% (RH_en) of all phrasal verbs. The most problematic proved to be take over (RH_en). Phrasal verbs carry over, follow up, join up, phase out, start up, tide over and write off (EU_en) and fill out, pay off and start up (RH_en) are used only in the form of a noun or an adjective which emphasizes the tendency towards nominalisation of phrasal verbs. However, nouns derived from phrasal verbs proved to be problematic in terms of the use of a hyphen (-). Nouns rollout and handover, which according to the handbook English Style Guide (Point 3.23.) can be written as a single word or hyphenated, also proved to be problematic since WordSmith Tools 6.0 recognizes them as one-word nouns and not as multi-word noun. Therefore, they were not included in the initial list of phrasal verbs but were added after the verification of the list via the verbal segment of phrasal verbs using the programme WS WordList. The verification via the verbal segment also led to finding the examples of structures such as noun-related/based + noun and nominalisation such as conformity assessment procedures/ tasks etc. Such structures are more frequent in EU_en which, therefore, explains the lower frequency of phrasal verbs conform to, relate to and base on in EU_en-u and underlines the tendency towards nominalisation in legal language. Furthermore, the verification via the verbal segment revealed examples where, using the method of inversion, more than 20 words are inserted between the verb and the particle constituting phrasal verbs such as refer to, relate to, pertain to, base on and combine with. Such examples were excluded from the initial list of phrasal verbs since the verb is too distant from the particle (see Figure 8.). The results show that English comparable corpus contains two large groups of synonymous phrasal verbs. One group consists of prepositional phrasal verbs related to/ relating to, pertaining to, referring to and associated with. The other group consists of phrasal verbs provide for, set out, lay down, set forth, lay out and refer to (i.e. referred to in) that are most often used in formulaic structures with the particle in, after the nouns e.g. activities, tasks, procedures, etc. and before the nouns e.g. (indent of) point, (sub)paragraph, Section, Article, Chapter, Regulation, Decision, Annex, etc., i.e. nouns which are typical of the legal language. The results show that the general opinion that phrasal verbs are mostly used in informal style is not correct. Phrasal verbs set forth, lay out and refer to have, as a result of their tendency towards polysemy and flexibility in changing their syntactic features, developed a new meaning in the legal domain. The results, thus, prove that the second hypothesis is correct in the sense that phrasal verbs are used in the legal language, both as the source and target language, regardless of their redundancy, polysemy and the principle of language economy since they are a characteristic feature of both the general and legal English language. However, synonymy and polysemy are not desirable features in the legal language which calls for precision, consistency, monosemy and transparency. The results show that the translators of the EU legislation mostly apply the technique of literal translation (49%), then economy (19%), and transposition (16%), whereas amplification is applied in only 0.5% of cases. The technique of modulation is applied in 16% of the cases (3% of which are due to mistakes in the EU_en). The translators of the Croatian legislation almost to an equal extent apply the techniques of literal translation (29%), modulation (28%) and amplification (27%). Also, it must be emphasized that half of the cases of modulation are translation mistakes, and that there are more serious instances of modulation of meaning and relationships than in the case of the EU corpus. These results prove that the hypothesis that texts originally written in English and translations in English differ in terms of the correct use of phrasal verbs. The technique of transposition is applied in 14%, and economy in only 1% of the cases. Greater use of the technique of economy in the EU corpus, and amplification in the RH corpus explains the fact that RH_en is by 30% longer than EU_en. The parallel corpora do not contain examples in which the techniques of addition and omission are applied which proves that the the forth hypothesis is partially correct. However, the part of the forth hypothesis that the techniques of transposition and modulation are applied more than other techniques, is not correct. As far as the print and digital reference resources are concerned, the results show that the dictionary of general English language, Veliki englesko-hrvatski rječnik (Bujas, 1999), although it offers rather general translation equivalents and examples, is more useful than the dictionary of legal language, Englesko-hrvatski i Europske unije pravni rječnik (Marunica, 2003) which lists only few phrasal verbs (i.e. phrasal verbs are mostly listed without the particle), and offers general, incomplete and, on occasions even contradictory translation equivalents and examples. The dictionaries of legal language Englesko-hrvatski rječnik prava međunarodnih odnosa, kriminalistike i forenzičnih znanosti, kriminologije i sigurnosti (Gačić, 2004) and Englesko-hrvatski rječnik prava i međunarodnih i poslovnih odnosa (Gačić, 2010) proved to be excellent resources which contain an impressive number of translation equivalents and examples and thus, can be very helpful to translators. These results prove that, in the case of Marunica (2003), the third hypothesis is correct, whereas in the case of Gačić (2004, 2010) is incorrect. The digital dictionary Četverojezični rječnik prava Europske Unije (2003) lists only the basic or the most frequent meaning of phrasal verbs, just as it is announced in its Preface. The EU`s term base IATE lists only few translation equivalents most often inviting translators to apply the technique of economy, which explains the fact that the translators of the EU legislation, besides the technique of literal translation, mostly apply the technique of economy. It also must be emphasized that IATE contains examples where phrasal verbs are listed, but no translation equivalents are offered. The EU`s thesaurus EuroVoc proved not to be useful for this research since it does not list any phrasal verb. The results, therefore, show that the third hypothesis is correct in the case of digital reference resources which, in accordance with the tendency towards the language economy and uniformity, list only the basic and the most frequent, as well as the economic translation equivalents. The seventh phase includes the semi-automatic extraction of the 100 most frequent nouns in the English comparable corpus using WS WordList and Concord programmes with the aim of analysing the extent to which phrasal verbs are used with the most frequent nouns and the context they appear in. The results show that phrasal verbs are used in only 78 (EU_en) i.e. 151 (RH_en) cases with the 100 most frequent nouns and mostly phrasal verbs refer to, set out, draw up and lay down), i.e. those phrasal verbs which proved to be typical of the legal language. The eighth phase includes the semi-automatic extraction of the 50 most frequent verbs with the purpose of analysing the share of phrasal verbs in the total number of verbs in the English comparable corpora. The results show that the 50 most frequent verbs account for about a half of the total number of verbs, and that RH_en contains more verbs than EU_en, which supports the fact that nominalisation is more frequently applied in the texts originally written in English than in translations. Phrasal verbs account for about 20% of the total tokens of the 50 most frequent verbs, with the second group of synonymous phrasal verbs which are typical of the legal language accounting for 12% (EU_en), i.e. 8% (RH_en). The list of the 50 most frequent verbs includes also their potential one-word synonyms as well as those of the phrasal verb carry out which points to the inconsistency and synonmy in the legal language which insists on monosemy, precision, consistency, especially in the case of the EU where there are 24 language versions. The ninth phase, therefore, includes the semi-automatic extraction of one-word synonyms of the most frequent adverbial phrasal verbs (carry out, draw up and set out/ lay down) via their most frequent translation equivalents and using the option Find of Word tool with the aim of verifying the inconsistency in the legal language. The results show the co-existence of phrasal verbs and their one-word synonyms at the level of individual documents and the English comparable corpus as a whole. The solution to the problems of synonymy, polysemy, inconsistency and inaccuracy that are identified in this research, can be sought in the greater application of the instructions from the EU`s handbook How to write clearly (2012); - the instruction Cut out excess nouns - verb forms are livelier in the case of the phrasal verb carry out which can be updated to: whenever possible, instead of the structures e.g. carry out/ its one-word synonyms + noun (mostly in -tion, but, others also, e.g. review) use one-word verbs (e.g. carry out/ perform evaluation = to evaluate) - the instruction KISS: Keep It Short and Simple in the case of the first group of synonymous phrasal verbs, which can be updated to: whenever possible, instead of the verbs concerning, associated with, pertaining to, regarding, referring to, related to, relating to use the particles about, of, on (Croatian equivalents are particles o and od) In accordance with the abovementioned instructions, a new instruction can be suggested for the second group of synonymous phrasal verbs: whenever possible, instead of the structures laid down in, laid out in, referred to in, provided (for) in, set forth in, set out in and their one-word synonyms, use the particle under The substitution of the second group of synonymous phrasal verbs and their one-word synonyms with the particle under 60 would reduce language redundancy, inconsistency, synonymy, polysemy and the use of passive structures. The suggested substitution is in line with the plain legal English movement, i.e. the tendency to render it more similar to the general language, which can be observed in different handbooks and guidelines available to translators. Furthermore, the suggested substitution would facilitate the translation process since n:n would result in 1:1 relationship, with the particle iz being used as the Croatian equivalent of the English particle under, which would greatly improve the quality of the machine translation and computer-assisted translation. Although the suggested substitution entails elimination of the phrasal verbs typical of the legal language, it is in line with the overall skopos of the translation process of achieving the translation equivalence at the level of 24 languages. The expected scientific contribution of this research is the theoretical discussion on the problematic presence of phrasal verbs in legal language, methodology of phrasal verb extraction from bi-lingual and comparable corpora; contrastive linguistic-translatologic study of phrasal verbs, usage and critical study of the print and digital reference resources and tools; conducted classification and evaluation of translation techniques applied in the translation process; analysis of the consistency in the use of phrasal verbs as well as the updating of the existing, and suggesting of new instructions for dealing with phrasal verbs in the legal language. The aim of this research is to improve the quality and consistency of legal translations and available language resources and to contribute to the development of language and translation technologies and machine translation, as well as to facilitate the acquisition and accurate use of phrasal verbs among the legal translators.

Item Type: PhD Thesis
Uncontrolled Keywords: phrasal verbs, particles, corpus-based analysis, legal language, language resources, translation techniques
Subjects: English language and literature
Linguistics
Departments: Department of Linguistics
Supervisor: Zovko Dinković, Irena
Additional Information: Poslijediplomski doktorski studij lingvistike
Date Deposited: 21 May 2018 10:47
Last Modified: 21 May 2018 10:47
URI: http://darhiv.ffzg.unizg.hr/id/eprint/10037

Actions (login required)

View Item View Item