Legislative register analysis of Croatian and Italian: intralingual, interlingual and translational perspectives

Lalli Paćelat, Ivana; Tadić, Marko
Legislative register analysis of Croatian and Italian: intralingual, interlingual and translational perspectives // Using Corpora in Contrastive and Translation Studies
Lancaster, Velika Britanija: UCREL, Lancaster University, 2014. str. 37-39 (predavanje, međunarodna recenzija, sažetak, znanstveni)

Lancaster, Velika Britanija, 24-26.07. 2014

Međunarodna recenzija

Corpus-based Translation Studies; Corpus-based Contrastive linguistics; Parallel corpora for Croatian; Register Analysis; Translation Universals

1 Introduction Translation and contrastive linguistic studies have significantly benefited from corpora and multilingual corpora in particular (McEnery and Xiao 2008: 18). It is probably not very well known that in 1968 the usage of computer parallel corpus in contrastive research in the entire history of linguistics was pioneered by Rudolf Filipović in Croatia (Tadić et al. 2012: 76). Although the first English-Croatian parallel corpus was compiled only a year after the publication of the Brown corpus (Kučera and Francis 1967), large parallel corpora for Croatian are still missing (Tadić et al. 2012: 77). Building a large Italian-Croatian parallel corpus of EU legislation has been enabled by the availability of the Croatian translations of the Acquis Communautaire and the possibility to align it further with the JRC-Acquis (Steinberger et al. 2006). 2 Corpus-based translation and contrastive studies Corpus based translation studies has shown that a translated text differs from a non-translated text and that, independently of the language, translations share some properties (e. g. Baker 1996 ; Bernardini 2011 ; Laviosa 2002 ; Xiao 2010). Whether absolute universals exist or just general tendencies in translated texts is still largely debated (cf. Bernardini and Zanettin 2004 ; Chesterman 2004 ; Mauranen 2008 ; Teich 2003 ; Xiao 2010 ; Xiao and Dai 2014). Research has also been conducted on the differences between registers in translated and non-translated texts and across languages proposing different methodological approaches (e.g. Biber 1995 ; Neumann 2010 ; Teich 2003). Biber (1995: 363) holds out ‘the possibility of patterns of register variation across languages’. The legal register is on the one hand defined as one of the most ‘national registers’ (Cortelazzo 1997: 37) which is ‘culture dependent’ (Engberg 2006: 68), and on the other hand it tends to display universal character, known as ‘legalese’ (cf. Novak 2010: 3 ; Tiersma 2006: 552). 3 Aim of the research The aim of the research is to depict lexico- grammatical features of legislative registers of Croatian and Italian and to compare them in order to find similarities and see whether legislative registers have indeed some universal features. Furthermore, the research aims at finding out whether the translations have the same lexico-grammatical features as the target language legislative register or they belong to a special register. The hypothesis predicts that, given the nature of the legislative register, the lexico- grammatical features are the similar in both languages, no matter how high the frequency of feature occurrence in the reference corpora are. Given the existence of universal translation features, it is assumed that the translated texts are more similar to one another than parallel texts of related languages. 4 Methodology and corpus design The basic requirements for the register analysis according to Biber (1995) are the comparative approach, the quantitative analysis and a representative sample. In order for these requirements to be met, six corpora belonging to four different corpus types are employed for the study ; firstly, reference corpora for both languages: (1) Croatian National Corpus (HNK v 3.0) and (2) Corpus di Italiano Scritto (CORIS) ; secondly, (3) specialized bilingual comparable corpus composed of national legislative documents in both languages (subcorpora of HNK v3.0 and CORIS) ; thirdly, (4, 5) monolingual corpora of original national legislative documents and translations of legislative documents of the European Union in the same language used as comparable corpus and lastly, a (6) parallel corpus consisting of Croatian and Italian translations of legal documents of the European Union. For the description of corpus parameters for HNK see Tadić (2002, 2009) and for CORIS Rossini Favretti et al. (2002). The approach adopted in this study is a hybrid one, without an ‘a priori’ established theoretical framework, but the corpora are annotated at part of speech (PoS) and lemma level. The analysis is performed by using WordSmith tools v_6.0 (Scott 2013), NoSketch Engine (Rychlý 2007) for HNK v3.0 (Tadić 2009) and for CORIS the on-line interface designed by F. Tamburini. Linguistic feature selection for the quantitative analysis follows previous studies (e.g. Biber and Conrad 2009 ; Cortelazzo 2013 ; Rovere 2005 ; Teich 2003 ; Venturi 2011 ; Xiao and Dai 2014), and is driven by primary corpus obtained data. In order to investigate the properties of translated texts, considered as a special register type, and to find out if there exist universal features of legislative texts across different languages, linguistic features at both lexical and grammatical level are quantitatively analysed and statistically evaluated among all the corpora and the two languages in question. 5 Conclusion The results showed that the legislative registers of Italian and Croatian share some universal features known as ‘legalese’. While greater similiraties were found, for example, in the distribution of parts of speech, less correspondence was noticed in grammatical means for expressing impersonality and nominal style. Hence, the results of this study confirm that the two languages share the same features of the legislative register, which need not necessarily be expressed by the same grammatical means. However, even at this level, the correspondence was noticed in the majority of cases. Translational corpora in both languages show the existence of universal translation features, but not always the same features and not with the same frequency (the Italian translational corpus shows the tendency towards normalization and the Croatian translational corpus towards levelling out). However, these features do not make the translations considerably different from comparable original texts in the same language. The results show the largest number of similarities between specialized and translational corpora in the same language, which confirms the authenticity of the translations and their orientation towards the target language, and in particular, towards the features of the target register. References Baker, M. 1996. “Corpus-based Translation Studies: The challenges that lie ahead”. In Somers, H. (ed.), Terminology, LSP and Translation: Studies in Language Engineering in Honour of Juan C. Sager, (175-187). Amsterdam: John Benjamins. Bernardini, S. 2011. “Monolingual comparable corpora and parallel corpora in the search for features of translated language”, SYNAPS, 26, 2–13. Bernardini S. and Zanettin F. 2004. “When is a universal not a universal? Some limits of current corpus-based methodologies for the investigation of translation universals”. In Mauranen, A., and Kuyamaki, P. (eds.), Translation Universals: Do they Exist?, (51- 62). Amsterdam: John Benjamins. Biber, D. 1995. Dimensions of register variation: a cross-linguistic comparison. Cambridge: Cambridge University Press. Biber, D. 2009. “A corpus-driven approach to formulaic language in English: multi-word patterns in speech and writing”. International Journal of Corpus Linguistics, 14 (3), 275-311. Chesterman, A. 2004. “Beyond the particular”. In Mauranen, A. and Kujamäki, P. (eds.), Translation Universals: Do They Exist? (33-49). Amsterdam: John Benjamins. Cortelazzo, M. A. 1997. “Lingua e diritto in Italia. Il punto di vista dei linguisti”. In Schena L. (ed.), La lingua del diritto: difficoltà traduttive e applicazioni didattiche, (35-50). Milano: Università Bocconi, Centro linguistico. Cortelazzo, M. A. 2013. “Leggi italiane e direttive europee a confronto”. In Realizzazioni testuali ibride in contesto europeo. Lingue dell’UE e lingue nazionali a confronto, Trieste: EUT - Edizioni Università di Trieste, 57-66. Engberg, J. 2006. “Languages for Specific Purposes”. In Brown, K. (ed.), Encyclopedia of Language and Linguistics 2. Ugd, (679-683). Oxford: Pergamon Press. Kučera, H. and Francis, W. N. 1967. Computational Analysis of Present Day American English. Providence, RI: Brown University Press. Laviosa, S. 2002. Corpus-based Translation Studies: Theory, Findings, Applications. Amsterdam/Atlanta: Rodopi. Mauranen, A. 2008. “Universal tendencies in translation”. In Anderman, G. and Rogers, M. (eds.), Incorporating Corpora. The Linguist and the Translator, (32-48). Clevedon: Multilingual Matters. McEnery, T. and Xiao, R. 2008. “Parallel and comparable corpora: what is happening?”. In G. Anderman and M. Rogers (eds.) Incorporating Corpora: Translation and the Linguist, (18–31). Clevedon: Multilingual Matters. Neumann, S. 2010. “Quantitative Register Analysis Across Languages”. In Swain, E. (ed.), Thresholds and Potentialities of Systemic Functional Linguistics: Multilingual, Multimodal and Other Specialised Discourses, (85-113). Trieste: EUT Edizioni Università di Trieste. Novak, B. 2010. Funkcionalna stilistika hrvatskoga zakonodavstva. Unpublished PhD thesis, Zagreb: Faculty of Humanities and Social Sciences, University of Zagreb. Rossini Favretti, R. Tamburini, F. and De Santis, C. 2002. “CORIS/CODIS: A corpus of written Italian based on a defined and a dynamic model”. In Wilson, A., Rayson, P., and McEnery, T. (ed.), A Rainbow of Corpora: Corpus Linguistics and the Languages of the World, (27-38). Munich: Lincom-Europa. Rovere, G. 2005. Capitoli di linguistica giuridica: ricerche su corpora elettronici. Alessandria: Edizioni dell’Orso. Rychlý, P. 2007. “A Modular Corpus Manager”. In 1st Workshop on Recent Advances in Slavonic Natural Language Processing, (65-70). Brno: Masaryk University. Scott, M. 2013. WordSmith Tools Manual, version 6, Liverpool: Lexical Analysis Software. Steinberger R., Pouliquen, B., Widiger, A., Ignat, C., Erjavec, T., Tufis, D. and Varga, D. 2006. “The JRC-Acquis: A multilingual aligned parallel corpus with 20+ languages”. In Proceedings of the 5th International Conference on Language Resources and Evaluation (LREC2006). Genoa, Italy, 2142-2147. Tadić, M. 2002. “Building the Croatian National Corpus.” LREC2002 Proceedings, Las Palmas- Pariz, Vol. II, 441-446. Tadić, M. 2009. “New version of the Croatian National Corpus”. In Hlaváčková, D., Horák, A., Osolsobě, K., and Rychlý, P. (eds.), After Half a Century of Slavonic Natural Language Processing, (199-205). Brno: Masaryk University. Tadić, M., Brozović-Rončević, D. and Kapetanović, A. 2012. Hrvatski jezik u digitalnom dobu.-The Croatian Language in the Digital Age. Heidelberg: Springer. Teich, E. 2003. Cross-linguistic variation in system and text. Berlin & New York: Mouton de Gruyter. Tiersma, P. 2006. “Languages for Specific Purposes”. In Brown, K. (ed.), Encyclopedia of Language and Linguistics, 2. udg., (679-683). Oxford: Pergamon Press. Venturi, G., 2011. Lingua e diritto: una prospettiva linguistico-computazionale. Unpublished PhD thesis, University of Turin. Available online at: page_id=81 [15.09. 2013.]. Xiao, R., and Dai, G. 2014. “Lexical and grammatical properties of Translational Chinese: translation universal hypotheses reevaluated from the Chinese perspective”. Corpus linguistics and linguistic theory. Xiao, R. 2010. “How different is translated Chinese from native Chinese”. International Journal of Corpus Linguistics, 15(1). 5–35

Filozofski fakultet, Zagreb,
Sveučilište Jurja Dobrile u Puli