Nalazite se na CroRIS probnoj okolini. Ovdje evidentirani podaci neće biti pohranjeni u Informacijskom sustavu znanosti RH. Ako je ovo greška, CroRIS produkcijskoj okolini moguće je pristupi putem poveznice www.croris.hr
izvor podataka: crosbi !

Selection, Implementation and Testing of Language Sample Analysis Measures for the Web-Based Application MultiDis (CROSBI ID 723422)

Prilog sa skupa u zborniku | sažetak izlaganja sa skupa | međunarodna recenzija

Hržica, Gordana ; Košutar, Sara ; Karl, Dario ; Kramarić, Matea Selection, Implementation and Testing of Language Sample Analysis Measures for the Web-Based Application MultiDis // LLOD Approaches for Language Data Research and Management LLODREAM2022: International Scientific Interdisciplinary Conference / Autorių kolektyvas (ur.). Vilnius: Mykolo Romerio universitetas, 2022. str. 40-41

Podaci o odgovornosti

Hržica, Gordana ; Košutar, Sara ; Karl, Dario ; Kramarić, Matea

engleski

Selection, Implementation and Testing of Language Sample Analysis Measures for the Web-Based Application MultiDis

Purpose: The MultiDis application is a new, web- based application designed for the analysis of spoken and written language samples, which provides information about the language abilities of children and adults, thus facilitating language assessment. The aim of this paper is to present the selection, implementation, and testing of language measures in the MultiDis application. We will present the application, the process of selecting the measures we implemented, the language resources needed to calculate them, and the results of testing. MultiDis is currently being developed for Croatian, but it could be scaled up for multilingual analysis. Design/methodology/approach: Language samples can be analyzed according to several dimensions, such as productivity, lexical diversity, and syntactic complexity. A set of (semi-) automatic measures has been selected to assess language abilities (e.g., number of lemmas, mean-average type-token ratio, mean length of communication unit). The next step was the integration of an open-source Python library for lemmatization, part-of-speech tagging, and syntactic parsing (Stanza ; Qi et al., 2020). To test whether these tasks and the subsequent calculation of language measures can be successfully performed on spoken language samples, we uploaded 150 short narrative samples produced by children as a result of a storytelling task. Findings: Lemmatization and part-of-speech tagging are fairly accurate (>85% of cases), as they do not interfere with the calculation of the currently implemented measures of productivity and lexical diversity. The process of syntactic parsing has been an obstacle that is currently being resolved. Research limitations/implications: The MultiDis web application is still under development, although the current version fulfils its main purpose – it allows for (semi-)automatic spoken language analysis. Practical implications: There is an increasing awareness of the importance of language sample analysis as a complementary method in language assessment. The time needed for transcription and the linguistic knowledge required for manual analysis are considered to be the main obstacles to its implementation (Pezold et al., 2020). Therefore, the development of a tool for automatic calculation of language measures such as the MultiDis application could make naturalistic language assessment more feasible. Originality/Value: The value of this study lies in proposing a new application for lemmatization and part-of-speech tagging that allows for more reliable calculation of measures of productivity, lexical diversity, and syntactic complexity. Selecting appropriate measures for language assessment is a challenging task because there are many available. Implementing language technologies developed for large bodies of written texts to spoken language is also challenging. Success in some parts of automated tagging (lemmatization and part-of-speech tagging) allows for the reliable calculation of measures of productivity and lexical diversity. Future work on syntactic parsing will lead to the successful implementation of measures of syntactic complexity.

Language Sample Analysis ; Microstructural Measures, Lemmatization, Part-of-speech tagging, Syntactic parsing

nije evidentirano

nije evidentirano

nije evidentirano

nije evidentirano

nije evidentirano

nije evidentirano

Podaci o prilogu

40-41.

2022.

objavljeno

Podaci o matičnoj publikaciji

LLOD Approaches for Language Data Research and Management LLODREAM2022: International Scientific Interdisciplinary Conference

Autorių kolektyvas

Vilnius: Mykolo Romerio universitetas

978-609-488-041-4

Podaci o skupu

LLOD approaches for language data research and management (LLODREAM 2022)

predavanje

21.09.2022-22.09.2022

Vilnius, Litva

Povezanost rada

Filologija, Interdisciplinarne društvene znanosti, Interdisciplinarne humanističke znanosti, Logopedija