Pregled bibliografske jedinice broj: 1277384
Generating Representative Phrase Sets for Text Entry Experiments by GA-Based Text Corpora Sampling
Generating Representative Phrase Sets for Text Entry Experiments by GA-Based Text Corpora Sampling // Mathematics, 11 (2023), 11; 2550, 26 doi:10.3390/math11112550 (međunarodna recenzija, članak, znanstveni)
CROSBI ID: 1277384 Za ispravke kontaktirajte CROSBI podršku putem web obrasca
Naslov
Generating Representative Phrase Sets for Text Entry Experiments by GA-Based Text Corpora Sampling
Autori
Ljubic, Sandi ; Salkanovic, Alen
Izvornik
Mathematics (2227-7390) 11
(2023), 11;
2550, 26
Vrsta, podvrsta i kategorija rada
Radovi u časopisima, članak, znanstveni
Ključne riječi
text entry ; phrase sets ; text corpus sampling ; genetic algorithm ; Kullback–Leibler divergence
Sažetak
In the field of human–computer interaction (HCI), text entry methods can be evaluated through controlled user experiments or predictive modeling techniques. While the modeling approach requires a language model, the empirical approach necessitates representative text phrases for the experimental stimuli. In this context, finding a phrase set with the best language representativeness belongs to the class of optimization problems in which a solution is sought in a large search space. We propose a genetic algorithm (GA)-based method for extracting a target phrase set from the available text corpus, optimizing its language representativeness. Kullback–Leibler divergence is utilized to evaluate candidates, considering the digram probability distributions of both the source corpus and the target sample. The proposed method is highly customizable, outperforms typical random sampling, and exhibits language independence. The representative phrase sets generated by the proposed solution facilitate a more valid comparison of the results from different text entry studies. The open source implementation enables the easy customization of the GA-based sampling method, promotes its immediate utilization, and facilitates the reproducibility of this study. In addition, we provide heuristic guidelines for preparing the text entry experiments, which consider the experiment’s intended design and the phrase set to be generated with the proposed solution.
Izvorni jezik
Engleski
Znanstvena područja
Računarstvo
POVEZANOST RADA
Ustanove:
Tehnički fakultet, Rijeka
Citiraj ovu publikaciju:
Časopis indeksira:
- Current Contents Connect (CCC)
- Web of Science Core Collection (WoSCC)
- Science Citation Index Expanded (SCI-EXP)
- SCI-EXP, SSCI i/ili A&HCI
- Scopus