Pregled bibliografske jedinice broj: 1191369
Named entity recognition for addresses: an empirical study
Named entity recognition for addresses: an empirical study // IEEE access, 10 (2022), 42094-42106 doi:10.1109/ACCESS.2022.3167418 (međunarodna recenzija, članak, znanstveni)
CROSBI ID: 1191369 Za ispravke kontaktirajte CROSBI podršku putem web obrasca
Naslov
Named entity recognition for addresses: an empirical study
Autori
Čeović, Helena ; Kurdija, Adrian Satja ; Delač, Goran ; Šilić, Marin
Izvornik
IEEE access (2169-3536) 10
(2022);
42094-42106
Vrsta, podvrsta i kategorija rada
Radovi u časopisima, članak, znanstveni
Ključne riječi
named entity recognition ; natural language processing ; address entity, bert ; bilstm-crf architecture ; flair, bilstm-cnn architecture
Sažetak
In this paper, we develop a high-performing named entity recognition model for addresses which deals with challenges including diversity, ambiguity and complexity of the address entity. Different model architectures are used for training the classifier, including logistic regression and random forest models as well as the more complex bidirectional LSTM network with a conditional random field layer (BiLSTM-CRF) implemented using Flair framework. Experiments are conducted using variously configured models on two sets of corpora, tagged differently based on the granularity of address entity: entire address, and address consisting of subparts. For both corpora, the best results are achieved on a BiLSTM-CRF architecture model with a single RNN layer trained on either standalone BERT embeddings or a stacked combination of BERT and GloVe.
Izvorni jezik
Engleski
Znanstvena područja
Računarstvo
POVEZANOST RADA
Projekti:
EK-EFRR-KK.05.1.1.02.0024 - VODIME - Vode Imotske krajine (VODIME) (Šilić, Marin; Andrić, Ivo, EK ) ( CroRIS)
HRZZ-IP-2018-01-6423 - Pouzdani kompozitni primjenski sustavi zasnovani na web uslugama (RELS) (Srbljić, Siniša, HRZZ ) ( CroRIS)
Ustanove:
Fakultet elektrotehnike i računarstva, Zagreb
Citiraj ovu publikaciju:
Časopis indeksira:
- Current Contents Connect (CCC)
- Web of Science Core Collection (WoSCC)
- Science Citation Index Expanded (SCI-EXP)
- SCI-EXP, SSCI i/ili A&HCI
- Scopus