Pretražite po imenu i prezimenu autora, mentora, urednika, prevoditelja

Napredna pretraga

Pregled bibliografske jedinice broj: 1194202

Toward Practical Usage of the Attention Mechanism as a Tool for Interpretability


Martin, Tutek; Jan, Šnajder
Toward Practical Usage of the Attention Mechanism as a Tool for Interpretability // IEEE access, 10 (2022), 47011-47030 doi:10.1109/ACCESS.2022.3169772 (međunarodna recenzija, članak, znanstveni)


CROSBI ID: 1194202 Za ispravke kontaktirajte CROSBI podršku putem web obrasca

Naslov
Toward Practical Usage of the Attention Mechanism as a Tool for Interpretability

Autori
Martin, Tutek ; Jan, Šnajder

Izvornik
IEEE access (2169-3536) 10 (2022); 47011-47030

Vrsta, podvrsta i kategorija rada
Radovi u časopisima, članak, znanstveni

Ključne riječi
Natural language processing ; Explainable AI ; Interpretability ; LSTM ; GRU ; Recurrent Neural Network

Sažetak
Natural language processing (NLP) has been one of the subfields of artificial intelligence much affected by the recent neural revolution. Architectures such as recurrent neural networks (RNNs) and attention-based transformers helped propel the state of the art across various NLP tasks, such as sequence classification, machine translation, and natural language inference. However, if neural models are to be used in high-stakes decision making scenarios, the explainability of their decisions becomes a paramount issue. The attention mechanism has offered some transparency in the workings of otherwise black- box RNN models: attention weights (scalar values assigned input words) invite to be interpreted as the importance of that word, providing a simple method of interpretability. Recent work, however, has questioned the faithfulness of this practice. Subsequent experiments have shown that faithfulness of attention weights may still be achieved by incorporating word-level objectives in the training process of neural networks. In this article, we present a study that extends the techniques for improving faithfulness of attention based on regularization methods that promote retention of word-level information. We perform extensive experiments on a wide array of recurrent neural architectures and analyze to what extent the explanations provided by inspecting attention weights are correlated with the human notion of importance. We find that incorporating tying regularization consistently improves both the faithfulness (−0.14 F1, +0.07 Brier, on average) and plausibility (+53.6% attention mass on salient tokens) of explanations obtained through inspecting attention weights across analyzed datasets and models.

Izvorni jezik
Engleski

Znanstvena područja
Informacijske i komunikacijske znanosti



POVEZANOST RADA


Projekti:
--KK.01.1.1.01.009 - Napredne metode i tehnologije u znanosti o podatcima i kooperativnim sustavima (DATACROSS) (Šmuc, Tomislav; Lončarić, Sven; Petrović, Ivan; Jokić, Andrej; Palunko, Ivana) ( CroRIS)

Ustanove:
Fakultet elektrotehnike i računarstva, Zagreb

Profili:

Avatar Url Jan Šnajder (autor)

Avatar Url Martin Tutek (autor)

Poveznice na cjeloviti tekst rada:

doi ieeexplore.ieee.org

Citiraj ovu publikaciju:

Martin, Tutek; Jan, Šnajder
Toward Practical Usage of the Attention Mechanism as a Tool for Interpretability // IEEE access, 10 (2022), 47011-47030 doi:10.1109/ACCESS.2022.3169772 (međunarodna recenzija, članak, znanstveni)
Martin, T. & Jan, Š. (2022) Toward Practical Usage of the Attention Mechanism as a Tool for Interpretability. IEEE access, 10, 47011-47030 doi:10.1109/ACCESS.2022.3169772.
@article{article, author = {Martin, Tutek and Jan, \v{S}najder}, year = {2022}, pages = {47011-47030}, DOI = {10.1109/ACCESS.2022.3169772}, keywords = {Natural language processing, Explainable AI, Interpretability, LSTM, GRU, Recurrent Neural Network}, journal = {IEEE access}, doi = {10.1109/ACCESS.2022.3169772}, volume = {10}, issn = {2169-3536}, title = {Toward Practical Usage of the Attention Mechanism as a Tool for Interpretability}, keyword = {Natural language processing, Explainable AI, Interpretability, LSTM, GRU, Recurrent Neural Network} }
@article{article, author = {Martin, Tutek and Jan, \v{S}najder}, year = {2022}, pages = {47011-47030}, DOI = {10.1109/ACCESS.2022.3169772}, keywords = {Natural language processing, Explainable AI, Interpretability, LSTM, GRU, Recurrent Neural Network}, journal = {IEEE access}, doi = {10.1109/ACCESS.2022.3169772}, volume = {10}, issn = {2169-3536}, title = {Toward Practical Usage of the Attention Mechanism as a Tool for Interpretability}, keyword = {Natural language processing, Explainable AI, Interpretability, LSTM, GRU, Recurrent Neural Network} }

Časopis indeksira:


  • Current Contents Connect (CCC)
  • Web of Science Core Collection (WoSCC)
    • Science Citation Index Expanded (SCI-EXP)
    • SCI-EXP, SSCI i/ili A&HCI
  • Scopus


Citati:





    Contrast
    Increase Font
    Decrease Font
    Dyslexic Font