Iconographic Image Captioning for Artworks

Cetinic, Eva

Pregled bibliografske jedinice broj: 1231368

Iconographic Image Captioning for Artworks

Cetinic, Eva

Iconographic Image Captioning for Artworks // Pattern Recognition. ICPR International Workshops and Challenges : Proceedings, Part III / Del Bimbo, Alberto ; Cucchiara, Rita ; Sclaroff, Stan ; Farinella, Giovanni Maria ; Mei, Tao ; Bertini, Marco ; Jair Escalante, Hugo ; Vezzani, Roberto (ur.).
Cham: Springer, 2021. str. 502-516 doi:10.1007/978-3-030-68796-0_36 (predavanje, međunarodna recenzija, cjeloviti rad (in extenso), znanstveni)

CROSBI ID: 1231368 Za ispravke kontaktirajte CROSBI podršku putem web obrasca

Naslov
Iconographic Image Captioning for Artworks

Autori
Cetinic, Eva

Vrsta, podvrsta i kategorija rada
Radovi u zbornicima skupova, cjeloviti rad (in extenso), znanstveni

Izvornik
Pattern Recognition. ICPR International Workshops and Challenges : Proceedings, Part III / Del Bimbo, Alberto ; Cucchiara, Rita ; Sclaroff, Stan ; Farinella, Giovanni Maria ; Mei, Tao ; Bertini, Marco ; Jair Escalante, Hugo ; Vezzani, Roberto - Cham : Springer, 2021, 502-516

ISBN
978-3-030-68795-3

Skup
International Workshop on Fine Art Pattern Extraction and Recognition (FAPER 2020)

Mjesto i datum
Online, 11.01.2021

Vrsta sudjelovanja
Predavanje

Vrsta recenzije
Međunarodna recenzija

Ključne riječi
image captioning ; vision-language models ; fine-tuning ; visual art

Sažetak
. Image captioning implies automatically generating textual descriptions of images based only on the visual input. Although this has been an extensively addressed research topic in recent years, not many contributions have been made in the domain of art historical data. In this particular context, the task of image captioning is confronted with various challenges such as the lack of large-scale datasets of image-text pairs, the complexity of meaning associated with describing artworks and the need for expert-level annotations. This work aims to address some of those challenges by utilizing a novel large-scale dataset of artwork images annotated with concepts from the Iconclass classification system designed for art and iconography. The annotations are processed into clean textual description to create a dataset suitable for training a deep neural network model on the image captioning task. Motivated by the state-of-the-art results achieved in generating captions for natural images, a transformer-based vision-language pre-trained model is fine-tuned using the artwork image dataset. Quantitative evaluation of the results is performed using standard image captioning metrics. The quality of the generated captions and the model’s capacity to generalize to new data is explored by employing the model on a new collection of paintings and performing an analysis of the relation between commonly generated captions and the artistic genre. The overall results suggest that the model can generate meaningful captions that exhibit a stronger relevance to the art historical context, particularly in comparison to captions obtained from models trained only on natural image datasets.

Izvorni jezik
Engleski

Znanstvena područja
Interdisciplinarne tehničke znanosti

POVEZANOST RADA

Ustanove:
Institut "Ruđer Bošković", Zagreb

Profili:

Eva Cetinić (autor)

Poveznice na cjeloviti tekst rada:

doi arxiv.org link.springer.com

Citiraj ovu publikaciju:

Časopis indeksira:

Scopus

CROSBI Hrvatska znanstvena bibliografija

Pregled bibliografske jedinice broj: 1231368

Iconographic Image Captioning for Artworks

Poveznice na cjeloviti tekst rada:

Citiraj ovu publikaciju:

Časopis indeksira:

Citati:

Altmetrijski pokazatelji:

Pregled bibliografske jedinice broj: 1231368

Iconographic Image Captioning for Artworks

Poveznice na cjeloviti tekst rada:

Citiraj ovu publikaciju:

Časopis indeksira:

Citati:

Altmetrijski pokazatelji:

Podijeli: