Učinkovita semantička segmentacija slike piramidnom fuzijom

Marin Oršić

Pregled bibliografske jedinice broj: 1171013

Učinkovita semantička segmentacija slike piramidnom fuzijom

Marin Oršić

Učinkovita semantička segmentacija slike piramidnom fuzijom, 2021., doktorska disertacija, Fakultet elektrotehnike i računarstva, Zagreb

CROSBI ID: 1171013 Za ispravke kontaktirajte CROSBI podršku putem web obrasca

Naslov
Učinkovita semantička segmentacija slike piramidnom fuzijom
(Efficient semantic image segmentation using pyramidal fusion)

Autori
Marin Oršić

Vrsta, podvrsta i kategorija rada
Ocjenski radovi, doktorska disertacija

Fakultet
Fakultet elektrotehnike i računarstva

Mjesto
Zagreb

Datum
12.11

Godina
2021

Stranica
94

Mentor
Siniša Šegvić

Ključne riječi
emantic segmentation, real-time inference, shared resolution pyramid, computer vision, deep learning

Sažetak
Emergence of large datasets and resilience of convolutional models have enabled successful training of very large semantic segmentation models. However, high capacity implies high computational complexity and therefore hinders real-time operation. We therefore study compact architectures which aim at high accuracy in spite of modest capacity. We propose a novel semantic segmentation approach based on shared pyramidal representation and fusion of heterogeneous features along the upsampling path. The proposed pyramidal fusion approach is especially effective for dense inference in images with large scale variance due to strong regularization effects induced by feature sharing across the resolution pyramid. Interpretation of the decision process suggests that our approach succeeds by acting as a large ensemble of relatively simple models, as well as due to large receptive range and strong gradient flow towards early layers. We propose a novel semantic segmentation approach based on pyramidal representation with shared parameters and fusion of heterogeneous features along the upsampling path. The proposed pyramidal fusion approach is especially effective for dense inference in very large images due to very large receptive field of the resulting predictions. Validation and ablation experiments support our design choices and suggest that the proposed approach succeeds by acting as an ensemble of relatively simpler models. Our best model achieves 76.4% mIoU on Cityscapes test and runs in real time on low-power embedded devices. In this thesis, we describe the main components of a real-time semantic segmentation system based on deep convolutional models. We are considered with convolutional encoders used for recognition, as well as decoders which are crucial for obtaining accurate results. We do extensive evaluation of the developed method over a range of public and domestic datasets. Finally, we presents results in the 2020 instance of Robust Vision Challenge.

Izvorni jezik
Engleski

Znanstvena područja
Računarstvo

POVEZANOST RADA

Projekti:
--IP-2020-02-5851 - Napredna gusta predikcija za računalni vid (ADEPT) (Šegvić, Siniša) ( CroRIS)

Ustanove:
Fakultet elektrotehnike i računarstva, Zagreb

Profili:

Marin Oršić (autor)