Pregled bibliografske jedinice broj: 1171013
Učinkovita semantička segmentacija slike piramidnom fuzijom
Učinkovita semantička segmentacija slike piramidnom fuzijom, 2021., doktorska disertacija, Fakultet elektrotehnike i računarstva, Zagreb
CROSBI ID: 1171013 Za ispravke kontaktirajte CROSBI podršku putem web obrasca
Naslov
Učinkovita semantička segmentacija slike piramidnom fuzijom
(Efficient semantic image segmentation using pyramidal fusion)
Autori
Marin Oršić
Vrsta, podvrsta i kategorija rada
Ocjenski radovi, doktorska disertacija
Fakultet
Fakultet elektrotehnike i računarstva
Mjesto
Zagreb
Datum
12.11
Godina
2021
Stranica
94
Mentor
Siniša Šegvić
Ključne riječi
emantic segmentation, real-time inference, shared resolution pyramid, computer vision, deep learning
Sažetak
Emergence of large datasets and resilience of convolutional models have enabled successful training of very large semantic segmentation models. However, high capacity implies high computational complexity and therefore hinders real-time operation. We therefore study compact architectures which aim at high accuracy in spite of modest capacity. We propose a novel semantic segmentation approach based on shared pyramidal representation and fusion of heterogeneous features along the upsampling path. The proposed pyramidal fusion approach is especially effective for dense inference in images with large scale variance due to strong regularization effects induced by feature sharing across the resolution pyramid. Interpretation of the decision process suggests that our approach succeeds by acting as a large ensemble of relatively simple models, as well as due to large receptive range and strong gradient flow towards early layers. We propose a novel semantic segmentation approach based on pyramidal representation with shared parameters and fusion of heterogeneous features along the upsampling path. The proposed pyramidal fusion approach is especially effective for dense inference in very large images due to very large receptive field of the resulting predictions. Validation and ablation experiments support our design choices and suggest that the proposed approach succeeds by acting as an ensemble of relatively simpler models. Our best model achieves 76.4% mIoU on Cityscapes test and runs in real time on low-power embedded devices. In this thesis, we describe the main components of a real-time semantic segmentation system based on deep convolutional models. We are considered with convolutional encoders used for recognition, as well as decoders which are crucial for obtaining accurate results. We do extensive evaluation of the developed method over a range of public and domestic datasets. Finally, we presents results in the 2020 instance of Robust Vision Challenge.
Izvorni jezik
Engleski
Znanstvena područja
Računarstvo
POVEZANOST RADA
Projekti:
--IP-2020-02-5851 - Napredna gusta predikcija za računalni vid (ADEPT) (Šegvić, Siniša) ( CroRIS)
Ustanove:
Fakultet elektrotehnike i računarstva, Zagreb