Nalazite se na CroRIS probnoj okolini. Ovdje evidentirani podaci neće biti pohranjeni u Informacijskom sustavu znanosti RH. Ako je ovo greška, CroRIS produkcijskoj okolini moguće je pristupi putem poveznice www.croris.hr
izvor podataka: crosbi !

Convolutional architecture for efficient semantic segmentation of large images (CROSBI ID 442804)

Ocjenski rad | doktorska disertacija

Ivan Krešo Convolutional architecture for efficient semantic segmentation of large images / Siniša Šegvić (mentor); Zagreb, Fakultet elektrotehnike i računarstva, . 2021

Podaci o odgovornosti

Ivan Krešo

Siniša Šegvić

engleski

Convolutional architecture for efficient semantic segmentation of large images

This thesis investigates the semantic segmentation of large natural images. We focus on the type of images that are recorded with a camera mounted on a vehicle. These kinds of images do not suffer from the photographer bias and therefore usually contain harder examples to generalize to. For the task of semantic segmentation, this means that the objects appear on a wide range of scales. Hence, it is important that the method works well both for small objects further away and large objects near the camera. The focus of the thesis is on applying convolutional neural networks for semantic segmentation of large images in an efficient manner. The thesis starts by providing an introduction to the problem of semantic segmentation and explaining its relation with respect to the problem of object localization. The introduction is concluded by discussing the challenges of the problem. The next chapter starts by introducing the required concepts from the field of machine learning. In particular, we review the convolutional neural networks and make a comparison between DenseNet and ResNet architecture. The main contributions of the thesis are as follows. First, we develop a scale-invariant convolutional model for semantic segmentation which alleviates the problem of learning the same object on a wide range of scales. Furthermore, we additionally contribute a new dataset for semantic segmentation of driving scenes. The dataset contains groundtruth semantic segmentations for 445 annotated hand-picked images from the KITTI dataset. Second, we develop an efficient asymmetric architecture for dense prediction on large images based on a densely connected feature extractor and lightweight ladder-style upsampling. Third, we present the results from the Robust Vision Challenge 2018 where we achieved the second place. The challenge addressed cross-dataset and cross-domain training of dense prediction models. We have found out that mixed batches ensure the most stable evolution of batchnorm parameters. Low incidence of foreign classes indicates that our models succeeded to implicitly learn to distinguish the domains. Finally, the presented architecture is computationally and memory efficient and achieves a great tradeoff between inference speed and generalization accuracy.

computer vision, semantic segmentation

nije evidentirano

nije evidentirano

nije evidentirano

nije evidentirano

nije evidentirano

nije evidentirano

Podaci o izdanju

104

01.07.2021.

obranjeno

Podaci o ustanovi koja je dodijelila akademski stupanj

Fakultet elektrotehnike i računarstva

Zagreb

Povezanost rada

Računarstvo

Poveznice