|   | 
Details
   web
Record
Author Pau Rodriguez
Title Towards Robust Neural Models for Fine-Grained Image Recognition Type Book Whole
Year 2019 Publication PhD Thesis, Universitat Autonoma de Barcelona-CVC Abbreviated Journal
Volume Issue Pages
Keywords
Abstract Fine-grained recognition, i.e. identifying similar subcategories of the same superclass, is central to human activity. Recognizing a friend, finding bacteria in microscopic imagery, or discovering a new kind of galaxy, are just but few examples. However, fine-grained image recognition is still a challenging computer vision task since the differences between two images of the same category can overwhelm the differences between two images of different fine-grained categories. In this regime, where the difference between two categories resides on subtle input changes, excessively invariant CNNs discard those details that help to discriminate between categories and focus on more obvious changes, yielding poor classification performance.
On the other hand, CNNs with too much capacity tend to memorize instance-specific details, thus causing overfitting. In this thesis,motivated by the
potential impact of automatic fine-grained image recognition, we tackle the previous challenges and demonstrate that proper alignment of the inputs, multiple levels of attention, regularization, and explicitmodeling of the output space, results inmore accurate fine-grained recognitionmodels, that generalize better, and are more robust to intra-class variation. Concretely, we study the different stages of the neural network pipeline: input pre-processing, attention to regions, feature activations, and the label space. In each stage, we address different issues that hinder the recognition performance on various fine-grained tasks, and devise solutions in each chapter: i)We deal with the sensitivity to input alignment on fine-grained human facial motion such as pain. ii) We introduce an attention mechanism to allow CNNs to choose and process in detail the most discriminate regions of the image. iii)We further extend attention mechanisms to act on the network activations,
thus allowing them to correct their predictions by looking back at certain
regions, at different levels of abstraction. iv) We propose a regularization loss to prevent high-capacity neural networks to memorize instance details by means of almost-identical feature detectors. v)We finally study the advantages of explicitly modeling the output space within the error-correcting framework. As a result, in this thesis we demonstrate that attention and regularization seem promising directions to overcome the problems of fine-grained image recognition, as well as proper treatment of the input and the output space.
Address March 2019
Corporate Author Thesis Ph.D. thesis
Publisher Ediciones Graficas Rey Place of Publication Editor Jordi Gonzalez;Josep M. Gonfaus;Xavier Roca
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue (up) Edition
ISSN ISBN 978-84-948531-3-5 Medium
Area Expedition Conference
Notes ISE; 600.119 Approved no
Call Number Admin @ si @ Rod2019 Serial 3258
Permanent link to this record