|
Records |
Links |
|
Author |
Misael Rosales |
|
|
Title |
Empirical Simulation Moldel of Intravascular Ultrasound |
Type |
Miscellaneous |
|
Year |
2002 |
Publication |
Director: P. Radeva, Master Thesis. |
Abbreviated Journal |
|
|
|
Volume |
|
Issue |
|
Pages |
|
|
|
Keywords |
|
|
|
Abstract |
|
|
|
Address |
CVC (UAB) |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
|
|
|
Notes |
|
Approved |
no |
|
|
Call Number |
Admin @ si @ Ros2002 |
Serial |
323 |
|
Permanent link to this record |
|
|
|
|
Author |
Jose Antonio Rodriguez; Florent Perronnin |
|
|
Title |
Handwritten word-spotting using hidden Markov models and universal vocabularies |
Type |
Journal Article |
|
Year |
2009 |
Publication |
Pattern Recognition |
Abbreviated Journal |
PR |
|
|
Volume |
42 |
Issue |
9 |
Pages |
2103-2116 |
|
|
Keywords |
Word-spotting; Hidden Markov model; Score normalization; Universal vocabulary; Handwriting recognition |
|
|
Abstract |
Handwritten word-spotting is traditionally viewed as an image matching task between one or multiple query word-images and a set of candidate word-images in a database. This is a typical instance of the query-by-example paradigm. In this article, we introduce a statistical framework for the word-spotting problem which employs hidden Markov models (HMMs) to model keywords and a Gaussian mixture model (GMM) for score normalization. We explore the use of two types of HMMs for the word modeling part: continuous HMMs (C-HMMs) and semi-continuous HMMs (SC-HMMs), i.e. HMMs with a shared set of Gaussians. We show on a challenging multi-writer corpus that the proposed statistical framework is always superior to a traditional matching system which uses dynamic time warping (DTW) for word-image distance computation. A very important finding is that the SC-HMM is superior when labeled training data is scarce—as low as one sample per keyword—thanks to the prior information which can be incorporated in the shared set of Gaussians. |
|
|
Address |
|
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
Elsevier |
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
0031-3203 |
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
|
|
|
Notes |
|
Approved |
no |
|
|
Call Number |
Admin @ si @ RoP2009 |
Serial |
1053 |
|
Permanent link to this record |
|
|
|
|
Author |
Jose Antonio Rodriguez; Florent Perronnin |
|
|
Title |
Score Normalization for Hmm-based Word Spotting Using Universal Background Model |
Type |
Conference Article |
|
Year |
2008 |
Publication |
International Conference on Frontiers in Handwriting Recognition |
Abbreviated Journal |
|
|
|
Volume |
|
Issue |
|
Pages |
82–87 |
|
|
Keywords |
|
|
|
Abstract |
|
|
|
Address |
Montreal (Canada) |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
ICFHR |
|
|
Notes |
|
Approved |
no |
|
|
Call Number |
Admin @ si @ RoP2008c |
Serial |
1067 |
|
Permanent link to this record |
|
|
|
|
Author |
Jose Antonio Rodriguez; Florent Perronnin |
|
|
Title |
Local Gradient Histogram Features for Word Spotting in Unconstrained Handwritten Documents |
Type |
Conference Article |
|
Year |
2008 |
Publication |
International Conference on Frontiers in Handwriting Recognition |
Abbreviated Journal |
|
|
|
Volume |
|
Issue |
|
Pages |
7–12 |
|
|
Keywords |
|
|
|
Abstract |
|
|
|
Address |
Montreal (Canada) |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
ICFHR |
|
|
Notes |
|
Approved |
no |
|
|
Call Number |
Admin @ si @ RoP2008b |
Serial |
1066 |
|
Permanent link to this record |
|
|
|
|
Author |
Jose Antonio Rodriguez; Florent Perronnin |
|
|
Title |
Local Gradient Histogram Features for Word Spotting in Unconstrained Handwritten Documents |
Type |
Book Chapter |
|
Year |
2008 |
Publication |
Graphics Recognition: Recent Advances and New Opportunities |
Abbreviated Journal |
|
|
|
Volume |
5046 |
Issue |
|
Pages |
188–198 |
|
|
Keywords |
|
|
|
Abstract |
|
|
|
Address |
|
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
W. Liu, J. Llados, J.M. Ogier |
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
LNCS |
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
|
|
|
Notes |
|
Approved |
no |
|
|
Call Number |
Admin @ si @ RoP2008a |
Serial |
992 |
|
Permanent link to this record |
|
|
|
|
Author |
Adriana Romero |
|
|
Title |
Assisting the training of deep neural networks with applications to computer vision |
Type |
Book Whole |
|
Year |
2015 |
Publication |
PhD Thesis, Universitat de Barcelona-CVC |
Abbreviated Journal |
|
|
|
Volume |
|
Issue |
|
Pages |
|
|
|
Keywords |
|
|
|
Abstract |
Deep learning has recently been enjoying an increasing popularity due to its success in solving challenging tasks. In particular, deep learning has proven to be effective in a large variety of computer vision tasks, such as image classification, object recognition and image parsing. Contrary to previous research, which required engineered feature representations, designed by experts, in order to succeed, deep learning attempts to learn representation hierarchies automatically from data. More recently, the trend has been to go deeper with representation hierarchies.
Learning (very) deep representation hierarchies is a challenging task, which
involves the optimization of highly non-convex functions. Therefore, the search
for algorithms to ease the learning of (very) deep representation hierarchies from data is extensive and ongoing.
In this thesis, we tackle the challenging problem of easing the learning of (very) deep representation hierarchies. We present a hyper-parameter free, off-the-shelf, simple and fast unsupervised algorithm to discover hidden structure from the input data by enforcing a very strong form of sparsity. We study the applicability and potential of the algorithm to learn representations of varying depth in a handful of applications and domains, highlighting the ability of the algorithm to provide discriminative feature representations that are able to achieve top performance.
Yet, while emphasizing the great value of unsupervised learning methods when
labeled data is scarce, the recent industrial success of deep learning has revolved around supervised learning. Supervised learning is currently the focus of many recent research advances, which have shown to excel at many computer vision tasks. Top performing systems often involve very large and deep models, which are not well suited for applications with time or memory limitations. More in line with the current trends, we engage in making top performing models more efficient, by designing very deep and thin models. Since training such very deep models still appears to be a challenging task, we introduce a novel algorithm that guides the training of very thin and deep models by hinting their intermediate representations.
Very deep and thin models trained by the proposed algorithm end up extracting feature representations that are comparable or even better performing
than the ones extracted by large state-of-the-art models, while compellingly
reducing the time and memory consumption of the model. |
|
|
Address |
October 2015 |
|
|
Corporate Author |
|
Thesis |
Ph.D. thesis |
|
|
Publisher |
Ediciones Graficas Rey |
Place of Publication |
|
Editor |
Carlo Gatta;Petia Radeva |
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
|
|
|
Notes |
MILAB |
Approved |
no |
|
|
Call Number |
Admin @ si @ Rom2015 |
Serial |
2707 |
|
Permanent link to this record |
|
|
|
|
Author |
Jordi Roca; A.Owen; G.Jordan; Y.Ling; C. Alejandro Parraga; A.Hurlbert |
|
|
Title |
Inter-individual Variations in Color Naming and the Structure of 3D Color Space |
Type |
Abstract |
|
Year |
2011 |
Publication |
Journal of Vision |
Abbreviated Journal |
VSS |
|
|
Volume |
12 |
Issue |
2 |
Pages |
166 |
|
|
Keywords |
|
|
|
Abstract |
36.307
Many everyday behavioural uses of color vision depend on color naming ability, which is neither measured nor predicted by most standardized tests of color vision, for either normal or anomalous color vision. Here we demonstrate a new method to quantify color naming ability by deriving a compact computational description of individual 3D color spaces. Methods: Individual observers underwent standardized color vision diagnostic tests (including anomaloscope testing) and a series of custom-made color naming tasks using 500 distinct color samples, either CRT stimuli (“light”-based) or Munsell chips (“surface”-based), with both forced- and free-choice color naming paradigms. For each subject, we defined his/her color solid as the set of 3D convex hulls computed for each basic color category from the relevant collection of categorised points in perceptually uniform CIELAB space. From the parameters of the convex hulls, we derived several indices to characterise the 3D structure of the color solid and its inter-individual variations. Using a reference group of 25 normal trichromats (NT), we defined the degree of normality for the shape, location and overlap of each color region, and the extent of “light”-“surface” agreement. Results: Certain features of color perception emerge from analysis of the average NT color solid, e.g.: (1) the white category is slightly shifted towards blue; and (2) the variability in category border location across NT subjects is asymmetric across color space, with least variability in the blue/green region. Comparisons between individual and average NT indices reveal specific naming “deficits”, e.g.: (1) Category volumes for white, green, brown and grey are expanded for anomalous trichromats and dichromats; and (2) the focal structure of color space is disrupted more in protanopia than other forms of anomalous color vision. The indices both capture the structure of subjective color spaces and allow us to quantify inter-individual differences in color naming ability. |
|
|
Address |
|
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
1534-7362 |
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
|
|
|
Notes |
CIC |
Approved |
no |
|
|
Call Number |
Admin @ si @ ROJ2011 |
Serial |
1758 |
|
Permanent link to this record |
|
|
|
|
Author |
David Augusto Rojas |
|
|
Title |
Colouring Local Feature Detection for Matching |
Type |
Report |
|
Year |
2009 |
Publication |
CVC Technical Report |
Abbreviated Journal |
|
|
|
Volume |
133 |
Issue |
|
Pages |
|
|
|
Keywords |
|
|
|
Abstract |
|
|
|
Address |
|
|
|
Corporate Author |
Computer Vision Center |
Thesis |
Master's thesis |
|
|
Publisher |
|
Place of Publication |
Bellaterra, Barcelona |
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
|
|
|
Notes |
CIC |
Approved |
no |
|
|
Call Number |
Admin @ si @ Roj2009 |
Serial |
2392 |
|
Permanent link to this record |
|
|
|
|
Author |
Adriana Romero; Carlo Gatta |
|
|
Title |
Do We Really Need All These Neurons? |
Type |
Conference Article |
|
Year |
2013 |
Publication |
6th Iberian Conference on Pattern Recognition and Image Analysis |
Abbreviated Journal |
|
|
|
Volume |
7887 |
Issue |
|
Pages |
460--467 |
|
|
Keywords |
Retricted Boltzmann Machine; hidden units; unsupervised learning; classification |
|
|
Abstract |
Restricted Boltzmann Machines (RBMs) are generative neural networks that have received much attention recently. In particular, choosing the appropriate number of hidden units is important as it might hinder their representative power. According to the literature, RBM require numerous hidden units to approximate any distribution properly. In this paper, we present an experiment to determine whether such amount of hidden units is required in a classification context. We then propose an incremental algorithm that trains RBM reusing the previously trained parameters using a trade-off measure to determine the appropriate number of hidden units. Results on the MNIST and OCR letters databases show that using a number of hidden units, which is one order of magnitude smaller than the literature estimate, suffices to achieve similar performance. Moreover, the proposed algorithm allows to estimate the required number of hidden units without the need of training many RBM from scratch. |
|
|
Address |
Madeira; Portugal; June 2013 |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
Springer Berlin Heidelberg |
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
LNCS |
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
0302-9743 |
ISBN |
978-3-642-38627-5 |
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
IbPRIA |
|
|
Notes |
MILAB; 600.046 |
Approved |
no |
|
|
Call Number |
Admin @ si @ RoG2013 |
Serial |
2311 |
|
Permanent link to this record |
|
|
|
|
Author |
Pau Rodriguez |
|
|
Title |
Towards Robust Neural Models for Fine-Grained Image Recognition |
Type |
Book Whole |
|
Year |
2019 |
Publication |
PhD Thesis, Universitat Autonoma de Barcelona-CVC |
Abbreviated Journal |
|
|
|
Volume |
|
Issue |
|
Pages |
|
|
|
Keywords |
|
|
|
Abstract |
Fine-grained recognition, i.e. identifying similar subcategories of the same superclass, is central to human activity. Recognizing a friend, finding bacteria in microscopic imagery, or discovering a new kind of galaxy, are just but few examples. However, fine-grained image recognition is still a challenging computer vision task since the differences between two images of the same category can overwhelm the differences between two images of different fine-grained categories. In this regime, where the difference between two categories resides on subtle input changes, excessively invariant CNNs discard those details that help to discriminate between categories and focus on more obvious changes, yielding poor classification performance.
On the other hand, CNNs with too much capacity tend to memorize instance-specific details, thus causing overfitting. In this thesis,motivated by the
potential impact of automatic fine-grained image recognition, we tackle the previous challenges and demonstrate that proper alignment of the inputs, multiple levels of attention, regularization, and explicitmodeling of the output space, results inmore accurate fine-grained recognitionmodels, that generalize better, and are more robust to intra-class variation. Concretely, we study the different stages of the neural network pipeline: input pre-processing, attention to regions, feature activations, and the label space. In each stage, we address different issues that hinder the recognition performance on various fine-grained tasks, and devise solutions in each chapter: i)We deal with the sensitivity to input alignment on fine-grained human facial motion such as pain. ii) We introduce an attention mechanism to allow CNNs to choose and process in detail the most discriminate regions of the image. iii)We further extend attention mechanisms to act on the network activations,
thus allowing them to correct their predictions by looking back at certain
regions, at different levels of abstraction. iv) We propose a regularization loss to prevent high-capacity neural networks to memorize instance details by means of almost-identical feature detectors. v)We finally study the advantages of explicitly modeling the output space within the error-correcting framework. As a result, in this thesis we demonstrate that attention and regularization seem promising directions to overcome the problems of fine-grained image recognition, as well as proper treatment of the input and the output space. |
|
|
Address |
March 2019 |
|
|
Corporate Author |
|
Thesis |
Ph.D. thesis |
|
|
Publisher |
Ediciones Graficas Rey |
Place of Publication |
|
Editor |
Jordi Gonzalez;Josep M. Gonfaus;Xavier Roca |
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
978-84-948531-3-5 |
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
|
|
|
Notes |
ISE; 600.119 |
Approved |
no |
|
|
Call Number |
Admin @ si @ Rod2019 |
Serial |
3258 |
|
Permanent link to this record |
|
|
|
|
Author |
Jose Antonio Rodriguez |
|
|
Title |
Statistical frameworks and prior information modeling in handwritten word-spotting |
Type |
Book Whole |
|
Year |
2009 |
Publication |
PhD Thesis, Universitat Autonoma de Barcelona-CVC |
Abbreviated Journal |
|
|
|
Volume |
|
Issue |
|
Pages |
|
|
|
Keywords |
|
|
|
Abstract |
Handwritten word-spotting (HWS) is the pattern analysis task that consists in finding keywords in handwritten document images. So far, HWS has been applied mostly to historical documents in order to build search engines for such image collections. This thesis addresses the problem of word-spotting for detecting important keywords in business documents. This is a first step towards the process of automatic routing of correspondence based on content.
However, the application of traditional HWS techniques fails for this type of documents. As opposed to historical documents, real business documents present a very high variability in terms of writing styles, spontaneous writing, crossed-out words, spelling mistakes, etc. The main goal of this thesis is the development of pattern recognition techniques that lead to a high-performance HWS system for this challenging type of data.
We develop a statistical framework in which word models are expressed in terms of hidden Markov models and the a priori information is encoded in a universal vocabulary of Gaussian codewords. This systems leads to a very robust performance in word-spotting task. We also find that by constraining the word models to the universal vocabulary, the a priori information of the problem of interest can be exploited for developing new contributions. These include a novel writer adaptation method, a system for searching handwritten words by generating typed text images, and a novel model-based similarity between feature vector sequences. |
|
|
Address |
Barcelona (Spain) |
|
|
Corporate Author |
|
Thesis |
Ph.D. thesis |
|
|
Publisher |
Ediciones Graficas Rey |
Place of Publication |
|
Editor |
Gemma Sanchez;Josep Llados;Florent Perronnin |
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
|
|
|
Notes |
|
Approved |
no |
|
|
Call Number |
Admin @ si @ Rod2009 |
Serial |
1266 |
|
Permanent link to this record |
|
|
|
|
Author |
Jose Antonio Rodriguez |
|
|
Title |
Pen-based Interfaces and Recognition: Application to Proofreading Interpretation |
Type |
Report |
|
Year |
2006 |
Publication |
CVC Technical Report #96 |
Abbreviated Journal |
|
|
|
Volume |
|
Issue |
|
Pages |
|
|
|
Keywords |
|
|
|
Abstract |
|
|
|
Address |
CVC (UAB) |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
|
|
|
Notes |
|
Approved |
no |
|
|
Call Number |
Admin @ si @ Rod2006 |
Serial |
669 |
|
Permanent link to this record |
|
|
|
|
Author |
David Roche |
|
|
Title |
A Statistical Framework for Terminating Evolutionary Algorithms at their Steady State |
Type |
Book Whole |
|
Year |
2015 |
Publication |
PhD Thesis, Universitat Autonoma de Barcelona-CVC |
Abbreviated Journal |
|
|
|
Volume |
|
Issue |
|
Pages |
|
|
|
Keywords |
|
|
|
Abstract |
As any iterative technique, it is a necessary condition a stop criterion for terminating Evolutionary Algorithms (EA). In the case of optimization methods, the algorithm should stop at the time it has reached a steady state so it can not improve results anymore. Assessing the reliability of termination conditions for EAs is of prime importance. A wrong or weak stop criterion can negatively aect both the computational eort and the nal result.
In this Thesis, we introduce a statistical framework for assessing whether a termination condition is able to stop EA at its steady state. In one hand a numeric approximation to steady states to detect the point in which EA population has lost its diversity has been presented for EA termination. This approximation has been applied to dierent EA paradigms based on diversity and a selection of functions covering the properties most relevant for EA convergence. Experiments show that our condition works regardless of the search space dimension and function landscape and Dierential Evolution (DE) arises as the best paradigm. On the other hand, we use a regression model in order to determine the requirements ensuring that a measure derived from EA evolving population is related to the distance to the optimum in xspace.
Our theoretical framework is analyzed across several benchmark test functions
and two standard termination criteria based on function improvement in f-space and EA population x-space distribution for the DE paradigm. Results validate our statistical framework as a powerful tool for determining the capability of a measure for terminating EA and select the x-space distribution as the best-suited for accurately stopping DE in real-world applications. |
|
|
Address |
July 2015 |
|
|
Corporate Author |
|
Thesis |
Ph.D. thesis |
|
|
Publisher |
Ediciones Graficas Rey |
Place of Publication |
|
Editor |
Debora Gil;Jesus Giraldo |
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
|
|
|
Notes |
IAM; 600.075 |
Approved |
no |
|
|
Call Number |
Admin @ si @ Roc2015 |
Serial |
2686 |
|
Permanent link to this record |
|
|
|
|
Author |
Jordi Roca |
|
|
Title |
Constancy and inconstancy in categorical colour perception |
Type |
Book Whole |
|
Year |
2012 |
Publication |
PhD Thesis, Universitat Autonoma de Barcelona-CVC |
Abbreviated Journal |
|
|
|
Volume |
|
Issue |
|
Pages |
|
|
|
Keywords |
|
|
|
Abstract |
To recognise objects is perhaps the most important task an autonomous system, either biological or artificial needs to perform. In the context of human vision, this is partly achieved by recognizing the colour of surfaces despite changes in the wavelength distribution of the illumination, a property called colour constancy. Correct surface colour recognition may be adequately accomplished by colour category matching without the need to match colours precisely, therefore categorical colour constancy is likely to play an important role for object identification to be successful. The main aim of this work is to study the relationship between colour constancy and categorical colour perception. Previous studies of colour constancy have shown the influence of factors such the spatio-chromatic properties of the background, individual observer's performance, semantics, etc. However there is very little systematic study of these influences. To this end, we developed a new approach to colour constancy which includes both individual observers' categorical perception, the categorical structure of the background, and their interrelations resulting in a more comprehensive characterization of the phenomenon. In our study, we first developed a new method to analyse the categorical structure of 3D colour space, which allowed us to characterize individual categorical colour perception as well as quantify inter-individual variations in terms of shape and centroid location of 3D categorical regions. Second, we developed a new colour constancy paradigm, termed chromatic setting, which allows measuring the precise location of nine categorically-relevant points in colour space under immersive illumination. Additionally, we derived from these measurements a new colour constancy index which takes into account the magnitude and orientation of the chromatic shift, memory effects and the interrelations among colours and a model of colour naming tuned to each observer/adaptation state. Our results lead to the following conclusions: (1) There exists large inter-individual variations in the categorical structure of colour space, and thus colour naming ability varies significantly but this is not well predicted by low-level chromatic discrimination ability; (2) Analysis of the average colour naming space suggested the need for an additional three basic colour terms (turquoise, lilac and lime) for optimal colour communication; (3) Chromatic setting improved the precision of more complex linear colour constancy models and suggested that mechanisms other than cone gain might be best suited to explain colour constancy; (4) The categorical structure of colour space is broadly stable under illuminant changes for categorically balanced backgrounds; (5) Categorical inconstancy exists for categorically unbalanced backgrounds thus indicating that categorical information perceived in the initial stages of adaptation may constrain further categorical perception. |
|
|
Address |
|
|
|
Corporate Author |
|
Thesis |
Ph.D. thesis |
|
|
Publisher |
|
Place of Publication |
|
Editor |
Maria Vanrell;C. Alejandro Parraga |
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
|
|
|
Notes |
CIC |
Approved |
no |
|
|
Call Number |
Admin @ si @ Roc2012 |
Serial |
2893 |
|
Permanent link to this record |
|
|
|
|
Author |
Javier Rodenas; Bhalaji Nagarajan; Marc Bolaños; Petia Radeva |
|
|
Title |
Learning Multi-Subset of Classes for Fine-Grained Food Recognition |
Type |
Conference Article |
|
Year |
2022 |
Publication |
7th International Workshop on Multimedia Assisted Dietary Management |
Abbreviated Journal |
|
|
|
Volume |
|
Issue |
|
Pages |
17–26 |
|
|
Keywords |
|
|
|
Abstract |
Food image recognition is a complex computer vision task, because of the large number of fine-grained food classes. Fine-grained recognition tasks focus on learning subtle discriminative details to distinguish similar classes. In this paper, we introduce a new method to improve the classification of classes that are more difficult to discriminate based on Multi-Subsets learning. Using a pre-trained network, we organize classes in multiple subsets using a clustering technique. Later, we embed these subsets in a multi-head model structure. This structure has three distinguishable parts. First, we use several shared blocks to learn the generalized representation of the data. Second, we use multiple specialized blocks focusing on specific subsets that are difficult to distinguish. Lastly, we use a fully connected layer to weight the different subsets in an end-to-end manner by combining the neuron outputs. We validated our proposed method using two recent state-of-the-art vision transformers on three public food recognition datasets. Our method was successful in learning the confused classes better and we outperformed the state-of-the-art on the three datasets. |
|
|
Address |
Lisboa; Portugal; October 2022 |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
MADiMa |
|
|
Notes |
MILAB |
Approved |
no |
|
|
Call Number |
Admin @ si @ RNB2022 |
Serial |
3797 |
|
Permanent link to this record |