|
Records |
Links |
|
Author |
Arash Akbarinia; Raquel Gil Rodriguez; C. Alejandro Parraga |
|
|
Title |
Colour Constancy: Biologically-inspired Contrast Variant Pooling Mechanism |
Type |
Conference Article |
|
Year |
2017 |
Publication |
28th British Machine Vision Conference |
Abbreviated Journal |
|
|
|
Volume |
|
Issue |
|
Pages |
|
|
|
Keywords |
|
|
|
Abstract |
Pooling is a ubiquitous operation in image processing algorithms that allows for higher-level processes to collect relevant low-level features from a region of interest. Currently, max-pooling is one of the most commonly used operators in the computational literature. However, it can lack robustness to outliers due to the fact that it relies merely on the peak of a function. Pooling mechanisms are also present in the primate visual cortex where neurons of higher cortical areas pool signals from lower ones. The receptive fields of these neurons have been shown to vary according to the contrast by aggregating signals over a larger region in the presence of low contrast stimuli. We hypothesise that this contrast-variant-pooling mechanism can address some of the shortcomings of maxpooling. We modelled this contrast variation through a histogram clipping in which the percentage of pooled signal is inversely proportional to the local contrast of an image. We tested our hypothesis by applying it to the phenomenon of colour constancy where a number of popular algorithms utilise a max-pooling step (e.g. White-Patch, Grey-Edge and Double-Opponency). For each of these methods, we investigated the consequences of replacing their original max-pooling by the proposed contrast-variant-pooling. Our experiments on three colour constancy benchmark datasets suggest that previous results can significantly improve by adopting a contrast-variant-pooling mechanism. |
|
|
Address |
London; September 2017 |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
BMVC |
|
|
Notes |
NEUROBIT; 600.068; 600.072 |
Approved |
no |
|
|
Call Number |
Admin @ si @ AGP2017 |
Serial |
2992 |
|
Permanent link to this record |
|
|
|
|
Author |
Arash Akbarinia; C. Alejandro Parraga |
|
|
Title |
Biologically Plausible Colour Naming Model |
Type |
Conference Article |
|
Year |
2015 |
Publication |
European Conference on Visual Perception ECVP2015 |
Abbreviated Journal |
|
|
|
Volume |
|
Issue |
|
Pages |
|
|
|
Keywords |
|
|
|
Abstract |
Poster |
|
|
Address |
Liverpool; UK; August 2015 |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
ECVP |
|
|
Notes |
NEUROBIT; 600.068 |
Approved |
no |
|
|
Call Number |
Admin @ si @ AkP2015 |
Serial |
2660 |
|
Permanent link to this record |
|
|
|
|
Author |
C. Alejandro Parraga; Arash Akbarinia |
|
|
Title |
NICE: A Computational Solution to Close the Gap from Colour Perception to Colour Categorization |
Type |
Journal Article |
|
Year |
2016 |
Publication |
PLoS One |
Abbreviated Journal |
Plos |
|
|
Volume |
11 |
Issue |
3 |
Pages |
e0149538 |
|
|
Keywords |
|
|
|
Abstract |
The segmentation of visible electromagnetic radiation into chromatic categories by the human visual system has been extensively studied from a perceptual point of view, resulting in several colour appearance models. However, there is currently a void when it comes to relate these results to the physiological mechanisms that are known to shape the pre-cortical and cortical visual pathway. This work intends to begin to fill this void by proposing a new physiologically plausible model of colour categorization based on Neural Isoresponsive Colour Ellipsoids (NICE) in the cone-contrast space defined by the main directions of the visual signals entering the visual cortex. The model was adjusted to fit psychophysical measures that concentrate on the categorical boundaries and are consistent with the ellipsoidal isoresponse surfaces of visual cortical neurons. By revealing the shape of such categorical colour regions, our measures allow for a more precise and parsimonious description, connecting well-known early visual processing mechanisms to the less understood phenomenon of colour categorization. To test the feasibility of our method we applied it to exemplary images and a popular ground-truth chart obtaining labelling results that are better than those of current state-of-the-art algorithms. |
|
|
Address |
|
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
|
|
|
Notes |
NEUROBIT; 600.068 |
Approved |
no |
|
|
Call Number |
Admin @ si @ PaA2016a |
Serial |
2747 |
|
Permanent link to this record |
|
|
|
|
Author |
Xavier Otazu; Olivier Penacchio; Xim Cerda-Company |
|
|
Title |
Brightness and colour induction through contextual influences in V1 |
Type |
Conference Article |
|
Year |
2015 |
Publication |
Scottish Vision Group 2015 SGV2015 |
Abbreviated Journal |
|
|
|
Volume |
12 |
Issue |
9 |
Pages |
1208-2012 |
|
|
Keywords |
|
|
|
Abstract |
|
|
|
Address |
Carnoustie; Scotland; March 2015 |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
SGV |
|
|
Notes |
NEUROBIT; |
Approved |
no |
|
|
Call Number |
Admin @ si @ OPC2015a |
Serial |
2632 |
|
Permanent link to this record |
|
|
|
|
Author |
Olivier Penacchio; Xavier Otazu; A. wilkins; J. Harris |
|
|
Title |
Uncomfortable images prevent lateral interactions in the cortex from providing a sparse code |
Type |
Conference Article |
|
Year |
2015 |
Publication |
European Conference on Visual Perception ECVP2015 |
Abbreviated Journal |
|
|
|
Volume |
|
Issue |
|
Pages |
|
|
|
Keywords |
|
|
|
Abstract |
|
|
|
Address |
Liverpool; uk; August 2015 |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
ECVP |
|
|
Notes |
NEUROBIT; |
Approved |
no |
|
|
Call Number |
Admin @ si @ POW2015 |
Serial |
2633 |
|
Permanent link to this record |
|
|
|
|
Author |
Xavier Otazu; Olivier Penacchio; Xim Cerda-Company |
|
|
Title |
An excitatory-inhibitory firing rate model accounts for brightness induction, colour induction and visual discomfort |
Type |
Conference Article |
|
Year |
2015 |
Publication |
Barcelona Computational, Cognitive and Systems Neuroscience |
Abbreviated Journal |
|
|
|
Volume |
|
Issue |
|
Pages |
|
|
|
Keywords |
|
|
|
Abstract |
|
|
|
Address |
Barcelona; June 2015 |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
BARCCSYN |
|
|
Notes |
NEUROBIT; |
Approved |
no |
|
|
Call Number |
Admin @ si @ OPC2015b |
Serial |
2634 |
|
Permanent link to this record |
|
|
|
|
Author |
Eduardo Tusa; Arash Akbarinia; Raquel Gil Rodriguez; Corina Barbalata |
|
|
Title |
Real-Time Face Detection and Tracking Utilising OpenMP and ROS |
Type |
Conference Article |
|
Year |
2015 |
Publication |
3rd Asia-Pacific Conference on Computer Aided System Engineering |
Abbreviated Journal |
|
|
|
Volume |
|
Issue |
|
Pages |
179 - 184 |
|
|
Keywords |
RGB-D; Kinect; Human Detection and Tracking; ROS; OpenMP |
|
|
Abstract |
The first requisite of a robot to succeed in social interactions is accurate human localisation, i.e. subject detection and tracking. Later, it is estimated whether an interaction partner seeks attention, for example by interpreting the position and orientation of the body. In computer vision, these cues usually are obtained in colour images, whose qualities are degraded in ill illuminated social scenes. In these scenarios depth sensors offer a richer representation. Therefore, it is important to combine colour and depth information. The
second aspect that plays a fundamental role in the acceptance of social robots is their real-time-ability. Processing colour and depth images is computationally demanding. To overcome this we propose a parallelisation strategy of face detection and tracking based on two different architectures: message passing and shared memory. Our results demonstrate high accuracy in
low computational time, processing nine times more number of frames in a parallel implementation. This provides a real-time social robot interaction. |
|
|
Address |
Quito; Ecuador; July 2015 |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
APCASE |
|
|
Notes |
NEUROBIT |
Approved |
no |
|
|
Call Number |
Admin @ si @ TAG2015 |
Serial |
2659 |
|
Permanent link to this record |
|
|
|
|
Author |
Arash Akbarinia; C. Alejandro Parraga |
|
|
Title |
Dynamically Adjusted Surround Contrast Enhances Boundary Detection, European Conference on Visual Perception |
Type |
Conference Article |
|
Year |
2016 |
Publication |
European Conference on Visual Perception |
Abbreviated Journal |
|
|
|
Volume |
|
Issue |
|
Pages |
|
|
|
Keywords |
|
|
|
Abstract |
|
|
|
Address |
Barcelona; Spain; August 2016 |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
ECVP |
|
|
Notes |
NEUROBIT |
Approved |
no |
|
|
Call Number |
Admin @ si @ AkP2016b |
Serial |
2900 |
|
Permanent link to this record |
|
|
|
|
Author |
C. Alejandro Parraga; Arash Akbarinia |
|
|
Title |
Colour Constancy as a Product of Dynamic Centre-Surround Adaptation |
Type |
Conference Article |
|
Year |
2016 |
Publication |
16th Annual meeting in Vision Sciences Society |
Abbreviated Journal |
|
|
|
Volume |
16 |
Issue |
12 |
Pages |
|
|
|
Keywords |
|
|
|
Abstract |
Colour constancy refers to the human visual system's ability to preserve the perceived colour of objects despite changes in the illumination. Its exact mechanisms are unknown, although a number of systems ranging from retinal to cortical and memory are thought to play important roles. The strength of the perceptual shift necessary to preserve these colours is usually estimated by the vectorial distances from an ideal match (or canonical illuminant). In this work we explore how much of the colour constancy phenomenon could be explained by well-known physiological properties of V1 and V2 neurons whose receptive fields (RF) vary according to the contrast and orientation of surround stimuli. Indeed, it has been shown that both RF size and the normalization occurring between centre and surround in cortical neurons depend on the local properties of surrounding stimuli. Our stating point is the construction of a computational model which includes this dynamical centre-surround adaptation by means of two overlapping asymmetric Gaussian kernels whose variances are adjusted to the contrast of surrounding pixels to represent the changes in RF size of cortical neurons and the weights of their respective contributions are altered according to differences in centre-surround contrast and orientation. The final output of the model is obtained after convolving an image with this dynamical operator and an estimation of the illuminant is obtained by considering the contrast of the far surround. We tested our algorithm on naturalistic stimuli from several benchmark datasets. Our results show that although our model does not require any training, its performance against the state-of-the-art is highly competitive, even outperforming learning-based algorithms in some cases. Indeed, these results are very encouraging if we consider that they were obtained with the same parameters for all datasets (i.e. just like the human visual system operates). |
|
|
Address |
Florida; USA; May 2016 |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
VSS |
|
|
Notes |
NEUROBIT |
Approved |
no |
|
|
Call Number |
Admin @ si @ PaA2016b |
Serial |
2901 |
|
Permanent link to this record |
|
|
|
|
Author |
Arash Akbarinia; C. Alejandro Parraga; Marta Exposito; Bogdan Raducanu; Xavier Otazu |
|
|
Title |
Can biological solutions help computers detect symmetry? |
Type |
Conference Article |
|
Year |
2017 |
Publication |
40th European Conference on Visual Perception |
Abbreviated Journal |
|
|
|
Volume |
|
Issue |
|
Pages |
|
|
|
Keywords |
|
|
|
Abstract |
|
|
|
Address |
Berlin; Germany; August 2017 |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
ECVP |
|
|
Notes |
NEUROBIT |
Approved |
no |
|
|
Call Number |
Admin @ si @ APE2017 |
Serial |
2995 |
|
Permanent link to this record |
|
|
|
|
Author |
Arash Akbarinia |
|
|
Title |
Computational Model of Visual Perception: From Colour to Form |
Type |
Book Whole |
|
Year |
2017 |
Publication |
PhD Thesis, Universitat Autonoma de Barcelona-CVC |
Abbreviated Journal |
|
|
|
Volume |
|
Issue |
|
Pages |
|
|
|
Keywords |
|
|
|
Abstract |
The original idea of this project was to study the role of colour in the challenging task of object recognition. We started by extending previous research on colour naming showing that it is feasible to capture colour terms through parsimonious ellipsoids. Although, the results of our model exceeded state-of-the-art in two benchmark datasets, we realised that the two phenomena of metameric lights and colour constancy must be addressed prior to any further colour processing. Our investigation of metameric pairs reached the conclusion that they are infrequent in real world scenarios. Contrary to that, the illumination of a scene often changes dramatically. We addressed this issue by proposing a colour constancy model inspired by the dynamical centre-surround adaptation of neurons in the visual cortex. This was implemented through two overlapping asymmetric Gaussians whose variances and heights are adjusted according to the local contrast of pixels. We complemented this model with a generic contrast-variant pooling mechanism that inversely connect the percentage of pooled signal to the local contrast of a region. The results of our experiments on four benchmark datasets were indeed promising: the proposed model, although simple, outperformed even learning-based approaches in many cases. Encouraged by the success of our contrast-variant surround modulation, we extended this approach to detect boundaries of objects. We proposed an edge detection model based on the first derivative of the Gaussian kernel. We incorporated four types of surround: full, far, iso- and orthogonal-orientation. Furthermore, we accounted for the pooling mechanism at higher cortical areas and the shape feedback sent to lower areas. Our results in three benchmark datasets showed significant improvement over non-learning algorithms.
To summarise, we demonstrated that biologically-inspired models offer promising solutions to computer vision problems, such as, colour naming, colour constancy and edge detection. We believe that the greatest contribution of this Ph.D dissertation is modelling the concept of dynamic surround modulation that shows the significance of contrast-variant surround integration. The models proposed here are grounded on only a portion of what we know about the human visual system. Therefore, it is only natural to complement them accordingly in future works. |
|
|
Address |
October 2017 |
|
|
Corporate Author |
|
Thesis |
Ph.D. thesis |
|
|
Publisher |
Ediciones Graficas Rey |
Place of Publication |
|
Editor |
C. Alejandro Parraga |
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
978-84-945373-4-9 |
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
|
|
|
Notes |
NEUROBIT |
Approved |
no |
|
|
Call Number |
Admin @ si @ Akb2017 |
Serial |
3019 |
|
Permanent link to this record |
|
|
|
|
Author |
Xim Cerda-Company |
|
|
Title |
Understanding color vision: from psychophysics to computational modeling |
Type |
Book Whole |
|
Year |
2019 |
Publication |
PhD Thesis, Universitat Autonoma de Barcelona-CVC |
Abbreviated Journal |
|
|
|
Volume |
|
Issue |
|
Pages |
|
|
|
Keywords |
|
|
|
Abstract |
In this PhD we have approached the human color vision from two different points of view: psychophysics and computational modeling. First, we have evaluated 15 different tone-mapping operators (TMOs). We have conducted two experiments that
consider two different criteria: the first one evaluates the local relationships among intensity levels and the second one evaluates the global appearance of the tonemapped imagesw.r.t. the physical one (presented side by side). We conclude that the rankings depend on the criterion and they are not correlated. Considering both criteria, the best TMOs are KimKautz (Kim and Kautz, 2008) and Krawczyk (Krawczyk, Myszkowski, and Seidel, 2005). Another conclusion is that a more standardized evaluation criteria is needed to do a fair comparison among TMOs.
Secondly, we have conducted several psychophysical experiments to study the
color induction. We have studied two different properties of the visual stimuli: temporal frequency and luminance spatial distribution. To study the temporal frequency we defined equiluminant stimuli composed by both uniform and striped surrounds and we flashed them varying the flash duration. For uniform surrounds, the results show that color induction depends on both the flash duration and inducer’s chromaticity. As expected, in all chromatic conditions color contrast was induced. In contrast, for striped surrounds, we expected to induce color assimilation, but we observed color contrast or no induction. Since similar but not equiluminant striped stimuli induce color assimilation, we concluded that luminance differences could be a key factor to induce color assimilation. Thus, in a subsequent study, we have studied the luminance differences’ effect on color assimilation. We varied the luminance difference between the target region and its inducers and we observed that color assimilation depends on both this difference and the inducer’s chromaticity. For red-green condition (where the first inducer is red and the second one is green), color assimilation occurs in almost all luminance conditions.
Instead, for green-red condition, color assimilation never occurs. Purple-lime
and lime-purple chromatic conditions show that luminance difference is a key factor to induce color assimilation. When the target is darker than its surround, color assimilation is stronger in purple-lime, while when the target is brighter, color assimilation is stronger in lime-purple (’mirroring’ effect). Moreover, we evaluated whether color assimilation is due to luminance or brightness differences. Similarly to equiluminance condition, when the stimuli are equibrightness no color assimilation is induced. Our results support the hypothesis that mutual-inhibition plays a major role in color perception, or at least in color induction.
Finally, we have defined a new firing rate model of color processing in the V1
parvocellular pathway. We have modeled two different layers of this cortical area: layers 4Cb and 2/3. Our model is a recurrent dynamic computational model that considers both excitatory and inhibitory cells and their lateral connections. Moreover, it considers the existent laminar differences and the cells’ variety. Thus, we have modeled both single- and double-opponent simple cells and complex cells, which are a pool of double-opponent simple cells. A set of sinusoidal drifting gratings have been used to test the architecture. In these gratings we have varied several spatial properties such as temporal and spatial frequencies, grating’s area and orientation. To reproduce the electrophysiological observations, the architecture has to consider the existence of non-oriented double-opponent cells in layer 4Cb and the lack of lateral connections between single-opponent cells. Moreover, we have tested our lateral connections simulating the center-surround modulation and we have reproduced physiological measurements where for high contrast stimulus, the
result of the lateral connections is inhibitory, while it is facilitatory for low contrast stimulus. |
|
|
Address |
March 2019 |
|
|
Corporate Author |
|
Thesis |
Ph.D. thesis |
|
|
Publisher |
Ediciones Graficas Rey |
Place of Publication |
|
Editor |
Xavier Otazu |
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
978-84-948531-4-2 |
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
|
|
|
Notes |
NEUROBIT |
Approved |
no |
|
|
Call Number |
Admin @ si @ Cer2019 |
Serial |
3259 |
|
Permanent link to this record |
|
|
|
|
Author |
David Berga; Xavier Otazu |
|
|
Title |
Computations of top-down attention by modulating V1 dynamics |
Type |
Conference Article |
|
Year |
2020 |
Publication |
Computational and Mathematical Models in Vision |
Abbreviated Journal |
|
|
|
Volume |
|
Issue |
|
Pages |
|
|
|
Keywords |
|
|
|
Abstract |
|
|
|
Address |
St. Pete Beach; Florida; May 2020 |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
MODVIS |
|
|
Notes |
NEUROBIT |
Approved |
no |
|
|
Call Number |
Admin @ si @ BeO2020a |
Serial |
3376 |
|
Permanent link to this record |
|
|
|
|
Author |
David Berga |
|
|
Title |
Understanding Eye Movements: Psychophysics and a Model of Primary Visual Cortex |
Type |
Book Whole |
|
Year |
2019 |
Publication |
PhD Thesis, Universitat Autonoma de Barcelona-CVC |
Abbreviated Journal |
|
|
|
Volume |
|
Issue |
|
Pages |
|
|
|
Keywords |
|
|
|
Abstract |
Humansmove their eyes in order to learn visual representations of the world. These eye movements depend on distinct factors, either by the scene that we perceive or by our own decisions. To select what is relevant to attend is part of our survival mechanisms and the way we build reality, as we constantly react both consciously and unconsciously to all the stimuli that is projected into our eyes. In this thesis we try to explain (1) how we move our eyes, (2) how to build machines that understand visual information and deploy eyemovements, and (3) how to make these machines understand tasks in order to decide for eye movements.
(1) We provided the analysis of eye movement behavior elicited by low-level feature distinctiveness with a dataset of 230 synthetically-generated image patterns. A total of 15 types of stimuli has been generated (e.g. orientation, brightness, color, size, etc.), with 7 feature contrasts for each feature category. Eye-tracking data was collected from 34 participants during the viewing of the dataset, using Free-Viewing and Visual Search task instructions. Results showed that saliency is predominantly and distinctively influenced by: 1. feature type, 2. feature contrast, 3. Temporality of fixations, 4. task difficulty and 5. center bias. From such dataset (SID4VAM), we have computed a benchmark of saliency models by testing performance using psychophysical patterns. Model performance has been evaluated considering model inspiration and consistency with human psychophysics. Our study reveals that state-of-the-art Deep Learning saliency models do not performwell with synthetic pattern images, instead, modelswith Spectral/Fourier inspiration outperform others in saliency metrics and are more consistent with human psychophysical experimentation.
(2) Computations in the primary visual cortex (area V1 or striate cortex) have long been hypothesized to be responsible, among several visual processing mechanisms, of bottom-up visual attention (also named saliency). In order to validate this hypothesis, images from eye tracking datasets have been processed with a biologically plausible model of V1 (named Neurodynamic SaliencyWaveletModel or NSWAM). Following Li’s neurodynamic model, we define V1’s lateral connections with a network of firing rate neurons, sensitive to visual features such as brightness, color, orientation and scale. Early subcortical processes (i.e. retinal and thalamic) are functionally simulated. The resulting saliency maps are generated from the model output, representing the neuronal activity of V1 projections towards brain areas involved in eye movement control. We want to pinpoint that our unified computational architecture is able to reproduce several visual processes (i.e. brightness, chromatic induction and visual discomfort) without applying any type of training or optimization and keeping the same parametrization. The model has been extended (NSWAM-CM) with an implementation of the cortical magnification function to define the retinotopical projections towards V1, processing neuronal activity for each distinct view during scene observation. Novel computational definitions of top-down inhibition (in terms of inhibition of return and selection mechanisms), are also proposed to predict attention in Free-Viewing and Visual Search conditions. Results show that our model outperforms other biologically-inpired models of saliency prediction as well as to predict visual saccade sequences, specifically for nature and synthetic images. We also show how temporal and spatial characteristics of inhibition of return can improve prediction of saccades, as well as how distinct search strategies (in terms of feature-selective or category-specific inhibition) predict attention at distinct image contexts.
(3) Although previous scanpath models have been able to efficiently predict saccades during Free-Viewing, it is well known that stimulus and task instructions can strongly affect eye movement patterns. In particular, task priming has been shown to be crucial to the deployment of eye movements, involving interactions between brain areas related to goal-directed behavior, working and long-termmemory in combination with stimulus-driven eyemovement neuronal correlates. In our latest study we proposed an extension of the Selective Tuning Attentive Reference Fixation ControllerModel based on task demands (STAR-FCT), describing novel computational definitions of Long-TermMemory, Visual Task Executive and Task Working Memory. With these modules we are able to use textual instructions in order to guide the model to attend to specific categories of objects and/or places in the scene. We have designed our memorymodel by processing a visual hierarchy of low- and high-level features. The relationship between the executive task instructions and the memory representations has been specified using a tree of semantic similarities between the learned features and the object category labels. Results reveal that by using this model, the resulting object localizationmaps and predicted saccades have a higher probability to fall inside the salient regions depending on the distinct task instructions compared to saliency. |
|
|
Address |
July 2019 |
|
|
Corporate Author |
|
Thesis |
Ph.D. thesis |
|
|
Publisher |
Ediciones Graficas Rey |
Place of Publication |
|
Editor |
Xavier Otazu |
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
978-84-948531-8-0 |
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
|
|
|
Notes |
NEUROBIT |
Approved |
no |
|
|
Call Number |
Admin @ si @ Ber2019 |
Serial |
3390 |
|
Permanent link to this record |
|
|
|
|
Author |
David Berga; Xavier Otazu |
|
|
Title |
Modeling Bottom-Up and Top-Down Attention with a Neurodynamic Model of V1 |
Type |
Journal Article |
|
Year |
2020 |
Publication |
Neurocomputing |
Abbreviated Journal |
NEUCOM |
|
|
Volume |
417 |
Issue |
|
Pages |
270-289 |
|
|
Keywords |
|
|
|
Abstract |
Previous studies suggested that lateral interactions of V1 cells are responsible, among other visual effects, of bottom-up visual attention (alternatively named visual salience or saliency). Our objective is to mimic these connections with a neurodynamic network of firing-rate neurons in order to predict visual attention. Early visual subcortical processes (i.e. retinal and thalamic) are functionally simulated. An implementation of the cortical magnification function is included to define the retinotopical projections towards V1, processing neuronal activity for each distinct view during scene observation. Novel computational definitions of top-down inhibition (in terms of inhibition of return, oculomotor and selection mechanisms), are also proposed to predict attention in Free-Viewing and Visual Search tasks. Results show that our model outpeforms other biologically inspired models of saliency prediction while predicting visual saccade sequences with the same model. We also show how temporal and spatial characteristics of saccade amplitude and inhibition of return can improve prediction of saccades, as well as how distinct search strategies (in terms of feature-selective or category-specific inhibition) can predict attention at distinct image contexts. |
|
|
Address |
|
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
|
|
|
Notes |
NEUROBIT |
Approved |
no |
|
|
Call Number |
Admin @ si @ BeO2020c |
Serial |
3444 |
|
Permanent link to this record |