toggle visibility Search & Display Options

Select All    Deselect All
 |   | 
Details
   print
  Records Links
Author Michael Teutsch; Angel Sappa; Riad I. Hammoud edit  url
isbn  openurl
  Title Computer Vision in the Infrared Spectrum: Challenges and Approaches Type Book Whole
  Year 2021 Publication Synthesis Lectures on Computer Vision Abbreviated Journal  
  Volume 10 Issue 2 Pages 1-138  
  Keywords  
  Abstract Human visual perception is limited to the visual-optical spectrum. Machine vision is not. Cameras sensitive to the different infrared spectra can enhance the abilities of autonomous systems and visually perceive the environment in a holistic way. Relevant scene content can be made visible especially in situations, where sensors of other modalities face issues like a visual-optical camera that needs a source of illumination. As a consequence, not only human mistakes can be avoided by increasing the level of automation, but also machine-induced errors can be reduced that, for example, could make a self-driving car crash into a pedestrian under difficult illumination conditions. Furthermore, multi-spectral sensor systems with infrared imagery as one modality are a rich source of information and can provably increase the robustness of many autonomous systems. Applications that can benefit from utilizing infrared imagery range from robotics to automotive and from biometrics to surveillance. In this book, we provide a brief yet concise introduction to the current state-of-the-art of computer vision and machine learning in the infrared spectrum. Based on various popular computer vision tasks such as image enhancement, object detection, or object tracking, we first motivate each task starting from established literature in the visual-optical spectrum. Then, we discuss the differences between processing images and videos in the visual-optical spectrum and the various infrared spectra. An overview of the current literature is provided together with an outlook for each task. Furthermore, available and annotated public datasets and common evaluation methods and metrics are presented. In a separate chapter, popular applications that can greatly benefit from the use of infrared imagery as a data source are presented and discussed. Among them are automatic target recognition, video surveillance, or biometrics including face recognition. Finally, we conclude with recommendations for well-fitting sensor setups and data processing algorithms for certain computer vision tasks. We address this book to prospective researchers and engineers new to the field but also to anyone who wants to get introduced to the challenges and the approaches of computer vision using infrared images or videos. Readers will be able to start their work directly after reading the book supported by a highly comprehensive backlog of recent and relevant literature as well as related infrared datasets including existing evaluation frameworks. Together with consistently decreasing costs for infrared cameras, new fields of application appear and make computer vision in the infrared spectrum a great opportunity to face nowadays scientific and engineering challenges.  
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN 978-1636392431 Medium  
  Area Expedition Conference  
  Notes (up) MSIAU Approved no  
  Call Number Admin @ si @ TSH2021 Serial 3666  
Permanent link to this record
 

 
Author Armin Mehri edit  isbn
openurl 
  Title Deep learning based architectures for cross-domain image processing Type Book Whole
  Year 2023 Publication PhD Thesis, Universitat Autonoma de Barcelona-CVC Abbreviated Journal  
  Volume Issue Pages  
  Keywords  
  Abstract Human vision is restricted to the visual-optical spectrum. Machine vision is not.
Cameras sensitive to diverse infrared spectral bands can improve the capacities of
autonomous systems and provide a comprehensive view. Relevant scene content
can be made visible, particularly in situations when sensors of other modalities,
such as a visual-optical camera, require a source of illumination. As a result, increasing the level of automation not only avoids human errors but also reduces
machine-induced errors. Furthermore, multi-spectral sensor systems with infrared
imagery as one modality are a rich source of information and can conceivably
increase the robustness of many autonomous systems. Robotics, automobiles,
biometrics, security, surveillance, and the military are some examples of fields
that can profit from the use of infrared imagery in their respective applications.
Although multimodal spectral sensors have come a long way, there are still several
bottlenecks that prevent us from combining their output information and using
them as comprehensive images. The primary issue with infrared imaging is the lack
of potential benefits due to their cost influence on sensor resolution, which grows
exponentially with greater resolution. Due to the more costly sensor technology
required for their development, their resolutions are substantially lower than thoseof regular digital cameras.
This thesis aims to improve beyond-visible-spectrum machine vision by integrating multi-modal spectral sensors. The emphasis is on transforming the produced images to enhance their resolution to match expected human perception, bring the color representation close to human understanding of natural color, and improve machine vision application performance. This research focuses mainly on two tasks, image Colorization and Image Super resolution for both single- and cross-domain problems. We first start with an extensive review of the state of the art in both tasks, point out the shortcomings of existing approaches, and then present our solutions to address their limitations. Our solutions demonstrate that low-cost channel information (i.e., visible image) can be used to improve expensive channel
information (i.e., infrared image), resulting in images with higher quality and closer to human perception at a lower cost than a high-cost infrared camera.
 
  Address  
  Corporate Author Thesis Ph.D. thesis  
  Publisher IMPRIMA Place of Publication Editor Angel Sappa  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN 978-84-126409-1-5 Medium  
  Area Expedition Conference  
  Notes (up) MSIAU Approved no  
  Call Number Admin @ si @ Meh2023 Serial 3959  
Permanent link to this record
 

 
Author Xavier Soria edit  isbn
openurl 
  Title Single sensor multi-spectral imaging Type Book Whole
  Year 2019 Publication PhD Thesis, Universitat Autonoma de Barcelona-CVC Abbreviated Journal  
  Volume Issue Pages  
  Keywords  
  Abstract The image sensor, nowadays, is rolling the smartphone industry. While some phone brands explore equipping more image sensors, others, like Google, maintain their smartphones with just one sensor; but this sensor is equipped with Deep Learning to enhance the image quality. However, what all brands agree on is the need to research new image sensors; for instance, in 2015 Omnivision and PixelTeq presented new CMOS based image sensors defined as multispectral Single Sensor Camera (SSC), which are capable of capturing multispectral bands. This dissertation presents the benefits of using a multispectral SSCs that, as aforementioned, simultaneously acquires images in the visible and near-infrared (NIR) bands. The principal benefits while addressing problems related to image bands in the spectral range of 400 to 1100 nanometers, there are cost reductions in the hardware and software setup because only one SSC is needed instead of two, and the images alignment are not required any more. Concerning to the NIR spectrum, many works in literature have proven the benefits of working with NIR to enhance RGB images (e.g., image enhancement, remove shadows, dehazing, etc.). In spite of the advantage of using SSC (e.g., low latency), there are some drawback to be solved. One of this drawback corresponds to the nature of the silicon-based sensor, which in addition to capture the RGB image, when the infrared cut off filter is not installed it also acquires NIR information into the visible image. This phenomenon is called RGB and NIR crosstalking. This thesis firstly faces this problem in challenging images and then it shows the benefit of using multispectral images in the edge detection task.
The RGB color restoration from RGBN image is the topic tackled in RGB and NIR crosstalking. Even though in the literature a set of processes have been proposed to face this issue, in this thesis novel approaches, based on DL, are proposed to subtract the additional NIR included in the RGB channel. More precisely, an Artificial Neural Network (NN) and two Convolutional Neural Network (CNN) models are proposed. As the DL based models need a dataset with a large collection of image pairs, a large dataset is collected to address the color restoration. The collected images are from challenging scenes where the sunlight radiation is sufficient to give absorption/reflectance properties to the considered scenes. An extensive evaluation has been conducted on the CNN models, differences from most of the restored images are almost imperceptible to the human eye. The next proposal of the thesis is the validation of the usage of SSC images in the edge detection task. Three methods based on CNN have been proposed. While the first one is based on the most used model, holistically-nested edge detection (HED) termed as multispectral HED (MS-HED), the other two have been proposed observing the drawbacks of MS-HED. These two novel architectures have been designed from scratch (training from scratch); after the first architecture is validated in the visible domain a slight redesign is proposed to tackle the multispectral domain. Again, another dataset is collected to face this problem with SSCs. Even though edge detection is confronted in the multispectral domain, its qualitative and quantitative evaluation demonstrates the generalization in other datasets used for edge detection, improving state-of-the-art results.
 
  Address September 2019  
  Corporate Author Thesis Ph.D. thesis  
  Publisher Ediciones Graficas Rey Place of Publication Editor Angel Sappa  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN 978-84-948531-9-7 Medium  
  Area Expedition Conference  
  Notes (up) MSIAU; 600.122 Approved no  
  Call Number Admin @ si @ Sor2019 Serial 3391  
Permanent link to this record
 

 
Author Angel Sappa (ed) edit  doi
isbn  openurl
  Title ICT Applications for Smart Cities Type Book Whole
  Year 2022 Publication ICT Applications for Smart Cities Abbreviated Journal  
  Volume 224 Issue Pages  
  Keywords Computational Intelligence; Intelligent Systems; Smart Cities; ICT Applications; Machine Learning; Pattern Recognition; Computer Vision; Image Processing  
  Abstract Part of the book series: Intelligent Systems Reference Library (ISRL)

This book is the result of four-year work in the framework of the Ibero-American Research Network TICs4CI funded by the CYTED program. In the following decades, 85% of the world's population is expected to live in cities; hence, urban centers should be prepared to provide smart solutions for problems ranging from video surveillance and intelligent mobility to the solid waste recycling processes, just to mention a few. More specifically, the book describes underlying technologies and practical implementations of several successful case studies of ICTs developed in the following smart city areas:

• Urban environment monitoring
• Intelligent mobility
• Waste recycling processes
• Video surveillance
• Computer-aided diagnose in healthcare systems
• Computer vision-based approaches for efficiency in production processes

The book is intended for researchers and engineers in the field of ICTs for smart cities, as well as to anyone who wants to know about state-of-the-art approaches and challenges on this field.
 
  Address September 2022  
  Corporate Author Thesis  
  Publisher Springer Place of Publication Editor Angel Sappa  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title ISRL  
  Series Volume Series Issue Edition  
  ISSN ISBN 978-3-031-06306-0 Medium  
  Area Expedition Conference  
  Notes (up) MSIAU; MACO Approved no  
  Call Number Admin @ si @ Sap2022 Serial 3812  
Permanent link to this record
 

 
Author Jorge Bernal edit  openurl
  Title Polyp Localization and Segmentation in Colonoscopy Images by Means of a Model of Appearance for Polyps Type Book Whole
  Year 2012 Publication PhD Thesis, Universitat Autonoma de Barcelona-CVC Abbreviated Journal  
  Volume Issue Pages  
  Keywords  
  Abstract Colorectal cancer is the fourth most common cause of cancer death worldwide and its survival rate depends on the stage in which it is detected on hence the necessity for an early colon screening. There are several screening techniques but colonoscopy is still nowadays the gold standard, although it has some drawbacks such as the miss rate. Our contribution, in the field of intelligent systems for colonoscopy, aims at providing a polyp localization and a polyp segmentation system based on a model of appearance for polyps. To develop both methods we define a model of appearance for polyps, which describes a polyp as enclosed by intensity valleys. The novelty of our contribution resides on the fact that we include in our model aspects of the image formation and we also consider the presence of other elements from the endoluminal scene such as specular highlights and blood vessels, which have an impact on the performance of our methods. In order to develop our polyp localization method we accumulate valley information in order to generate energy maps, which are also used to guide the polyp segmentation. Our methods achieve promising results in polyp localization and segmentation. As we want to explore the usability of our methods we present a comparative analysis between physicians fixations obtained via an eye tracking device and our polyp localization method. The results show that our method is indistinguishable to novice physicians although it is far from expert physicians.  
  Address  
  Corporate Author Thesis Ph.D. thesis  
  Publisher Ediciones Graficas Rey Place of Publication Editor F. Javier Sanchez;Fernando Vilariño  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area 800 Expedition Conference  
  Notes (up) MV Approved no  
  Call Number Admin @ si @ Ber2012 Serial 2211  
Permanent link to this record
 

 
Author Joan M. Nuñez edit  isbn
openurl 
  Title Vascular Pattern Characterization in Colonoscopy Images Type Book Whole
  Year 2015 Publication PhD Thesis, Universitat Autonoma de Barcelona-CVC Abbreviated Journal  
  Volume Issue Pages  
  Keywords  
  Abstract Colorectal cancer is the third most common cancer worldwide and the second most common malignant tumor in Europe. Screening tests have shown to be very e ective in increasing the survival rates since they allow an early detection of polyps. Among the di erent screening techniques, colonoscopy is considered the gold standard although clinical studies mention several problems that have an impact in the quality of the procedure. The navigation through the rectum and colon track can be challenging for the physicians which can increase polyp miss rates. The thorough visualization of the colon track must be ensured so that
the chances of missing lesions are minimized. The visual analysis of colonoscopy images can provide important information to the physicians and support their navigation during the procedure.
Blood vessels and their branching patterns can provide descriptive power to potentially develop biometric markers. Anatomical markers based on blood vessel patterns could be used to identify a particular scene in colonoscopy videos and to support endoscope navigation by generating a sequence of ordered scenes through the di erent colon sections. By verifying the presence of vascular content in the endoluminal scene it is also possible to certify a proper
inspection of the colon mucosa and to improve polyp localization. Considering the potential uses of blood vessel description, this contribution studies the characterization of the vascular content and the analysis of the descriptive power of its branching patterns.
Blood vessel characterization in colonoscopy images is shown to be a challenging task. The endoluminal scene is conformed by several elements whose similar characteristics hinder the development of particular models for each of them. To overcome such diculties we propose the use of the blood vessel branching characteristics as key features for pattern description. We present a model to characterize junctions in binary patterns. The implementation
of the junction model allows us to develop a junction localization method. We
created two data sets including manually labeled vessel information as well as manual ground truths of two types of keypoint landmarks: junctions and endpoints. The proposed method outperforms the available algorithms in the literature in experiments in both, our newly created colon vessel data set, and in DRIVE retinal fundus image data set. In the latter case, we created a manual ground truth of junction coordinates. Since we want to explore the descriptive potential of junctions and vessels, we propose a graph-based approach to
create anatomical markers. In the context of polyp localization, we present a new method to inhibit the in uence of blood vessels in the extraction valley-pro le information. The results show that our methodology decreases vessel in
uence, increases polyp information and leads to an improvement in state-of-the-art polyp localization performance. We also propose a polyp-speci c segmentation method that outperforms other general and speci c approaches.
 
  Address November 2015  
  Corporate Author Thesis Ph.D. thesis  
  Publisher Ediciones Graficas Rey Place of Publication Editor Fernando Vilariño  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN 978-84-943427-6-9 Medium  
  Area Expedition Conference  
  Notes (up) MV Approved no  
  Call Number Admin @ si @ Nuñ2015 Serial 2709  
Permanent link to this record
 

 
Author Onur Ferhat edit  isbn
openurl 
  Title Analysis of Head-Pose Invariant, Natural Light Gaze Estimation Methods Type Book Whole
  Year 2017 Publication PhD Thesis, Universitat Autonoma de Barcelona-CVC Abbreviated Journal  
  Volume Issue Pages  
  Keywords  
  Abstract Eye tracker devices have traditionally been only used inside laboratories, requiring trained professionals and elaborate setup mechanisms. However, in the recent years the scientific work on easier–to–use eye trackers which require no special hardware—other than the omnipresent front facing cameras in computers, tablets, and mobiles—is aiming at making this technology common–place. These types of trackers have several extra challenges that make the problem harder, such as low resolution images provided by a regular webcam, the changing ambient lighting conditions, personal appearance differences, changes in head pose, and so on. Recent research in the field has focused on all these challenges in order to provide better gaze estimation performances in a real world setup.

In this work, we aim at tackling the gaze tracking problem in a single camera setup. We first analyze all the previous work in the field, identifying the strengths and weaknesses of each tried idea. We start our work on the gaze tracker with an appearance–based gaze estimation method, which is the simplest idea that creates a direct mapping between a rectangular image patch extracted around the eye in a camera image, and the gaze point (or gaze direction). Here, we do an extensive analysis of the factors that affect the performance of this tracker in several experimental setups, in order to address these problems in future works. In the second part of our work, we propose a feature–based gaze estimation method, which encodes the eye region image into a compact representation. We argue that this type of representation is better suited to dealing with head pose and lighting condition changes, as it both reduces the dimensionality of the input (i.e. eye image) and breaks the direct connection between image pixel intensities and the gaze estimation. Lastly, we use a face alignment algorithm to have robust face pose estimation, using a 3D model customized to the subject using the tracker. We combine this with a convolutional neural network trained on a large dataset of images to build a face pose invariant gaze tracker.
 
  Address September 2017  
  Corporate Author Thesis Ph.D. thesis  
  Publisher Ediciones Graficas Rey Place of Publication Editor Fernando Vilariño  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN 978-84-945373-5-6 Medium  
  Area Expedition Conference  
  Notes (up) MV Approved no  
  Call Number Admin @ si @ Fer2017 Serial 3018  
Permanent link to this record
 

 
Author Fernando Vilariño edit  openurl
  Title 3D Scanning of Capitals at Library Living Lab Type Book Whole
  Year 2019 Publication “Living Lab Projects 2019”. ENoLL. Abbreviated Journal  
  Volume Issue Pages  
  Keywords  
  Abstract  
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes (up) MV; DAG; 600.140; 600.121;SIAI Approved no  
  Call Number Admin @ si @ Vil2019c Serial 3463  
Permanent link to this record
 

 
Author Fernando Vilariño edit   pdf
openurl 
  Title A Machine Learning Approach for Intestinal Motility Assessment with Capsule Endoscopy Type Book Whole
  Year 2006 Publication PhD Thesis, Universitat Autonoma de Barcelona-CVC Abbreviated Journal  
  Volume Issue Pages  
  Keywords  
  Abstract Intestinal motility assessment with video capsule endoscopy arises as a novel and challenging clinical fieldwork. This technique is based on the analysis of the patterns of intestinal contractions obtained by labelling all the motility events present in a video provided by a capsule with a wireless micro-camera, which is ingested by the patient. However, the visual analysis of these video sequences presents several im- portant drawbacks, mainly related to both the large amount of time needed for the visualization process, and the low prevalence of intestinal contractions in video.
In this work we propose a machine learning system to automatically detect the intestinal contractions in video capsule endoscopy, driving a very useful but not fea- sible clinical routine into a feasible clinical procedure. Our proposal is divided into two different parts: The first part tackles the problem of the automatic detection of phasic contractions in capsule endoscopy videos. Phasic contractions are dynamic events spanning about 4-5 seconds, which show visual patterns with a high variability. Our proposal is based on a sequential design which involves the analysis of textural, color and blob features with powerful classifiers such as SVM. This approach appears to cope with two basic aims: the reduction of the imbalance rate of the data set, and the modular construction of the system, which adds the capability of including domain knowledge as new stages in the cascade. The second part of the current work tackles the problem of the automatic detection of tonic contractions. Tonic contrac- tions manifest in capsule endoscopy as a sustained pattern of the folds and wrinkles of the intestine, which may be prolonged for an undetermined span of time. Our proposal is based on the analysis of the wrinkle patterns, presenting a comparative study of diverse features and classification methods, and providing a set of appro- priate descriptors for their characterization. We provide a detailed analysis of the performance achieved by our system both in a qualitative and a quantitative way.
 
  Address CVC (UAB)  
  Corporate Author Thesis Ph.D. thesis  
  Publisher Place of Publication Editor Petia Radeva  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue 84-933652-7-0 Edition  
  ISSN ISBN Medium  
  Area 800 Expedition Conference  
  Notes (up) MV;SIAI Approved no  
  Call Number Admin @ si @ Vil2006; IAM @ iam @ Vil2006 Serial 738  
Permanent link to this record
 

 
Author Arash Akbarinia edit  isbn
openurl 
  Title Computational Model of Visual Perception: From Colour to Form Type Book Whole
  Year 2017 Publication PhD Thesis, Universitat Autonoma de Barcelona-CVC Abbreviated Journal  
  Volume Issue Pages  
  Keywords  
  Abstract The original idea of this project was to study the role of colour in the challenging task of object recognition. We started by extending previous research on colour naming showing that it is feasible to capture colour terms through parsimonious ellipsoids. Although, the results of our model exceeded state-of-the-art in two benchmark datasets, we realised that the two phenomena of metameric lights and colour constancy must be addressed prior to any further colour processing. Our investigation of metameric pairs reached the conclusion that they are infrequent in real world scenarios. Contrary to that, the illumination of a scene often changes dramatically. We addressed this issue by proposing a colour constancy model inspired by the dynamical centre-surround adaptation of neurons in the visual cortex. This was implemented through two overlapping asymmetric Gaussians whose variances and heights are adjusted according to the local contrast of pixels. We complemented this model with a generic contrast-variant pooling mechanism that inversely connect the percentage of pooled signal to the local contrast of a region. The results of our experiments on four benchmark datasets were indeed promising: the proposed model, although simple, outperformed even learning-based approaches in many cases. Encouraged by the success of our contrast-variant surround modulation, we extended this approach to detect boundaries of objects. We proposed an edge detection model based on the first derivative of the Gaussian kernel. We incorporated four types of surround: full, far, iso- and orthogonal-orientation. Furthermore, we accounted for the pooling mechanism at higher cortical areas and the shape feedback sent to lower areas. Our results in three benchmark datasets showed significant improvement over non-learning algorithms.
To summarise, we demonstrated that biologically-inspired models offer promising solutions to computer vision problems, such as, colour naming, colour constancy and edge detection. We believe that the greatest contribution of this Ph.D dissertation is modelling the concept of dynamic surround modulation that shows the significance of contrast-variant surround integration. The models proposed here are grounded on only a portion of what we know about the human visual system. Therefore, it is only natural to complement them accordingly in future works.
 
  Address October 2017  
  Corporate Author Thesis Ph.D. thesis  
  Publisher Ediciones Graficas Rey Place of Publication Editor C. Alejandro Parraga  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN 978-84-945373-4-9 Medium  
  Area Expedition Conference  
  Notes (up) NEUROBIT Approved no  
  Call Number Admin @ si @ Akb2017 Serial 3019  
Permanent link to this record
 

 
Author Xim Cerda-Company edit  isbn
openurl 
  Title Understanding color vision: from psychophysics to computational modeling Type Book Whole
  Year 2019 Publication PhD Thesis, Universitat Autonoma de Barcelona-CVC Abbreviated Journal  
  Volume Issue Pages  
  Keywords  
  Abstract In this PhD we have approached the human color vision from two different points of view: psychophysics and computational modeling. First, we have evaluated 15 different tone-mapping operators (TMOs). We have conducted two experiments that
consider two different criteria: the first one evaluates the local relationships among intensity levels and the second one evaluates the global appearance of the tonemapped imagesw.r.t. the physical one (presented side by side). We conclude that the rankings depend on the criterion and they are not correlated. Considering both criteria, the best TMOs are KimKautz (Kim and Kautz, 2008) and Krawczyk (Krawczyk, Myszkowski, and Seidel, 2005). Another conclusion is that a more standardized evaluation criteria is needed to do a fair comparison among TMOs.
Secondly, we have conducted several psychophysical experiments to study the
color induction. We have studied two different properties of the visual stimuli: temporal frequency and luminance spatial distribution. To study the temporal frequency we defined equiluminant stimuli composed by both uniform and striped surrounds and we flashed them varying the flash duration. For uniform surrounds, the results show that color induction depends on both the flash duration and inducer’s chromaticity. As expected, in all chromatic conditions color contrast was induced. In contrast, for striped surrounds, we expected to induce color assimilation, but we observed color contrast or no induction. Since similar but not equiluminant striped stimuli induce color assimilation, we concluded that luminance differences could be a key factor to induce color assimilation. Thus, in a subsequent study, we have studied the luminance differences’ effect on color assimilation. We varied the luminance difference between the target region and its inducers and we observed that color assimilation depends on both this difference and the inducer’s chromaticity. For red-green condition (where the first inducer is red and the second one is green), color assimilation occurs in almost all luminance conditions.
Instead, for green-red condition, color assimilation never occurs. Purple-lime
and lime-purple chromatic conditions show that luminance difference is a key factor to induce color assimilation. When the target is darker than its surround, color assimilation is stronger in purple-lime, while when the target is brighter, color assimilation is stronger in lime-purple (’mirroring’ effect). Moreover, we evaluated whether color assimilation is due to luminance or brightness differences. Similarly to equiluminance condition, when the stimuli are equibrightness no color assimilation is induced. Our results support the hypothesis that mutual-inhibition plays a major role in color perception, or at least in color induction.
Finally, we have defined a new firing rate model of color processing in the V1
parvocellular pathway. We have modeled two different layers of this cortical area: layers 4Cb and 2/3. Our model is a recurrent dynamic computational model that considers both excitatory and inhibitory cells and their lateral connections. Moreover, it considers the existent laminar differences and the cells’ variety. Thus, we have modeled both single- and double-opponent simple cells and complex cells, which are a pool of double-opponent simple cells. A set of sinusoidal drifting gratings have been used to test the architecture. In these gratings we have varied several spatial properties such as temporal and spatial frequencies, grating’s area and orientation. To reproduce the electrophysiological observations, the architecture has to consider the existence of non-oriented double-opponent cells in layer 4Cb and the lack of lateral connections between single-opponent cells. Moreover, we have tested our lateral connections simulating the center-surround modulation and we have reproduced physiological measurements where for high contrast stimulus, the
result of the lateral connections is inhibitory, while it is facilitatory for low contrast stimulus.
 
  Address March 2019  
  Corporate Author Thesis Ph.D. thesis  
  Publisher Ediciones Graficas Rey Place of Publication Editor Xavier Otazu  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN 978-84-948531-4-2 Medium  
  Area Expedition Conference  
  Notes (up) NEUROBIT Approved no  
  Call Number Admin @ si @ Cer2019 Serial 3259  
Permanent link to this record
 

 
Author David Berga edit  isbn
openurl 
  Title Understanding Eye Movements: Psychophysics and a Model of Primary Visual Cortex Type Book Whole
  Year 2019 Publication PhD Thesis, Universitat Autonoma de Barcelona-CVC Abbreviated Journal  
  Volume Issue Pages  
  Keywords  
  Abstract Humansmove their eyes in order to learn visual representations of the world. These eye movements depend on distinct factors, either by the scene that we perceive or by our own decisions. To select what is relevant to attend is part of our survival mechanisms and the way we build reality, as we constantly react both consciously and unconsciously to all the stimuli that is projected into our eyes. In this thesis we try to explain (1) how we move our eyes, (2) how to build machines that understand visual information and deploy eyemovements, and (3) how to make these machines understand tasks in order to decide for eye movements.
(1) We provided the analysis of eye movement behavior elicited by low-level feature distinctiveness with a dataset of 230 synthetically-generated image patterns. A total of 15 types of stimuli has been generated (e.g. orientation, brightness, color, size, etc.), with 7 feature contrasts for each feature category. Eye-tracking data was collected from 34 participants during the viewing of the dataset, using Free-Viewing and Visual Search task instructions. Results showed that saliency is predominantly and distinctively influenced by: 1. feature type, 2. feature contrast, 3. Temporality of fixations, 4. task difficulty and 5. center bias. From such dataset (SID4VAM), we have computed a benchmark of saliency models by testing performance using psychophysical patterns. Model performance has been evaluated considering model inspiration and consistency with human psychophysics. Our study reveals that state-of-the-art Deep Learning saliency models do not performwell with synthetic pattern images, instead, modelswith Spectral/Fourier inspiration outperform others in saliency metrics and are more consistent with human psychophysical experimentation.
(2) Computations in the primary visual cortex (area V1 or striate cortex) have long been hypothesized to be responsible, among several visual processing mechanisms, of bottom-up visual attention (also named saliency). In order to validate this hypothesis, images from eye tracking datasets have been processed with a biologically plausible model of V1 (named Neurodynamic SaliencyWaveletModel or NSWAM). Following Li’s neurodynamic model, we define V1’s lateral connections with a network of firing rate neurons, sensitive to visual features such as brightness, color, orientation and scale. Early subcortical processes (i.e. retinal and thalamic) are functionally simulated. The resulting saliency maps are generated from the model output, representing the neuronal activity of V1 projections towards brain areas involved in eye movement control. We want to pinpoint that our unified computational architecture is able to reproduce several visual processes (i.e. brightness, chromatic induction and visual discomfort) without applying any type of training or optimization and keeping the same parametrization. The model has been extended (NSWAM-CM) with an implementation of the cortical magnification function to define the retinotopical projections towards V1, processing neuronal activity for each distinct view during scene observation. Novel computational definitions of top-down inhibition (in terms of inhibition of return and selection mechanisms), are also proposed to predict attention in Free-Viewing and Visual Search conditions. Results show that our model outperforms other biologically-inpired models of saliency prediction as well as to predict visual saccade sequences, specifically for nature and synthetic images. We also show how temporal and spatial characteristics of inhibition of return can improve prediction of saccades, as well as how distinct search strategies (in terms of feature-selective or category-specific inhibition) predict attention at distinct image contexts.
(3) Although previous scanpath models have been able to efficiently predict saccades during Free-Viewing, it is well known that stimulus and task instructions can strongly affect eye movement patterns. In particular, task priming has been shown to be crucial to the deployment of eye movements, involving interactions between brain areas related to goal-directed behavior, working and long-termmemory in combination with stimulus-driven eyemovement neuronal correlates. In our latest study we proposed an extension of the Selective Tuning Attentive Reference Fixation ControllerModel based on task demands (STAR-FCT), describing novel computational definitions of Long-TermMemory, Visual Task Executive and Task Working Memory. With these modules we are able to use textual instructions in order to guide the model to attend to specific categories of objects and/or places in the scene. We have designed our memorymodel by processing a visual hierarchy of low- and high-level features. The relationship between the executive task instructions and the memory representations has been specified using a tree of semantic similarities between the learned features and the object category labels. Results reveal that by using this model, the resulting object localizationmaps and predicted saccades have a higher probability to fall inside the salient regions depending on the distinct task instructions compared to saliency.
 
  Address July 2019  
  Corporate Author Thesis Ph.D. thesis  
  Publisher Ediciones Graficas Rey Place of Publication Editor Xavier Otazu  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN 978-84-948531-8-0 Medium  
  Area Expedition Conference  
  Notes (up) NEUROBIT Approved no  
  Call Number Admin @ si @ Ber2019 Serial 3390  
Permanent link to this record
 

 
Author Xavier Baro edit  openurl
  Title Probabilistic Darwin Machines: A New Approach to Develop Evolutionary Object Detection Type Book Whole
  Year 2009 Publication PhD Thesis, Universitat Autonoma de Barcelona-CVC Abbreviated Journal  
  Volume Issue Pages  
  Keywords  
  Abstract Ever since computers were invented, we have wondered whether they might perform some of the human quotidian tasks. One of the most studied and still nowadays less understood problem is the capacity to learn from our experiences and how we generalize the knowledge that we acquire. One of that unaware tasks for the persons and that more interest is awakening in different scientific areas since the beginning, is the one that is known as pattern recognition. The creation of models that represent the world that surrounds us, help us for recognizing objects in our environment, to predict situations, to identify behaviors... All this information allows us to adapt ourselves and to interact with our environment. The capacity of adaptation of individuals to their environment has been related to the amount of patterns that are capable of identifying.

This thesis faces the pattern recognition problem from a Computer Vision point of view, taking one of the most paradigmatic and extended approaches to object detection as starting point. After studying this approach, two weak points are identified: The first makes reference to the description of the objects, and the second is a limitation of the learning algorithm, which hampers the utilization of best descriptors.

In order to address the learning limitations, we introduce evolutionary computation techniques to the classical object detection approach.

After testing the classical evolutionary approaches, such as genetic algorithms, we develop a new learning algorithm based on Probabilistic Darwin Machines, which better adapts to the learning problem. Once the learning limitation is avoided, we introduce a new feature set, which maintains the benefits of the classical feature set, adding the ability to describe non localities. This combination of evolutionary learning algorithm and features is tested on different public data sets, outperforming the results obtained by the classical approach.
 
  Address Barcelona (Spain)  
  Corporate Author Thesis Ph.D. thesis  
  Publisher Ediciones Graficas Rey Place of Publication Editor Jordi Vitria  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes (up) OR;HuPBA;MV Approved no  
  Call Number BCNPCL @ bcnpcl @ Bar2009 Serial 1262  
Permanent link to this record
 

 
Author F. Pla; Petia Radeva; Jordi Vitria edit  isbn
openurl 
  Title Pattern Recognition: Progress, Directions and Applications Type Book Whole
  Year 2006 Publication Abbreviated Journal  
  Volume Issue Pages  
  Keywords  
  Abstract  
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN 84-933652-6-2 Medium  
  Area Expedition Conference  
  Notes (up) OR;MILAB;MV Approved no  
  Call Number BCNPCL @ bcnpcl @ PRV2006b Serial 771  
Permanent link to this record
 

 
Author A. Sanfeliu; Juan J. Villanueva; Jordi Vitria edit  openurl
  Title Image Analysis and Pattern Recognition. Type Book Whole
  Year 1997 Publication Image Analysis and Pattern Recognition. Abbreviated Journal  
  Volume Issue Pages  
  Keywords  
  Abstract  
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes (up) OR;MV Approved no  
  Call Number BCNPCL @ bcnpcl @ SVV1997 Serial 56  
Permanent link to this record
Select All    Deselect All
 |   | 
Details
   print

Save Citations:
Export Records: