toggle visibility Search & Display Options

Select All    Deselect All
 |   | 
Details
   print
  Records Links
Author David Geronimo; Angel Sappa; Daniel Ponsa; Antonio Lopez edit   pdf
url  doi
openurl 
  Title (up) 2D-3D based on-board pedestrian detection system Type Journal Article
  Year 2010 Publication Computer Vision and Image Understanding Abbreviated Journal CVIU  
  Volume 114 Issue 5 Pages 583–595  
  Keywords Pedestrian detection; Advanced Driver Assistance Systems; Horizon line; Haar wavelets; Edge orientation histograms  
  Abstract During the next decade, on-board pedestrian detection systems will play a key role in the challenge of increasing traffic safety. The main target of these systems, to detect pedestrians in urban scenarios, implies overcoming difficulties like processing outdoor scenes from a mobile platform and searching for aspect-changing objects in cluttered environments. This makes such systems combine techniques in the state-of-the-art Computer Vision. In this paper we present a three module system based on both 2D and 3D cues. The first module uses 3D information to estimate the road plane parameters and thus select a coherent set of regions of interest (ROIs) to be further analyzed. The second module uses Real AdaBoost and a combined set of Haar wavelets and edge orientation histograms to classify the incoming ROIs as pedestrian or non-pedestrian. The final module loops again with the 3D cue in order to verify the classified ROIs and with the 2D in order to refine the final results. According to the results, the integration of the proposed techniques gives rise to a promising system.  
  Address Computer Vision and Image Understanding (Special Issue on Intelligent Vision Systems), Vol. 114(5):583-595  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN 1077-3142 ISBN Medium  
  Area Expedition Conference  
  Notes ADAS Approved no  
  Call Number ADAS @ adas @ GSP2010 Serial 1341  
Permanent link to this record
 

 
Author Daniel Hernandez; Antonio Espinosa; David Vazquez; Antonio Lopez; Juan C. Moure edit   pdf
url  doi
openurl 
  Title (up) 3D Perception With Slanted Stixels on GPU Type Journal Article
  Year 2021 Publication IEEE Transactions on Parallel and Distributed Systems Abbreviated Journal TPDS  
  Volume 32 Issue 10 Pages 2434-2447  
  Keywords Daniel Hernandez-Juarez; Antonio Espinosa; David Vazquez; Antonio M. Lopez; Juan C. Moure  
  Abstract This article presents a GPU-accelerated software design of the recently proposed model of Slanted Stixels, which represents the geometric and semantic information of a scene in a compact and accurate way. We reformulate the measurement depth model to reduce the computational complexity of the algorithm, relying on the confidence of the depth estimation and the identification of invalid values to handle outliers. The proposed massively parallel scheme and data layout for the irregular computation pattern that corresponds to a Dynamic Programming paradigm is described and carefully analyzed in performance terms. Performance is shown to scale gracefully on current generation embedded GPUs. We assess the proposed methods in terms of semantic and geometric accuracy as well as run-time performance on three publicly available benchmark datasets. Our approach achieves real-time performance with high accuracy for 2048 × 1024 image sizes and 4 × 4 Stixel resolution on the low-power embedded GPU of an NVIDIA Tegra Xavier.  
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes ADAS; 600.124; 600.118 Approved no  
  Call Number Admin @ si @ HEV2021 Serial 3561  
Permanent link to this record
 

 
Author David Vazquez; Jorge Bernal; F. Javier Sanchez; Gloria Fernandez Esparrach; Antonio Lopez; Adriana Romero; Michal Drozdzal; Aaron Courville edit   pdf
url  openurl
  Title (up) A Benchmark for Endoluminal Scene Segmentation of Colonoscopy Images Type Journal Article
  Year 2017 Publication Journal of Healthcare Engineering Abbreviated Journal JHCE  
  Volume Issue Pages 2040-2295  
  Keywords Colonoscopy images; Deep Learning; Semantic Segmentation  
  Abstract Colorectal cancer (CRC) is the third cause of cancer death world-wide. Currently, the standard approach to reduce CRC-related mortality is to perform regular screening in search for polyps and colonoscopy is the screening tool of choice. The main limitations of this screening procedure are polyp miss- rate and inability to perform visual assessment of polyp malignancy. These drawbacks can be reduced by designing Decision Support Systems (DSS) aim- ing to help clinicians in the different stages of the procedure by providing endoluminal scene segmentation. Thus, in this paper, we introduce an extended benchmark of colonoscopy image segmentation, with the hope of establishing a new strong benchmark for colonoscopy image analysis research. The proposed dataset consists of 4 relevant classes to inspect the endolumninal scene, tar- geting different clinical needs. Together with the dataset and taking advantage of advances in semantic segmentation literature, we provide new baselines by training standard fully convolutional networks (FCN). We perform a compar- ative study to show that FCN significantly outperform, without any further post-processing, prior results in endoluminal scene segmentation, especially with respect to polyp segmentation and localization.  
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes ADAS; MV; 600.075; 600.085; 600.076; 601.281; 600.118 Approved no  
  Call Number VBS2017b Serial 2940  
Permanent link to this record
 

 
Author Juan Borrego-Carazo; Carles Sanchez; David Castells; Jordi Carrabina; Debora Gil edit  openurl
  Title (up) A benchmark for the evaluation of computational methods for bronchoscopic navigation Type Journal Article
  Year 2022 Publication International Journal of Computer Assisted Radiology and Surgery Abbreviated Journal IJCARS  
  Volume 17 Issue 1 Pages  
  Keywords  
  Abstract  
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes IAM Approved no  
  Call Number Admin @ si @ BSC2022 Serial 3832  
Permanent link to this record
 

 
Author Onur Ferhat; Fernando Vilariño; F. Javier Sanchez edit  url
openurl 
  Title (up) A cheap portable eye-tracker solution for common setups. Type Journal Article
  Year 2014 Publication Journal of Eye Movement Research Abbreviated Journal JEMR  
  Volume 7 Issue 3 Pages 1-10  
  Keywords  
  Abstract We analyze the feasibility of a cheap eye-tracker where the hardware consists of a single webcam and a Raspberry Pi device. Our aim is to discover the limits of such a system and to see whether it provides an acceptable performance. We base our work on the open source Opengazer (Zielinski, 2013) and we propose several improvements to create a robust, real-time system which can work on a computer with 30Hz sampling rate. After assessing the accuracy of our eye-tracker in elaborated experiments involving 12 subjects under 4 different system setups, we install it on a Raspberry Pi to create a portable stand-alone eye-tracker which achieves 1.42° horizontal accuracy with 3Hz refresh rate for a building cost of 70 Euros.  
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes ;SIAI Approved no  
  Call Number Admin @ si @ FVS2014 Serial 2435  
Permanent link to this record
 

 
Author Diego Velazquez; Pau Rodriguez; Josep M. Gonfaus; Xavier Roca; Jordi Gonzalez edit  url
openurl 
  Title (up) A Closer Look at Embedding Propagation for Manifold Smoothing Type Journal Article
  Year 2022 Publication Journal of Machine Learning Research Abbreviated Journal JMLR  
  Volume 23 Issue 252 Pages 1-27  
  Keywords Regularization; emi-supervised learning; self-supervised learning; adversarial robustness; few-shot classification  
  Abstract Supervised training of neural networks requires a large amount of manually annotated data and the resulting networks tend to be sensitive to out-of-distribution (OOD) data.
Self- and semi-supervised training schemes reduce the amount of annotated data required during the training process. However, OOD generalization remains a major challenge for most methods. Strategies that promote smoother decision boundaries play an important role in out-of-distribution generalization. For example, embedding propagation (EP) for manifold smoothing has recently shown to considerably improve the OOD performance for few-shot classification. EP achieves smoother class manifolds by building a graph from sample embeddings and propagating information through the nodes in an unsupervised manner. In this work, we extend the original EP paper providing additional evidence and experiments showing that it attains smoother class embedding manifolds and improves results in settings beyond few-shot classification. Concretely, we show that EP improves the robustness of neural networks against multiple adversarial attacks as well as semi- and
self-supervised learning performance.
 
  Address 9/2022  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes Approved no  
  Call Number Admin @ si @ VRG2022 Serial 3762  
Permanent link to this record
 

 
Author Marco Pedersoli; Andrea Vedaldi; Jordi Gonzalez; Xavier Roca edit   pdf
doi  openurl
  Title (up) A coarse-to-fine approach for fast deformable object detection Type Journal Article
  Year 2015 Publication Pattern Recognition Abbreviated Journal PR  
  Volume 48 Issue 5 Pages 1844-1853  
  Keywords  
  Abstract We present a method that can dramatically accelerate object detection with part based models. The method is based on the observation that the cost of detection is likely to be dominated by the cost of matching each part to the image, and not by the cost of computing the optimal configuration of the parts as commonly assumed. Therefore accelerating detection requires minimizing the number of
part-to-image comparisons. To this end we propose a multiple-resolutions hierarchical part based model and a corresponding coarse-to-fine inference procedure that recursively eliminates from the search space unpromising part
placements. The method yields a ten-fold speedup over the standard dynamic programming approach and is complementary to the cascade-of-parts approach of [9]. Compared to the latter, our method does not have parameters to be determined empirically, which simplifies its use during the training of the model. Most importantly, the two techniques can be combined to obtain a very significant speedup, of two orders of magnitude in some cases. We evaluate our method extensively on the PASCAL VOC and INRIA datasets, demonstrating a very high increase in the detection speed with little degradation of the accuracy.
 
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes ISE; 600.078; 602.005; 605.001; 302.012 Approved no  
  Call Number Admin @ si @ PVG2015 Serial 2628  
Permanent link to this record
 

 
Author Alicia Fornes; Josep Llados; Gemma Sanchez; Xavier Otazu; Horst Bunke edit  doi
openurl 
  Title (up) A Combination of Features for Symbol-Independent Writer Identification in Old Music Scores Type Journal Article
  Year 2010 Publication International Journal on Document Analysis and Recognition Abbreviated Journal IJDAR  
  Volume 13 Issue 4 Pages 243-259  
  Keywords  
  Abstract The aim of writer identification is determining the writer of a piece of handwriting from a set of writers. In this paper, we present an architecture for writer identification in old handwritten music scores. Even though an important amount of music compositions contain handwritten text, the aim of our work is to use only music notation to determine the author. The main contribution is therefore the use of features extracted from graphical alphabets. Our proposal consists in combining the identification results of two different approaches, based on line and textural features. The steps of the ensemble architecture are the following. First of all, the music sheet is preprocessed for removing the staff lines. Then, music lines and texture images are generated for computing line features and textural features. Finally, the classification results are combined for identifying the writer. The proposed method has been tested on a database of old music scores from the seventeenth to nineteenth centuries, achieving a recognition rate of about 92% with 20 writers.  
  Address  
  Corporate Author Thesis  
  Publisher Springer-Verlag Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN 1433-2833 ISBN Medium  
  Area Expedition Conference  
  Notes DAG; CAT;CIC Approved no  
  Call Number FLS2010b Serial 1319  
Permanent link to this record
 

 
Author Tadashi Araki; Nobutaka Ikeda; Nilanjan Dey; Sayan Chakraborty; Luca Saba; Dinesh Kumar; Elisa Cuadrado Godia; Xiaoyi Jiang; Ajay Gupta; Petia Radeva; John R. Laird; Andrew Nicolaides; Jasjit S. Suri edit  doi
openurl 
  Title (up) A comparative approach of four different image registration techniques for quantitative assessment of coronary artery calcium lesions using intravascular ultrasound Type Journal Article
  Year 2015 Publication Computer Methods and Programs in Biomedicine Abbreviated Journal CMPB  
  Volume 118 Issue 2 Pages 158-172  
  Keywords  
  Abstract  
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes MILAB Approved no  
  Call Number Admin @ si @ AID2015 Serial 2640  
Permanent link to this record
 

 
Author Frederic Sampedro; Sergio Escalera; Anna Domenech; Ignasi Carrio edit  doi
openurl 
  Title (up) A computational framework for cancer response assessment based on oncological PET-CT scans Type Journal Article
  Year 2014 Publication Computers in Biology and Medicine Abbreviated Journal CBM  
  Volume 55 Issue Pages 92–99  
  Keywords Computer aided diagnosis; Nuclear medicine; Machine learning; Image processing; Quantitative analysis  
  Abstract In this work we present a comprehensive computational framework to help in the clinical assessment of cancer response from a pair of time consecutive oncological PET-CT scans. In this scenario, the design and implementation of a supervised machine learning system to predict and quantify cancer progression or response conditions by introducing a novel feature set that models the underlying clinical context is described. Performance results in 100 clinical cases (corresponding to 200 whole body PET-CT scans) in comparing expert-based visual analysis and classifier decision making show up to 70% accuracy within a completely automatic pipeline and 90% accuracy when providing the system with expert-guided PET tumor segmentation masks.  
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes HuPBA;MILAB Approved no  
  Call Number Admin @ si @ SED2014 Serial 2606  
Permanent link to this record
 

 
Author Maria Oliver; G. Haro; Mariella Dimiccoli; B. Mazin; C. Ballester edit   pdf
doi  openurl
  Title (up) A Computational Model for Amodal Completion Type Journal Article
  Year 2016 Publication Journal of Mathematical Imaging and Vision Abbreviated Journal JMIV  
  Volume 56 Issue 3 Pages 511–534  
  Keywords Perception; visual completion; disocclusion; Bayesian model;relatability; Euler elastica  
  Abstract This paper presents a computational model to recover the most likely interpretation
of the 3D scene structure from a planar image, where some objects may occlude others. The estimated scene interpretation is obtained by integrating some global and local cues and provides both the complete disoccluded objects that form the scene and their ordering according to depth.
Our method first computes several distal scenes which are compatible with the proximal planar image. To compute these different hypothesized scenes, we propose a perceptually inspired object disocclusion method, which works by minimizing the Euler's elastica as well as by incorporating the relatability of partially occluded contours and the convexity of the disoccluded objects. Then, to estimate the preferred scene we rely on a Bayesian model and define probabilities taking into account the global complexity of the objects in the hypothesized scenes as well as the effort of bringing these objects in their relative position in the planar image, which is also measured by an Euler's elastica-based quantity. The model is illustrated with numerical experiments on, both, synthetic and real images showing the ability of our model to reconstruct the occluded objects and the preferred perceptual order among them. We also present results on images of the Berkeley dataset with provided figure-ground ground-truth labeling.
 
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes MILAB; 601.235 Approved no  
  Call Number Admin @ si @ OHD2016b Serial 2745  
Permanent link to this record
 

 
Author Karim Lekadir; Alfiia Galimzianova; Angels Betriu; Maria del Mar Vila; Laura Igual; Daniel L. Rubin; Elvira Fernandez-Giraldez; Petia Radeva; Sandy Napel edit  doi
openurl 
  Title (up) A Convolutional Neural Network for Automatic Characterization of Plaque Composition in Carotid Ultrasound Type Journal Article
  Year 2017 Publication IEEE Journal Biomedical and Health Informatics Abbreviated Journal J-BHI  
  Volume 21 Issue 1 Pages 48-55  
  Keywords  
  Abstract Characterization of carotid plaque composition, more specifically the amount of lipid core, fibrous tissue, and calcified tissue, is an important task for the identification of plaques that are prone to rupture, and thus for early risk estimation of cardiovascular and cerebrovascular events. Due to its low costs and wide availability, carotid ultrasound has the potential to become the modality of choice for plaque characterization in clinical practice. However, its significant image noise, coupled with the small size of the plaques and their complex appearance, makes it difficult for automated techniques to discriminate between the different plaque constituents. In this paper, we propose to address this challenging problem by exploiting the unique capabilities of the emerging deep learning framework. More specifically, and unlike existing works which require a priori definition of specific imaging features or thresholding values, we propose to build a convolutional neural network (CNN) that will automatically extract from the images the information that is optimal for the identification of the different plaque constituents. We used approximately 90 000 patches extracted from a database of images and corresponding expert plaque characterizations to train and to validate the proposed CNN. The results of cross-validation experiments show a correlation of about 0.90 with the clinical assessment for the estimation of lipid core, fibrous cap, and calcified tissue areas, indicating the potential of deep learning for the challenging task of automatic characterization of plaque composition in carotid ultrasound.  
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes MILAB; no menciona Approved no  
  Call Number Admin @ si @ LGB2017 Serial 2931  
Permanent link to this record
 

 
Author Marta Ligero; Alonso Garcia Ruiz; Cristina Viaplana; Guillermo Villacampa; Maria V Raciti; Jaid Landa; Ignacio Matos; Juan Martin Liberal; Maria Ochoa de Olza; Cinta Hierro; Joaquin Mateo; Macarena Gonzalez; Rafael Morales Barrera; Cristina Suarez; Jordi Rodon; Elena Elez; Irene Braña; Eva Muñoz-Couselo; Ana Oaknin; Roberta Fasani; Paolo Nuciforo; Debora Gil; Carlota Rubio Perez; Joan Seoane; Enriqueta Felip; Manuel Escobar; Josep Tabernero; Joan Carles; Rodrigo Dienstmann; Elena Garralda; Raquel Perez Lopez edit  url
doi  openurl
  Title (up) A CT-based radiomics signature is associated with response to immune checkpoint inhibitors in advanced solid tumors Type Journal Article
  Year 2021 Publication Radiology Abbreviated Journal  
  Volume 299 Issue 1 Pages 109-119  
  Keywords  
  Abstract Background Reliable predictive imaging markers of response to immune checkpoint inhibitors are needed. Purpose To develop and validate a pretreatment CT-based radiomics signature to predict response to immune checkpoint inhibitors in advanced solid tumors. Materials and Methods In this retrospective study, a radiomics signature was developed in patients with advanced solid tumors (including breast, cervix, gastrointestinal) treated with anti-programmed cell death-1 or programmed cell death ligand-1 monotherapy from August 2012 to May 2018 (cohort 1). This was tested in patients with bladder and lung cancer (cohorts 2 and 3). Radiomics variables were extracted from all metastases delineated at pretreatment CT and selected by using an elastic-net model. A regression model combined radiomics and clinical variables with response as the end point. Biologic validation of the radiomics score with RNA profiling of cytotoxic cells (cohort 4) was assessed with Mann-Whitney analysis. Results The radiomics signature was developed in 85 patients (cohort 1: mean age, 58 years ± 13 [standard deviation]; 43 men) and tested on 46 patients (cohort 2: mean age, 70 years ± 12; 37 men) and 47 patients (cohort 3: mean age, 64 years ± 11; 40 men). Biologic validation was performed in a further cohort of 20 patients (cohort 4: mean age, 60 years ± 13; 14 men). The radiomics signature was associated with clinical response to immune checkpoint inhibitors (area under the curve [AUC], 0.70; 95% CI: 0.64, 0.77; P < .001). In cohorts 2 and 3, the AUC was 0.67 (95% CI: 0.58, 0.76) and 0.67 (95% CI: 0.56, 0.77; P < .001), respectively. A radiomics-clinical signature (including baseline albumin level and lymphocyte count) improved on radiomics-only performance (AUC, 0.74 [95% CI: 0.63, 0.84; P < .001]; Akaike information criterion, 107.00 and 109.90, respectively). Conclusion A pretreatment CT-based radiomics signature is associated with response to immune checkpoint inhibitors, likely reflecting the tumor immunophenotype. © RSNA, 2021 Online supplemental material is available for this article. See also the editorial by Summers in this issue.  
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes IAM; 600.145 Approved no  
  Call Number Admin @ si @ LGV2021 Serial 3593  
Permanent link to this record
 

 
Author Razieh Rastgoo; Kourosh Kiani; Sergio Escalera edit  url
openurl 
  Title (up) A deep co-attentive hand-based video question answering framework using multi-view skeleton Type Journal Article
  Year 2023 Publication Multimedia Tools and Applications Abbreviated Journal MTAP  
  Volume 82 Issue Pages 1401–1429  
  Keywords  
  Abstract In this paper, we present a novel hand –based Video Question Answering framework, entitled Multi-View Video Question Answering (MV-VQA), employing the Single Shot Detector (SSD), Convolutional Neural Network (CNN), Long Short-Term Memory (LSTM), Bidirectional Encoder Representations from Transformers (BERT), and Co-Attention mechanism with RGB videos as the inputs. Our model includes three main blocks: vision, language, and attention. In the vision block, we employ a novel representation to obtain some efficient multiview features from the hand object using the combination of five 3DCNNs and one LSTM network. To obtain the question embedding, we use the BERT model in language block. Finally, we employ a co-attention mechanism on vision and language features to recognize the final answer. For the first time, we propose such a hand-based Video-QA framework including the multi-view hand skeleton features combined with the question embedding and co-attention mechanism. Our framework is capable of processing the arbitrary numbers of questions in the dataset annotations. There are different application domains for this framework. Here, as an application domain, we applied our framework to dynamic hand gesture recognition for the first time. Since the main object in dynamic hand gesture recognition is the human hand, we performed a step-by-step analysis of the hand detection and multi-view hand skeleton impact on the model performance. Evaluation results on five datasets, including two datasets in VideoQA, two datasets in dynamic hand gesture, and one dataset in hand action recognition show that MV-VQA outperforms state-of-the-art alternatives.  
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes HUPBA Approved no  
  Call Number Admin @ si @ RKE2023b Serial 3881  
Permanent link to this record
 

 
Author Jaykishan Patel; Alban Flachot; Javier Vazquez; David H. Brainard; Thomas S. A. Wallis; Marcus A. Brubaker; Richard F. Murray edit  url
openurl 
  Title (up) A deep convolutional neural network trained to infer surface reflectance is deceived by mid-level lightness illusions Type Journal Article
  Year 2023 Publication Journal of Vision Abbreviated Journal JV  
  Volume 23 Issue 9 Pages 4817-4817  
  Keywords  
  Abstract A long-standing view is that lightness illusions are by-products of strategies employed by the visual system to stabilize its perceptual representation of surface reflectance against changes in illumination. Computationally, one such strategy is to infer reflectance from the retinal image, and to base the lightness percept on this inference. CNNs trained to infer reflectance from images have proven successful at solving this problem under limited conditions. To evaluate whether these CNNs provide suitable starting points for computational models of human lightness perception, we tested a state-of-the-art CNN on several lightness illusions, and compared its behaviour to prior measurements of human performance. We trained a CNN (Yu & Smith, 2019) to infer reflectance from luminance images. The network had a 30-layer hourglass architecture with skip connections. We trained the network via supervised learning on 100K images, rendered in Blender, each showing randomly placed geometric objects (surfaces, cubes, tori, etc.), with random Lambertian reflectance patterns (solid, Voronoi, or low-pass noise), under randomized point+ambient lighting. The renderer also provided the ground-truth reflectance images required for training. After training, we applied the network to several visual illusions. These included the argyle, Koffka-Adelson, snake, White’s, checkerboard assimilation, and simultaneous contrast illusions, along with their controls where appropriate. The CNN correctly predicted larger illusions in the argyle, Koffka-Adelson, and snake images than in their controls. It also correctly predicted an assimilation effect in White's illusion. It did not, however, account for the checkerboard assimilation or simultaneous contrast effects. These results are consistent with the view that at least some lightness phenomena are by-products of a rational approach to inferring stable representations of physical properties from intrinsically ambiguous retinal images. Furthermore, they suggest that CNN models may be a promising starting point for new models of human lightness perception.  
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes MACO; CIC Approved no  
  Call Number Admin @ si @ PFV2023 Serial 3890  
Permanent link to this record
Select All    Deselect All
 |   | 
Details
   print

Save Citations:
Export Records: