|   | 
Details
   web
Records
Author Mark Philip Philipsen; Anders Jorgensen; Thomas B. Moeslund; Sergio Escalera
Title RGB-D Segmentation of Poultry Entrails Type Conference Article
Year 2016 Publication 9th Conference on Articulated Motion and Deformable Objects Abbreviated Journal
Volume Issue Pages
Keywords
Abstract Best commercial paper award.
Address
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference AMDO
Notes HuPBA;MILAB Approved no
Call Number (up) Admin @ si @ PJM2016 Serial 2848
Permanent link to this record
 

 
Author Francesco Pelosin; Saurav Jha; Andrea Torsello; Bogdan Raducanu; Joost Van de Weijer
Title Towards exemplar-free continual learning in vision transformers: an account of attention, functional and weight regularization Type Conference Article
Year 2022 Publication IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW) Abbreviated Journal
Volume Issue Pages
Keywords Learning systems; Weight measurement; Image recognition; Surgery; Benchmark testing; Transformers; Stability analysis
Abstract In this paper, we investigate the continual learning of Vision Transformers (ViT) for the challenging exemplar-free scenario, with special focus on how to efficiently distill the knowledge of its crucial self-attention mechanism (SAM). Our work takes an initial step towards a surgical investigation of SAM for designing coherent continual learning methods in ViTs. We first carry out an evaluation of established continual learning regularization techniques. We then examine the effect of regularization when applied to two key enablers of SAM: (a) the contextualized embedding layers, for their ability to capture well-scaled representations with respect to the values, and (b) the prescaled attention maps, for carrying value-independent global contextual information. We depict the perks of each distilling strategy on two image recognition benchmarks (CIFAR100 and ImageNet-32) – while (a) leads to a better overall accuracy, (b) helps enhance the rigidity by maintaining competitive performances. Furthermore, we identify the limitation imposed by the symmetric nature of regularization losses. To alleviate this, we propose an asymmetric variant and apply it to the pooled output distillation (POD) loss adapted for ViTs. Our experiments confirm that introducing asymmetry to POD boosts its plasticity while retaining stability across (a) and (b). Moreover, we acknowledge low forgetting measures for all the compared methods, indicating that ViTs might be naturally inclined continual learners. 1
Address New Orleans; USA; June 2022
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference CVPRW
Notes LAMP; 600.147 Approved no
Call Number (up) Admin @ si @ PJT2022 Serial 3784
Permanent link to this record
 

 
Author Cristina Palmero; Oleg V Komogortsev; Sergio Escalera; Sachin S Talathi
Title Multi-Rate Sensor Fusion for Unconstrained Near-Eye Gaze Estimation Type Conference Article
Year 2023 Publication Proceedings of the 2023 Symposium on Eye Tracking Research and Applications Abbreviated Journal
Volume Issue Pages 1-8
Keywords
Abstract The power requirements of video-oculography systems can be prohibitive for high-speed operation on portable devices. Recently, low-power alternatives such as photosensors have been evaluated, providing gaze estimates at high frequency with a trade-off in accuracy and robustness. Potentially, an approach combining slow/high-fidelity and fast/low-fidelity sensors should be able to exploit their complementarity to track fast eye motion accurately and robustly. To foster research on this topic, we introduce OpenSFEDS, a near-eye gaze estimation dataset containing approximately 2M synthetic camera-photosensor image pairs sampled at 500 Hz under varied appearance and camera position. We also formulate the task of sensor fusion for gaze estimation, proposing a deep learning framework consisting in appearance-based encoding and temporal eye-state dynamics. We evaluate several single- and multi-rate fusion baselines on OpenSFEDS, achieving 8.7% error decrease when tracking fast eye movements with a multi-rate approach vs. a gaze forecasting approach operating with a low-speed sensor alone.
Address Tubingen; Germany; May 2023
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference ETRA
Notes HUPBA Approved no
Call Number (up) Admin @ si @ PKE2023 Serial 3923
Permanent link to this record
 

 
Author Vishwesh Pillai; Pranav Mehar; Manisha Das; Deep Gupta; Petia Radeva
Title Integrated Hierarchical and Flat Classifiers for Food Image Classification using Epistemic Uncertainty Type Conference Article
Year 2022 Publication IEEE International Conference on Signal Processing and Communications Abbreviated Journal
Volume Issue Pages
Keywords
Abstract The problem of food image recognition is an essential one in today’s context because health conditions such as diabetes, obesity, and heart disease require constant monitoring of a person’s diet. To automate this process, several models are available to recognize food images. Due to a considerable number of unique food dishes and various cuisines, a traditional flat classifier ceases to perform well. To address this issue, prediction schemes consisting of both flat and hierarchical classifiers, with the analysis of epistemic uncertainty are used to switch between the classifiers. However, the accuracy of the predictions made using epistemic uncertainty data remains considerably low. Therefore, this paper presents a prediction scheme using three different threshold criteria that helps to increase the accuracy of epistemic uncertainty predictions. The performance of the proposed method is demonstrated using several experiments performed on the MAFood-121 dataset. The experimental results validate the proposal performance and show that the proposed threshold criteria help to increase the overall accuracy of the predictions by correctly classifying the uncertainty distribution of the samples.
Address Bangalore; India; July 2022
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference SPCOM
Notes MILAB; no menciona Approved no
Call Number (up) Admin @ si @ PMD2022 Serial 3796
Permanent link to this record
 

 
Author Miquel Angel Piera; Jose Luis Muñoz; Debora Gil; Gonzalo Martin; Jordi Manzano
Title A Socio-Technical Simulation Model for the Design of the Future Single Pilot Cockpit: An Opportunity to Improve Pilot Performance Type Journal Article
Year 2022 Publication IEEE Access Abbreviated Journal ACCESS
Volume 10 Issue Pages 22330-22343
Keywords Human factors ; Performance evaluation ; Simulation; Sociotechnical systems ; System performance
Abstract The future deployment of single pilot operations must be supported by new cockpit computer services. Such services require an adaptive context-aware integration of technical functionalities with the concurrent tasks that a pilot must deal with. Advanced artificial intelligence supporting services and improved communication capabilities are the key enabling technologies that will render future cockpits more integrated with the present digitalized air traffic management system. However, an issue in the integration of such technologies is the lack of socio-technical analysis in the design of these teaming mechanisms. A key factor in determining how and when a service support should be provided is the dynamic evolution of pilot workload. This paper investigates how the socio-technical model-based systems engineering approach paves the way for the design of a digital assistant framework by formalizing this workload. The model was validated in an Airbus A-320 cockpit simulator, and the results confirmed the degraded pilot behavioral model and the performance impact according to different contextual flight deck information. This study contributes to practical knowledge for designing human-machine task-sharing systems.
Address Feb 2022
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference
Notes IAM; Approved no
Call Number (up) Admin @ si @ PMG2022 Serial 3697
Permanent link to this record
 

 
Author Ruben Perez Tito; Khanh Nguyen; Marlon Tobaben; Raouf Kerkouche; Mohamed Ali Souibgui; Kangsoo Jung; Lei Kang; Ernest Valveny; Antti Honkela; Mario Fritz; Dimosthenis Karatzas
Title Privacy-Aware Document Visual Question Answering Type Miscellaneous
Year 2023 Publication Arxiv Abbreviated Journal
Volume Issue Pages
Keywords
Abstract Document Visual Question Answering (DocVQA) is a fast growing branch of document understanding. Despite the fact that documents contain sensitive or copyrighted information, none of the current DocVQA methods offers strong privacy guarantees.
In this work, we explore privacy in the domain of DocVQA for the first time. We highlight privacy issues in state of the art multi-modal LLM models used for DocVQA, and explore possible solutions.
Specifically, we focus on the invoice processing use case as a realistic, widely used scenario for document understanding, and propose a large scale DocVQA dataset comprising invoice documents and associated questions and answers. We employ a federated learning scheme, that reflects the real-life distribution of documents in different businesses, and we explore the use case where the ID of the invoice issuer is the sensitive information to be protected.
We demonstrate that non-private models tend to memorise, behaviour that can lead to exposing private information. We then evaluate baseline training schemes employing federated learning and differential privacy in this multi-modal scenario, where the sensitive information might be exposed through any of the two input modalities: vision (document image) or language (OCR tokens).
Finally, we design an attack exploiting the memorisation effect of the model, and demonstrate its effectiveness in probing different DocVQA models.
Address
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference
Notes DAG Approved no
Call Number (up) Admin @ si @ PNT2023 Serial 4012
Permanent link to this record
 

 
Author C. Alejandro Parraga; Xavier Otazu; Arash Akbarinia
Title Modelling symmetry perception with banks of quadrature convolutional Gabor kernels Type Conference Article
Year 2019 Publication 42nd edition of the European Conference on Visual Perception Abbreviated Journal
Volume Issue Pages 224-224
Keywords
Abstract Mirror symmetry is a property most likely to be encountered in animals than in medium scale vegetation or inanimate objects in the natural world. This might be the reason why the human visual system has evolved to detect it quickly and robustly. Indeed, the perception of symmetry assists higher-level visual processing that are crucial for survival such as target recognition and identification irrespective of position and location. Although the task of detecting symmetrical objects seems effortless to us, it is very challenging for computers (to the extent that it has been proposed as a robust “captcha” by Funk & Liu in 2016). Indeed, the exact mechanism of symmetry detection in primates is not well understood: fMRI studies have shown that symmetrical shapes activate specific higher-level areas of the visual cortex (Sasaki et al.; 2005) and similarly, a large body of psychophysical experiments suggest that the symmetry perception is critically influenced by low-level mechanisms (Treder; 2010). In this work we attempt to find plausible low-level mechanisms that might form the basis for symmetry perception. Our simple model is made from banks of (i) odd-symmetric Gabors (resembling edge-detecting V1 neurons); and (ii) banks of larger odd- and even-symmetric Gabors (resembling higher visual cortex neurons), that pool signals from the 'edge image'. As reported previously (Akbarinia et al, ECVP2017), the convolution of the symmetrical lines with the two Gabor kernels of alternative phase produces a minimum in one and a maximum in the other (Osorio; 1996), and the rectification and combination of these signals create lines which hint of mirror symmetry in natural images. We improved the algorithm by combining these signals across several spatial scales. Our preliminary results suggest that such multiscale combination of convolutional operations might form the basis for much of the operation of the HVS in terms of symmetry detection and representation.
Address Leuven; Belgium; August 2019
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference ECVP
Notes NEUROBIT; 600.128 Approved no
Call Number (up) Admin @ si @ POA2019 Serial 3371
Permanent link to this record
 

 
Author Olivier Penacchio; Xavier Otazu; Laura Dempere-Marco
Title A Neurodynamical Model of Brightness Induction in V1 Type Journal Article
Year 2013 Publication PloS ONE Abbreviated Journal Plos
Volume 8 Issue 5 Pages e64086
Keywords
Abstract Brightness induction is the modulation of the perceived intensity of an area by the luminance of surrounding areas. Recent neurophysiological evidence suggests that brightness information might be explicitly represented in V1, in contrast to the more common assumption that the striate cortex is an area mostly responsive to sensory information. Here we investigate possible neural mechanisms that offer a plausible explanation for such phenomenon. To this end, a neurodynamical model which is based on neurophysiological evidence and focuses on the part of V1 responsible for contextual influences is presented. The proposed computational model successfully accounts for well known psychophysical effects for static contexts and also for brightness induction in dynamic contexts defined by modulating the luminance of surrounding areas. This work suggests that intra-cortical interactions in V1 could, at least partially, explain brightness induction effects and reveals how a common general architecture may account for several different fundamental processes, such as visual saliency and brightness induction, which emerge early in the visual processing pathway.
Address
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference
Notes CIC Approved no
Call Number (up) Admin @ si @ POD2013 Serial 2242
Permanent link to this record
 

 
Author Diego Porres
Title Discriminator Synthesis: On reusing the other half of Generative Adversarial Networks Type Conference Article
Year 2021 Publication Machine Learning for Creativity and Design, Neurips Workshop Abbreviated Journal
Volume Issue Pages
Keywords
Abstract Generative Adversarial Networks have long since revolutionized the world of computer vision and, tied to it, the world of art. Arduous efforts have gone into fully utilizing and stabilizing training so that outputs of the Generator network have the highest possible fidelity, but little has gone into using the Discriminator after training is complete. In this work, we propose to use the latter and show a way to use the features it has learned from the training dataset to both alter an image and generate one from scratch. We name this method Discriminator Dreaming, and the full code can be found at this https URL.
Address Virtual; December 2021
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference NEURIPSW
Notes ADAS; 601.365 Approved no
Call Number (up) Admin @ si @ Por2021 Serial 3597
Permanent link to this record
 

 
Author Ferran Poveda
Title Computer Graphics and Vision Techniques for the Study of the Muscular Fiber Architecture of the Myocardium Type Book Whole
Year 2013 Publication PhD Thesis, Universitat Autonoma de Barcelona-CVC Abbreviated Journal
Volume Issue Pages
Keywords
Abstract
Address
Corporate Author Thesis Ph.D. thesis
Publisher Place of Publication Editor Debora Gil;Enric Marti
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference
Notes IAM Approved no
Call Number (up) Admin @ si @ Pov2013 Serial 2417
Permanent link to this record
 

 
Author Olivier Penacchio; Xavier Otazu; A. wilkins; J. Harris
Title Uncomfortable images prevent lateral interactions in the cortex from providing a sparse code Type Conference Article
Year 2015 Publication European Conference on Visual Perception ECVP2015 Abbreviated Journal
Volume Issue Pages
Keywords
Abstract
Address Liverpool; uk; August 2015
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference ECVP
Notes NEUROBIT;CIC Approved no
Call Number (up) Admin @ si @ POW2015 Serial 2633
Permanent link to this record
 

 
Author Olivier Penacchio; Xavier Otazu; Arnold J Wilkings; Sara M. Haigh
Title A mechanistic account of visual discomfort Type Journal Article
Year 2023 Publication Frontiers in Neuroscience Abbreviated Journal FN
Volume 17 Issue Pages
Keywords
Abstract Much of the neural machinery of the early visual cortex, from the extraction of local orientations to contextual modulations through lateral interactions, is thought to have developed to provide a sparse encoding of contour in natural scenes, allowing the brain to process efficiently most of the visual scenes we are exposed to. Certain visual stimuli, however, cause visual stress, a set of adverse effects ranging from simple discomfort to migraine attacks, and epileptic seizures in the extreme, all phenomena linked with an excessive metabolic demand. The theory of efficient coding suggests a link between excessive metabolic demand and images that deviate from natural statistics. Yet, the mechanisms linking energy demand and image spatial content in discomfort remain elusive. Here, we used theories of visual coding that link image spatial structure and brain activation to characterize the response to images observers reported as uncomfortable in a biologically based neurodynamic model of the early visual cortex that included excitatory and inhibitory layers to implement contextual influences. We found three clear markers of aversive images: a larger overall activation in the model, a less sparse response, and a more unbalanced distribution of activity across spatial orientations. When the ratio of excitation over inhibition was increased in the model, a phenomenon hypothesised to underlie interindividual differences in susceptibility to visual discomfort, the three markers of discomfort progressively shifted toward values typical of the response to uncomfortable stimuli. Overall, these findings propose a unifying mechanistic explanation for why there are differences between images and between observers, suggesting how visual input and idiosyncratic hyperexcitability give rise to abnormal brain responses that result in visual stress.
Address
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference
Notes NEUROBIT Approved no
Call Number (up) Admin @ si @ POW2023 Serial 3886
Permanent link to this record
 

 
Author Ricardo Dario Perez Principi; Cristina Palmero; Julio C. S. Jacques Junior; Sergio Escalera
Title On the Effect of Observed Subject Biases in Apparent Personality Analysis from Audio-visual Signals Type Journal Article
Year 2021 Publication IEEE Transactions on Affective Computing Abbreviated Journal TAC
Volume 12 Issue 3 Pages 607-621
Keywords
Abstract Personality perception is implicitly biased due to many subjective factors, such as cultural, social, contextual, gender and appearance. Approaches developed for automatic personality perception are not expected to predict the real personality of the target, but the personality external observers attributed to it. Hence, they have to deal with human bias, inherently transferred to the training data. However, bias analysis in personality computing is an almost unexplored area. In this work, we study different possible sources of bias affecting personality perception, including emotions from facial expressions, attractiveness, age, gender, and ethnicity, as well as their influence on prediction ability for apparent personality estimation. To this end, we propose a multi-modal deep neural network that combines raw audio and visual information alongside predictions of attribute-specific models to regress apparent personality. We also analyse spatio-temporal aggregation schemes and the effect of different time intervals on first impressions. We base our study on the ChaLearn First Impressions dataset, consisting of one-person conversational videos. Our model shows state-of-the-art results regressing apparent personality based on the Big-Five model. Furthermore, given the interpretability nature of our network design, we provide an incremental analysis on the impact of each possible source of bias on final network predictions.
Address 1 July-Sept. 2021
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference
Notes HuPBA; no proj Approved no
Call Number (up) Admin @ si @ PPJ2019 Serial 3312
Permanent link to this record
 

 
Author C. Alejandro Parraga; Olivier Penacchio; Maria Vanrell
Title Retinal Filtering Matches Natural Image Statistics at Low Luminance Levels Type Journal Article
Year 2011 Publication Perception Abbreviated Journal PER
Volume 40 Issue Pages 96
Keywords
Abstract The assumption that the retina’s main objective is to provide a minimum entropy representation to higher visual areas (ie efficient coding principle) allows to predict retinal filtering in space–time and colour (Atick, 1992 Network 3 213–251). This is achieved by considering the power spectra of natural images (which is proportional to 1/f2) and the suppression of retinal and image noise. However, most studies consider images within a limited range of lighting conditions (eg near noon) whereas the visual system’s spatial filtering depends on light intensity and the spatiochromatic properties of natural scenes depend of the time of the day. Here, we explore whether the dependence of visual spatial filtering on luminance match the changes in power spectrum of natural scenes at different times of the day. Using human cone-activation based naturalistic stimuli (from the Barcelona Calibrated Images Database), we show that for a range of luminance levels, the shape of the retinal CSF reflects the slope of the power spectrum at low spatial frequencies. Accordingly, the retina implements the filtering which best decorrelates the input signal at every luminance level. This result is in line with the body of work that places efficient coding as a guiding neural principle.
Address
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference
Notes CIC Approved no
Call Number (up) Admin @ si @ PPV2011 Serial 1720
Permanent link to this record
 

 
Author Marcin Przewiezlikowski; Mateusz Pyla; Bartosz Zielinski; Bartłomiej Twardowski; Jacek Tabor; Marek Smieja
Title Augmentation-aware Self-supervised Learning with Guided Projector Type Miscellaneous
Year 2023 Publication arxiv Abbreviated Journal
Volume Issue Pages
Keywords
Abstract Self-supervised learning (SSL) is a powerful technique for learning robust representations from unlabeled data. By learning to remain invariant to applied data augmentations, methods such as SimCLR and MoCo are able to reach quality on par with supervised approaches. However, this invariance may be harmful to solving some downstream tasks which depend on traits affected by augmentations used during pretraining, such as color. In this paper, we propose to foster sensitivity to such characteristics in the representation space by modifying the projector network, a common component of self-supervised architectures. Specifically, we supplement the projector with information about augmentations applied to images. In order for the projector to take advantage of this auxiliary conditioning when solving the SSL task, the feature extractor learns to preserve the augmentation information in its representations. Our approach, coined Conditional Augmentation-aware Self-supervised Learning (CASSLE), is directly applicable to typical joint-embedding SSL methods regardless of their objective functions. Moreover, it does not require major changes in the network architecture or prior knowledge of downstream tasks. In addition to an analysis of sensitivity towards different data augmentations, we conduct a series of experiments, which show that CASSLE improves over various SSL methods, reaching state-of-the-art performance in multiple downstream tasks.
Address
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference
Notes LAMP Approved no
Call Number (up) Admin @ si @ PPZ2023 Serial 3971
Permanent link to this record