Publicacions CVC -- Query Results

<< 1 2 3 4 5 6 7 8 9 >>

Details

Records
Author	Jorge Charco; Angel Sappa; Boris X. Vintimilla; Henry Velesaca
Title	Camera pose estimation in multi-view environments: From virtual scenarios to the real world			Type	Journal Article
Year	2021	Publication	Image and Vision Computing	Abbreviated Journal	IVC
Volume	110	Issue		Pages	104182
Keywords
Abstract	This paper presents a domain adaptation strategy to efficiently train network architectures for estimating the relative camera pose in multi-view scenarios. The network architectures are fed by a pair of simultaneously acquired images, hence in order to improve the accuracy of the solutions, and due to the lack of large datasets with pairs of overlapped images, a domain adaptation strategy is proposed. The domain adaptation strategy consists on transferring the knowledge learned from synthetic images to real-world scenarios. For this, the networks are firstly trained using pairs of synthetic images, which are captured at the same time by a pair of cameras in a virtual environment; and then, the learned weights of the networks are transferred to the real-world case, where the networks are retrained with a few real images. Different virtual 3D scenarios are generated to evaluate the relationship between the accuracy on the result and the similarity between virtual and real scenarios—similarity on both geometry of the objects contained in the scene as well as relative pose between camera and objects in the scene. Experimental results and comparisons are provided showing that the accuracy of all the evaluated networks for estimating the camera pose improves when the proposed domain adaptation strategy is used, highlighting the importance on the similarity between virtual-real scenarios.
Address
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference
Notes	MSIAU; 600.130; 600.122			Approved	no
Call Number	Admin @ si @ CSV2021			Serial	3577
Permanent link to this record



Author	Ricardo Dario Perez Principi; Cristina Palmero; Julio C. S. Jacques Junior; Sergio Escalera
Title	On the Effect of Observed Subject Biases in Apparent Personality Analysis from Audio-visual Signals			Type	Journal Article
Year	2021	Publication	IEEE Transactions on Affective Computing	Abbreviated Journal	TAC
Volume	12	Issue	3	Pages	607-621
Keywords
Abstract	Personality perception is implicitly biased due to many subjective factors, such as cultural, social, contextual, gender and appearance. Approaches developed for automatic personality perception are not expected to predict the real personality of the target, but the personality external observers attributed to it. Hence, they have to deal with human bias, inherently transferred to the training data. However, bias analysis in personality computing is an almost unexplored area. In this work, we study different possible sources of bias affecting personality perception, including emotions from facial expressions, attractiveness, age, gender, and ethnicity, as well as their influence on prediction ability for apparent personality estimation. To this end, we propose a multi-modal deep neural network that combines raw audio and visual information alongside predictions of attribute-specific models to regress apparent personality. We also analyse spatio-temporal aggregation schemes and the effect of different time intervals on first impressions. We base our study on the ChaLearn First Impressions dataset, consisting of one-person conversational videos. Our model shows state-of-the-art results regressing apparent personality based on the Big-Five model. Furthermore, given the interpretability nature of our network design, we provide an incremental analysis on the impact of each possible source of bias on final network predictions.
Address	1 July-Sept. 2021
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference
Notes	HuPBA; no proj			Approved	no
Call Number	Admin @ si @ PPJ2019			Serial	3312
Permanent link to this record



Author	Parichehr Behjati Ardakani; Pau Rodriguez; Armin Mehri; Isabelle Hupont; Carles Fernandez; Jordi Gonzalez
Title	OverNet: Lightweight Multi-Scale Super-Resolution with Overscaling Network			Type	Conference Article
Year	2021	Publication	IEEE Winter Conference on Applications of Computer Vision	Abbreviated Journal
Volume		Issue		Pages	2693-2702
Keywords
Abstract	Super-resolution (SR) has achieved great success due to the development of deep convolutional neural networks (CNNs). However, as the depth and width of the networks increase, CNN-based SR methods have been faced with the challenge of computational complexity in practice. More- over, most SR methods train a dedicated model for each target resolution, losing generality and increasing memory requirements. To address these limitations we introduce OverNet, a deep but lightweight convolutional network to solve SISR at arbitrary scale factors with a single model. We make the following contributions: first, we introduce a lightweight feature extractor that enforces efficient reuse of information through a novel recursive structure of skip and dense connections. Second, to maximize the performance of the feature extractor, we propose a model agnostic reconstruction module that generates accurate high-resolution images from overscaled feature maps obtained from any SR architecture. Third, we introduce a multi-scale loss function to achieve generalization across scales. Experiments show that our proposal outperforms previous state-of-the-art approaches in standard benchmarks, while maintaining relatively low computation and memory requirements.
Address	Virtual; January 2021
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference	WACV
Notes	ISE; 600.119; 600.098			Approved	no
Call Number	Admin @ si @ BRM2021			Serial	3512
Permanent link to this record



Author	Lei Kang; Pau Riba; Mauricio Villegas; Alicia Fornes; Marçal Rusiñol
Title	Candidate Fusion: Integrating Language Modelling into a Sequence-to-Sequence Handwritten Word Recognition Architecture			Type	Journal Article
Year	2021	Publication	Pattern Recognition	Abbreviated Journal	PR
Volume	112	Issue		Pages	107790
Keywords
Abstract	Sequence-to-sequence models have recently become very popular for tackling handwritten word recognition problems. However, how to effectively integrate an external language model into such recognizer is still a challenging problem. The main challenge faced when training a language model is to deal with the language model corpus which is usually different to the one used for training the handwritten word recognition system. Thus, the bias between both word corpora leads to incorrectness on the transcriptions, providing similar or even worse performances on the recognition task. In this work, we introduce Candidate Fusion, a novel way to integrate an external language model to a sequence-to-sequence architecture. Moreover, it provides suggestions from an external language knowledge, as a new input to the sequence-to-sequence recognizer. Hence, Candidate Fusion provides two improvements. On the one hand, the sequence-to-sequence recognizer has the flexibility not only to combine the information from itself and the language model, but also to choose the importance of the information provided by the language model. On the other hand, the external language model has the ability to adapt itself to the training corpus and even learn the most commonly errors produced from the recognizer. Finally, by conducting comprehensive experiments, the Candidate Fusion proves to outperform the state-of-the-art language models for handwritten word recognition tasks.
Address
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference
Notes	DAG; 600.140; 601.302; 601.312; 600.121			Approved	no
Call Number	Admin @ si @ KRV2021			Serial	3343
Permanent link to this record



Author	Arka Ujjal Dey; Suman Ghosh; Ernest Valveny; Gaurav Harit
Title	Beyond Visual Semantics: Exploring the Role of Scene Text in Image Understanding			Type	Journal Article
Year	2021	Publication	Pattern Recognition Letters	Abbreviated Journal	PRL
Volume	149	Issue		Pages	164-171
Keywords
Abstract	Images with visual and scene text content are ubiquitous in everyday life. However, current image interpretation systems are mostly limited to using only the visual features, neglecting to leverage the scene text content. In this paper, we propose to jointly use scene text and visual channels for robust semantic interpretation of images. We do not only extract and encode visual and scene text cues, but also model their interplay to generate a contextual joint embedding with richer semantics. The contextual embedding thus generated is applied to retrieval and classification tasks on multimedia images, with scene text content, to demonstrate its effectiveness. In the retrieval framework, we augment our learned text-visual semantic representation with scene text cues, to mitigate vocabulary misses that may have occurred during the semantic embedding. To deal with irrelevant or erroneous recognition of scene text, we also apply query-based attention to our text channel. We show how the multi-channel approach, involving visual semantics and scene text, improves upon state of the art.
Address
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference
Notes	DAG; 600.121			Approved	no
Call Number	Admin @ si @ DGV2021			Serial	3364
Permanent link to this record



Author	Manisha Das; Deep Gupta; Petia Radeva; Ashwini M. Bakde
Title	Optimized CT-MR neurological image fusion framework using biologically inspired spiking neural model in hybrid ℓ1 - ℓ0 layer decomposition domain			Type	Journal Article
Year	2021	Publication	Biomedical Signal Processing and Control	Abbreviated Journal	BSPC
Volume	68	Issue		Pages	102535
Keywords
Abstract	Medical image fusion plays an important role in the clinical diagnosis of several critical neurological diseases by merging complementary information available in multimodal images. In this paper, a novel CT-MR neurological image fusion framework is proposed using an optimized biologically inspired feedforward neural model in two-scale hybrid ℓ1 − ℓ0 decomposition domain using gray wolf optimization to preserve the structural as well as texture information present in source CT and MR images. Initially, the source images are subjected to two-scale ℓ1 − ℓ0 decomposition with optimized parameters, giving a scale-1 detail layer, a scale-2 detail layer and a scale-2 base layer. Two detail layers at scale-1 and 2 are fused using an optimized biologically inspired neural model and weighted average scheme based on local energy and modified spatial frequency to maximize the preservation of edges and local textures, respectively, while the scale-2 base layer gets fused using choose max rule to preserve the background information. To optimize the hyper-parameters of hybrid ℓ1 − ℓ0 decomposition and biologically inspired neural model, a fitness function is evaluated based on spatial frequency and edge index of the resultant fused image obtained by adding all the fused components. The fusion performance is analyzed by conducting extensive experiments on different CT-MR neurological images. Experimental results indicate that the proposed method provides better-fused images and outperforms the other state-of-the-art fusion methods in both visual and quantitative assessments.
Address
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference
Notes	MILAB; no proj			Approved	no
Call Number	Admin @ si @ DGR2021b			Serial	3636
Permanent link to this record



Author	Andreea Glavan; Alina Matei; Petia Radeva; Estefania Talavera
Title	Does our social life influence our nutritional behaviour? Understanding nutritional habits from egocentric photo-streams			Type	Journal Article
Year	2021	Publication	Expert Systems with Applications	Abbreviated Journal	ESWA
Volume	171	Issue		Pages	114506
Keywords
Abstract	Nutrition and social interactions are both key aspects of the daily lives of humans. In this work, we propose a system to evaluate the influence of social interaction in the nutritional habits of a person from a first-person perspective. In order to detect the routine of an individual, we construct a nutritional behaviour pattern discovery model, which outputs routines over a number of days. Our method evaluates similarity of routines with respect to visited food-related scenes over the collected days, making use of Dynamic Time Warping, as well as considering social engagement and its correlation with food-related activities. The nutritional and social descriptors of the collected days are evaluated and encoded using an LSTM Autoencoder. Later, the obtained latent space is clustered to find similar days unaffected by outliers using the Isolation Forest method. Moreover, we introduce a new score metric to evaluate the performance of the proposed algorithm. We validate our method on 104 days and more than 100 k egocentric images gathered by 7 users. Several different visualizations are evaluated for the understanding of the findings. Our results demonstrate good performance and applicability of our proposed model for social-related nutritional behaviour understanding. At the end, relevant applications of the model are discussed by analysing the discovered routine of particular individuals.
Address
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference
Notes	MILAB; no proj			Approved	no
Call Number	Admin @ si @ GMR2021			Serial	3634
Permanent link to this record



Author	Mohamed Ali Souibgui; Alicia Fornes; Y.Kessentini; C.Tudor
Title	A Few-shot Learning Approach for Historical Encoded Manuscript Recognition			Type	Conference Article
Year	2021	Publication	25th International Conference on Pattern Recognition	Abbreviated Journal
Volume		Issue		Pages	5413-5420
Keywords
Abstract	Encoded (or ciphered) manuscripts are a special type of historical documents that contain encrypted text. The automatic recognition of this kind of documents is challenging because: 1) the cipher alphabet changes from one document to another, 2) there is a lack of annotated corpus for training and 3) touching symbols make the symbol segmentation difficult and complex. To overcome these difficulties, we propose a novel method for handwritten ciphers recognition based on few-shot object detection. Our method first detects all symbols of a given alphabet in a line image, and then a decoding step maps the symbol similarity scores to the final sequence of transcribed symbols. By training on synthetic data, we show that the proposed architecture is able to recognize handwritten ciphers with unseen alphabets. In addition, if few labeled pages with the same alphabet are used for fine tuning, our method surpasses existing unsupervised and supervised HTR methods for ciphers recognition.
Address	Virtual; January 2021
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference	ICPR
Notes	DAG; 600.121; 600.140			Approved	no
Call Number	Admin @ si @ SFK2021			Serial	3449
Permanent link to this record



Author	Yaxing Wang; Abel Gonzalez-Garcia; Luis Herranz; Joost Van de Weijer
Title	Controlling biases and diversity in diverse image-to-image translation			Type	Journal Article
Year	2021	Publication	Computer Vision and Image Understanding	Abbreviated Journal	CVIU
Volume	202	Issue		Pages	103082
Keywords
Abstract	JCR 2019 Q2, IF=3.121 The task of unpaired image-to-image translation is highly challenging due to the lack of explicit cross-domain pairs of instances. We consider here diverse image translation (DIT), an even more challenging setting in which an image can have multiple plausible translations. This is normally achieved by explicitly disentangling content and style in the latent representation and sampling different styles codes while maintaining the image content. Despite the success of current DIT models, they are prone to suffer from bias. In this paper, we study the problem of bias in image-to-image translation. Biased datasets may add undesired changes (e.g. change gender or race in face images) to the output translations as a consequence of the particular underlying visual distribution in the target domain. In order to alleviate the effects of this problem we propose the use of semantic constraints that enforce the preservation of desired image properties. Our proposed model is a step towards unbiased diverse image-to-image translation (UDIT), and results in less unwanted changes in the translated images while still performing the wanted transformation. Experiments on several heavily biased datasets show the effectiveness of the proposed techniques in different domains such as faces, objects, and scenes.
Address
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference
Notes	LAMP; 600.141; 600.109; 600.147			Approved	no
Call Number	Admin @ si @ WGH2021			Serial	3464
Permanent link to this record



Author	Akhil Gurram; Ahmet Faruk Tuna; Fengyi Shen; Onay Urfalioglu; Antonio Lopez
Title	Monocular Depth Estimation through Virtual-world Supervision and Real-world SfM Self-Supervision			Type	Journal Article
Year	2021	Publication	IEEE Transactions on Intelligent Transportation Systems	Abbreviated Journal	TITS
Volume	23	Issue	8	Pages	12738-12751
Keywords
Abstract	Depth information is essential for on-board perception in autonomous driving and driver assistance. Monocular depth estimation (MDE) is very appealing since it allows for appearance and depth being on direct pixelwise correspondence without further calibration. Best MDE models are based on Convolutional Neural Networks (CNNs) trained in a supervised manner, i.e., assuming pixelwise ground truth (GT). Usually, this GT is acquired at training time through a calibrated multi-modal suite of sensors. However, also using only a monocular system at training time is cheaper and more scalable. This is possible by relying on structure-from-motion (SfM) principles to generate self-supervision. Nevertheless, problems of camouflaged objects, visibility changes, static-camera intervals, textureless areas, and scale ambiguity, diminish the usefulness of such self-supervision. In this paper, we perform monocular depth estimation by virtual-world supervision (MonoDEVS) and real-world SfM self-supervision. We compensate the SfM self-supervision limitations by leveraging virtual-world images with accurate semantic and depth supervision and addressing the virtual-to-real domain gap. Our MonoDEVSNet outperforms previous MDE CNNs trained on monocular and even stereo sequences.
Address
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference
Notes	ADAS; 600.118			Approved	no
Call Number	Admin @ si @ GTS2021			Serial	3598
Permanent link to this record



Author	Diego Porres
Title	Discriminator Synthesis: On reusing the other half of Generative Adversarial Networks			Type	Conference Article
Year	2021	Publication	Machine Learning for Creativity and Design, Neurips Workshop	Abbreviated Journal
Volume		Issue		Pages
Keywords
Abstract	Generative Adversarial Networks have long since revolutionized the world of computer vision and, tied to it, the world of art. Arduous efforts have gone into fully utilizing and stabilizing training so that outputs of the Generator network have the highest possible fidelity, but little has gone into using the Discriminator after training is complete. In this work, we propose to use the latter and show a way to use the features it has learned from the training dataset to both alter an image and generate one from scratch. We name this method Discriminator Dreaming, and the full code can be found at this https URL.
Address	Virtual; December 2021
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference	NEURIPSW
Notes	ADAS; 601.365			Approved	no
Call Number	Admin @ si @ Por2021			Serial	3597
Permanent link to this record



Author	Andres Mafla; Sounak Dey; Ali Furkan Biten; Lluis Gomez; Dimosthenis Karatzas
Title	Multi-modal reasoning graph for scene-text based fine-grained image classification and retrieval			Type	Conference Article
Year	2021	Publication	IEEE Winter Conference on Applications of Computer Vision	Abbreviated Journal
Volume		Issue		Pages	4022-4032
Keywords
Abstract
Address	Virtual; January 2021
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference	WACV
Notes	DAG; 600.121			Approved	no
Call Number	Admin @ si @ MDB2021			Serial	3491
Permanent link to this record



Author	Andres Mafla; Rafael S. Rezende; Lluis Gomez; Diana Larlus; Dimosthenis Karatzas
Title	StacMR: Scene-Text Aware Cross-Modal Retrieval			Type	Conference Article
Year	2021	Publication	IEEE Winter Conference on Applications of Computer Vision	Abbreviated Journal
Volume		Issue		Pages	2219-2229
Keywords
Abstract
Address	Virtual; January 2021
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference	WACV
Notes	DAG; 600.121			Approved	no
Call Number	Admin @ si @ MRG2021a			Serial	3492
Permanent link to this record



Author	Andres Mafla; Ruben Tito; Sounak Dey; Lluis Gomez; Marçal Rusiñol; Ernest Valveny; Dimosthenis Karatzas
Title	Real-time Lexicon-free Scene Text Retrieval			Type	Journal Article
Year	2021	Publication	Pattern Recognition	Abbreviated Journal	PR
Volume	110	Issue		Pages	107656
Keywords
Abstract	In this work, we address the task of scene text retrieval: given a text query, the system returns all images containing the queried text. The proposed model uses a single shot CNN architecture that predicts bounding boxes and builds a compact representation of spotted words. In this way, this problem can be modeled as a nearest neighbor search of the textual representation of a query over the outputs of the CNN collected from the totality of an image database. Our experiments demonstrate that the proposed model outperforms previous state-of-the-art, while offering a significant increase in processing speed and unmatched expressiveness with samples never seen at training time. Several experiments to assess the generalization capability of the model are conducted in a multilingual dataset, as well as an application of real-time text spotting in videos.
Address
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference
Notes	DAG; 600.121; 600.129; 601.338			Approved	no
Call Number	Admin @ si @ MTD2021			Serial	3493
Permanent link to this record



Author	Minesh Mathew; Dimosthenis Karatzas; C.V. Jawahar
Title	DocVQA: A Dataset for VQA on Document Images			Type	Conference Article
Year	2021	Publication	IEEE Winter Conference on Applications of Computer Vision	Abbreviated Journal
Volume		Issue		Pages	2200-2209
Keywords
Abstract	We present a new dataset for Visual Question Answering (VQA) on document images called DocVQA. The dataset consists of 50,000 questions defined on 12,000+ document images. Detailed analysis of the dataset in comparison with similar datasets for VQA and reading comprehension is presented. We report several baseline results by adopting existing VQA and reading comprehension models. Although the existing models perform reasonably well on certain types of questions, there is large performance gap compared to human performance (94.36% accuracy). The models need to improve specifically on questions where understanding structure of the document is crucial. The dataset, code and leaderboard are available at docvqa. org
Address	Virtual; January 2021
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference	WACV
Notes	DAG; 600.121			Approved	no
Call Number	Admin @ si @ MKJ2021			Serial	3498
Permanent link to this record