Publicacions CVC -- Query Results

<< 1 2 3 4 5 6 7 8 9 10 >> [11–20]

Details

Records
Author	Santiago Segui; Michal Drozdzal; Guillem Pascual; Petia Radeva; Carolina Malagelada; Fernando Azpiroz; Jordi Vitria
Title	Generic Feature Learning for Wireless Capsule Endoscopy Analysis			Type	Journal Article
Year	2016	Publication	Computers in Biology and Medicine	Abbreviated Journal	CBM
Volume	79	Issue		Pages	163-172
Keywords	Wireless capsule endoscopy; Deep learning; Feature learning; Motility analysis
Abstract	The interpretation and analysis of wireless capsule endoscopy (WCE) recordings is a complex task which requires sophisticated computer aided decision (CAD) systems to help physicians with video screening and, finally, with the diagnosis. Most CAD systems used in capsule endoscopy share a common system design, but use very different image and video representations. As a result, each time a new clinical application of WCE appears, a new CAD system has to be designed from the scratch. This makes the design of new CAD systems very time consuming. Therefore, in this paper we introduce a system for small intestine motility characterization, based on Deep Convolutional Neural Networks, which circumvents the laborious step of designing specific features for individual motility events. Experimental results show the superiority of the learned features over alternative classifiers constructed using state-of-the-art handcrafted features. In particular, it reaches a mean classification accuracy of 96% for six intestinal motility events, outperforming the other classifiers by a large margin (a 14% relative performance increase).
Address
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference
Notes	OR; MILAB;MV;			Approved	no
Call Number	Admin @ si @ SDP2016			Serial	2836
Permanent link to this record



Author	Mikkel Thogersen; Sergio Escalera; Jordi Gonzalez; Thomas B. Moeslund
Title	Segmentation of RGB-D Indoor scenes by Stacking Random Forests and Conditional Random Fields			Type	Journal Article
Year	2016	Publication	Pattern Recognition Letters	Abbreviated Journal	PRL
Volume	80	Issue		Pages	208–215
Keywords
Abstract	This paper proposes a technique for RGB-D scene segmentation using Multi-class Multi-scale Stacked Sequential Learning (MMSSL) paradigm. Following recent trends in state-of-the-art, a base classifier uses an initial SLIC segmentation to obtain superpixels which provide a diminution of data while retaining object boundaries. A series of color and depth features are extracted from the superpixels, and are used in a Conditional Random Field (CRF) to predict superpixel labels. Furthermore, a Random Forest (RF) classifier using random offset features is also used as an input to the CRF, acting as an initial prediction. As a stacked classifier, another Random Forest is used acting on a spatial multi-scale decomposition of the CRF confidence map to correct the erroneous labels assigned by the previous classifier. The model is tested on the popular NYU-v2 dataset. The approach shows that simple multi-modal features with the power of the MMSSL paradigm can achieve better performance than state of the art results on the same dataset.
Address
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference
Notes	HuPBA; ISE;MILAB; 600.098; 600.119			Approved	no
Call Number	Admin @ si @ TEG2016			Serial	2843
Permanent link to this record



Author	Sergio Escalera; Jordi Gonzalez; Xavier Baro; Jamie Shotton
Title	Guest Editor Introduction to the Special Issue on Multimodal Human Pose Recovery and Behavior Analysis			Type	Journal Article
Year	2016	Publication	IEEE Transactions on Pattern Analysis and Machine Intelligence	Abbreviated Journal	TPAMI
Volume	28	Issue		Pages	1489 - 1491
Keywords
Abstract	The sixteen papers in this special section focus on human pose recovery and behavior analysis (HuPBA). This is one of the most challenging topics in computer vision, pattern analysis, and machine learning. It is of critical importance for application areas that include gaming, computer interaction, human robot interaction, security, commerce, assistive technologies and rehabilitation, sports, sign language recognition, and driver assistance technology, to mention just a few. In essence, HuPBA requires dealing with the articulated nature of the human body, changes in appearance due to clothing, and the inherent problems of clutter scenes, such as background artifacts, occlusions, and illumination changes. These papers represent the most recent research in this field, including new methods considering still images, image sequences, depth data, stereo vision, 3D vision, audio, and IMUs, among others.
Address
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference
Notes	HuPBA; ISE;MV;			Approved	no
Call Number	Admin @ si @			Serial	2851
Permanent link to this record



Author	Pau Riba; Josep Llados; Alicia Fornes; Anjan Dutta
Title	Large-scale graph indexing using binary embeddings of node contexts for information spotting in document image databases			Type	Journal Article
Year	2017	Publication	Pattern Recognition Letters	Abbreviated Journal	PRL
Volume	87	Issue		Pages	203-211
Keywords
Abstract	Graph-based representations are experiencing a growing usage in visual recognition and retrieval due to their representational power in front of classical appearance-based representations. However, retrieving a query graph from a large dataset of graphs implies a high computational complexity. The most important property for a large-scale retrieval is the search time complexity to be sub-linear in the number of database examples. With this aim, in this paper we propose a graph indexation formalism applied to visual retrieval. A binary embedding is defined as hashing keys for graph nodes. Given a database of labeled graphs, graph nodes are complemented with vectors of attributes representing their local context. Then, each attribute vector is converted to a binary code applying a binary-valued hash function. Therefore, graph retrieval is formulated in terms of finding target graphs in the database whose nodes have a small Hamming distance from the query nodes, easily computed with bitwise logical operators. As an application example, we validate the performance of the proposed methods in different real scenarios such as handwritten word spotting in images of historical documents or symbol spotting in architectural floor plans.
Address
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference
Notes	DAG; 600.097; 602.006; 603.053; 600.121			Approved	no
Call Number	RLF2017b			Serial	2873
Permanent link to this record



Author	Lluis Gomez; Dimosthenis Karatzas
Title	TextProposals: a Text‐specific Selective Search Algorithm for Word Spotting in the Wild			Type	Journal Article
Year	2017	Publication	Pattern Recognition	Abbreviated Journal	PR
Volume	70	Issue		Pages	60-74
Keywords
Abstract	Motivated by the success of powerful while expensive techniques to recognize words in a holistic way (Goel et al., 2013; Almazán et al., 2014; Jaderberg et al., 2016) object proposals techniques emerge as an alternative to the traditional text detectors. In this paper we introduce a novel object proposals method that is specifically designed for text. We rely on a similarity based region grouping algorithm that generates a hierarchy of word hypotheses. Over the nodes of this hierarchy it is possible to apply a holistic word recognition method in an efficient way. Our experiments demonstrate that the presented method is superior in its ability of producing good quality word proposals when compared with class-independent algorithms. We show impressive recall rates with a few thousand proposals in different standard benchmarks, including focused or incidental text datasets, and multi-language scenarios. Moreover, the combination of our object proposals with existing whole-word recognizers (Almazán et al., 2014; Jaderberg et al., 2016) shows competitive performance in end-to-end word spotting, and, in some benchmarks, outperforms previously published results. Concretely, in the challenging ICDAR2015 Incidental Text dataset, we overcome in more than 10% F-score the best-performing method in the last ICDAR Robust Reading Competition (Karatzas, 2015). Source code of the complete end-to-end system is available at https://github.com/lluisgomez/TextProposals.
Address
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference
Notes	DAG; 600.084; 601.197; 600.121; 600.129			Approved	no
Call Number	Admin @ si @ GoK2017			Serial	2886
Permanent link to this record



Author	Lluis Gomez; Anguelos Nicolaou; Dimosthenis Karatzas
Title	Improving patch‐based scene text script identification with ensembles of conjoined networks			Type	Journal Article
Year	2017	Publication	Pattern Recognition	Abbreviated Journal	PR
Volume	67	Issue		Pages	85-96
Keywords
Abstract
Address
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference
Notes	DAG; 600.084; 600.121; 600.129			Approved	no
Call Number	Admin @ si @ GNK2017			Serial	2887
Permanent link to this record



Author	Miguel Oliveira; Victor Santos; Angel Sappa; P. Dias; A. Moreira
Title	Incremental texture mapping for autonomous driving			Type	Journal Article
Year	2016	Publication	Robotics and Autonomous Systems	Abbreviated Journal	RAS
Volume	84	Issue		Pages	113-128
Keywords	Scene reconstruction; Autonomous driving; Texture mapping
Abstract	Autonomous vehicles have a large number of on-board sensors, not only for providing coverage all around the vehicle, but also to ensure multi-modality in the observation of the scene. Because of this, it is not trivial to come up with a single, unique representation that feeds from the data given by all these sensors. We propose an algorithm which is capable of mapping texture collected from vision based sensors onto a geometric description of the scenario constructed from data provided by 3D sensors. The algorithm uses a constrained Delaunay triangulation to produce a mesh which is updated using a specially devised sequence of operations. These enforce a partial configuration of the mesh that avoids bad quality textures and ensures that there are no gaps in the texture. Results show that this algorithm is capable of producing fine quality textures.
Address
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference
Notes	ADAS; 600.086			Approved	no
Call Number	Admin @ si @ OSS2016b			Serial	2912
Permanent link to this record



Author	Pau Rodriguez; Guillem Cucurull; Jordi Gonzalez; Josep M. Gonfaus; Kamal Nasrollahi; Thomas B. Moeslund; Xavier Roca
Title	Deep Pain: Exploiting Long Short-Term Memory Networks for Facial Expression Classification			Type	Journal Article
Year	2017	Publication	IEEE Transactions on cybernetics	Abbreviated Journal	Cyber
Volume		Issue		Pages	1-11
Keywords
Abstract	Pain is an unpleasant feeling that has been shown to be an important factor for the recovery of patients. Since this is costly in human resources and difficult to do objectively, there is the need for automatic systems to measure it. In this paper, contrary to current state-of-the-art techniques in pain assessment, which are based on facial features only, we suggest that the performance can be enhanced by feeding the raw frames to deep learning models, outperforming the latest state-of-the-art results while also directly facing the problem of imbalanced data. As a baseline, our approach first uses convolutional neural networks (CNNs) to learn facial features from VGG_Faces, which are then linked to a long short-term memory to exploit the temporal relation between video frames. We further compare the performances of using the so popular schema based on the canonically normalized appearance versus taking into account the whole image. As a result, we outperform current state-of-the-art area under the curve performance in the UNBC-McMaster Shoulder Pain Expression Archive Database. In addition, to evaluate the generalization properties of our proposed methodology on facial motion recognition, we also report competitive results in the Cohn Kanade+ facial expression database.
Address
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference
Notes	ISE; 600.119; 600.098			Approved	no
Call Number	Admin @ si @ RCG2017a			Serial	2926
Permanent link to this record



Author	Anjan Dutta; Josep Llados; Horst Bunke; Umapada Pal
Title	Product graph-based higher order contextual similarities for inexact subgraph matching			Type	Journal Article
Year	2018	Publication	Pattern Recognition	Abbreviated Journal	PR
Volume	76	Issue		Pages	596-611
Keywords
Abstract	Many algorithms formulate graph matching as an optimization of an objective function of pairwise quantification of nodes and edges of two graphs to be matched. Pairwise measurements usually consider local attributes but disregard contextual information involved in graph structures. We address this issue by proposing contextual similarities between pairs of nodes. This is done by considering the tensor product graph (TPG) of two graphs to be matched, where each node is an ordered pair of nodes of the operand graphs. Contextual similarities between a pair of nodes are computed by accumulating weighted walks (normalized pairwise similarities) terminating at the corresponding paired node in TPG. Once the contextual similarities are obtained, we formulate subgraph matching as a node and edge selection problem in TPG. We use contextual similarities to construct an objective function and optimize it with a linear programming approach. Since random walk formulation through TPG takes into account higher order information, it is not a surprise that we obtain more reliable similarities and better discrimination among the nodes and edges. Experimental results shown on synthetic as well as real benchmarks illustrate that higher order contextual similarities increase discriminating power and allow one to find approximate solutions to the subgraph matching problem.
Address
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference
Notes	DAG; 602.167; 600.097; 600.121			Approved	no
Call Number	Admin @ si @ DLB2018			Serial	3083
Permanent link to this record



Author	David Vazquez; Jorge Bernal; F. Javier Sanchez; Gloria Fernandez Esparrach; Antonio Lopez; Adriana Romero; Michal Drozdzal; Aaron Courville
Title	A Benchmark for Endoluminal Scene Segmentation of Colonoscopy Images			Type	Journal Article
Year	2017	Publication	Journal of Healthcare Engineering	Abbreviated Journal	JHCE
Volume		Issue		Pages	2040-2295
Keywords	Colonoscopy images; Deep Learning; Semantic Segmentation
Abstract	Colorectal cancer (CRC) is the third cause of cancer death world-wide. Currently, the standard approach to reduce CRC-related mortality is to perform regular screening in search for polyps and colonoscopy is the screening tool of choice. The main limitations of this screening procedure are polyp miss- rate and inability to perform visual assessment of polyp malignancy. These drawbacks can be reduced by designing Decision Support Systems (DSS) aim- ing to help clinicians in the different stages of the procedure by providing endoluminal scene segmentation. Thus, in this paper, we introduce an extended benchmark of colonoscopy image segmentation, with the hope of establishing a new strong benchmark for colonoscopy image analysis research. The proposed dataset consists of 4 relevant classes to inspect the endolumninal scene, tar- geting different clinical needs. Together with the dataset and taking advantage of advances in semantic segmentation literature, we provide new baselines by training standard fully convolutional networks (FCN). We perform a compar- ative study to show that FCN significantly outperform, without any further post-processing, prior results in endoluminal scene segmentation, especially with respect to polyp segmentation and localization.
Address
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference
Notes	ADAS; MV; 600.075; 600.085; 600.076; 601.281; 600.118			Approved	no
Call Number	VBS2017b			Serial	2940
Permanent link to this record



Author	Marta Diez-Ferrer; Debora Gil; Elena Carreño; Susana Padrones; Samantha Aso; Vanesa Vicens; Noelia Cubero de Frutos; Rosa Lopez Lisbona; Carles Sanchez; Agnes Borras; Antoni Rosell
Title	Positive Airway Pressure-Enhanced CT to Improve Virtual Bronchoscopic Navigation			Type	Journal Article
Year	2017	Publication	European Respiratory Journal	Abbreviated Journal	ERJ
Volume		Issue		Pages
Keywords
Abstract
Address
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference
Notes	IAM			Approved	no
Call Number	Admin @ si @ DGC2017b			Serial	3632
Permanent link to this record



Author	Thanh Ha Do; Salvatore Tabbone; Oriol Ramos Terrades
Title	Sparse representation over learned dictionary for symbol recognition			Type	Journal Article
Year	2016	Publication	Signal Processing	Abbreviated Journal	SP
Volume	125	Issue		Pages	36-47
Keywords	Symbol Recognition; Sparse Representation; Learned Dictionary; Shape Context; Interest Points
Abstract	In this paper we propose an original sparse vector model for symbol retrieval task. More specically, we apply the K-SVD algorithm for learning a visual dictionary based on symbol descriptors locally computed around interest points. Results on benchmark datasets show that the obtained sparse representation is competitive related to state-of-the-art methods. Moreover, our sparse representation is invariant to rotation and scale transforms and also robust to degraded images and distorted symbols. Thereby, the learned visual dictionary is able to represent instances of unseen classes of symbols.
Address
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference
Notes	DAG; 600.061; 600.077			Approved	no
Call Number	Admin @ si @ DTR2016			Serial	2946
Permanent link to this record



Author	Joan Serrat; Felipe Lumbreras; Francisco Blanco; Manuel Valiente; Montserrat Lopez-Mesas
Title	myStone: A system for automatic kidney stone classification			Type	Journal Article
Year	2017	Publication	Expert Systems with Applications	Abbreviated Journal	ESA
Volume	89	Issue		Pages	41-51
Keywords	Kidney stone; Optical device; Computer vision; Image classification
Abstract	Kidney stone formation is a common disease and the incidence rate is constantly increasing worldwide. It has been shown that the classification of kidney stones can lead to an important reduction of the recurrence rate. The classification of kidney stones by human experts on the basis of certain visual color and texture features is one of the most employed techniques. However, the knowledge of how to analyze kidney stones is not widespread, and the experts learn only after being trained on a large number of samples of the different classes. In this paper we describe a new device specifically designed for capturing images of expelled kidney stones, and a method to learn and apply the experts knowledge with regard to their classification. We show that with off the shelf components, a carefully selected set of features and a state of the art classifier it is possible to automate this difficult task to a good degree. We report results on a collection of 454 kidney stones, achieving an overall accuracy of 63% for a set of eight classes covering almost all of the kidney stones taxonomy. Moreover, for more than 80% of samples the real class is the first or the second most probable class according to the system, being then the patient recommendations for the two top classes similar. This is the first attempt towards the automatic visual classification of kidney stones, and based on the current results we foresee better accuracies with the increase of the dataset size.
Address
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference
Notes	ADAS; MSIAU; 603.046; 600.122; 600.118			Approved	no
Call Number	Admin @ si @ SLB2017			Serial	3026
Permanent link to this record



Author	Pau Rodriguez; Guillem Cucurull; Josep M. Gonfaus; Xavier Roca; Jordi Gonzalez
Title	Age and gender recognition in the wild with deep attention			Type	Journal Article
Year	2017	Publication	Pattern Recognition	Abbreviated Journal	PR
Volume	72	Issue		Pages	563-571
Keywords	Age recognition; Gender recognition; Deep neural networks; Attention mechanisms
Abstract	Face analysis in images in the wild still pose a challenge for automatic age and gender recognition tasks, mainly due to their high variability in resolution, deformation, and occlusion. Although the performance has highly increased thanks to Convolutional Neural Networks (CNNs), it is still far from optimal when compared to other image recognition tasks, mainly because of the high sensitiveness of CNNs to facial variations. In this paper, inspired by biology and the recent success of attention mechanisms on visual question answering and fine-grained recognition, we propose a novel feedforward attention mechanism that is able to discover the most informative and reliable parts of a given face for improving age and gender classification. In particular, given a downsampled facial image, the proposed model is trained based on a novel end-to-end learning framework to extract the most discriminative patches from the original high-resolution image. Experimental validation on the standard Adience, Images of Groups, and MORPH II benchmarks show that including attention mechanisms enhances the performance of CNNs in terms of robustness and accuracy.
Address
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference
Notes	ISE; 600.098; 602.133; 600.119			Approved	no
Call Number	Admin @ si @ RCG2017b			Serial	2962
Permanent link to this record



Author	Parichehr Behjati Ardakani; Pau Rodriguez; Carles Fernandez; Armin Mehri; Xavier Roca; Seiichi Ozawa; Jordi Gonzalez
Title	Frequency-based Enhancement Network for Efficient Super-Resolution			Type	Journal Article
Year	2022	Publication	IEEE Access	Abbreviated Journal	ACCESS
Volume	10	Issue		Pages	57383-57397
Keywords	Deep learning; Frequency-based methods; Lightweight architectures; Single image super-resolution
Abstract	Recently, deep convolutional neural networks (CNNs) have provided outstanding performance in single image super-resolution (SISR). Despite their remarkable performance, the lack of high-frequency information in the recovered images remains a core problem. Moreover, as the networks increase in depth and width, deep CNN-based SR methods are faced with the challenge of computational complexity in practice. A promising and under-explored solution is to adapt the amount of compute based on the different frequency bands of the input. To this end, we present a novel Frequency-based Enhancement Block (FEB) which explicitly enhances the information of high frequencies while forwarding low-frequencies to the output. In particular, this block efficiently decomposes features into low- and high-frequency and assigns more computation to high-frequency ones. Thus, it can help the network generate more discriminative representations by explicitly recovering finer details. Our FEB design is simple and generic and can be used as a direct replacement of commonly used SR blocks with no need to change network architectures. We experimentally show that when replacing SR blocks with FEB we consistently improve the reconstruction error, while reducing the number of parameters in the model. Moreover, we propose a lightweight SR model — Frequency-based Enhancement Network (FENet) — based on FEB that matches the performance of larger models. Extensive experiments demonstrate that our proposal performs favorably against the state-of-the-art SR algorithms in terms of visual quality, memory footprint, and inference time. The code is available at https://github.com/pbehjatii/FENet
Address	18 May 2022
Corporate Author				Thesis
Publisher	IEEE	Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference
Notes	ISE			Approved	no
Call Number	Admin @ si @ BRF2022a			Serial	3747
Permanent link to this record