toggle visibility Search & Display Options

Select All    Deselect All
 |   | 
Details
   print
  Records Links
Author Meysam Madadi; Hugo Bertiche; Sergio Escalera edit   pdf
url  openurl
  Title SMPLR: Deep learning based SMPL reverse for 3D human pose and shape recovery Type Journal Article
  Year 2020 Publication Pattern Recognition Abbreviated Journal PR  
  Volume 106 Issue Pages 107472  
  Keywords Deep learning; 3D Human pose; Body shape; SMPL; Denoising autoencoder; Volumetric stack hourglass  
  Abstract In this paper we propose to embed SMPL within a deep-based model to accurately estimate 3D pose and shape from a still RGB image. We use CNN-based 3D joint predictions as an intermediate representation to regress SMPL pose and shape parameters. Later, 3D joints are reconstructed again in the SMPL output. This module can be seen as an autoencoder where the encoder is a deep neural network and the decoder is SMPL model. We refer to this as SMPL reverse (SMPLR). By implementing SMPLR as an encoder-decoder we avoid the need of complex constraints on pose and shape. Furthermore, given that in-the-wild datasets usually lack accurate 3D annotations, it is desirable to lift 2D joints to 3D without pairing 3D annotations with RGB images. Therefore, we also propose a denoising autoencoder (DAE) module between CNN and SMPLR, able to lift 2D joints to 3D and partially recover from structured error. We evaluate our method on SURREAL and Human3.6M datasets, showing improvement over SMPL-based state-of-the-art alternatives by about 4 and 12 mm, respectively.  
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes HuPBA; no proj Approved no  
  Call Number Admin @ si @ MBE2020 Serial (down) 3439  
Permanent link to this record
 

 
Author Alejandro Cartas; Petia Radeva; Mariella Dimiccoli edit  url
doi  openurl
  Title Activities of Daily Living Monitoring via a Wearable Camera: Toward Real-World Applications Type Journal Article
  Year 2020 Publication IEEE Access Abbreviated Journal ACCESS  
  Volume 8 Issue Pages 77344 - 77363  
  Keywords  
  Abstract Activity recognition from wearable photo-cameras is crucial for lifestyle characterization and health monitoring. However, to enable its wide-spreading use in real-world applications, a high level of generalization needs to be ensured on unseen users. Currently, state-of-the-art methods have been tested only on relatively small datasets consisting of data collected by a few users that are partially seen during training. In this paper, we built a new egocentric dataset acquired by 15 people through a wearable photo-camera and used it to test the generalization capabilities of several state-of-the-art methods for egocentric activity recognition on unseen users and daily image sequences. In addition, we propose several variants to state-of-the-art deep learning architectures, and we show that it is possible to achieve 79.87% accuracy on users unseen during training. Furthermore, to show that the proposed dataset and approach can be useful in real-world applications, where data can be acquired by different wearable cameras and labeled data are scarcely available, we employed a domain adaptation strategy on two egocentric activity recognition benchmark datasets. These experiments show that the model learned with our dataset, can easily be transferred to other domains with a very small amount of labeled data. Taken together, those results show that activity recognition from wearable photo-cameras is mature enough to be tested in real-world applications.  
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes MILAB; no proj Approved no  
  Call Number Admin @ si @ CRD2020 Serial (down) 3436  
Permanent link to this record
 

 
Author Estefania Talavera; Carolin Wuerich; Nicolai Petkov; Petia Radeva edit  url
doi  openurl
  Title Topic modelling for routine discovery from egocentric photo-streams Type Journal Article
  Year 2020 Publication Pattern Recognition Abbreviated Journal PR  
  Volume 104 Issue Pages 107330  
  Keywords Routine; Egocentric vision; Lifestyle; Behaviour analysis; Topic modelling  
  Abstract Developing tools to understand and visualize lifestyle is of high interest when addressing the improvement of habits and well-being of people. Routine, defined as the usual things that a person does daily, helps describe the individuals’ lifestyle. With this paper, we are the first ones to address the development of novel tools for automatic discovery of routine days of an individual from his/her egocentric images. In the proposed model, sequences of images are firstly characterized by semantic labels detected by pre-trained CNNs. Then, these features are organized in temporal-semantic documents to later be embedded into a topic models space. Finally, Dynamic-Time-Warping and Spectral-Clustering methods are used for final day routine/non-routine discrimination. Moreover, we introduce a new EgoRoutine-dataset, a collection of 104 egocentric days with more than 100.000 images recorded by 7 users. Results show that routine can be discovered and behavioural patterns can be observed.  
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes MILAB; no proj Approved no  
  Call Number Admin @ si @ TWP2020 Serial (down) 3435  
Permanent link to this record
 

 
Author Cristhian A. Aguilera-Carrasco; Cristhian Aguilera; Cristobal A. Navarro; Angel Sappa edit   pdf
url  doi
openurl 
  Title Fast CNN Stereo Depth Estimation through Embedded GPU Devices Type Journal Article
  Year 2020 Publication Sensors Abbreviated Journal SENS  
  Volume 20 Issue 11 Pages 3249  
  Keywords stereo matching; deep learning; embedded GPU  
  Abstract Current CNN-based stereo depth estimation models can barely run under real-time constraints on embedded graphic processing unit (GPU) devices. Moreover, state-of-the-art evaluations usually do not consider model optimization techniques, being that it is unknown what is the current potential on embedded GPU devices. In this work, we evaluate two state-of-the-art models on three different embedded GPU devices, with and without optimization methods, presenting performance results that illustrate the actual capabilities of embedded GPU devices for stereo depth estimation. More importantly, based on our evaluation, we propose the use of a U-Net like architecture for postprocessing the cost-volume, instead of a typical sequence of 3D convolutions, drastically augmenting the runtime speed of current models. In our experiments, we achieve real-time inference speed, in the range of 5–32 ms, for 1216 × 368 input stereo images on the Jetson TX2, Jetson Xavier, and Jetson Nano embedded devices.  
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes MSIAU; 600.122 Approved no  
  Call Number Admin @ si @ AAN2020 Serial (down) 3428  
Permanent link to this record
 

 
Author Zhengying Liu; Zhen Xu; Sergio Escalera; Isabelle Guyon; Julio C. S. Jacques Junior; Meysam Madadi; Adrien Pavao; Sebastien Treguer; Wei-Wei Tu edit   pdf
url  openurl
  Title Towards automated computer vision: analysis of the AutoCV challenges 2019 Type Journal Article
  Year 2020 Publication Pattern Recognition Letters Abbreviated Journal PRL  
  Volume 135 Issue Pages 196-203  
  Keywords Computer vision; AutoML; Deep learning  
  Abstract We present the results of recent challenges in Automated Computer Vision (AutoCV, renamed here for clarity AutoCV1 and AutoCV2, 2019), which are part of a series of challenge on Automated Deep Learning (AutoDL). These two competitions aim at searching for fully automated solutions for classification tasks in computer vision, with an emphasis on any-time performance. The first competition was limited to image classification while the second one included both images and videos. Our design imposed to the participants to submit their code on a challenge platform for blind testing on five datasets, both for training and testing, without any human intervention whatsoever. Winning solutions adopted deep learning techniques based on already published architectures, such as AutoAugment, MobileNet and ResNet, to reach state-of-the-art performance in the time budget of the challenge (only 20 minutes of GPU time). The novel contributions include strategies to deliver good preliminary results at any time during the learning process, such that a method can be stopped early and still deliver good performance. This feature is key for the adoption of such techniques by data analysts desiring to obtain rapidly preliminary results on large datasets and to speed up the development process. The soundness of our design was verified in several aspects: (1) Little overfitting of the on-line leaderboard providing feedback on 5 development datasets was observed, compared to the final blind testing on the 5 (separate) final test datasets, suggesting that winning solutions might generalize to other computer vision classification tasks; (2) Error bars on the winners’ performance allow us to say with confident that they performed significantly better than the baseline solutions we provided; (3) The ranking of participants according to the any-time metric we designed, namely the Area under the Learning Curve, was different from that of the fixed-time metric, i.e. AUC at the end of the fixed time budget. We released all winning solutions under open-source licenses. At the end of the AutoDL challenge series, all data of the challenge will be made publicly available, thus providing a collection of uniformly formatted datasets, which can serve to conduct further research, particularly on meta-learning.  
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes HuPBA; no proj Approved no  
  Call Number Admin @ si @ LXE2020 Serial (down) 3427  
Permanent link to this record
 

 
Author Yaxing Wang; Luis Herranz; Joost Van de Weijer edit   pdf
url  doi
openurl 
  Title Mix and match networks: multi-domain alignment for unpaired image-to-image translation Type Journal Article
  Year 2020 Publication International Journal of Computer Vision Abbreviated Journal IJCV  
  Volume 128 Issue Pages 2849–2872  
  Keywords  
  Abstract This paper addresses the problem of inferring unseen cross-modal image-to-image translations between multiple modalities. We assume that only some of the pairwise translations have been seen (i.e. trained) and infer the remaining unseen translations (where training pairs are not available). We propose mix and match networks, an approach where multiple encoders and decoders are aligned in such a way that the desired translation can be obtained by simply cascading the source encoder and the target decoder, even when they have not interacted during the training stage (i.e. unseen). The main challenge lies in the alignment of the latent representations at the bottlenecks of encoder-decoder pairs. We propose an architecture with several tools to encourage alignment, including autoencoders and robust side information and latent consistency losses. We show the benefits of our approach in terms of effectiveness and scalability compared with other pairwise image-to-image translation approaches. We also propose zero-pair cross-modal image translation, a challenging setting where the objective is inferring semantic segmentation from depth (and vice-versa) without explicit segmentation-depth pairs, and only from two (disjoint) segmentation-RGB and depth-RGB training sets. We observe that a certain part of the shared information between unseen modalities might not be reachable, so we further propose a variant that leverages pseudo-pairs which allows us to exploit this shared information between the unseen modalities  
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes LAMP; 600.109; 600.106; 600.141; 600.120 Approved no  
  Call Number Admin @ si @ WHW2020 Serial (down) 3424  
Permanent link to this record
 

 
Author Pau Rodriguez; Diego Velazquez; Guillem Cucurull; Josep M. Gonfaus; Xavier Roca; Jordi Gonzalez edit   pdf
doi  openurl
  Title Pay attention to the activations: a modular attention mechanism for fine-grained image recognition Type Journal Article
  Year 2020 Publication IEEE Transactions on Multimedia Abbreviated Journal TMM  
  Volume 22 Issue 2 Pages 502-514  
  Keywords  
  Abstract Fine-grained image recognition is central to many multimedia tasks such as search, retrieval, and captioning. Unfortunately, these tasks are still challenging since the appearance of samples of the same class can be more different than those from different classes. This issue is mainly due to changes in deformation, pose, and the presence of clutter. In the literature, attention has been one of the most successful strategies to handle the aforementioned problems. Attention has been typically implemented in neural networks by selecting the most informative regions of the image that improve classification. In contrast, in this paper, attention is not applied at the image level but to the convolutional feature activations. In essence, with our approach, the neural model learns to attend to lower-level feature activations without requiring part annotations and uses those activations to update and rectify the output likelihood distribution. The proposed mechanism is modular, architecture-independent, and efficient in terms of both parameters and computation required. Experiments demonstrate that well-known networks such as wide residual networks and ResNeXt, when augmented with our approach, systematically improve their classification accuracy and become more robust to changes in deformation and pose and to the presence of clutter. As a result, our proposal reaches state-of-the-art classification accuracies in CIFAR-10, the Adience gender recognition task, Stanford Dogs, and UEC-Food100 while obtaining competitive performance in ImageNet, CIFAR-100, CUB200 Birds, and Stanford Cars. In addition, we analyze the different components of our model, showing that the proposed attention modules succeed in finding the most discriminative regions of the image. Finally, as a proof of concept, we demonstrate that with only local predictions, an augmented neural network can successfully classify an image before reaching any fully connected layer, thus reducing the computational amount up to 10%.  
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes ISE; 600.119; 600.098 Approved no  
  Call Number Admin @ si @ RVC2020a Serial (down) 3417  
Permanent link to this record
 

 
Author Yunan Li; Jun Wan; Qiguang Miao; Sergio Escalera; Huijuan Fang; Huizhou Chen; Xiangda Qi; Guodong Guo edit  url
openurl 
  Title CR-Net: A Deep Classification-Regression Network for Multimodal Apparent Personality Analysis Type Journal Article
  Year 2020 Publication International Journal of Computer Vision Abbreviated Journal IJCV  
  Volume 128 Issue Pages 2763–2780  
  Keywords  
  Abstract First impressions strongly influence social interactions, having a high impact in the personal and professional life. In this paper, we present a deep Classification-Regression Network (CR-Net) for analyzing the Big Five personality problem and further assisting on job interview recommendation in a first impressions setup. The setup is based on the ChaLearn First Impressions dataset, including multimodal data with video, audio, and text converted from the corresponding audio data, where each person is talking in front of a camera. In order to give a comprehensive prediction, we analyze the videos from both the entire scene (including the person’s motions and background) and the face of the person. Our CR-Net first performs personality trait classification and applies a regression later, which can obtain accurate predictions for both personality traits and interview recommendation. Furthermore, we present a new loss function called Bell Loss to address inaccurate predictions caused by the regression-to-the-mean problem. Extensive experiments on the First Impressions dataset show the effectiveness of our proposed network, outperforming the state-of-the-art.  
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes HuPBA; no menciona Approved no  
  Call Number Admin @ si @ LWM2020 Serial (down) 3413  
Permanent link to this record
 

 
Author Razieh Rastgoo; Kourosh Kiani; Sergio Escalera edit  url
openurl 
  Title Hand sign language recognition using multi-view hand skeleton Type Journal Article
  Year 2020 Publication Expert Systems With Applications Abbreviated Journal ESWA  
  Volume 150 Issue Pages 113336  
  Keywords Multi-view hand skeleton; Hand sign language recognition; 3DCNN; Hand pose estimation; RGB video; Hand action recognition  
  Abstract Hand sign language recognition from video is a challenging research area in computer vision, which performance is affected by hand occlusion, fast hand movement, illumination changes, or background complexity, just to mention a few. In recent years, deep learning approaches have achieved state-of-the-art results in the field, though previous challenges are not completely solved. In this work, we propose a novel deep learning-based pipeline architecture for efficient automatic hand sign language recognition using Single Shot Detector (SSD), 2D Convolutional Neural Network (2DCNN), 3D Convolutional Neural Network (3DCNN), and Long Short-Term Memory (LSTM) from RGB input videos. We use a CNN-based model which estimates the 3D hand keypoints from 2D input frames. After that, we connect these estimated keypoints to build the hand skeleton by using midpoint algorithm. In order to obtain a more discriminative representation of hands, we project 3D hand skeleton into three views surface images. We further employ the heatmap image of detected keypoints as input for refinement in a stacked fashion. We apply 3DCNNs on the stacked features of hand, including pixel level, multi-view hand skeleton, and heatmap features, to extract discriminant local spatio-temporal features from these stacked inputs. The outputs of the 3DCNNs are fused and fed to a LSTM to model long-term dynamics of hand sign gestures. Analyzing 2DCNN vs. 3DCNN using different number of stacked inputs into the network, we demonstrate that 3DCNN better capture spatio-temporal dynamics of hands. To the best of our knowledge, this is the first time that this multi-modal and multi-view set of hand skeleton features are applied for hand sign language recognition. Furthermore, we present a new large-scale hand sign language dataset, namely RKS-PERSIANSIGN, including 10′000 RGB videos of 100 Persian sign words. Evaluation results of the proposed model on three datasets, NYU, First-Person, and RKS-PERSIANSIGN, indicate that our model outperforms state-of-the-art models in hand sign language recognition, hand pose estimation, and hand action recognition.  
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes HuPBA; no proj Approved no  
  Call Number Admin @ si @ RKE2020a Serial (down) 3411  
Permanent link to this record
 

 
Author Margarita Torre; Beatriz Remeseiro; Petia Radeva; Fernando Martinez edit  url
doi  openurl
  Title DeepNEM: Deep Network Energy-Minimization for Agricultural Field Segmentation Type Journal Article
  Year 2020 Publication IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing Abbreviated Journal JSTAEOR  
  Volume 13 Issue Pages 726-737  
  Keywords  
  Abstract One of the main characteristics of agricultural fields is that the appearance of different crops and their growth status, in an aerial image, is varied, and has a wide range of radiometric values and high level of variability. The extraction of these fields and their monitoring are activities that require a high level of human intervention. In this article, we propose a novel automatic algorithm, named deep network energy-minimization (DeepNEM), to extract agricultural fields in aerial images. The model-guided process selects the most relevant image clues extracted by a deep network, completes them and finally generates regions that represent the agricultural fields under a minimization scheme. DeepNEM has been tested over a broad range of fields in terms of size, shape, and content. Different measures were used to compare the DeepNEM with other methods, and to prove that it represents an improved approach to achieve a high-quality segmentation of agricultural fields. Furthermore, this article also presents a new public dataset composed of 1200 images with their parcels boundaries annotations.  
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes MILAB Approved no  
  Call Number Admin @ si @ TRR2020 Serial (down) 3410  
Permanent link to this record
 

 
Author Khalid El Asnaoui; Petia Radeva edit  url
openurl 
  Title Automatically Assess Day Similarity Using Visual Lifelogs Type Journal Article
  Year 2020 Publication International Journal of Intelligent Systems Abbreviated Journal IJIS  
  Volume 29 Issue Pages 298–310  
  Keywords  
  Abstract Today, we witness the appearance of many lifelogging cameras that are able to capture the life of a person wearing the camera and which produce a large number of images everyday. Automatically characterizing the experience and extracting patterns of behavior of individuals from this huge collection of unlabeled and unstructured egocentric data present major challenges and require novel and efficient algorithmic solutions. The main goal of this work is to propose a new method to automatically assess day similarity from the lifelogging images of a person. We propose a technique to measure the similarity between images based on the Swain’s distance and generalize it to detect the similarity between daily visual data. To this purpose, we apply the dynamic time warping (DTW) combined with the Swain’s distance for final day similarity estimation. For validation, we apply our technique on the Egocentric Dataset of University of Barcelona (EDUB) of 4912 daily images acquired by four persons with preliminary encouraging results.  
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes MILAB; no proj Approved no  
  Call Number AsR2020 Serial (down) 3409  
Permanent link to this record
 

 
Author Hugo Jair Escalante; Heysem Kaya; Albert Ali Salah; Sergio Escalera; Yagmur Gucluturk; Umut Guçlu; Xavier Baro; Isabelle Guyon; Julio C. S. Jacques Junior; Meysam Madadi; Stephane Ayache; Evelyne Viegas; Furkan Gurpinar; Achmadnoer Sukma Wicaksana; Cynthia Liem; Marcel A. J. Van Gerven; Rob Van Lier edit   pdf
url  doi
openurl 
  Title Modeling, Recognizing, and Explaining Apparent Personality from Videos Type Journal Article
  Year 2022 Publication IEEE Transactions on Affective Computing Abbreviated Journal TAC  
  Volume 13 Issue 2 Pages 894-911  
  Keywords  
  Abstract Explainability and interpretability are two critical aspects of decision support systems. Despite their importance, it is only recently that researchers are starting to explore these aspects. This paper provides an introduction to explainability and interpretability in the context of apparent personality recognition. To the best of our knowledge, this is the first effort in this direction. We describe a challenge we organized on explainability in first impressions analysis from video. We analyze in detail the newly introduced data set, evaluation protocol, proposed solutions and summarize the results of the challenge. We investigate the issue of bias in detail. Finally, derived from our study, we outline research opportunities that we foresee will be relevant in this area in the near future.  
  Address 1 April-June 2022  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes HuPBA; no menciona Approved no  
  Call Number Admin @ si @ EKS2022 Serial (down) 3406  
Permanent link to this record
 

 
Author Gabriel Villalonga; Joost Van de Weijer; Antonio Lopez edit  url
doi  openurl
  Title Recognizing new classes with synthetic data in the loop: application to traffic sign recognition Type Journal Article
  Year 2020 Publication Sensors Abbreviated Journal SENS  
  Volume 20 Issue 3 Pages 583  
  Keywords  
  Abstract On-board vision systems may need to increase the number of classes that can be recognized in a relatively short period. For instance, a traffic sign recognition system may suddenly be required to recognize new signs. Since collecting and annotating samples of such new classes may need more time than we wish, especially for uncommon signs, we propose a method to generate these samples by combining synthetic images and Generative Adversarial Network (GAN) technology. In particular, the GAN is trained on synthetic and real-world samples from known classes to perform synthetic-to-real domain adaptation, but applied to synthetic samples of the new classes. Using the Tsinghua dataset with a synthetic counterpart, SYNTHIA-TS, we have run an extensive set of experiments. The results show that the proposed method is indeed effective, provided that we use a proper Convolutional Neural Network (CNN) to perform the traffic sign recognition (classification) task as well as a proper GAN to transform the synthetic images. Here, a ResNet101-based classifier and domain adaptation based on CycleGAN performed extremely well for a ratio∼ 1/4 for new/known classes; even for more challenging ratios such as∼ 4/1, the results are also very positive.  
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes LAMP; ADAS; 600.118; 600.120 Approved no  
  Call Number Admin @ si @ VWL2020 Serial (down) 3405  
Permanent link to this record
 

 
Author Cristina Sanchez Montes; Jorge Bernal; Ana Garcia Rodriguez; Henry Cordova; Gloria Fernandez Esparrach edit  url
openurl 
  Title Revisión de métodos computacionales de detección y clasificación de pólipos en imagen de colonoscopia Type Journal Article
  Year 2020 Publication Gastroenterología y Hepatología Abbreviated Journal GH  
  Volume 43 Issue 4 Pages 222-232  
  Keywords  
  Abstract Computer-aided diagnosis (CAD) is a tool with great potential to help endoscopists in the tasks of detecting and histologically classifying colorectal polyps. In recent years, different technologies have been described and their potential utility has been increasingly evidenced, which has generated great expectations among scientific societies. However, most of these works are retrospective and use images of different quality and characteristics which are analysed off line. This review aims to familiarise gastroenterologists with computational methods and the particularities of endoscopic imaging, which have an impact on image processing analysis. Finally, the publicly available image databases, needed to compare and confirm the results obtained with different methods, are presented.  
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes MV; Approved no  
  Call Number Admin @ si @ SBG2020 Serial (down) 3404  
Permanent link to this record
 

 
Author Ana Garcia Rodriguez; Jorge Bernal; F. Javier Sanchez; Henry Cordova; Rodrigo Garces Duran; Cristina Rodriguez de Miguel; Gloria Fernandez Esparrach edit  url
doi  openurl
  Title Polyp fingerprint: automatic recognition of colorectal polyps’ unique features Type Journal Article
  Year 2020 Publication Surgical Endoscopy and other Interventional Techniques Abbreviated Journal SEND  
  Volume 34 Issue 4 Pages 1887-1889  
  Keywords  
  Abstract BACKGROUND:
Content-based image retrieval (CBIR) is an application of machine learning used to retrieve images by similarity on the basis of features. Our objective was to develop a CBIR system that could identify images containing the same polyp ('polyp fingerprint').

METHODS:
A machine learning technique called Bag of Words was used to describe each endoscopic image containing a polyp in a unique way. The system was tested with 243 white light images belonging to 99 different polyps (for each polyp there were at least two images representing it in two different temporal moments). Images were acquired in routine colonoscopies at Hospital Clínic using high-definition Olympus endoscopes. The method provided for each image the closest match within the dataset.

RESULTS:
The system matched another image of the same polyp in 221/243 cases (91%). No differences were observed in the number of correct matches according to Paris classification (protruded: 90.7% vs. non-protruded: 91.3%) and size (< 10 mm: 91.6% vs. > 10 mm: 90%).

CONCLUSIONS:
A CBIR system can match accurately two images containing the same polyp, which could be a helpful aid for polyp image recognition.

KEYWORDS:
Artificial intelligence; Colorectal polyps; Content-based image retrieval
 
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes MV; no menciona Approved no  
  Call Number Admin @ si @ Serial (down) 3403  
Permanent link to this record
Select All    Deselect All
 |   | 
Details
   print

Save Citations:
Export Records: