Home | [121–130] << 131 132 133 134 135 136 137 138 139 140 >> [141–150] |
![]() |
Records | |||||
---|---|---|---|---|---|
Author | Jorge Bernal; Fernando Vilariño; F. Javier Sanchez | ||||
Title | Towards Intelligent Systems for Colonoscopy | Type | Book Chapter | ||
Year | 2011 | Publication | Colonoscopy | Abbreviated Journal | |
Volume | 1 | Issue | Pages | 257-282 | |
Keywords | |||||
Abstract ![]() |
In this chapter we present tools that can be used to build intelligent systems for colonoscopy.
The idea is, by using methods based on computer vision and artificial intelligence, add significant value to the colonoscopy procedure. Intelligent systems are being used to assist in other medical interventions |
||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Intech | Place of Publication | Editor | Paul Miskovitz | |
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | 978-953-307-568-6 | Medium | ||
Area | 800 | Expedition | Conference | ||
Notes | MV;SIAI | Approved | no | ||
Call Number | IAM @ iam @ BVS2011 | Serial | 1697 | ||
Permanent link to this record | |||||
Author | Svebor Karaman; Giuseppe Lisanti; Andrew Bagdanov; Alberto del Bimbo | ||||
Title | From re-identification to identity inference: Labeling consistency by local similarity constraints | Type | Book Chapter | ||
Year | 2014 | Publication | Person Re-Identification | Abbreviated Journal | |
Volume | 2 | Issue | Pages | 287-307 | |
Keywords | re-identification; Identity inference; Conditional random fields; Video surveillance | ||||
Abstract ![]() |
In this chapter, we introduce the problem of identity inference as a generalization of person re-identification. It is most appropriate to distinguish identity inference from re-identification in situations where a large number of observations must be identified without knowing a priori that groups of test images represent the same individual. The standard single- and multishot person re-identification common in the literature are special cases of our formulation. We present an approach to solving identity inference by modeling it as a labeling problem in a Conditional Random Field (CRF). The CRF model ensures that the final labeling gives similar labels to detections that are similar in feature space. Experimental results are given on the ETHZ, i-LIDS and CAVIAR datasets. Our approach yields state-of-the-art performance for multishot re-identification, and our results on the more general identity inference problem demonstrate that we are able to infer the identity of very many examples even with very few labeled images in the gallery. | ||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Springer London | Place of Publication | Editor | ||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | 2191-6586 | ISBN | 978-1-4471-6295-7 | Medium | |
Area | Expedition | Conference | |||
Notes | LAMP; 600.079 | Approved | no | ||
Call Number | Admin @ si @KLB2014b | Serial | 2521 | ||
Permanent link to this record | |||||
Author | Cesar de Souza | ||||
Title | Action Recognition in Videos: Data-efficient approaches for supervised learning of human action classification models for video | Type | Book Whole | ||
Year | 2018 | Publication | PhD Thesis, Universitat Autonoma de Barcelona-CVC | Abbreviated Journal | |
Volume | Issue | Pages | |||
Keywords | |||||
Abstract ![]() |
In this dissertation, we explore different ways to perform human action recognition in video clips. We focus on data efficiency, proposing new approaches that alleviate the need for laborious and time-consuming manual data annotation. In the first part of this dissertation, we start by analyzing previous state-of-the-art models, comparing their differences and similarities in order to pinpoint where their real strengths come from. Leveraging this information, we then proceed to boost the classification accuracy of shallow models to levels that rival deep neural networks. We introduce hybrid video classification architectures based on carefully designed unsupervised representations of handcrafted spatiotemporal features classified by supervised deep networks. We show in our experiments that our hybrid model combine the best of both worlds: it is data efficient (trained on 150 to 10,000 short clips) and yet improved significantly on the state of the art, including deep models trained on millions of manually labeled images and videos. In the second part of this research, we investigate the generation of synthetic training data for action recognition, as it has recently shown promising results for a variety of other computer vision tasks. We propose an interpretable parametric generative model of human action videos that relies on procedural generation and other computer graphics techniques of modern game engines. We generate a diverse, realistic, and physically plausible dataset of human action videos, called PHAV for “Procedural Human Action Videos”. It contains a total of 39,982 videos, with more than 1,000 examples for each action of 35 categories. Our approach is not limited to existing motion capture sequences, and we procedurally define 14 synthetic actions. We then introduce deep multi-task representation learning architectures to mix synthetic and real videos, even if the action categories differ. Our experiments on the UCF-101 and HMDB-51 benchmarks suggest that combining our large set of synthetic videos with small real-world datasets can boost recognition performance, outperforming fine-tuning state-of-the-art unsupervised generative models of videos. | ||||
Address | April 2018 | ||||
Corporate Author | Thesis | Ph.D. thesis | |||
Publisher | Ediciones Graficas Rey | Place of Publication | Editor | Antonio Lopez;Naila Murray | |
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | |||
Notes | ADAS; 600.118 | Approved | no | ||
Call Number | Admin @ si @ Sou2018 | Serial | 3127 | ||
Permanent link to this record | |||||
Author | Mikhail Mozerov; Fei Yang; Joost Van de Weijer | ||||
Title | Sparse Data Interpolation Using the Geodesic Distance Affinity Space | Type | Journal Article | ||
Year | 2019 | Publication | IEEE Signal Processing Letters | Abbreviated Journal | SPL |
Volume | 26 | Issue | 6 | Pages | 943 - 947 |
Keywords | |||||
Abstract ![]() |
In this letter, we adapt the geodesic distance-based recursive filter to the sparse data interpolation problem. The proposed technique is general and can be easily applied to any kind of sparse data. We demonstrate its superiority over other interpolation techniques in three experiments for qualitative and quantitative evaluation. In addition, we compare our method with the popular interpolation algorithm presented in the paper on EpicFlow optical flow, which is intuitively motivated by a similar geodesic distance principle. The comparison shows that our algorithm is more accurate and considerably faster than the EpicFlow interpolation technique. | ||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | |||
Notes | LAMP; 600.120 | Approved | no | ||
Call Number | Admin @ si @ MYW2019 | Serial | 3261 | ||
Permanent link to this record | |||||
Author | Fatemeh Noroozi; Marina Marjanovic; Angelina Njegus; Sergio Escalera; Gholamreza Anbarjafari | ||||
Title | Fusion of Classifier Predictions for Audio-Visual Emotion Recognition | Type | Conference Article | ||
Year | 2016 | Publication | 23rd International Conference on Pattern Recognition Workshops | Abbreviated Journal | |
Volume | Issue | Pages | |||
Keywords | |||||
Abstract ![]() |
In this paper is presented a novel multimodal emotion recognition system which is based on the analysis of audio and visual cues. MFCC-based features are extracted from the audio channel and facial landmark geometric relations are
computed from visual data. Both sets of features are learnt separately using state-of-the-art classifiers. In addition, we summarise each emotion video into a reduced set of key-frames, which are learnt in order to visually discriminate emotions by means of a Convolutional Neural Network. Finally, confidence outputs of all classifiers from all modalities are used to define a new feature space to be learnt for final emotion prediction, in a late fusion/stacking fashion. The conducted experiments on eNTERFACE’05 database show significant performance improvements of our proposed system in comparison to state-of-the-art approaches. |
||||
Address | Cancun; Mexico; December 2016 | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | ICPRW | ||
Notes | HuPBA;MILAB; | Approved | no | ||
Call Number | Admin @ si @ NMN2016 | Serial | 2839 | ||
Permanent link to this record | |||||
Author | Dena Bazazian; Dimosthenis Karatzas; Andrew Bagdanov | ||||
Title | Word Spotting in Scene Images based on Character Recognition | Type | Conference Article | ||
Year | 2018 | Publication | IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops | Abbreviated Journal | |
Volume | Issue | Pages | 1872-1874 | ||
Keywords | |||||
Abstract ![]() |
In this paper we address the problem of unconstrained Word Spotting in scene images. We train a Fully Convolutional Network to produce heatmaps of all the character classes. Then, we employ the Text Proposals approach and, via a rectangle classifier, detect the most likely rectangle for each query word based on the character attribute maps. We evaluate the proposed method on ICDAR2015 and show that it is capable of identifying and recognizing query words in natural scene images. | ||||
Address | Salt Lake City; USA; June 2018 | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | CVPRW | ||
Notes | DAG; 600.129; 600.121 | Approved | no | ||
Call Number | BKB2018a | Serial | 3179 | ||
Permanent link to this record | |||||
Author | Hongxing Gao; Marçal Rusiñol; Dimosthenis Karatzas; Josep Llados; Tomokazu Sato; Masakazu Iwamura; Koichi Kise | ||||
Title | Key-region detection for document images -applications to administrative document retrieval | Type | Conference Article | ||
Year | 2013 | Publication | 12th International Conference on Document Analysis and Recognition | Abbreviated Journal | |
Volume | Issue | Pages | 230-234 | ||
Keywords | |||||
Abstract ![]() |
In this paper we argue that a key-region detector designed to take into account the special characteristics of document images can result in the detection of less and more meaningful key-regions. We propose a fast key-region detector able to capture aspects of the structural information of the document, and demonstrate its efficiency by comparing against standard detectors in an administrative document retrieval scenario. We show that using the proposed detector results to a smaller number of detected key-regions and higher performance without any drop in speed compared to standard state of the art detectors. | ||||
Address | Washington; USA; August 2013 | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | 1520-5363 | ISBN | Medium | ||
Area | Expedition | Conference | ICDAR | ||
Notes | DAG; 600.056; 600.045 | Approved | no | ||
Call Number | Admin @ si @ GRK2013b | Serial | 2293 | ||
Permanent link to this record | |||||
Author | Marçal Rusiñol; J. Chazalon; Jean-Marc Ogier; Josep Llados | ||||
Title | A Comparative Study of Local Detectors and Descriptors for Mobile Document Classification | Type | Conference Article | ||
Year | 2015 | Publication | 13th International Conference on Document Analysis and Recognition ICDAR2015 | Abbreviated Journal | |
Volume | Issue | Pages | 596-600 | ||
Keywords | |||||
Abstract ![]() |
In this paper we conduct a comparative study of local key-point detectors and local descriptors for the specific task of mobile document classification. A classification architecture based on direct matching of local descriptors is used as baseline for the comparative study. A set of four different key-point
detectors and four different local descriptors are tested in all the possible combinations. The experiments are conducted in a database consisting of 30 model documents acquired on 6 different backgrounds, totaling more than 36.000 test images. |
||||
Address | Nancy; France; August 2015 | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | ICDAR | ||
Notes | DAG; 600.084; 600.61; 601.223; 600.077 | Approved | no | ||
Call Number | Admin @ si @ RCO2015 | Serial | 2684 | ||
Permanent link to this record | |||||
Author | Kai Wang; Xialei Liu; Andrew Bagdanov; Luis Herranz; Shangling Jui; Joost Van de Weijer | ||||
Title | Incremental Meta-Learning via Episodic Replay Distillation for Few-Shot Image Recognition | Type | Conference Article | ||
Year | 2022 | Publication | CVPR 2022 Workshop on Continual Learning (CLVision, 3rd Edition) | Abbreviated Journal | |
Volume | Issue | Pages | 3728-3738 | ||
Keywords | Training; Computer vision; Image recognition; Upper bound; Conferences; Pattern recognition; Task analysis | ||||
Abstract ![]() |
In this paper we consider the problem of incremental meta-learning in which classes are presented incrementally in discrete tasks. We propose Episodic Replay Distillation (ERD), that mixes classes from the current task with exemplars from previous tasks when sampling episodes for meta-learning. To allow the training to benefit from a large as possible variety of classes, which leads to more gener-
alizable feature representations, we propose the cross-task meta loss. Furthermore, we propose episodic replay distillation that also exploits exemplars for improved knowledge distillation. Experiments on four datasets demonstrate that ERD surpasses the state-of-the-art. In particular, on the more challenging one-shot, long task sequence scenarios, we reduce the gap between Incremental Meta-Learning and the joint-training upper bound from 3.5% / 10.1% / 13.4% / 11.7% with the current state-of-the-art to 2.6% / 2.9% / 5.0% / 0.2% with our method on Tiered-ImageNet / Mini-ImageNet / CIFAR100 / CUB, respectively. |
||||
Address | New Orleans, USA; 20 June 2022 | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | CVPRW | ||
Notes | LAMP; 600.147 | Approved | no | ||
Call Number | Admin @ si @ WLB2022 | Serial | 3686 | ||
Permanent link to this record | |||||
Author | Francisco Alvaro; Francisco Cruz; Joan Andreu Sanchez; Oriol Ramos Terrades; Jose Miguel Benedi | ||||
Title | Structure Detection and Segmentation of Documents Using 2D Stochastic Context-Free Grammars | Type | Journal Article | ||
Year | 2015 | Publication | Neurocomputing | Abbreviated Journal | NEUCOM |
Volume | 150 | Issue | A | Pages | 147-154 |
Keywords | document image analysis; stochastic context-free grammars; text classication features | ||||
Abstract ![]() |
In this paper we dene a bidimensional extension of Stochastic Context-Free Grammars for structure detection and segmentation of images of documents.
Two sets of text classication features are used to perform an initial classication of each zone of the page. Then, the document segmentation is obtained as the most likely hypothesis according to a stochastic grammar. We used a dataset of historical marriage license books to validate this approach. We also tested several inference algorithms for Probabilistic Graphical Models and the results showed that the proposed grammatical model outperformed the other methods. Furthermore, grammars also provide the document structure along with its segmentation. |
||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | |||
Notes | DAG; 601.158; 600.077; 600.061 | Approved | no | ||
Call Number | Admin @ si @ ACS2015 | Serial | 2531 | ||
Permanent link to this record | |||||
Author | Jon Almazan; Alicia Fornes; Ernest Valveny | ||||
Title | A Deformable HOG-based Shape Descriptor | Type | Conference Article | ||
Year | 2013 | Publication | 12th International Conference on Document Analysis and Recognition | Abbreviated Journal | |
Volume | Issue | Pages | 1022-1026 | ||
Keywords | |||||
Abstract ![]() |
In this paper we deal with the problem of recognizing handwritten shapes. We present a new deformable feature extraction method that adapts to the shape to be described, dealing in this way with the variability introduced in the handwriting domain. It consists in a selection of the regions that best define the shape to be described, followed by the computation of histograms of oriented gradients-based features over these points. Our results significantly outperform other descriptors in the literature for the task of hand-drawn shape recognition and handwritten word retrieval | ||||
Address | Washington; USA; August 2013 | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | 1520-5363 | ISBN | Medium | ||
Area | Expedition | Conference | ICDAR | ||
Notes | DAG | Approved | no | ||
Call Number | Admin @ si @ AFV2013 | Serial | 2326 | ||
Permanent link to this record | |||||
Author | Francisco Alvaro; Francisco Cruz; Joan Andreu Sanchez; Oriol Ramos Terrades; Jose Miguel Bemedi | ||||
Title | Page Segmentation of Structured Documents Using 2D Stochastic Context-Free Grammars | Type | Conference Article | ||
Year | 2013 | Publication | 6th Iberian Conference on Pattern Recognition and Image Analysis | Abbreviated Journal | |
Volume | 7887 | Issue | Pages | 133-140 | |
Keywords | |||||
Abstract ![]() |
In this paper we define a bidimensional extension of Stochastic Context-Free Grammars for page segmentation of structured documents. Two sets of text classification features are used to perform an initial classification of each zone of the page. Then, the page segmentation is obtained as the most likely hypothesis according to a grammar. This approach is compared to Conditional Random Fields and results show significant improvements in several cases. Furthermore, grammars provide a detailed segmentation that allowed a semantic evaluation which also validates this model. | ||||
Address | Madeira; Portugal; June 2013 | ||||
Corporate Author | Thesis | ||||
Publisher | Springer Berlin Heidelberg | Place of Publication | Editor | ||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | LNCS | ||
Series Volume | Series Issue | Edition | |||
ISSN | 0302-9743 | ISBN | 978-3-642-38627-5 | Medium | |
Area | Expedition | Conference | IbPRIA | ||
Notes | DAG; 605.203 | Approved | no | ||
Call Number | Admin @ si @ ACS2013 | Serial | 2328 | ||
Permanent link to this record | |||||
Author | Jon Almazan; Alicia Fornes; Ernest Valveny | ||||
Title | A non-rigid appearance model for shape description and recognition | Type | Journal Article | ||
Year | 2012 | Publication | Pattern Recognition | Abbreviated Journal | PR |
Volume | 45 | Issue | 9 | Pages | 3105--3113 |
Keywords | Shape recognition; Deformable models; Shape modeling; Hand-drawn recognition | ||||
Abstract ![]() |
In this paper we describe a framework to learn a model of shape variability in a set of patterns. The framework is based on the Active Appearance Model (AAM) and permits to combine shape deformations with appearance variability. We have used two modifications of the Blurred Shape Model (BSM) descriptor as basic shape and appearance features to learn the model. These modifications permit to overcome the rigidity of the original BSM, adapting it to the deformations of the shape to be represented. We have applied this framework to representation and classification of handwritten digits and symbols. We show that results of the proposed methodology outperform the original BSM approach. | ||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | 0031-3203 | ISBN | Medium | ||
Area | Expedition | Conference | |||
Notes | DAG | Approved | no | ||
Call Number | DAG @ dag @ AFV2012 | Serial | 1982 | ||
Permanent link to this record | |||||
Author | Murad Al Haj; Andrew Bagdanov; Jordi Gonzalez; Xavier Roca | ||||
Title | Reactive object tracking with a single PTZ camera | Type | Conference Article | ||
Year | 2010 | Publication | 20th International Conference on Pattern Recognition | Abbreviated Journal | |
Volume | Issue | Pages | 1690–1693 | ||
Keywords | |||||
Abstract ![]() |
In this paper we describe a novel approach to reactive tracking of moving targets with a pan-tilt-zoom camera. The approach uses an extended Kalman filter to jointly track the object position in the real world, its velocity in 3D and the camera intrinsics, in addition to the rate of change of these parameters. The filter outputs are used as inputs to PID controllers which continuously adjust the camera motion in order to reactively track the object at a constant image velocity while simultaneously maintaining a desirable target scale in the image plane. We provide experimental results on simulated and real tracking sequences to show how our tracker is able to accurately estimate both 3D object position and camera intrinsics with very high precision over a wide range of focal lengths. | ||||
Address | Istanbul (Turkey) | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | 1051-4651 | ISBN | 978-1-4244-7542-1 | Medium | |
Area | Expedition | Conference | ICPR | ||
Notes | ISE | Approved | no | ||
Call Number | DAG @ dag @ ABG2010 | Serial | 1418 | ||
Permanent link to this record | |||||
Author | Svebor Karaman; Giuseppe Lisanti; Andrew Bagdanov; Alberto del Bimbo | ||||
Title | Leveraging local neighborhood topology for large scale person re-identification | Type | Journal Article | ||
Year | 2014 | Publication | Pattern Recognition | Abbreviated Journal | PR |
Volume | 47 | Issue | 12 | Pages | 3767–3778 |
Keywords | Re-identification; Conditional random field; Semi-supervised; ETHZ; CAVIAR; 3DPeS; CMV100 | ||||
Abstract ![]() |
In this paper we describe a semi-supervised approach to person re-identification that combines discriminative models of person identity with a Conditional Random Field (CRF) to exploit the local manifold approximation induced by the nearest neighbor graph in feature space. The linear discriminative models learned on few gallery images provides coarse separation of probe images into identities, while a graph topology defined by distances between all person images in feature space leverages local support for label propagation in the CRF. We evaluate our approach using multiple scenarios on several publicly available datasets, where the number of identities varies from 28 to 191 and the number of images ranges between 1003 and 36 171. We demonstrate that the discriminative model and the CRF are complementary and that the combination of both leads to significant improvement over state-of-the-art approaches. We further demonstrate how the performance of our approach improves with increasing test data and also with increasing amounts of additional unlabeled data. | ||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | |||
Notes | LAMP; 601.240; 600.079 | Approved | no | ||
Call Number | Admin @ si @ KLB2014a | Serial | 2522 | ||
Permanent link to this record |