Home | [11–20] << 21 22 23 24 25 26 27 28 29 30 >> [31–40] |
Records | |||||
---|---|---|---|---|---|
Author | Xialei Liu; Marc Masana; Luis Herranz; Joost Van de Weijer; Antonio Lopez; Andrew Bagdanov | ||||
Title | Rotate your Networks: Better Weight Consolidation and Less Catastrophic Forgetting | Type | Conference Article | ||
Year | 2018 | Publication | 24th International Conference on Pattern Recognition | Abbreviated Journal | |
Volume | Issue | Pages | 2262-2268 | ||
Keywords | |||||
Abstract | In this paper we propose an approach to avoiding catastrophic forgetting in sequential task learning scenarios. Our technique is based on a network reparameterization that approximately diagonalizes the Fisher Information Matrix of the network parameters. This reparameterization takes the form of
a factorized rotation of parameter space which, when used in conjunction with Elastic Weight Consolidation (which assumes a diagonal Fisher Information Matrix), leads to significantly better performance on lifelong learning of sequential tasks. Experimental results on the MNIST, CIFAR-100, CUB-200 and Stanford-40 datasets demonstrate that we significantly improve the results of standard elastic weight consolidation, and that we obtain competitive results when compared to the state-of-the-art in lifelong learning without forgetting. |
||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | ICPR | ||
Notes | LAMP; ADAS; 601.305; 601.109; 600.124; 600.106; 602.200; 600.120; 600.118 | Approved | no | ||
Call Number | Admin @ si @ LMH2018 | Serial | 3160 | ||
Permanent link to this record | |||||
Author | Sounak Dey; Anjan Dutta; Suman Ghosh; Ernest Valveny; Josep Llados; Umapada Pal | ||||
Title | Learning Cross-Modal Deep Embeddings for Multi-Object Image Retrieval using Text and Sketch | Type | Conference Article | ||
Year | 2018 | Publication | 24th International Conference on Pattern Recognition | Abbreviated Journal | |
Volume | Issue | Pages | 916 - 921 | ||
Keywords | |||||
Abstract | In this work we introduce a cross modal image retrieval system that allows both text and sketch as input modalities for the query. A cross-modal deep network architecture is formulated to jointly model the sketch and text input modalities as well as the the image output modality, learning a common embedding between text and images and between sketches and images. In addition, an attention model is used to selectively focus the attention on the different objects of the image, allowing for retrieval with multiple objects in the query. Experiments show that the proposed method performs the best in both single and multiple object image retrieval in standard datasets. | ||||
Address | Beijing; China; August 2018 | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | ICPR | ||
Notes | DAG; 602.167; 602.168; 600.097; 600.084; 600.121; 600.129 | Approved | no | ||
Call Number | Admin @ si @ DDG2018b | Serial | 3152 | ||
Permanent link to this record | |||||
Author | Pau Riba; Andreas Fischer; Josep Llados; Alicia Fornes | ||||
Title | Learning Graph Distances with Message Passing Neural Networks | Type | Conference Article | ||
Year | 2018 | Publication | 24th International Conference on Pattern Recognition | Abbreviated Journal | |
Volume | Issue | Pages | 2239-2244 | ||
Keywords | ★Best Paper Award★ | ||||
Abstract | Graph representations have been widely used in pattern recognition thanks to their powerful representation formalism and rich theoretical background. A number of error-tolerant graph matching algorithms such as graph edit distance have been proposed for computing a distance between two labelled graphs. However, they typically suffer from a high
computational complexity, which makes it difficult to apply these matching algorithms in a real scenario. In this paper, we propose an efficient graph distance based on the emerging field of geometric deep learning. Our method employs a message passing neural network to capture the graph structure and learns a metric with a siamese network approach. The performance of the proposed graph distance is validated in two application cases, graph classification and graph retrieval of handwritten words, and shows a promising performance when compared with (approximate) graph edit distance benchmarks. |
||||
Address | Beijing; China; August 2018 | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | ICPR | ||
Notes | DAG; 600.097; 603.057; 601.302; 600.121 | Approved | no | ||
Call Number | Admin @ si @ RFL2018 | Serial | 3168 | ||
Permanent link to this record | |||||
Author | Gema Rotger; Felipe Lumbreras; Francesc Moreno-Noguer; Antonio Agudo | ||||
Title | 2D-to-3D Facial Expression Transfer | Type | Conference Article | ||
Year | 2018 | Publication | 24th International Conference on Pattern Recognition | Abbreviated Journal | |
Volume | Issue | Pages | 2008 - 2013 | ||
Keywords | |||||
Abstract | Automatically changing the expression and physical features of a face from an input image is a topic that has been traditionally tackled in a 2D domain. In this paper, we bring this problem to 3D and propose a framework that given an
input RGB video of a human face under a neutral expression, initially computes his/her 3D shape and then performs a transfer to a new and potentially non-observed expression. For this purpose, we parameterize the rest shape –obtained from standard factorization approaches over the input video– using a triangular mesh which is further clustered into larger macro-segments. The expression transfer problem is then posed as a direct mapping between this shape and a source shape, such as the blend shapes of an off-the-shelf 3D dataset of human facial expressions. The mapping is resolved to be geometrically consistent between 3D models by requiring points in specific regions to map on semantic equivalent regions. We validate the approach on several synthetic and real examples of input faces that largely differ from the source shapes, yielding very realistic expression transfers even in cases with topology changes, such as a synthetic video sequence of a single-eyed cyclops. |
||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | ICPR | ||
Notes | ADAS; 600.086; 600.130; 600.118 | Approved | no | ||
Call Number | Admin @ si @ RLM2018 | Serial | 3232 | ||
Permanent link to this record | |||||
Author | Lu Yu; Yongmei Cheng; Joost Van de Weijer | ||||
Title | Weakly Supervised Domain-Specific Color Naming Based on Attention | Type | Conference Article | ||
Year | 2018 | Publication | 24th International Conference on Pattern Recognition | Abbreviated Journal | |
Volume | Issue | Pages | 3019 - 3024 | ||
Keywords | |||||
Abstract | The majority of existing color naming methods focuses on the eleven basic color terms of the English language. However, in many applications, different sets of color names are used for the accurate description of objects. Labeling data to learn these domain-specific color names is an expensive and laborious task. Therefore, in this article we aim to learn color names from weakly labeled data. For this purpose, we add an attention branch to the color naming network. The attention branch is used to modulate the pixel-wise color naming predictions of the network. In experiments, we illustrate that the attention branch correctly identifies the relevant regions. Furthermore, we show that our method obtains state-of-the-art results for pixel-wise and image-wise classification on the EBAY dataset and is able to learn color names for various domains. | ||||
Address | Beijing; August 2018 | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | ICPR | ||
Notes | LAMP; 600.109; 602.200; 600.120 | Approved | no | ||
Call Number | Admin @ si @ YCW2018 | Serial | 3243 | ||
Permanent link to this record | |||||
Author | Jiaolong Xu; Sebastian Ramos;David Vazquez; Antonio Lopez | ||||
Title | Cost-sensitive Structured SVM for Multi-category Domain Adaptation | Type | Conference Article | ||
Year | 2014 | Publication | 22nd International Conference on Pattern Recognition | Abbreviated Journal | |
Volume | Issue | Pages | 3886 - 3891 | ||
Keywords | Domain Adaptation; Pedestrian Detection | ||||
Abstract | Domain adaptation addresses the problem of accuracy drop that a classifier may suffer when the training data (source domain) and the testing data (target domain) are drawn from different distributions. In this work, we focus on domain adaptation for structured SVM (SSVM). We propose a cost-sensitive domain adaptation method for SSVM, namely COSS-SSVM. In particular, during the re-training of an adapted classifier based on target and source data, the idea that we explore consists in introducing a non-zero cost even for correctly classified source domain samples. Eventually, we aim to learn a more targetoriented classifier by not rewarding (zero loss) properly classified source-domain training samples. We assess the effectiveness of COSS-SSVM on multi-category object recognition. | ||||
Address | Stockholm; Sweden; August 2014 | ||||
Corporate Author | Thesis | ||||
Publisher | IEEE | Place of Publication | Editor | ||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | 1051-4651 | ISBN | Medium | ||
Area | Expedition | Conference | ICPR | ||
Notes | ADAS; 600.057; 600.054; 601.217; 600.076 | Approved | no | ||
Call Number | ADAS @ adas @ XRV2014a | Serial | 2434 | ||
Permanent link to this record | |||||
Author | Francisco Cruz; Oriol Ramos Terrades | ||||
Title | EM-Based Layout Analysis Method for Structured Documents | Type | Conference Article | ||
Year | 2014 | Publication | 22nd International Conference on Pattern Recognition | Abbreviated Journal | |
Volume | Issue | Pages | 315-320 | ||
Keywords | |||||
Abstract | In this paper we present a method to perform layout analysis in structured documents. We proposed an EM-based algorithm to fit a set of Gaussian mixtures to the different regions according to the logical distribution along the page. After the convergence, we estimate the final shape of the regions according
to the parameters computed for each component of the mixture. We evaluated our method in the task of record detection in a collection of historical structured documents and performed a comparison with other previous works in this task. |
||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | 1051-4651 | ISBN | Medium | ||
Area | Expedition | Conference | ICPR | ||
Notes | DAG; 602.006; 600.061; 600.077 | Approved | no | ||
Call Number | Admin @ si @ CrR2014 | Serial | 2530 | ||
Permanent link to this record | |||||
Author | Lluis Gomez; Dimosthenis Karatzas | ||||
Title | MSER-based Real-Time Text Detection and Tracking | Type | Conference Article | ||
Year | 2014 | Publication | 22nd International Conference on Pattern Recognition | Abbreviated Journal | |
Volume | Issue | Pages | 3110 - 3115 | ||
Keywords | |||||
Abstract | We present a hybrid algorithm for detection and tracking of text in natural scenes that goes beyond the fulldetection approaches in terms of time performance optimization.
A state-of-the-art scene text detection module based on Maximally Stable Extremal Regions (MSER) is used to detect text asynchronously, while on a separate thread detected text objects are tracked by MSER propagation. The cooperation of these two modules yields real time video processing at high frame rates even on low-resource devices. |
||||
Address | Stockholm; August 2014 | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | 1051-4651 | ISBN | Medium | ||
Area | Expedition | Conference | ICPR | ||
Notes | DAG; 600.056; 601.158; 601.197; 600.077 | Approved | no | ||
Call Number | Admin @ si @ GoK2014a | Serial | 2492 | ||
Permanent link to this record | |||||
Author | P. Wang; V. Eglin; C. Garcia; C. Largeron; Josep Llados; Alicia Fornes | ||||
Title | A Coarse-to-Fine Word Spotting Approach for Historical Handwritten Documents Based on Graph Embedding and Graph Edit Distance | Type | Conference Article | ||
Year | 2014 | Publication | 22nd International Conference on Pattern Recognition | Abbreviated Journal | |
Volume | Issue | Pages | 3074 - 3079 | ||
Keywords | word spotting; coarse-to-fine mechamism; graphbased representation; graph embedding; graph edit distance | ||||
Abstract | Effective information retrieval on handwritten document images has always been a challenging task, especially historical ones. In the paper, we propose a coarse-to-fine handwritten word spotting approach based on graph representation. The presented model comprises both the topological and morphological signatures of the handwriting. Skeleton-based graphs with the Shape Context labelled vertexes are established for connected components. Each word image is represented as a sequence of graphs. Aiming at developing a practical and efficient word spotting approach for large-scale historical handwritten documents, a fast and coarse comparison is first applied to prune the regions that are not similar to the query based on the graph embedding methodology. Afterwards, the query and regions of interest are compared by graph edit distance based on the Dynamic Time Warping alignment. The proposed approach is evaluated on a public dataset containing 50 pages of historical marriage license records. The results show that the proposed approach achieves a compromise between efficiency and accuracy. | ||||
Address | Stockholm; Sweden; August 2014 | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | 1051-4651 | ISBN | Medium | ||
Area | Expedition | Conference | ICPR | ||
Notes | DAG; 600.061; 602.006; 600.077 | Approved | no | ||
Call Number | Admin @ si @ WEG2014a | Serial | 2515 | ||
Permanent link to this record | |||||
Author | David Fernandez; Jon Almazan; Nuria Cirera; Alicia Fornes; Josep Llados | ||||
Title | BH2M: the Barcelona Historical Handwritten Marriages database | Type | Conference Article | ||
Year | 2014 | Publication | 22nd International Conference on Pattern Recognition | Abbreviated Journal | |
Volume | Issue | Pages | 256 - 261 | ||
Keywords | |||||
Abstract | This paper presents an image database of historical handwritten marriages records stored in the archives of Barcelona cathedral, and the corresponding meta-data addressed to evaluate the performance of document analysis algorithms. The contribution of this paper is twofold. First, it presents a complete ground truth which covers the whole pipeline of handwriting
recognition research, from layout analysis to recognition and understanding. Second, it is the first dataset in the emerging area of genealogical document analysis, where documents are manuscripts pseudo-structured with specific lexicons and the interest is beyond pure transcriptions but context dependent. |
||||
Address | Creete Island; Grecia; September 2014 | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | 1051-4651 | ISBN | Medium | ||
Area | Expedition | Conference | ICPR | ||
Notes | DAG; 600.056; 600.061; 602.006; 600.077 | Approved | no | ||
Call Number | Admin @ si @ FAC2014 | Serial | 2461 | ||
Permanent link to this record | |||||
Author | Hongxing Gao; Marçal Rusiñol; Dimosthenis Karatzas; Josep Llados | ||||
Title | Embedding Document Structure to Bag-of-Words through Pair-wise Stable Key-regions | Type | Conference Article | ||
Year | 2014 | Publication | 22nd International Conference on Pattern Recognition | Abbreviated Journal | |
Volume | Issue | Pages | 2903 - 2908 | ||
Keywords | |||||
Abstract | Since the document structure carries valuable discriminative information, plenty of efforts have been made for extracting and understanding document structure among which layout analysis approaches are the most commonly used. In this paper, Distance Transform based MSER (DTMSER) is employed to efficiently extract the document structure as a dendrogram of key-regions which roughly correspond to structural elements such as characters, words and paragraphs. Inspired by the Bag
of Words (BoW) framework, we propose an efficient method for structural document matching by representing the document image as a histogram of key-region pairs encoding structural relationships. Applied to the scenario of document image retrieval, experimental results demonstrate a remarkable improvement when comparing the proposed method with typical BoW and pyramidal BoW methods. |
||||
Address | Stockholm; Sweden; August 2014 | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | ICPR | ||
Notes | DAG; 600.056; 600.061; 600.077 | Approved | no | ||
Call Number | Admin @ si @ GRK2014b | Serial | 2497 | ||
Permanent link to this record | |||||
Author | Fahad Shahbaz Khan; Joost Van de Weijer; Andrew Bagdanov; Michael Felsberg | ||||
Title | Scale Coding Bag-of-Words for Action Recognition | Type | Conference Article | ||
Year | 2014 | Publication | 22nd International Conference on Pattern Recognition | Abbreviated Journal | |
Volume | Issue | Pages | 1514-1519 | ||
Keywords | |||||
Abstract | Recognizing human actions in still images is a challenging problem in computer vision due to significant amount of scale, illumination and pose variation. Given the bounding box of a person both at training and test time, the task is to classify the action associated with each bounding box in an image.
Most state-of-the-art methods use the bag-of-words paradigm for action recognition. The bag-of-words framework employing a dense multi-scale grid sampling strategy is the de facto standard for feature detection. This results in a scale invariant image representation where all the features at multiple-scales are binned in a single histogram. We argue that such a scale invariant strategy is sub-optimal since it ignores the multi-scale information available with each bounding box of a person. This paper investigates alternative approaches to scale coding for action recognition in still images. We encode multi-scale information explicitly in three different histograms for small, medium and large scale visual-words. Our first approach exploits multi-scale information with respect to the image size. In our second approach, we encode multi-scale information relative to the size of the bounding box of a person instance. In each approach, the multi-scale histograms are then concatenated into a single representation for action classification. We validate our approaches on the Willow dataset which contains seven action categories: interacting with computer, photography, playing music, riding bike, riding horse, running and walking. Our results clearly suggest that the proposed scale coding approaches outperform the conventional scale invariant technique. Moreover, we show that our approach obtains promising results compared to more complex state-of-the-art methods. |
||||
Address | Stockholm; August 2014 | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | ICPR | ||
Notes | CIC; LAMP; 601.240; 600.074; 600.079 | Approved | no | ||
Call Number | Admin @ si @ KWB2014 | Serial | 2450 | ||
Permanent link to this record | |||||
Author | Mohammad Ali Bagheri; Gang Hu; Qigang Gao; Sergio Escalera | ||||
Title | A Framework of Multi-Classifier Fusion for Human Action Recognition | Type | Conference Article | ||
Year | 2014 | Publication | 22nd International Conference on Pattern Recognition | Abbreviated Journal | |
Volume | Issue | Pages | 1260 - 1265 | ||
Keywords | |||||
Abstract | The performance of different action-recognition methods using skeleton joint locations have been recently studied by several computer vision researchers. However, the potential improvement in classification through classifier fusion by ensemble-based methods has remained unattended. In this work, we evaluate the performance of an ensemble of five action learning techniques, each performing the recognition task from a different perspective. The underlying rationale of the fusion approach is that different learners employ varying structures of input descriptors/features to be trained. These varying structures cannot be attached and used by a single learner. In addition, combining the outputs of several learners can reduce the risk of an unfortunate selection of a poorly performing learner. This leads to having a more robust and general-applicable framework. Also, we propose two simple, yet effective, action description techniques. In order to improve the recognition performance, a powerful combination strategy is utilized based on the Dempster-Shafer theory, which can effectively make use of diversity of base learners trained on different sources of information. The recognition results of the individual classifiers are compared with those obtained from fusing the classifiers' output, showing advanced performance of the proposed methodology. | ||||
Address | Stockholm; Sweden; August 2014 | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | 1051-4651 | ISBN | Medium | ||
Area | Expedition | Conference | ICPR | ||
Notes | HuPBA;MILAB | Approved | no | ||
Call Number | Admin @ si @ BHG2014 | Serial | 2446 | ||
Permanent link to this record | |||||
Author | Mohammad Ali Bagheri; Qigang Gao; Sergio Escalera | ||||
Title | Generic Subclass Ensemble: A Novel Approach to Ensemble Classification | Type | Conference Article | ||
Year | 2014 | Publication | 22nd International Conference on Pattern Recognition | Abbreviated Journal | |
Volume | Issue | Pages | 1254 - 1259 | ||
Keywords | |||||
Abstract | Multiple classifier systems, also known as classifier ensembles, have received great attention in recent years because of their improved classification accuracy in different applications. In this paper, we propose a new general approach to ensemble classification, named generic subclass ensemble, in which each base classifier is trained with data belonging to a subset of classes, and thus discriminates among a subset of target categories. The ensemble classifiers are then fused using a combination rule. The proposed approach differs from existing methods that manipulate the target attribute, since in our approach individual classification problems are not restricted to two-class problems. We perform a series of experiments to evaluate the efficiency of the generic subclass approach on a set of benchmark datasets. Experimental results with multilayer perceptrons show that the proposed approach presents a viable alternative to the most commonly used ensemble classification approaches. | ||||
Address | Stockholm; August 2014 | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | 1051-4651 | ISBN | Medium | ||
Area | Expedition | Conference | ICPR | ||
Notes | HuPBA;MILAB | Approved | no | ||
Call Number | Admin @ si @ BGE2014b | Serial | 2445 | ||
Permanent link to this record | |||||
Author | Fadi Dornaika; Bogdan Raducanu | ||||
Title | Person-specific face shape estimation under varying head pose from single snapshots | Type | Conference Article | ||
Year | 2010 | Publication | 20th International Conference on Pattern Recognition | Abbreviated Journal | |
Volume | Issue | Pages | 3496–3499 | ||
Keywords | |||||
Abstract | This paper presents a new method for person-specific face shape estimation under varying head pose of a previously unseen person from a single image. We describe a featureless approach based on a deformable 3D model and a learned face subspace. The proposed approach is based on maximizing a likelihood measure associated with a learned face subspace, which is carried out by a stochastic and genetic optimizer. We conducted the experiments on a subset of Honda Video Database showing the feasibility and robustness of the proposed approach. For this reason, our approach could lend itself nicely to complex frameworks involving 3D face tracking and face gesture recognition in monocular videos. | ||||
Address | Istanbul, Turkey | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | 1051-4651 | ISBN | 978-1-4244-7542-1 | Medium | |
Area | Expedition | Conference | ICPR | ||
Notes | OR;MV | Approved | no | ||
Call Number | BCNPCL @ bcnpcl @ DoR2010b | Serial | 1361 | ||
Permanent link to this record |