|   | 
Details
   web
Records
Author Pau Riba; Josep Llados; Alicia Fornes
Title Hierarchical graphs for coarse-to-fine error tolerant matching Type Journal Article
Year 2020 Publication Pattern Recognition Letters Abbreviated Journal PRL
Volume 134 Issue Pages 116-124
Keywords Hierarchical graph representation; Coarse-to-fine graph matching; Graph-based retrieval
Abstract During the last years, graph-based representations are experiencing a growing usage in visual recognition and retrieval due to their ability to capture both structural and appearance-based information. Thus, they provide a greater representational power than classical statistical frameworks. However, graph-based representations leads to high computational complexities usually dealt by graph embeddings or approximated matching techniques. Despite their representational power, they are very sensitive to noise and small variations of the input image. With the aim to cope with the time complexity and the variability present in the generated graphs, in this paper we propose to construct a novel hierarchical graph representation. Graph clustering techniques adapted from social media analysis have been used in order to contract a graph at different abstraction levels while keeping information about the topology. Abstract nodes attributes summarise information about the contracted graph partition. For the proposed representations, a coarse-to-fine matching technique is defined. Hence, small graphs are used as a filtering before more accurate matching methods are applied. This approach has been validated in real scenarios such as classification of colour images or retrieval of handwritten words (i.e. word spotting).
Address
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference
Notes DAG; 600.097; 601.302; 603.057; 600.140; 600.121 Approved no
Call Number (down) Admin @ si @ RLF2020 Serial 3349
Permanent link to this record
 

 
Author Pau Riba; Josep Llados; Alicia Fornes
Title Error-tolerant coarse-to-fine matching model for hierarchical graphs Type Conference Article
Year 2017 Publication 11th IAPR-TC-15 International Workshop on Graph-Based Representations in Pattern Recognition Abbreviated Journal
Volume 10310 Issue Pages 107-117
Keywords Graph matching; Hierarchical graph; Graph-based representation; Coarse-to-fine matching
Abstract Graph-based representations are effective tools to capture structural information from visual elements. However, retrieving a query graph from a large database of graphs implies a high computational complexity. Moreover, these representations are very sensitive to noise or small changes. In this work, a novel hierarchical graph representation is designed. Using graph clustering techniques adapted from graph-based social media analysis, we propose to generate a hierarchy able to deal with different levels of abstraction while keeping information about the topology. For the proposed representations, a coarse-to-fine matching method is defined. These approaches are validated using real scenarios such as classification of colour images and handwritten word spotting.
Address Anacapri; Italy; May 2017
Corporate Author Thesis
Publisher Springer International Publishing Place of Publication Editor Pasquale Foggia; Cheng-Lin Liu; Mario Vento
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference GbRPR
Notes DAG; 600.097; 601.302; 600.121 Approved no
Call Number (down) Admin @ si @ RLF2017a Serial 2951
Permanent link to this record
 

 
Author Pau Riba; Josep Llados; Alicia Fornes
Title Handwritten Word Spotting by Inexact Matching of Grapheme Graphs Type Conference Article
Year 2015 Publication 13th International Conference on Document Analysis and Recognition ICDAR2015 Abbreviated Journal
Volume Issue Pages 781 - 785
Keywords
Abstract This paper presents a graph-based word spotting for handwritten documents. Contrary to most word spotting techniques, which use statistical representations, we propose a structural representation suitable to be robust to the inherent deformations of handwriting. Attributed graphs are constructed using a part-based approach. Graphemes extracted from shape convexities are used as stable units of handwriting, and are associated to graph nodes. Then, spatial relations between them determine graph edges. Spotting is defined in terms of an error-tolerant graph matching using bipartite-graph matching algorithm. To make the method usable in large datasets, a graph indexing approach that makes use of binary embeddings is used as preprocessing. Historical documents are used as experimental framework. The approach is comparable to statistical ones in terms of time and memory requirements, especially when dealing with large document collections.
Address
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference ICDAR
Notes DAG; 600.077; 600.061; 602.006 Approved no
Call Number (down) Admin @ si @ RLF2015b Serial 2642
Permanent link to this record
 

 
Author Pau Riba; Josep Llados; Alicia Fornes; Anjan Dutta
Title Large-scale Graph Indexing using Binary Embeddings of Node Contexts Type Conference Article
Year 2015 Publication 10th IAPR-TC15 Workshop on Graph-based Representations in Pattern Recognition Abbreviated Journal
Volume 9069 Issue Pages 208-217
Keywords Graph matching; Graph indexing; Application in document analysis; Word spotting; Binary embedding
Abstract Graph-based representations are experiencing a growing usage in visual recognition and retrieval due to their representational power in front of classical appearance-based representations in terms of feature vectors. Retrieving a query graph from a large dataset of graphs has the drawback of the high computational complexity required to compare the query and the target graphs. The most important property for a large-scale retrieval is the search time complexity to be sub-linear in the number of database examples. In this paper we propose a fast indexation formalism for graph retrieval. A binary embedding is defined as hashing keys for graph nodes. Given a database of labeled graphs, graph nodes are complemented with vectors of attributes representing their local context. Hence, each attribute counts the length of a walk of order k originated in a vertex with label l. Each attribute vector is converted to a binary code applying a binary-valued hash function. Therefore, graph retrieval is formulated in terms of finding target graphs in the database whose nodes have a small Hamming distance from the query nodes, easily computed with bitwise logical operators. As an application example, we validate the performance of the proposed methods in a handwritten word spotting scenario in images of historical documents.
Address Beijing; China; May 2015
Corporate Author Thesis
Publisher Springer International Publishing Place of Publication Editor C.-L.Liu; B.Luo; W.G.Kropatsch; J.Cheng
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title LNCS
Series Volume Series Issue Edition
ISSN 0302-9743 ISBN 978-3-319-18223-0 Medium
Area Expedition Conference GbRPR
Notes DAG; 600.061; 602.006; 600.077 Approved no
Call Number (down) Admin @ si @ RLF2015a Serial 2618
Permanent link to this record
 

 
Author Arnau Ramisa; Ramon Lopez de Mantaras; Ricardo Toledo
Title Comparing Combinations of Feature Regions for Panoramic VSLAM Type Conference Article
Year 2007 Publication 4th International Conference on Informatics in Control, Automation and Robotics Abbreviated Journal
Volume Issue Pages 292–297
Keywords
Abstract
Address Angers (France)
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference ICINCO
Notes RV;ADAS Approved no
Call Number (down) Admin @ si @ RLA2007 Serial 900
Permanent link to this record
 

 
Author Muhammad Anwer Rao; Fahad Shahbaz Khan; Joost Van de Weijer; Matthieu Molinier; Jorma Laaksonen
Title Binary patterns encoded convolutional neural networks for texture recognition and remote sensing scene classification Type Journal Article
Year 2018 Publication ISPRS Journal of Photogrammetry and Remote Sensing Abbreviated Journal ISPRS J
Volume 138 Issue Pages 74-85
Keywords Remote sensing; Deep learning; Scene classification; Local Binary Patterns; Texture analysis
Abstract Designing discriminative powerful texture features robust to realistic imaging conditions is a challenging computer vision problem with many applications, including material recognition and analysis of satellite or aerial imagery. In the past, most texture description approaches were based on dense orderless statistical distribution of local features. However, most recent approaches to texture recognition and remote sensing scene classification are based on Convolutional Neural Networks (CNNs). The de facto practice when learning these CNN models is to use RGB patches as input with training performed on large amounts of labeled data (ImageNet). In this paper, we show that Local Binary Patterns (LBP) encoded CNN models, codenamed TEX-Nets, trained using mapped coded images with explicit LBP based texture information provide complementary information to the standard RGB deep models. Additionally, two deep architectures, namely early and late fusion, are investigated to combine the texture and color information. To the best of our knowledge, we are the first to investigate Binary Patterns encoded CNNs and different deep network fusion architectures for texture recognition and remote sensing scene classification. We perform comprehensive experiments on four texture recognition datasets and four remote sensing scene classification benchmarks: UC-Merced with 21 scene categories, WHU-RS19 with 19 scene classes, RSSCN7 with 7 categories and the recently introduced large scale aerial image dataset (AID) with 30 aerial scene types. We demonstrate that TEX-Nets provide complementary information to standard RGB deep model of the same network architecture. Our late fusion TEX-Net architecture always improves the overall performance compared to the standard RGB network on both recognition problems. Furthermore, our final combination leads to consistent improvement over the state-of-the-art for remote sensing scene
Address
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference
Notes LAMP; 600.109; 600.106; 600.120 Approved no
Call Number (down) Admin @ si @ RKW2018 Serial 3158
Permanent link to this record
 

 
Author Muhammad Anwer Rao; Fahad Shahbaz Khan; Joost Van de Weijer; Jorma Laaksonen
Title Top-Down Deep Appearance Attention for Action Recognition Type Conference Article
Year 2017 Publication 20th Scandinavian Conference on Image Analysis Abbreviated Journal
Volume 10269 Issue Pages 297-309
Keywords Action recognition; CNNs; Feature fusion
Abstract Recognizing human actions in videos is a challenging problem in computer vision. Recently, convolutional neural network based deep features have shown promising results for action recognition. In this paper, we investigate the problem of fusing deep appearance and motion cues for action recognition. We propose a video representation which combines deep appearance and motion based local convolutional features within the bag-of-deep-features framework. Firstly, dense deep appearance and motion based local convolutional features are extracted from spatial (RGB) and temporal (flow) networks, respectively. Both visual cues are processed in parallel by constructing separate visual vocabularies for appearance and motion. A category-specific appearance map is then learned to modulate the weights of the deep motion features. The proposed representation is discriminative and binds the deep local convolutional features to their spatial locations. Experiments are performed on two challenging datasets: JHMDB dataset with 21 action classes and ACT dataset with 43 categories. The results clearly demonstrate that our approach outperforms both standard approaches of early and late feature fusion. Further, our approach is only employing action labels and without exploiting body part information, but achieves competitive performance compared to the state-of-the-art deep features based approaches.
Address Tromso; June 2017
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title LNCS
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference SCIA
Notes LAMP; 600.109; 600.068; 600.120 Approved no
Call Number (down) Admin @ si @ RKW2017b Serial 3039
Permanent link to this record
 

 
Author Muhammad Anwer Rao; Fahad Shahbaz Khan; Joost Van de Weijer; Jorma Laaksonen
Title Tex-Nets: Binary Patterns Encoded Convolutional Neural Networks for Texture Recognition Type Conference Article
Year 2017 Publication 19th International Conference on Multimodal Interaction Abbreviated Journal
Volume Issue Pages
Keywords Convolutional Neural Networks; Texture Recognition; Local Binary Paterns
Abstract Recognizing materials and textures in realistic imaging conditions is a challenging computer vision problem. For many years, local features based orderless representations were a dominant approach for texture recognition. Recently deep local features, extracted from the intermediate layers of a Convolutional Neural Network (CNN), are used as filter banks. These dense local descriptors from a deep model, when encoded with Fisher Vectors, have shown to provide excellent results for texture recognition. The CNN models, employed in such approaches, take RGB patches as input and train on a large amount of labeled images. We show that CNN models, which we call TEX-Nets, trained using mapped coded images with explicit texture information provide complementary information to the standard deep models trained on RGB patches. We further investigate two deep architectures, namely early and late fusion, to combine the texture and color information. Experiments on benchmark texture datasets clearly demonstrate that TEX-Nets provide complementary information to standard RGB deep network. Our approach provides a large gain of 4.8%, 3.5%, 2.6% and 4.1% respectively in accuracy on the DTD, KTH-TIPS-2a, KTH-TIPS-2b and Texture-10 datasets, compared to the standard RGB network of the same architecture. Further, our final combination leads to consistent improvements over the state-of-the-art on all four datasets.
Address Glasgow; Scothland; November 2017
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference ACM
Notes LAMP; 600.109; 600.068; 600.120 Approved no
Call Number (down) Admin @ si @ RKW2017 Serial 3038
Permanent link to this record
 

 
Author Muhammad Anwer Rao; Fahad Shahbaz Khan; Joost Van de Weijer; Jorma Laaksonen
Title Combining Holistic and Part-based Deep Representations for Computational Painting Categorization Type Conference Article
Year 2016 Publication 6th International Conference on Multimedia Retrieval Abbreviated Journal
Volume Issue Pages
Keywords
Abstract Automatic analysis of visual art, such as paintings, is a challenging inter-disciplinary research problem. Conventional approaches only rely on global scene characteristics by encoding holistic information for computational painting categorization.We argue that such approaches are sub-optimal and that discriminative common visual structures provide complementary information for painting classification. We present an approach that encodes both the global scene layout and discriminative latent common structures for computational painting categorization. The region of interests are automatically extracted, without any manual part labeling, by training class-specific deformable part-based models. Both holistic and region-of-interests are then described using multi-scale dense convolutional features. These features are pooled separately using Fisher vector encoding and concatenated afterwards in a single image representation. Experiments are performed on a challenging dataset with 91 different painters and 13 diverse painting styles. Our approach outperforms the standard method, which only employs the global scene characteristics. Furthermore, our method achieves state-of-the-art results outperforming a recent multi-scale deep features based approach [11] by 6.4% and 3.8% respectively on artist and style classification.
Address New York; USA; June 2016
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference ICMR
Notes LAMP; 600.068; 600.079;ADAS Approved no
Call Number (down) Admin @ si @ RKW2016 Serial 2763
Permanent link to this record
 

 
Author Christophe Rigaud; Dimosthenis Karatzas; Joost Van de Weijer; Jean-Christophe Burie; Jean-Marc Ogier
Title Automatic text localisation in scanned comic books Type Conference Article
Year 2013 Publication Proceedings of the International Conference on Computer Vision Theory and Applications Abbreviated Journal
Volume Issue Pages 814-819
Keywords Text localization; comics; text/graphic separation; complex background; unstructured document
Abstract Comic books constitute an important cultural heritage asset in many countries. Digitization combined with subsequent document understanding enable direct content-based search as opposed to metadata only search (e.g. album title or author name). Few studies have been done in this direction. In this work we detail a novel approach for the automatic text localization in scanned comics book pages, an essential step towards a fully automatic comics book understanding. We focus on speech text as it is semantically important and represents the majority of the text present in comics. The approach is compared with existing methods of text localization found in the literature and results are presented.
Address Barcelona; February 2013
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference VISAPP
Notes DAG; CIC; 600.056 Approved no
Call Number (down) Admin @ si @ RKW2013b Serial 2261
Permanent link to this record
 

 
Author Christophe Rigaud; Dimosthenis Karatzas; Joost Van de Weijer; Jean-Christophe Burie; Jean-Marc Ogier
Title An active contour model for speech balloon detection in comics Type Conference Article
Year 2013 Publication 12th International Conference on Document Analysis and Recognition Abbreviated Journal
Volume Issue Pages 1240-1244
Keywords
Abstract Comic books constitute an important cultural heritage asset in many countries. Digitization combined with subsequent comic book understanding would enable a variety of new applications, including content-based retrieval and content retargeting. Document understanding in this domain is challenging as comics are semi-structured documents, combining semantically important graphical and textual parts. Few studies have been done in this direction. In this work we detail a novel approach for closed and non-closed speech balloon localization in scanned comic book pages, an essential step towards a fully automatic comic book understanding. The approach is compared with existing methods for closed balloon localization found in the literature and results are presented.
Address washington; USA; August 2013
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN 1520-5363 ISBN Medium
Area Expedition Conference ICDAR
Notes DAG; CIC; 600.056 Approved no
Call Number (down) Admin @ si @ RKW2013a Serial 2260
Permanent link to this record
 

 
Author Huamin Ren; Nattiya Kanhabua; Andreas Mogelmose; Weifeng Liu; Kaustubh Kulkarni; Sergio Escalera; Xavier Baro; Thomas B. Moeslund
Title Back-dropout Transfer Learning for Action Recognition Type Journal Article
Year 2018 Publication IET Computer Vision Abbreviated Journal IETCV
Volume 12 Issue 4 Pages 484-491
Keywords Learning (artificial intelligence); Pattern Recognition
Abstract Transfer learning aims at adapting a model learned from source dataset to target dataset. It is a beneficial approach especially when annotating on the target dataset is expensive or infeasible. Transfer learning has demonstrated its powerful learning capabilities in various vision tasks. Despite transfer learning being a promising approach, it is still an open question how to adapt the model learned from the source dataset to the target dataset. One big challenge is to prevent the impact of category bias on classification performance. Dataset bias exists when two images from the same category, but from different datasets, are not classified as the same. To address this problem, a transfer learning algorithm has been proposed, called negative back-dropout transfer learning (NB-TL), which utilizes images that have been misclassified and further performs back-dropout strategy on them to penalize errors. Experimental results demonstrate the effectiveness of the proposed algorithm. In particular, the authors evaluate the performance of the proposed NB-TL algorithm on UCF 101 action recognition dataset, achieving 88.9% recognition rate.
Address
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference
Notes HUPBA; no proj Approved no
Call Number (down) Admin @ si @ RKM2018 Serial 3071
Permanent link to this record
 

 
Author Juanjo Rubio; Takahiro Kashiwa; Teera Laiteerapong; Wenlong Deng; Kohei Nagai; Sergio Escalera; Kotaro Nakayama; Yutaka Matsuo; Helmut Prendinger
Title Multi-class structural damage segmentation using fully convolutional networks Type Journal Article
Year 2019 Publication Computers in Industry Abbreviated Journal COMPUTIND
Volume 112 Issue Pages 103121
Keywords Bridge damage detection; Deep learning; Semantic segmentation
Abstract Structural Health Monitoring (SHM) has benefited from computer vision and more recently, Deep Learning approaches, to accurately estimate the state of deterioration of infrastructure. In our work, we test Fully Convolutional Networks (FCNs) with a dataset of deck areas of bridges for damage segmentation. We create a dataset for delamination and rebar exposure that has been collected from inspection records of bridges in Niigata Prefecture, Japan. The dataset consists of 734 images with three labels per image, which makes it the largest dataset of images of bridge deck damage. This data allows us to estimate the performance of our method based on regions of agreement, which emulates the uncertainty of in-field inspections. We demonstrate the practicality of FCNs to perform automated semantic segmentation of surface damages. Our model achieves a mean accuracy of 89.7% for delamination and 78.4% for rebar exposure, and a weighted F1 score of 81.9%.
Address
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference
Notes HuPBA; no proj Approved no
Call Number (down) Admin @ si @ RKL2019 Serial 3315
Permanent link to this record
 

 
Author Marçal Rusiñol; Dimosthenis Karatzas; Josep Llados
Title Spotting Graphical Symbols in Camera-Acquired Documents in Real Time Type Book Chapter
Year 2014 Publication Graphics Recognition. Current Trends and Challenges Abbreviated Journal
Volume 8746 Issue Pages 3-10
Keywords
Abstract In this paper we present a system devoted to spot graphical symbols in camera-acquired document images. The system is based on the extraction and further matching of ORB compact local features computed over interest key-points. Then, the FLANN indexing framework based on approximate nearest neighbor search allows to efficiently match local descriptors between the captured scene and the graphical models. Finally, the RANSAC algorithm is used in order to compute the homography between the spotted symbol and its appearance in the document image. The proposed approach is efficient and is able to work in real time.
Address
Corporate Author Thesis
Publisher Springer Berlin Heidelberg Place of Publication Editor Bart Lamiroy; Jean-Marc Ogier
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title LNCS
Series Volume Series Issue Edition
ISSN 0302-9743 ISBN 978-3-662-44853-3 Medium
Area Expedition Conference
Notes DAG; 600.045; 600.055; 600.061; 600.077 Approved no
Call Number (down) Admin @ si @ RKL2014 Serial 2700
Permanent link to this record
 

 
Author Marçal Rusiñol; Dimosthenis Karatzas; Josep Llados
Title Spotting Graphical Symbols in Camera-Acquired Documents in Real Time Type Conference Article
Year 2013 Publication 10th IAPR International Workshop on Graphics Recognition Abbreviated Journal
Volume Issue Pages
Keywords
Abstract In this paper we present a system devoted to spot graphical symbols in camera-acquired document images. The system is based on the extraction and further matching of ORB compact local features computed over interest key-points. Then, the FLANN indexing framework based on approximate nearest neighbor search allows to efficiently match local descriptors between the captured scene and the graphical models. Finally, the RANSAC algorithm is used in order to compute the homography between the spotted symbol and its appearance in the document image. The proposed approach is efficient and is able to work in real time.
Address Bethlehem; PA; USA; August 2013
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference GREC
Notes DAG; 600.045; 600.055; 600.061; 602.101 Approved no
Call Number (down) Admin @ si @ RKL2013 Serial 2347
Permanent link to this record