|   | 
Details
   web
Records
Author Marçal Rusiñol; Josep Llados
Title A Performance Evaluation Protocol for Symbol Spotting Systems in Terms of Recognition and Location Indices Type Journal Article
Year 2009 Publication International Journal on Document Analysis and Recognition Abbreviated Journal IJDAR
Volume 12 Issue 2 Pages 83-96
Keywords Performance evaluation; Symbol Spotting; Graphics Recognition
Abstract (down) Symbol spotting systems are intended to retrieve regions of interest from a document image database where the queried symbol is likely to be found. They shall have the ability to recognize and locate graphical symbols in a single step. In this paper, we present a set of measures to evaluate the performance of a symbol spotting system in terms of recognition abilities, location accuracy and scalability. We show that the proposed measures allow to determine the weaknesses and strengths of different methods. In particular we have tested a symbol spotting method based on a set of four different off-the-shelf shape descriptors.
Address
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN 1433-2833 ISBN Medium
Area Expedition Conference
Notes DAG Approved no
Call Number DAG @ dag @ RuL2009a Serial 1166
Permanent link to this record
 

 
Author Sergio Escalera; Alicia Fornes; Oriol Pujol; Alberto Escudero; Petia Radeva
Title Circular Blurred Shape Model for Symbol Spotting in Documents Type Conference Article
Year 2009 Publication 16th IEEE International Conference on Image Processing Abbreviated Journal
Volume Issue Pages 1985-1988
Keywords
Abstract (down) Symbol spotting problem requires feature extraction strategies able to generalize from training samples and to localize the target object while discarding most part of the image. In the case of document analysis, symbol spotting techniques have to deal with a high variability of symbols' appearance. In this paper, we propose the Circular Blurred Shape Model descriptor. Feature extraction is performed capturing the spatial arrangement of significant object characteristics in a correlogram structure. Shape information from objects is shared among correlogram regions, being tolerant to the irregular deformations. Descriptors are learnt using a cascade of classifiers and Abadoost as the base classifier. Finally, symbol spotting is performed by means of a windowing strategy using the learnt cascade over plan and old musical score documents. Spotting and multi-class categorization results show better performance comparing with the state-of-the-art descriptors.
Address Cairo, Egypt
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN 978-1-4244-5653-6 Medium
Area Expedition Conference ICIP
Notes MILAB;HuPBA;DAG Approved no
Call Number BCNPCL @ bcnpcl @ EFP2009b Serial 1184
Permanent link to this record
 

 
Author Klaus Broelemann; Anjan Dutta; Xiaoyi Jiang; Josep Llados
Title Hierarchical graph representation for symbol spotting in graphical document images Type Conference Article
Year 2012 Publication Structural, Syntactic, and Statistical Pattern Recognition, Joint IAPR International Workshop Abbreviated Journal
Volume 7626 Issue Pages 529-538
Keywords
Abstract (down) Symbol spotting can be defined as locating given query symbol in a large collection of graphical documents. In this paper we present a hierarchical graph representation for symbols. This representation allows graph matching methods to deal with low-level vectorization errors and, thus, to perform a robust symbol spotting. To show the potential of this approach, we conduct an experiment with the SESYD dataset.
Address Miyajima-Itsukushima, Hiroshima
Corporate Author Thesis
Publisher Springer Berlin Heidelberg Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title LNCS
Series Volume Series Issue Edition
ISSN 0302-9743 ISBN 978-3-642-34165-6 Medium
Area Expedition Conference SSPR&SPR
Notes DAG Approved no
Call Number Admin @ si @ BDJ2012 Serial 2126
Permanent link to this record
 

 
Author Jose A. Garcia; David Masip; Valerio Sbragaglia; Jacopo Aguzzi
Title Using ORB, BoW and SVM to identificate and track tagged Norway lobster Nephrops Norvegicus (L.) Type Conference Article
Year 2016 Publication 3rd International Conference on Maritime Technology and Engineering Abbreviated Journal
Volume Issue Pages
Keywords
Abstract (down) Sustainable capture policies of many species strongly depend on the understanding of their social behaviour. Nevertheless, the analysis of emergent behaviour in marine species poses several challenges. Usually animals are captured and observed in tanks, and their behaviour is inferred from their dynamics and interactions. Therefore, researchers must deal with thousands of hours of video data. Without loss of generality, this paper proposes a computer
vision approach to identify and track specific species, the Norway lobster, Nephrops norvegicus. We propose an identification scheme were animals are marked using black and white tags with a geometric shape in the center (holed
triangle, filled triangle, holed circle and filled circle). Using a massive labelled dataset; we extract local features based on the ORB descriptor. These features are a posteriori clustered, and we construct a Bag of Visual Words feature vector per animal. This approximation yields us invariance to rotation
and translation. A SVM classifier achieves generalization results above 99%. In a second contribution, we will make the code and training data publically available.
Address Lisboa; Portugal; July 2016
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference MARTECH
Notes OR;MV; Approved no
Call Number Admin @ si @ GMS2016b Serial 2817
Permanent link to this record
 

 
Author Diego Velazquez; Pau Rodriguez; Josep M. Gonfaus; Xavier Roca; Jordi Gonzalez
Title A Closer Look at Embedding Propagation for Manifold Smoothing Type Journal Article
Year 2022 Publication Journal of Machine Learning Research Abbreviated Journal JMLR
Volume 23 Issue 252 Pages 1-27
Keywords Regularization; emi-supervised learning; self-supervised learning; adversarial robustness; few-shot classification
Abstract (down) Supervised training of neural networks requires a large amount of manually annotated data and the resulting networks tend to be sensitive to out-of-distribution (OOD) data.
Self- and semi-supervised training schemes reduce the amount of annotated data required during the training process. However, OOD generalization remains a major challenge for most methods. Strategies that promote smoother decision boundaries play an important role in out-of-distribution generalization. For example, embedding propagation (EP) for manifold smoothing has recently shown to considerably improve the OOD performance for few-shot classification. EP achieves smoother class manifolds by building a graph from sample embeddings and propagating information through the nodes in an unsupervised manner. In this work, we extend the original EP paper providing additional evidence and experiments showing that it attains smoother class embedding manifolds and improves results in settings beyond few-shot classification. Concretely, we show that EP improves the robustness of neural networks against multiple adversarial attacks as well as semi- and
self-supervised learning performance.
Address 9/2022
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference
Notes Approved no
Call Number Admin @ si @ VRG2022 Serial 3762
Permanent link to this record
 

 
Author Antonio Lopez; Jiaolong Xu; Jose Luis Gomez; David Vazquez; German Ros
Title From Virtual to Real World Visual Perception using Domain Adaptation -- The DPM as Example Type Book Chapter
Year 2017 Publication Domain Adaptation in Computer Vision Applications Abbreviated Journal
Volume Issue 13 Pages 243-258
Keywords Domain Adaptation
Abstract (down) Supervised learning tends to produce more accurate classifiers than unsupervised learning in general. This implies that training data is preferred with annotations. When addressing visual perception challenges, such as localizing certain object classes within an image, the learning of the involved classifiers turns out to be a practical bottleneck. The reason is that, at least, we have to frame object examples with bounding boxes in thousands of images. A priori, the more complex the model is regarding its number of parameters, the more annotated examples are required. This annotation task is performed by human oracles, which ends up in inaccuracies and errors in the annotations (aka ground truth) since the task is inherently very cumbersome and sometimes ambiguous. As an alternative we have pioneered the use of virtual worlds for collecting such annotations automatically and with high precision. However, since the models learned with virtual data must operate in the real world, we still need to perform domain adaptation (DA). In this chapter we revisit the DA of a deformable part-based model (DPM) as an exemplifying case of virtual- to-real-world DA. As a use case, we address the challenge of vehicle detection for driver assistance, using different publicly available virtual-world data. While doing so, we investigate questions such as: how does the domain gap behave due to virtual-vs-real data with respect to dominant object appearance per domain, as well as the role of photo-realism in the virtual world.
Address
Corporate Author Thesis
Publisher Springer Place of Publication Editor Gabriela Csurka
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference
Notes ADAS; 600.085; 601.223; 600.076; 600.118 Approved no
Call Number ADAS @ adas @ LXG2017 Serial 2872
Permanent link to this record
 

 
Author Parichehr Behjati Ardakani; Pau Rodriguez; Armin Mehri; Isabelle Hupont; Carles Fernandez; Jordi Gonzalez
Title OverNet: Lightweight Multi-Scale Super-Resolution with Overscaling Network Type Conference Article
Year 2021 Publication IEEE Winter Conference on Applications of Computer Vision Abbreviated Journal
Volume Issue Pages 2693-2702
Keywords
Abstract (down) Super-resolution (SR) has achieved great success due to the development of deep convolutional neural networks (CNNs). However, as the depth and width of the networks increase, CNN-based SR methods have been faced with the challenge of computational complexity in practice. More- over, most SR methods train a dedicated model for each target resolution, losing generality and increasing memory requirements. To address these limitations we introduce OverNet, a deep but lightweight convolutional network to solve SISR at arbitrary scale factors with a single model. We make the following contributions: first, we introduce a lightweight feature extractor that enforces efficient reuse of information through a novel recursive structure of skip and dense connections. Second, to maximize the performance of the feature extractor, we propose a model agnostic reconstruction module that generates accurate high-resolution images from overscaled feature maps obtained from any SR architecture. Third, we introduce a multi-scale loss function to achieve generalization across scales. Experiments show that our proposal outperforms previous state-of-the-art approaches in standard benchmarks, while maintaining relatively low computation and memory requirements.
Address Virtual; January 2021
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference WACV
Notes ISE; 600.119; 600.098 Approved no
Call Number Admin @ si @ BRM2021 Serial 3512
Permanent link to this record
 

 
Author Antonio Lopez; Atsushi Imiya; Tomas Pajdla; Jose Manuel Alvarez
Title Computer Vision in Vehicle Technology: Land, Sea & Air Type Book Whole
Year 2017 Publication Abbreviated Journal
Volume Issue Pages 161-163
Keywords
Abstract (down) Summary This chapter examines different vision-based commercial solutions for real-live problems related to vehicles. It is worth mentioning the recent astonishing performance of deep convolutional neural networks (DCNNs) in difficult visual tasks such as image classification, object recognition/localization/detection, and semantic segmentation. In fact,
different DCNN architectures are already being explored for low-level tasks such as optical flow and disparity computation, and higher level ones such as place recognition.
Address
Corporate Author Thesis
Publisher John Wiley & Sons, Ltd Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN 978-1-118-86807-2 Medium
Area Expedition Conference
Notes ADAS; 600.118 Approved no
Call Number Admin @ si @ LIP2017a Serial 2937
Permanent link to this record
 

 
Author Katerine Diaz; Francesc J. Ferri; W. Diaz
Title Incremental Generalized Discriminative Common Vectors for Image Classification Type Journal Article
Year 2015 Publication IEEE Transactions on Neural Networks and Learning Systems Abbreviated Journal TNNLS
Volume 26 Issue 8 Pages 1761 - 1775
Keywords
Abstract (down) Subspace-based methods have become popular due to their ability to appropriately represent complex data in such a way that both dimensionality is reduced and discriminativeness is enhanced. Several recent works have concentrated on the discriminative common vector (DCV) method and other closely related algorithms also based on the concept of null space. In this paper, we present a generalized incremental formulation of the DCV methods, which allows the update of a given model by considering the addition of new examples even from unseen classes. Having efficient incremental formulations of well-behaved batch algorithms allows us to conveniently adapt previously trained classifiers without the need of recomputing them from scratch. The proposed generalized incremental method has been empirically validated in different case studies from different application domains (faces, objects, and handwritten digits) considering several different scenarios in which new data are continuously added at different rates starting from an initial model.
Address
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN 2162-237X ISBN Medium
Area Expedition Conference
Notes ADAS; 600.076 Approved no
Call Number Admin @ si @ DFD2015 Serial 2547
Permanent link to this record
 

 
Author Wenwen Yu; Chengquan Zhang; Haoyu Cao; Wei Hua; Bohan Li; Huang Chen; Mingyu Liu; Mingrui Chen; Jianfeng Kuang; Mengjun Cheng; Yuning Du; Shikun Feng; Xiaoguang Hu; Pengyuan Lyu; Kun Yao; Yuechen Yu; Yuliang Liu; Wanxiang Che; Errui Ding; Cheng-Lin Liu; Jiebo Luo; Shuicheng Yan; Min Zhang; Dimosthenis Karatzas; Xing Sun; Jingdong Wang; Xiang Bai
Title ICDAR 2023 Competition on Structured Text Extraction from Visually-Rich Document Images Type Conference Article
Year 2023 Publication 17th International Conference on Document Analysis and Recognition Abbreviated Journal
Volume 14188 Issue Pages 536–552
Keywords
Abstract (down) Structured text extraction is one of the most valuable and challenging application directions in the field of Document AI. However, the scenarios of past benchmarks are limited, and the corresponding evaluation protocols usually focus on the submodules of the structured text extraction scheme. In order to eliminate these problems, we organized the ICDAR 2023 competition on Structured text extraction from Visually-Rich Document images (SVRD). We set up two tracks for SVRD including Track 1: HUST-CELL and Track 2: Baidu-FEST, where HUST-CELL aims to evaluate the end-to-end performance of Complex Entity Linking and Labeling, and Baidu-FEST focuses on evaluating the performance and generalization of Zero-shot/Few-shot Structured Text extraction from an end-to-end perspective. Compared to the current document benchmarks, our two tracks of competition benchmark enriches the scenarios greatly and contains more than 50 types of visually-rich document images (mainly from the actual enterprise applications). The competition opened on 30th December, 2022 and closed on 24th March, 2023. There are 35 participants and 91 valid submissions received for Track 1, and 15 participants and 26 valid submissions received for Track 2. In this report we will presents the motivation, competition datasets, task definition, evaluation protocol, and submission summaries. According to the performance of the submissions, we believe there is still a large gap on the expected information extraction performance for complex and zero-shot scenarios. It is hoped that this competition will attract many researchers in the field of CV and NLP, and bring some new thoughts to the field of Document AI.
Address San Jose; CA; USA; August 2023
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title LNCS
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference ICDAR
Notes DAG Approved no
Call Number Admin @ si @ YZC2023 Serial 3896
Permanent link to this record
 

 
Author Jaume Gibert; Ernest Valveny; Horst Bunke
Title Graph of Words Embedding for Molecular Structure-Activity Relationship Analysis Type Conference Article
Year 2010 Publication 15th Iberoamerican Congress on Pattern Recognition Abbreviated Journal
Volume 6419 Issue Pages 30–37
Keywords
Abstract (down) Structure-Activity relationship analysis aims at discovering chemical activity of molecular compounds based on their structure. In this article we make use of a particular graph representation of molecules and propose a new graph embedding procedure to solve the problem of structure-activity relationship analysis. The embedding is essentially an arrangement of a molecule in the form of a vector by considering frequencies of appearing atoms and frequencies of covalent bonds between them. Results on two benchmark databases show the effectiveness of the proposed technique in terms of recognition accuracy while avoiding high operational costs in the transformation.
Address Sao Paulo, Brazil
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title LNCS
Series Volume Series Issue Edition
ISSN 0302-9743 ISBN 978-3-642-16686-0 Medium
Area Expedition Conference CIARP
Notes DAG Approved no
Call Number DAG @ dag @ GVB2010 Serial 1462
Permanent link to this record
 

 
Author Muhammad Muzzamil Luqman; Jean-Yves Ramel; Josep Llados; Thierry Brouard
Title Fuzzy Multilevel Graph Embedding Type Journal Article
Year 2013 Publication Pattern Recognition Abbreviated Journal PR
Volume 46 Issue 2 Pages 551-565
Keywords Pattern recognition; Graphics recognition; Graph clustering; Graph classification; Explicit graph embedding; Fuzzy logic
Abstract (down) Structural pattern recognition approaches offer the most expressive, convenient, powerful but computational expensive representations of underlying relational information. To benefit from mature, less expensive and efficient state-of-the-art machine learning models of statistical pattern recognition they must be mapped to a low-dimensional vector space. Our method of explicit graph embedding bridges the gap between structural and statistical pattern recognition. We extract the topological, structural and attribute information from a graph and encode numeric details by fuzzy histograms and symbolic details by crisp histograms. The histograms are concatenated to achieve a simple and straightforward embedding of graph into a low-dimensional numeric feature vector. Experimentation on standard public graph datasets shows that our method outperforms the state-of-the-art methods of graph embedding for richly attributed graphs.
Address
Corporate Author Thesis
Publisher Elsevier Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN 0031-3203 ISBN Medium
Area Expedition Conference
Notes DAG; 600.042; 600.045; 605.203 Approved no
Call Number Admin @ si @ LRL2013a Serial 2270
Permanent link to this record
 

 
Author Wenlong Deng; Yongli Mou; Takahiro Kashiwa; Sergio Escalera; Kohei Nagai; Kotaro Nakayama; Yutaka Matsuo; Helmut Prendinger
Title Vision based Pixel-level Bridge Structural Damage Detection Using a Link ASPP Network Type Journal Article
Year 2020 Publication Automation in Construction Abbreviated Journal AC
Volume 110 Issue Pages 102973
Keywords Semantic image segmentation; Deep learning
Abstract (down) Structural Health Monitoring (SHM) has greatly benefited from computer vision. Recently, deep learning approaches are widely used to accurately estimate the state of deterioration of infrastructure. In this work, we focus on the problem of bridge surface structural damage detection, such as delamination and rebar exposure. It is well known that the quality of a deep learning model is highly dependent on the quality of the training dataset. Bridge damage detection, our application domain, has the following main challenges: (i) labeling the damages requires knowledgeable civil engineering professionals, which makes it difficult to collect a large annotated dataset; (ii) the damage area could be very small, whereas the background area is large, which creates an unbalanced training environment; (iii) due to the difficulty to exactly determine the extension of the damage, there is often a variation among different labelers who perform pixel-wise labeling. In this paper, we propose a novel model for bridge structural damage detection to address the first two challenges. This paper follows the idea of an atrous spatial pyramid pooling (ASPP) module that is designed as a novel network for bridge damage detection. Further, we introduce the weight balanced Intersection over Union (IoU) loss function to achieve accurate segmentation on a highly unbalanced small dataset. The experimental results show that (i) the IoU loss function improves the overall performance of damage detection, as compared to cross entropy loss or focal loss, and (ii) the proposed model has a better ability to detect a minority class than other light segmentation networks.
Address
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference
Notes HuPBA; no proj Approved no
Call Number Admin @ si @ DMK2020 Serial 3314
Permanent link to this record
 

 
Author Juan Jose Rubio; Takahiro Kashiwa; Teera Laiteerapong; Wenlong Deng; Kohei Nagai; Sergio Escalera; Kotaro Nakayama; Yutaka Matsuo; Helmut Prendinger
Title Multi-class structural damage segmentation using fully convolutional networks Type Journal Article
Year 2019 Publication Computers in Industry Abbreviated Journal COMPUTIND
Volume 112 Issue Pages 103121
Keywords Bridge damage detection; Deep learning; Semantic segmentation
Abstract (down) Structural Health Monitoring (SHM) has benefited from computer vision and more recently, Deep Learning approaches, to accurately estimate the state of deterioration of infrastructure. In our work, we test Fully Convolutional Networks (FCNs) with a dataset of deck areas of bridges for damage segmentation. We create a dataset for delamination and rebar exposure that has been collected from inspection records of bridges in Niigata Prefecture, Japan. The dataset consists of 734 images with three labels per image, which makes it the largest dataset of images of bridge deck damage. This data allows us to estimate the performance of our method based on regions of agreement, which emulates the uncertainty of in-field inspections. We demonstrate the practicality of FCNs to perform automated semantic segmentation of surface damages. Our model achieves a mean accuracy of 89.7% for delamination and 78.4% for rebar exposure, and a weighted F1 score of 81.9%.
Address
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference
Notes HuPBA; no proj Approved no
Call Number Admin @ si @ RKL2019 Serial 3315
Permanent link to this record
 

 
Author German Barquero; Sergio Escalera; Cristina Palmero
Title BeLFusion: Latent Diffusion for Behavior-Driven Human Motion Prediction Type Conference Article
Year 2023 Publication IEEE/CVF International Conference on Computer Vision (ICCV) Workshops Abbreviated Journal
Volume Issue Pages 2317-2327
Keywords
Abstract (down) Stochastic human motion prediction (HMP) has generally been tackled with generative adversarial networks and variational autoencoders. Most prior works aim at predicting highly diverse movements in terms of the skeleton joints’ dispersion. This has led to methods predicting fast and motion-divergent movements, which are often unrealistic and incoherent with past motion. Such methods also neglect contexts that need to anticipate diverse low-range behaviors, or actions, with subtle joint displacements. To address these issues, we present BeLFusion, a model that, for the first time, leverages latent diffusion models in HMP to sample from a latent space where behavior is disentangled from pose and motion. As a result, diversity is encouraged from a behavioral perspective. Thanks to our behavior
coupler’s ability to transfer sampled behavior to ongoing motion, BeLFusion’s predictions display a variety of behaviors that are significantly more realistic than the state of the art. To support it, we introduce two metrics, the Area of
the Cumulative Motion Distribution, and the Average Pairwise Distance Error, which are correlated to our definition of realism according to a qualitative study with 126 participants. Finally, we prove BeLFusion’s generalization power in a new cross-dataset scenario for stochastic HMP.
Address 2-6 October 2023. Paris (France)
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference ICCV
Notes HUPBA; no menciona Approved no
Call Number Admin @ si @ BEP2023 Serial 3829
Permanent link to this record