|   | 
Details
   web
Records
Author Suman Ghosh; Ernest Valveny
Title R-PHOC: Segmentation-Free Word Spotting using CNN Type Conference Article
Year 2017 Publication 14th International Conference on Document Analysis and Recognition Abbreviated Journal
Volume Issue (down) Pages
Keywords Convolutional neural network; Image segmentation; Artificial neural network; Nearest neighbor search
Abstract arXiv:1707.01294
This paper proposes a region based convolutional neural network for segmentation-free word spotting. Our network takes as input an image and a set of word candidate bound- ing boxes and embeds all bounding boxes into an embedding space, where word spotting can be casted as a simple nearest neighbour search between the query representation and each of the candidate bounding boxes. We make use of PHOC embedding as it has previously achieved significant success in segmentation- based word spotting. Word candidates are generated using a simple procedure based on grouping connected components using some spatial constraints. Experiments show that R-PHOC which operates on images directly can improve the current state-of- the-art in the standard GW dataset and performs as good as PHOCNET in some cases designed for segmentation based word spotting.
Address
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference ICDAR
Notes DAG; 600.121 Approved no
Call Number Admin @ si @ GhV2017a Serial 3079
Permanent link to this record
 

 
Author Suman Ghosh; Ernest Valveny
Title Visual attention models for scene text recognition Type Conference Article
Year 2017 Publication 14th International Conference on Document Analysis and Recognition Abbreviated Journal
Volume Issue (down) Pages
Keywords
Abstract arXiv:1706.01487
In this paper we propose an approach to lexicon-free recognition of text in scene images. Our approach relies on a LSTM-based soft visual attention model learned from convolutional features. A set of feature vectors are derived from an intermediate convolutional layer corresponding to different areas of the image. This permits encoding of spatial information into the image representation. In this way, the framework is able to learn how to selectively focus on different parts of the image. At every time step the recognizer emits one character using a weighted combination of the convolutional feature vectors according to the learned attention model. Training can be done end-to-end using only word level annotations. In addition, we show that modifying the beam search algorithm by integrating an explicit language model leads to significantly better recognition results. We validate the performance of our approach on standard SVT and ICDAR'03 scene text datasets, showing state-of-the-art performance in unconstrained text recognition.
Address
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference ICDAR
Notes DAG; 600.121 Approved no
Call Number Admin @ si @ GhV2017b Serial 3080
Permanent link to this record
 

 
Author Konstantia Georgouli; Katerine Diaz; Jesus Martinez del Rincon; Anastasios Koidis
Title Building generic, easily-updatable chemometric models with harmonisation and augmentation features: The case of FTIR vegetable oils classification Type Conference Article
Year 2017 Publication 3rd Ιnternational Conference Metrology Promoting Standardization and Harmonization in Food and Nutrition Abbreviated Journal
Volume Issue (down) Pages
Keywords
Abstract
Address Thessaloniki; Greece; October 2017
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference IMEKOFOODS
Notes ADAS; 600.118 Approved no
Call Number Admin @ si @ GDM2017 Serial 3081
Permanent link to this record
 

 
Author Sounak Dey; Anjan Dutta; Juan Ignacio Toledo; Suman Ghosh; Josep Llados; Umapada Pal
Title SigNet: Convolutional Siamese Network for Writer Independent Offline Signature Verification Type Miscellaneous
Year 2018 Publication Arxiv Abbreviated Journal
Volume Issue (down) Pages
Keywords
Abstract Offline signature verification is one of the most challenging tasks in biometrics and document forensics. Unlike other verification problems, it needs to model minute but critical details between genuine and forged signatures, because a skilled falsification might often resembles the real signature with small deformation. This verification task is even harder in writer independent scenarios which is undeniably fiscal for realistic cases. In this paper, we model an offline writer independent signature verification task with a convolutional Siamese network. Siamese networks are twin networks with shared weights, which can be trained to learn a feature space where similar observations are placed in proximity. This is achieved by exposing the network to a pair of similar and dissimilar observations and minimizing the Euclidean distance between similar pairs while simultaneously maximizing it between dissimilar pairs. Experiments conducted on cross-domain datasets emphasize the capability of our network to model forgery in different languages (scripts) and handwriting styles. Moreover, our designed Siamese network, named SigNet, exceeds the state-of-the-art results on most of the benchmark signature datasets, which paves the way for further research in this direction.
Address
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference
Notes DAG; 600.097; 600.121 Approved no
Call Number Admin @ si @ DDT2018 Serial 3085
Permanent link to this record
 

 
Author Lluis Pere de las Heras; Oriol Ramos Terrades; Josep Llados
Title Ontology-Based Understanding of Architectural Drawings Type Book Chapter
Year 2017 Publication International Workshop on Graphics Recognition. GREC 2015.Graphic Recognition. Current Trends and Challenges Abbreviated Journal
Volume 9657 Issue (down) Pages 75-85
Keywords Graphics recognition; Floor plan analysi; Domain ontology
Abstract In this paper we present a knowledge base of architectural documents aiming at improving existing methods of floor plan classification and understanding. It consists of an ontological definition of the domain and the inclusion of real instances coming from both, automatically interpreted and manually labeled documents. The knowledge base has proven to be an effective tool to structure our knowledge and to easily maintain and upgrade it. Moreover, it is an appropriate means to automatically check the consistency of relational data and a convenient complement of hard-coded knowledge interpretation systems.
Address
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title LNCS
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference
Notes DAG; 600.121 Approved no
Call Number Admin @ si @ HRL2017 Serial 3086
Permanent link to this record
 

 
Author Dena Bazazian; Dimosthenis Karatzas; Andrew Bagdanov
Title Soft-PHOC Descriptor for End-to-End Word Spotting in Egocentric Scene Images Type Conference Article
Year 2018 Publication International Workshop on Egocentric Perception, Interaction and Computing at ECCV Abbreviated Journal
Volume Issue (down) Pages
Keywords
Abstract Word spotting in natural scene images has many applications in scene understanding and visual assistance. We propose Soft-PHOC, an intermediate representation of images based on character probability maps. Our representation extends the concept of the Pyramidal Histogram Of Characters (PHOC) by exploiting Fully Convolutional Networks to derive a pixel-wise mapping of the character distribution within candidate word regions. We show how to use our descriptors for word spotting tasks in egocentric camera streams through an efficient text line proposal algorithm. This is based on the Hough Transform over character attribute maps followed by scoring using Dynamic Time Warping (DTW). We evaluate our results on ICDAR 2015 Challenge 4 dataset of incidental scene text captured by an egocentric camera.
Address Munich; Alemanya; September 2018
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference ECCVW
Notes DAG; 600.129; 600.121; Approved no
Call Number Admin @ si @ BKB2018b Serial 3174
Permanent link to this record
 

 
Author Jorge Bernal; Aymeric Histace; Marc Masana; Quentin Angermann; Cristina Sanchez Montes; Cristina Rodriguez de Miguel; Maroua Hammami; Ana Garcia Rodriguez; Henry Cordova; Olivier Romain; Gloria Fernandez Esparrach; Xavier Dray; F. Javier Sanchez
Title Polyp Detection Benchmark in Colonoscopy Videos using GTCreator: A Novel Fully Configurable Tool for Easy and Fast Annotation of Image Databases Type Conference Article
Year 2018 Publication 32nd International Congress and Exhibition on Computer Assisted Radiology & Surgery Abbreviated Journal
Volume Issue (down) Pages
Keywords
Abstract
Address
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference CARS
Notes ISE; MV; 600.119 Approved no
Call Number Admin @ si @ BHM2018 Serial 3089
Permanent link to this record
 

 
Author Katerine Diaz; Francesc J. Ferri; Aura Hernandez-Sabate
Title An overview of incremental feature extraction methods based on linear subspaces Type Journal Article
Year 2018 Publication Knowledge-Based Systems Abbreviated Journal KBS
Volume 145 Issue (down) Pages 219-235
Keywords
Abstract With the massive explosion of machine learning in our day-to-day life, incremental and adaptive learning has become a major topic, crucial to keep up-to-date and improve classification models and their corresponding feature extraction processes. This paper presents a categorized overview of incremental feature extraction based on linear subspace methods which aim at incorporating new information to the already acquired knowledge without accessing previous data. Specifically, this paper focuses on those linear dimensionality reduction methods with orthogonal matrix constraints based on global loss function, due to the extensive use of their batch approaches versus other linear alternatives. Thus, we cover the approaches derived from Principal Components Analysis, Linear Discriminative Analysis and Discriminative Common Vector methods. For each basic method, its incremental approaches are differentiated according to the subspace model and matrix decomposition involved in the updating process. Besides this categorization, several updating strategies are distinguished according to the amount of data used to update and to the fact of considering a static or dynamic number of classes. Moreover, the specific role of the size/dimension ratio in each method is considered. Finally, computational complexity, experimental setup and the accuracy rates according to published results are compiled and analyzed, and an empirical evaluation is done to compare the best approach of each kind.
Address
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN 0950-7051 ISBN Medium
Area Expedition Conference
Notes ADAS; 600.118 Approved no
Call Number Admin @ si @ DFH2018 Serial 3090
Permanent link to this record
 

 
Author Katerine Diaz; Jesus Martinez del Rincon; Aura Hernandez-Sabate; Debora Gil
Title Continuous head pose estimation using manifold subspace embedding and multivariate regression Type Journal Article
Year 2018 Publication IEEE Access Abbreviated Journal ACCESS
Volume 6 Issue (down) Pages 18325 - 18334
Keywords Head Pose estimation; HOG features; Generalized Discriminative Common Vectors; B-splines; Multiple linear regression
Abstract In this paper, a continuous head pose estimation system is proposed to estimate yaw and pitch head angles from raw facial images. Our approach is based on manifold learningbased methods, due to their promising generalization properties shown for face modelling from images. The method combines histograms of oriented gradients, generalized discriminative common vectors and continuous local regression to achieve successful performance. Our proposal was tested on multiple standard face datasets, as well as in a realistic scenario. Results show a considerable performance improvement and a higher consistence of our model in comparison with other state-of-art methods, with angular errors varying between 9 and 17 degrees.
Address
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN 2169-3536 ISBN Medium
Area Expedition Conference
Notes ADAS; 600.118 Approved no
Call Number Admin @ si @ DMH2018b Serial 3091
Permanent link to this record
 

 
Author Albert Berenguel; Oriol Ramos Terrades; Josep Llados; Cristina Cañero
Title Evaluation of Texture Descriptors for Validation of Counterfeit Documents Type Conference Article
Year 2017 Publication 14th International Conference on Document Analysis and Recognition Abbreviated Journal
Volume Issue (down) Pages 1237-1242
Keywords
Abstract This paper describes an exhaustive comparative analysis and evaluation of different existing texture descriptor algorithms to differentiate between genuine and counterfeit documents. We include in our experiments different categories of algorithms and compare them in different scenarios with several counterfeit datasets, comprising banknotes and identity documents. Computational time in the extraction of each descriptor is important because the final objective is to use it in a real industrial scenario. HoG and CNN based descriptors stands out statistically over the rest in terms of the F1-score/time ratio performance.
Address
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN 2379-2140 ISBN Medium
Area Expedition Conference ICDAR
Notes DAG; 600.061; 601.269; 600.097; 600.121 Approved no
Call Number Admin @ si @ BRL2017 Serial 3092
Permanent link to this record
 

 
Author Mohamed Ilyes Lakhal; Hakan Cevikalp; Sergio Escalera
Title CRN: End-to-end Convolutional Recurrent Network Structure Applied to Vehicle Classification Type Conference Article
Year 2018 Publication 13th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications Abbreviated Journal
Volume 5 Issue (down) Pages 137-144
Keywords Vehicle Classification; Deep Learning; End-to-end Learning
Abstract Vehicle type classification is considered to be a central part of Intelligent Traffic Systems. In the recent years, deep learning methods have emerged in as being the state-of-the-art in many computer vision tasks. In this paper, we present a novel yet simple deep learning framework for the vehicle type classification problem. We propose an end-to-end trainable system, that combines convolution neural network for feature extraction and recurrent neural network as a classifier. The recurrent network structure is used to handle various types of feature inputs, and at the same time allows to produce a single or a set of class predictions. In order to assess the effectiveness of our solution, we have conducted a set of experiments in two public datasets, obtaining state of the art results. In addition, we also report results on the newly released MIO-TCD dataset.
Address Funchal; Madeira; Portugal; January 2018
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference VISAPP
Notes HUPBA Approved no
Call Number Admin @ si @ LCE2018a Serial 3094
Permanent link to this record
 

 
Author Hugo Jair Escalante; Heysem Kaya; Albert Ali Salah; Sergio Escalera; Yagmur Gucluturk; Umut Guclu; Xavier Baro; Isabelle Guyon; Julio C. S. Jacques Junior; Meysam Madadi; Stephane Ayache; Evelyne Viegas; Furkan Gurpinar; Achmadnoer Sukma Wicaksana; Cynthia C. S. Liem; Marcel A. J. van Gerven; Rob van Lier
Title Explaining First Impressions: Modeling, Recognizing, and Explaining Apparent Personality from Videos Type Miscellaneous
Year 2018 Publication Arxiv Abbreviated Journal
Volume Issue (down) Pages
Keywords
Abstract Explainability and interpretability are two critical aspects of decision support systems. Within computer vision, they are critical in certain tasks related to human behavior analysis such as in health care applications. Despite their importance, it is only recently that researchers are starting to explore these aspects. This paper provides an introduction to explainability and interpretability in the context of computer vision with an emphasis on looking at people tasks. Specifically, we review and study those mechanisms in the context of first impressions analysis. To the best of our knowledge, this is the first effort in this direction. Additionally, we describe a challenge we organized on explainability in first impressions analysis from video. We analyze in detail the newly introduced data set, the evaluation protocol, and summarize the results of the challenge. Finally, derived from our study, we outline research opportunities that we foresee will be decisive in the near future for the development of the explainable computer vision field.
Address
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference
Notes HUPBA Approved no
Call Number Admin @ si @ JKS2018 Serial 3095
Permanent link to this record
 

 
Author Sangheeta Roy; Palaiahnakote Shivakumara; Namita Jain; Vijeta Khare; Anjan Dutta; Umapada Pal; Tong Lu
Title Rough-Fuzzy based Scene Categorization for Text Detection and Recognition in Video Type Journal Article
Year 2018 Publication Pattern Recognition Abbreviated Journal PR
Volume 80 Issue (down) Pages 64-82
Keywords Rough set; Fuzzy set; Video categorization; Scene image classification; Video text detection; Video text recognition
Abstract Scene image or video understanding is a challenging task especially when number of video types increases drastically with high variations in background and foreground. This paper proposes a new method for categorizing scene videos into different classes, namely, Animation, Outlet, Sports, e-Learning, Medical, Weather, Defense, Economics, Animal Planet and Technology, for the performance improvement of text detection and recognition, which is an effective approach for scene image or video understanding. For this purpose, at first, we present a new combination of rough and fuzzy concept to study irregular shapes of edge components in input scene videos, which helps to classify edge components into several groups. Next, the proposed method explores gradient direction information of each pixel in each edge component group to extract stroke based features by dividing each group into several intra and inter planes. We further extract correlation and covariance features to encode semantic features located inside planes or between planes. Features of intra and inter planes of groups are then concatenated to get a feature matrix. Finally, the feature matrix is verified with temporal frames and fed to a neural network for categorization. Experimental results show that the proposed method outperforms the existing state-of-the-art methods, at the same time, the performances of text detection and recognition methods are also improved significantly due to categorization.
Address
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference
Notes DAG; 600.097; 600.121 Approved no
Call Number Admin @ si @ RSJ2018 Serial 3096
Permanent link to this record
 

 
Author ChunYang; Xu Cheng Yin; Hong Yu; Dimosthenis Karatzas; Yu Cao
Title ICDAR2017 Robust Reading Challenge on Text Extraction from Biomedical Literature Figures (DeTEXT) Type Conference Article
Year 2017 Publication 14th International Conference on Document Analysis and Recognition Abbreviated Journal
Volume Issue (down) Pages 1444-1447
Keywords
Abstract Hundreds of millions of figures are available in the biomedical literature, representing important biomedical experimental evidence. Since text is a rich source of information in figures, automatically extracting such text may assist in the task of mining figure information and understanding biomedical documents. Unlike images in the open domain, biomedical figures present a variety of unique challenges. For example, biomedical figures typically have complex layouts, small font sizes, short text, specific text, complex symbols and irregular text arrangements. This paper presents the final results of the ICDAR 2017 Competition on Text Extraction from Biomedical Literature Figures (ICDAR2017 DeTEXT Competition), which aims at extracting (detecting and recognizing) text from biomedical literature figures. Similar to text extraction from scene images and web pictures, ICDAR2017 DeTEXT Competition includes three major tasks, i.e., text detection, cropped word recognition and end-to-end text recognition. Here, we describe in detail the data set, tasks, evaluation protocols and participants of this competition, and report the performance of the participating methods.
Address
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN 978-1-5386-3586-5 Medium
Area Expedition Conference ICDAR
Notes DAG; 600.121 Approved no
Call Number Admin @ si @ YCY2017 Serial 3098
Permanent link to this record
 

 
Author Ivet Rafegas
Title Color in Visual Recognition: from flat to deep representations and some biological parallelisms Type Book Whole
Year 2017 Publication PhD Thesis, Universitat Autonoma de Barcelona-CVC Abbreviated Journal
Volume Issue (down) Pages
Keywords
Abstract Visual recognition is one of the main problems in computer vision that attempts to solve image understanding by deciding what objects are in images. This problem can be computationally solved by using relevant sets of visual features, such as edges, corners, color or more complex object parts. This thesis contributes to how color features have to be represented for recognition tasks.

Image features can be extracted following two different approaches. A first approach is defining handcrafted descriptors of images which is then followed by a learning scheme to classify the content (named flat schemes in Kruger et al. (2013). In this approach, perceptual considerations are habitually used to define efficient color features. Here we propose a new flat color descriptor based on the extension of color channels to boost the representation of spatio-chromatic contrast that surpasses state-of-the-art approaches. However, flat schemes present a lack of generality far away from the capabilities of biological systems. A second approach proposes evolving these flat schemes into a hierarchical process, like in the visual cortex. This includes an automatic process to learn optimal features. These deep schemes, and more specifically Convolutional Neural Networks (CNNs), have shown an impressive performance to solve various vision problems. However, there is a lack of understanding about the internal representation obtained, as a result of automatic learning. In this thesis we propose a new methodology to explore the internal representation of trained CNNs by defining the Neuron Feature as a visualization of the intrinsic features encoded in each individual neuron. Additionally, and inspired by physiological techniques, we propose to compute different neuron selectivity indexes (e.g., color, class, orientation or symmetry, amongst others) to label and classify the full CNN neuron population to understand learned representations.

Finally, using the proposed methodology, we show an in-depth study on how color is represented on a specific CNN, trained for object recognition, that competes with primate representational abilities (Cadieu et al (2014)). We found several parallelisms with biological visual systems: (a) a significant number of color selectivity neurons throughout all the layers; (b) an opponent and low frequency representation of color oriented edges and a higher sampling of frequency selectivity in brightness than in color in 1st layer like in V1; (c) a higher sampling of color hue in the second layer aligned to observed hue maps in V2; (d) a strong color and shape entanglement in all layers from basic features in shallower layers (V1 and V2) to object and background shapes in deeper layers (V4 and IT); and (e) a strong correlation between neuron color selectivities and color dataset bias.
Address November 2017
Corporate Author Thesis Ph.D. thesis
Publisher Ediciones Graficas Rey Place of Publication Editor Maria Vanrell
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN 978-84-945373-7-0 Medium
Area Expedition Conference
Notes CIC Approved no
Call Number Admin @ si @ Raf2017 Serial 3100
Permanent link to this record