|   | 
Details
   web
Records
Author David Masip; Alexander Todorov; Jordi Vitria
Title The Role of Facial Regions in Evaluating Social Dime Type Conference Article
Year 2012 Publication 12th European Conference on Computer Vision – Workshops and Demonstrations Abbreviated Journal
Volume 7584 Issue II Pages 210-219
Keywords Workshops and Demonstrations
Abstract Facial trait judgments are an important information cue for people. Recent works in the Psychology field have stated the basis of face evaluation, defining a set of traits that we evaluate from faces (e.g. dominance, trustworthiness, aggressiveness, attractiveness, threatening or intelligence among others). We rapidly infer information from others faces, usually after a short period of time (< 1000ms) we perceive a certain degree of dominance or trustworthiness of another person from the face. Although these perceptions are not necessarily accurate, they influence many important social outcomes (such as the results of the elections or the court decisions). This topic has also attracted the attention of Computer Vision scientists, and recently a computational model to automatically predict trait evaluations from faces has been proposed. These systems try to mimic the human perception by means of applying machine learning classifiers to a set of labeled data. In this paper we perform an experimental study on the specific facial features that trigger the social inferences. Using previous results from the literature, we propose to use simple similarity maps to evaluate which regions of the face influence the most the trait inferences. The correlation analysis is performed using only appearance, and the results from the experiments suggest that each trait is correlated with specific facial characteristics.
Address Florence, Italy
Corporate Author Thesis
Publisher Springer Berlin Heidelberg Place of Publication Editor (down) Andrea Fusiello, Vittorio Murino, Rita Cucchiara
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title LNCS
Series Volume Series Issue Edition
ISSN 0302-9743 ISBN 978-3-642-33867-0 Medium
Area Expedition Conference ECCVW
Notes OR;MV Approved no
Call Number Admin @ si @ MTV2012 Serial 2171
Permanent link to this record
 

 
Author Patricia Marquez;Debora Gil;Aura Hernandez-Sabate
Title A Complete Confidence Framework for Optical Flow Type Conference Article
Year 2012 Publication 12th European Conference on Computer Vision – Workshops and Demonstrations Abbreviated Journal
Volume 7584 Issue 2 Pages 124-133
Keywords Optical flow, confidence measures, sparsification plots, error prediction plots
Abstract Medial representations are powerful tools for describing and parameterizing the volumetric shape of anatomical structures. Existing methods show excellent results when applied to 2D objects, but their quality drops across dimensions. This paper contributes to the computation of medial manifolds in two aspects. First, we provide a standard scheme for the computation of medial manifolds that avoid degenerated medial axis segments; second, we introduce an energy based method which performs independently of the dimension. We evaluate quantitatively the performance of our method with respect to existing approaches, by applying them to synthetic shapes of known medial geometry. Finally, we show results on shape representation of multiple abdominal organs, exploring the use of medial manifolds for the representation of multi-organ relations.
Address
Corporate Author Thesis
Publisher Springer-Verlag Place of Publication Florence, Italy, October 7-13, 2012 Editor (down) Andrea Fusiello, Vittorio Murino ,Rita Cucchiara
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title LNCS
Series Volume Series Issue Edition
ISSN ISBN 978-3-642-33867-0 Medium
Area Expedition Conference ECCVW
Notes IAM;ADAS; Approved no
Call Number IAM @ iam @ MGH2012b Serial 1991
Permanent link to this record
 

 
Author Jon Almazan; Lluis Gomez; Suman Ghosh; Ernest Valveny; Dimosthenis Karatzas
Title WATTS: A common representation of word images and strings using embedded attributes for text recognition and retrieval Type Book Chapter
Year 2020 Publication Visual Text Interpretation – Algorithms and Applications in Scene Understanding and Document Analysis Abbreviated Journal
Volume Issue Pages
Keywords
Abstract
Address
Corporate Author Thesis
Publisher Springer Place of Publication Editor (down) Analysis”, K. Alahari; C.V. Jawahar
Language Summary Language Original Title
Series Editor Series Title Series on Advances in Computer Vision and Pattern Recognition Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference
Notes DAG; 600.121 Approved no
Call Number Admin @ si @ AGG2020 Serial 3496
Permanent link to this record
 

 
Author Mohamed Ali Souibgui
Title Document Image Enhancement and Recognition in Low Resource Scenarios: Application to Ciphers and Handwritten Text Type Book Whole
Year 2022 Publication PhD Thesis, Universitat Autonoma de Barcelona-CVC Abbreviated Journal
Volume Issue Pages
Keywords
Abstract In this thesis, we propose different contributions with the goal of enhancing and recognizing historical handwritten document images, especially the ones with rare scripts, such as cipher documents.
In the first part, some effective end-to-end models for Document Image Enhancement (DIE) using deep learning models were presented. First, Generative Adversarial Networks (cGAN) for different tasks (document clean-up, binarization, deblurring, and watermark removal) were explored. Next, we further improve the results by recovering the degraded document images into a clean and readable form by integrating a text recognizer into the cGAN model to promote the generated document image to be more readable. Afterward, we present a new encoder-decoder architecture based on vision transformers to enhance both machine-printed and handwritten document images, in an end-to-end fashion.
The second part of the thesis addresses Handwritten Text Recognition (HTR) in low resource scenarios, i.e. when only few labeled training data is available. We propose novel methods for recognizing ciphers with rare scripts. First, a few-shot object detection based method was proposed. Then, we incorporate a progressive learning strategy that automatically assignspseudo-labels to a set of unlabeled data to reduce the human labor of annotating few pages while maintaining the good performance of the model. Secondly, a data generation technique based on Bayesian Program Learning (BPL) is proposed to overcome the lack of data in such rare scripts. Thirdly, we propose a Text-Degradation Invariant Auto Encoder (Text-DIAE). This latter self-supervised model is designed to tackle two tasks, text recognition and document image enhancement. The proposed model does not exhibit limitations of previous state-of-the-art methods based on contrastive losses, while at the same time, it requires substantially fewer data samples to converge.
In the third part of the thesis, we analyze, from the user perspective, the usage of HTR systems in low resource scenarios. This contrasts with the usual research on HTR, which often focuses on technical aspects only and rarely devotes efforts on implementing software tools for scholars in Humanities.
Address
Corporate Author Thesis Ph.D. thesis
Publisher IMPRIMA Place of Publication Editor (down) Alicia Fornes;Yousri Kessentini
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN 978-84-124793-8-6 Medium
Area Expedition Conference
Notes DAG Approved no
Call Number Admin @ si @ Sou2022 Serial 3757
Permanent link to this record
 

 
Author Manuel Carbonell
Title Neural Information Extraction from Semi-structured Documents A Type Book Whole
Year 2020 Publication PhD Thesis, Universitat Autonoma de Barcelona-CVC Abbreviated Journal
Volume Issue Pages
Keywords
Abstract Sectors as fintech, legaltech or insurance process an inflow of millions of forms, invoices, id documents, claims or similar every day. Together with these, historical archives provide gigantic amounts of digitized documents containing useful information that needs to be stored in machine encoded text with a meaningful structure. This procedure, known as information extraction (IE) comprises the steps of localizing and recognizing text, identifying named entities contained in it and optionally finding relationships among its elements. In this work we explore multi-task neural models at image and graph level to solve all steps in a unified way. While doing so we find benefits and limitations of these end-to-end approaches in comparison with sequential separate methods. More specifically, we first propose a method to produce textual as well as semantic labels with a unified model from handwritten text line images. We do so with the use of a convolutional recurrent neural model trained with connectionist temporal classification to predict the textual as well as semantic information encoded in the images. Secondly, motivated by the success of this approach we investigate the unification of the localization and recognition tasks of handwritten text in full pages with an end-to-end model, observing benefits in doing so. Having two models that tackle information extraction subsequent task pairs in an end-to-end to end manner, we lastly contribute with a method to put them all together in a single neural network to solve the whole information extraction pipeline in a unified way. Doing so we observe some benefits and some limitations in the approach, suggesting that in certain cases it is beneficial to train specialized models that excel at a single challenging task of the information extraction process, as it can be the recognition of named entities or the extraction of relationships between them. For this reason we lastly study the use of the recently arrived graph neural network architectures for the semantic tasks of the information extraction process, which are recognition of named entities and relation extraction, achieving promising results on the relation extraction part.
Address
Corporate Author Thesis Ph.D. thesis
Publisher Ediciones Graficas Rey Place of Publication Editor (down) Alicia Fornes;Mauricio Villegas;Josep Llados
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN 978-84-122714-1-6 Medium
Area Expedition Conference
Notes DAG; 600.121 Approved no
Call Number Admin @ si @ Car20 Serial 3483
Permanent link to this record
 

 
Author Lei Kang
Title Robust Handwritten Text Recognition in Scarce Labeling Scenarios: Disentanglement, Adaptation and Generation Type Book Whole
Year 2020 Publication PhD Thesis, Universitat Autonoma de Barcelona-CVC Abbreviated Journal
Volume Issue Pages
Keywords
Abstract Handwritten documents are not only preserved in historical archives but also widely used in administrative documents such as cheques and claims. With the rise of the deep learning era, many state-of-the-art approaches have achieved good performance on specific datasets for Handwritten Text Recognition (HTR). However, it is still challenging to solve real use cases because of the varied handwriting styles across different writers and the limited labeled data. Thus, both explorin a more robust handwriting recognition architectures and proposing methods to diminish the gap between the source and target data in an unsupervised way are
demanded.
In this thesis, firstly, we explore novel architectures for HTR, from Sequence-to-Sequence (Seq2Seq) method with attention mechanism to non-recurrent Transformer-based method. Secondly, we focus on diminishing the performance gap between source and target data in an unsupervised way. Finally, we propose a group of generative methods for handwritten text images, which could be utilized to increase the training set to obtain a more robust recognizer. In addition, by simply modifying the generative method and joining it with a recognizer, we end up with an effective disentanglement method to distill textual content from handwriting styles so as to achieve a generalized recognition performance.
We outperform state-of-the-art HTR performances in the experimental results among different scientific and industrial datasets, which prove the effectiveness of the proposed methods. To the best of our knowledge, the non-recurrent recognizer and the disentanglement method are the first contributions in the handwriting recognition field. Furthermore, we have outlined the potential research lines, which would be interesting to explore in the future.
Address
Corporate Author Thesis Ph.D. thesis
Publisher Ediciones Graficas Rey Place of Publication Editor (down) Alicia Fornes;Marçal Rusiñol;Mauricio Villegas
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN 978-84-122714-0-9 Medium
Area Expedition Conference
Notes DAG; 600.121 Approved no
Call Number Admin @ si @ Kan20 Serial 3482
Permanent link to this record
 

 
Author Juan Ignacio Toledo
Title Information Extraction from Heterogeneous Handwritten Documents Type Book Whole
Year 2019 Publication PhD Thesis, Universitat Autonoma de Barcelona-CVC Abbreviated Journal
Volume Issue Pages
Keywords
Abstract In this thesis we explore information Extraction from totally or partially handwritten documents. Basically we are dealing with two different application scenarios. The first scenario are modern highly structured documents like forms. In this kind of documents, the semantic information is encoded in different fields with a pre-defined location in the document, therefore, information extraction becomes roughly equivalent to transcription. The second application scenario are loosely structured totally handwritten documents, besides transcribing them, we need to assign a semantic label, from a set of known values to the handwritten words.
In both scenarios, transcription is an important part of the information extraction. For that reason in this thesis we present two methods based on Neural Networks, to transcribe handwritten text.In order to tackle the challenge of loosely structured documents, we have produced a benchmark, consisting of a dataset, a defined set of tasks and a metric, that was presented to the community as an international competition. Also, we propose different models based on Convolutional and Recurrent neural networks that are able to transcribe and assign different semantic labels to each handwritten words, that is, able to perform Information Extraction.
Address July 2019
Corporate Author Thesis Ph.D. thesis
Publisher Ediciones Graficas Rey Place of Publication Editor (down) Alicia Fornes;Josep Llados
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN 978-84-948531-7-3 Medium
Area Expedition Conference
Notes DAG; 600.140; 600.121 Approved no
Call Number Admin @ si @ Tol2019 Serial 3389
Permanent link to this record
 

 
Author Jorge Bernal
Title Polyp Localization and Segmentation in Colonoscopy Images by Means of a Model of Appearance for Polyps Type Journal Article
Year 2014 Publication Electronic Letters on Computer Vision and Image Analysis Abbreviated Journal ELCVIA
Volume 13 Issue 2 Pages 9-10
Keywords Colonoscopy; polyp localization; polyp segmentation; Eye-tracking
Abstract Colorectal cancer is the fourth most common cause of cancer death worldwide and its survival rate depends on the stage in which it is detected on hence the necessity for an early colon screening. There are several screening techniques but colonoscopy is still nowadays the gold standard, although it has some drawbacks such as the miss rate. Our contribution, in the field of intelligent systems for colonoscopy, aims at providing a polyp localization and a polyp segmentation system based on a model of appearance for polyps. To develop both methods we define a model of appearance for polyps, which describes a polyp as enclosed by intensity valleys. The novelty of our contribution resides on the fact that we include in our model aspects of the image formation and we also consider the presence of other elements from the endoluminal scene such as specular highlights and blood vessels, which have an impact on the performance of our methods. In order to develop our polyp localization method we accumulate valley information in order to generate energy maps, which are also used to guide the polyp segmentation. Our methods achieve promising results in polyp localization and segmentation. As we want to explore the usability of our methods we present a comparative analysis between physicians fixations obtained via an eye tracking device and our polyp localization method. The results show that our method is indistinguishable to novice physicians although it is far from expert physicians.
Address
Corporate Author Thesis
Publisher Place of Publication Editor (down) Alicia Fornes; Volkmar Frinken
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference
Notes MV Approved no
Call Number Admin @ si @ Ber2014 Serial 2487
Permanent link to this record
 

 
Author Arnau Baro
Title Reading Music Systems: From Deep Optical Music Recognition to Contextual Methods Type Book Whole
Year 2022 Publication PhD Thesis, Universitat Autonoma de Barcelona-CVC Abbreviated Journal
Volume Issue Pages
Keywords
Abstract The transcription of sheet music into some machine-readable format can be carried out manually. However, the complexity of music notation inevitably leads to burdensome software for music score editing, which makes the whole process
very time-consuming and prone to errors. Consequently, automatic transcription
systems for musical documents represent interesting tools.
Document analysis is the subject that deals with the extraction and processing
of documents through image and pattern recognition. It is a branch of computer
vision. Taking music scores as source, the field devoted to address this task is
known as Optical Music Recognition (OMR). Typically, an OMR system takes an
image of a music score and automatically extracts its content into some symbolic
structure such as MEI or MusicXML.
In this dissertation, we have investigated different methods for recognizing a
single staff section (e.g. scores for violin, flute, etc.), much in the same way as most text recognition research focuses on recognizing words appearing in a given line image. These methods are based in two different methodologies. On the one hand, we present two methods based on Recurrent Neural Networks, in particular, the
Long Short-Term Memory Neural Network. On the other hand, a method based on Sequence to Sequence models is detailed.
Music context is needed to improve the OMR results, just like language models
and dictionaries help in handwriting recognition. For example, syntactical rules
and grammars could be easily defined to cope with the ambiguities in the rhythm.
In music theory, for example, the time signature defines the amount of beats per
bar unit. Thus, in the second part of this dissertation, different methodologies
have been investigated to improve the OMR recognition. We have explored three
different methods: (a) a graphic tree-structure representation, Dendrograms, that
joins, at each level, its primitives following a set of rules, (b) the incorporation of Language Models to model the probability of a sequence of tokens, and (c) graph neural networks to analyze the music scores to avoid meaningless relationships between music primitives.
Finally, to train all these methodologies, and given the method-specificity of
the datasets in the literature, we have created four different music datasets. Two of them are synthetic with a modern or old handwritten appearance, whereas the
other two are real handwritten scores, being one of them modern and the other
old.
Address
Corporate Author Thesis Ph.D. thesis
Publisher IMPRIMA Place of Publication Editor (down) Alicia Fornes
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN 978-84-124793-8-6 Medium
Area Expedition Conference
Notes DAG; Approved no
Call Number Admin @ si @ Bar2022 Serial 3754
Permanent link to this record
 

 
Author Nataliya Shapovalova; Carles Fernandez; Xavier Roca; Jordi Gonzalez
Title Semantics of Human Behavior in Image Sequences Type Book Chapter
Year 2011 Publication Computer Analysis of Human Behavior Abbreviated Journal
Volume Issue 7 Pages 151-182
Keywords
Abstract Human behavior is contextualized and understanding the scene of an action is crucial for giving proper semantics to behavior. In this chapter we present a novel approach for scene understanding. The emphasis of this work is on the particular case of Human Event Understanding. We introduce a new taxonomy to organize the different semantic levels of the Human Event Understanding framework proposed. Such a framework particularly contributes to the scene understanding domain by (i) extracting behavioral patterns from the integrative analysis of spatial, temporal, and contextual evidence and (ii) integrative analysis of bottom-up and top-down approaches in Human Event Understanding. We will explore how the information about interactions between humans and their environment influences the performance of activity recognition, and how this can be extrapolated to the temporal domain in order to extract higher inferences from human events observed in sequences of images.
Address
Corporate Author Thesis
Publisher Springer London Place of Publication Editor (down) Albert Ali Salah;
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN 978-0-85729-993-2 Medium
Area Expedition Conference
Notes ISE Approved no
Call Number Admin @ si @ SFR2011 Serial 1810
Permanent link to this record
 

 
Author Jaume Garcia; Debora Gil; A.Bajo; M.J.Ledesma-Carbayo; C.SantaMarta
Title Influence of the temporal resolution on the quantification of displacement fields in cardiac magnetic resonance tagged images Type Conference Article
Year 2008 Publication Proc. Computers in Cardiology Abbreviated Journal
Volume 35 Issue Pages 785-788
Keywords
Abstract It is difficult to acquire tagged cardiac MR images with a high temporal and spatial resolution using clinical MR scanners. However, if such images are used for quantifying scores based on motion, it is essential a resolution as high as possible. This paper explores the influence of the temporal resolution of a tagged series on the quantification of myocardial dynamic parameters. To such purpose we have designed a SPAMM (Spatial Modulation of Magnetization) sequence allowing acquisition of sequences at simple and double temporal resolution. Sequences are processed to compute myocardial motion by an automatic technique based on the tracking of the harmonic phase of tagged images (the Harmonic Phase Flow, HPF). The results have been compared to manual tracking of myocardial tags. The error in displacement fields for double resolution sequences reduces 17%.
Address
Corporate Author Thesis
Publisher Place of Publication Editor (down) Alan Murray
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference
Notes IAM Approved no
Call Number IAM @ iam @ GGB2008 Serial 1508
Permanent link to this record
 

 
Author Debora Gil; Petia Radeva
Title Inhibition of False Landmarks Type Book Chapter
Year 2004 Publication Recent Advances in Artificial Intelligence Research and Development Abbreviated Journal
Volume Issue Pages 233-244
Keywords
Abstract We argue that a corner detector should be based on the degree of continuity of the tangent vector to the image level sets, work on the image domain and need no assumptions on neither the image local structure nor the particular geometry of the corner/junction. An operator measuring the degree of differentiability of the projection matrix on the image gradient fulfills the above requirements. Its high sensitivity to changes in vector directions makes it suitable for landmark location in real images prone to need smoothing to reduce the impact of noise. Because using smoothing kernels leads to corner misplacement, we suggest an alternative fake response remover based on the receptive field inhibition of spurious details. The combination of both orientation discontinuity detection and noise inhibition produce our Inhibition Orientation Energy (IOE) landmark locator.
Address
Corporate Author Thesis
Publisher IOS Press Place of Publication Barcelona (Spain) Editor (down) al, J.V. et
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference
Notes IAM;MILAB Approved no
Call Number IAM @ iam @ GiR2004a Serial 1533
Permanent link to this record
 

 
Author Thierry Brouard; A. Delaplace; Muhammad Muzzamil Luqman; H. Cardot; Jean-Yves Ramel
Title Design of Evolutionary Methods Applied to the Learning of Bayesian Nerwork Structures Type Book Chapter
Year 2010 Publication Bayesian Network Abbreviated Journal
Volume Issue Pages 13-37
Keywords
Abstract
Address
Corporate Author Thesis
Publisher Sciyo Place of Publication Editor (down) Ahmed Rebai
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN 978-953-307-124-4 Medium
Area Expedition Conference
Notes Approved no
Call Number Admin @ si @ BDL2010 Serial 1461
Permanent link to this record
 

 
Author Patricia Suarez; Angel Sappa; Boris X. Vintimilla
Title Deep learning-based vegetation index estimation Type Book Chapter
Year 2021 Publication Generative Adversarial Networks for Image-to-Image Translation Abbreviated Journal
Volume Issue Pages 205-234
Keywords
Abstract Chapter 9
Address
Corporate Author Thesis
Publisher Elsevier Place of Publication Editor (down) A.Solanki; A.Nayyar; M.Naved
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference
Notes MSIAU; 600.122 Approved no
Call Number Admin @ si @ SSV2021a Serial 3578
Permanent link to this record
 

 
Author Susana Alvarez; Anna Salvatella; Maria Vanrell; Xavier Otazu
Title 3D Texton Spaces for color-texture retrieval Type Conference Article
Year 2010 Publication 7th International Conference on Image Analysis and Recognition Abbreviated Journal
Volume 6111 Issue Pages 354–363
Keywords
Abstract Color and texture are visual cues of different nature, their integration in an useful visual descriptor is not an easy problem. One way to combine both features is to compute spatial texture descriptors independently on each color channel. Another way is to do the integration at the descriptor level. In this case the problem of normalizing both cues arises. In this paper we solve the latest problem by fusing color and texture through distances in texton spaces. Textons are the attributes of image blobs and they are responsible for texture discrimination as defined in Julesz’s Texton theory. We describe them in two low-dimensional and uniform spaces, namely, shape and color. The dissimilarity between color texture images is computed by combining the distances in these two spaces. Following this approach, we propose our TCD descriptor which outperforms current state of art methods in the two different approaches mentioned above, early combination with LBP and late combination with MPEG-7. This is done on an image retrieval experiment over a highly diverse texture dataset from Corel.
Address
Corporate Author Thesis
Publisher Springer Berlin Heidelberg Place of Publication Editor (down) A.C. Campilho and M.S. Kamel
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title LNCS
Series Volume Series Issue Edition
ISSN 0302-9743 ISBN 978-3-642-13771-6 Medium
Area Expedition Conference ICIAR
Notes CIC Approved no
Call Number CAT @ cat @ ASV2010a Serial 1325
Permanent link to this record