|   | 
Details
   web
Records
Author Alicia Fornes; Josep Llados; Gemma Sanchez
Title (up) Primitive Segmentation in Old Handwritten Music Scores Type Miscellaneous
Year 2005 Publication 6th IAPR International Workshop on Graphics Recognition (GREC 2005) Abbreviated Journal
Volume Issue Pages
Keywords
Abstract
Address Hong Kong, Hong Kong SAR (China)
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference
Notes DAG Approved no
Call Number DAG @ dag @ FLS2005a Serial 584
Permanent link to this record
 

 
Author Alicia Fornes; Josep Llados; Gemma Sanchez
Title (up) Primitive Segmentation in Old Handwritten Music Scores Type Book Chapter
Year 2006 Publication Graphics Recognition: Ten Years Review and Future Perspectives, W. Liu, J. Llados (Eds.), LNCS 3926: 288–299 Abbreviated Journal
Volume Issue Pages
Keywords
Abstract
Address
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference
Notes DAG Approved no
Call Number DAG @ dag @ FLS2006 Serial 697
Permanent link to this record
 

 
Author Angel Sappa; Niki Aifanti; Sotiris Malassiotis; Michael G. Strintzis
Title (up) Prior Knowledge Based Motion Model Representation Type Journal
Year 2005 Publication Electronic Letters on Computer Vision and Image Analysis, Special Issue on Articulated Motion & Deformable Objects, 5(3):55–67 (Electronic Letters: IF: 1.016) Abbreviated Journal
Volume Issue Pages
Keywords
Abstract
Address
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference
Notes Approved no
Call Number ADAS @ adas @ SAM2005b Serial 539
Permanent link to this record
 

 
Author Angel Sappa; Niki Aifanti; Sotiris Malassiotis; Michael G. Strintzis
Title (up) Prior Knowledge Based Motion Model Representation Type Book Chapter
Year 2009 Publication Progress in Computer Vision and Image Analysis Abbreviated Journal
Volume 16 Issue Pages
Keywords
Abstract
Address
Corporate Author Thesis
Publisher Place of Publication Editor Horst Bunke; JuanJose Villanueva; Gemma Sanchez
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference
Notes ADAS Approved no
Call Number ADAS @ adas @ SAM2009 Serial 1235
Permanent link to this record
 

 
Author Alfons Juan-Ciscar; Gemma Sanchez
Title (up) PRIS 2008. Pattern Recognition in Information Systems. Proceedings of the 8th international Workshop on Pattern Recognition in Information systems – PRIS 2008, in conjunction with ICEIS 2008 Type Book Whole
Year 2008 Publication Abbreviated Journal
Volume Issue Pages
Keywords
Abstract
Address Barcelona (Spain)
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference
Notes DAG Approved no
Call Number DAG @ dag @ JuS2008 Serial 1054
Permanent link to this record
 

 
Author Ruben Tito; Khanh Nguyen; Marlon Tobaben; Raouf Kerkouche; Mohamed Ali Souibgui; Kangsoo Jung; Lei Kang; Ernest Valveny; Antti Honkela; Mario Fritz; Dimosthenis Karatzas
Title (up) Privacy-Aware Document Visual Question Answering Type Miscellaneous
Year 2023 Publication Arxiv Abbreviated Journal
Volume Issue Pages
Keywords
Abstract Document Visual Question Answering (DocVQA) is a fast growing branch of document understanding. Despite the fact that documents contain sensitive or copyrighted information, none of the current DocVQA methods offers strong privacy guarantees.
In this work, we explore privacy in the domain of DocVQA for the first time. We highlight privacy issues in state of the art multi-modal LLM models used for DocVQA, and explore possible solutions.
Specifically, we focus on the invoice processing use case as a realistic, widely used scenario for document understanding, and propose a large scale DocVQA dataset comprising invoice documents and associated questions and answers. We employ a federated learning scheme, that reflects the real-life distribution of documents in different businesses, and we explore the use case where the ID of the invoice issuer is the sensitive information to be protected.
We demonstrate that non-private models tend to memorise, behaviour that can lead to exposing private information. We then evaluate baseline training schemes employing federated learning and differential privacy in this multi-modal scenario, where the sensitive information might be exposed through any of the two input modalities: vision (document image) or language (OCR tokens).
Finally, we design an attack exploiting the memorisation effect of the model, and demonstrate its effectiveness in probing different DocVQA models.
Address
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference
Notes DAG Approved no
Call Number Admin @ si @ PNT2023 Serial 4012
Permanent link to this record
 

 
Author Mohammad N. S. Jahromi; Pau Buch Cardona; Egils Avots; Kamal Nasrollahi; Sergio Escalera; Thomas B. Moeslund; Gholamreza Anbarjafari
Title (up) Privacy-Constrained Biometric System for Non-cooperative Users Type Journal Article
Year 2019 Publication Entropy Abbreviated Journal ENTROPY
Volume 21 Issue 11 Pages 1033
Keywords biometric recognition; multimodal-based human identification; privacy; deep learning
Abstract With the consolidation of the new data protection regulation paradigm for each individual within the European Union (EU), major biometric technologies are now confronted with many concerns related to user privacy in biometric deployments. When individual biometrics are disclosed, the sensitive information about his/her personal data such as financial or health are at high risk of being misused or compromised. This issue can be escalated considerably over scenarios of non-cooperative users, such as elderly people residing in care homes, with their inability to interact conveniently and securely with the biometric system. The primary goal of this study is to design a novel database to investigate the problem of automatic people recognition under privacy constraints. To do so, the collected data-set contains the subject’s hand and foot traits and excludes the face biometrics of individuals in order to protect their privacy. We carried out extensive simulations using different baseline methods, including deep learning. Simulation results show that, with the spatial features extracted from the subject sequence in both individual hand or foot videos, state-of-the-art deep models provide promising recognition performance.
Address
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference
Notes HuPBA; no proj Approved no
Call Number Admin @ si @ NBA2019 Serial 3313
Permanent link to this record
 

 
Author Ferran Diego
Title (up) Probabilistic Alignment of Video Sequences Recorded by Moving Cameras Type Book Whole
Year 2011 Publication PhD Thesis, Universitat Autonoma de Barcelona-CVC Abbreviated Journal
Volume Issue Pages
Keywords
Abstract Video alignment consists of integrating multiple video sequences recorded independently into a single video sequence. This means to register both in time (synchronize
frames) and space (image registration) so that the two videos sequences can be fused
or compared pixel–wise. In spite of being relatively unknown, many applications today may benefit from the availability of robust and efficient video alignment methods.
For instance, video surveillance requires to integrate video sequences that are recorded
of the same scene at different times in order to detect changes. The problem of aligning videos has been addressed before, but in the relatively simple cases of fixed or rigidly attached cameras and simultaneous acquisition. In addition, most works rely
on restrictive assumptions which reduce its difficulty such as linear time correspondence or the knowledge of the complete trajectories of corresponding scene points on the images; to some extent, these assumptions limit the practical applicability of the solutions developed until now. In this thesis, we focus on the challenging problem of aligning sequences recorded at different times from independent moving cameras following similar but not coincident trajectories. More precisely, this thesis covers four studies that advance the state-of-the-art in video alignment. First, we focus on analyzing and developing a probabilistic framework for video alignment, that is, a principled way to integrate multiple observations and prior information. In this way, two different approaches are presented to exploit the combination of several purely visual features (image–intensities, visual words and dense motion field descriptor), and
global positioning system (GPS) information. Second, we focus on reformulating the
problem into a single alignment framework since previous works on video alignment
adopt a divide–and–conquer strategy, i.e., first solve the synchronization, and then
register corresponding frames. This also generalizes the ’classic’ case of fixed geometric transform and linear time mapping. Third, we focus on exploiting directly the
time domain of the video sequences in order to avoid exhaustive cross–frame search.
This provides relevant information used for learning the temporal mapping between
pairs of video sequences. Finally, we focus on adapting these methods to the on–line
setting for road detection and vehicle geolocation. The qualitative and quantitative
results presented in this thesis on a variety of real–world pairs of video sequences show that the proposed method is: robust to varying imaging conditions, different image
content (e.g., incoming and outgoing vehicles), variations on camera velocity, and
different scenarios (indoor and outdoor) going beyond the state–of–the–art. Moreover, the on–line video alignment has been successfully applied for road detection and
vehicle geolocation achieving promising results.
Address
Corporate Author Thesis Ph.D. thesis
Publisher Ediciones Graficas Rey Place of Publication Editor Joan Serrat
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference
Notes ADAS Approved no
Call Number Admin @ si @ Die2011 Serial 1787
Permanent link to this record
 

 
Author Xavier Baro
Title (up) Probabilistic Darwin Machines: A New Approach to Develop Evolutionary Object Detection Type Book Whole
Year 2009 Publication PhD Thesis, Universitat Autonoma de Barcelona-CVC Abbreviated Journal
Volume Issue Pages
Keywords
Abstract Ever since computers were invented, we have wondered whether they might perform some of the human quotidian tasks. One of the most studied and still nowadays less understood problem is the capacity to learn from our experiences and how we generalize the knowledge that we acquire. One of that unaware tasks for the persons and that more interest is awakening in different scientific areas since the beginning, is the one that is known as pattern recognition. The creation of models that represent the world that surrounds us, help us for recognizing objects in our environment, to predict situations, to identify behaviors... All this information allows us to adapt ourselves and to interact with our environment. The capacity of adaptation of individuals to their environment has been related to the amount of patterns that are capable of identifying.

This thesis faces the pattern recognition problem from a Computer Vision point of view, taking one of the most paradigmatic and extended approaches to object detection as starting point. After studying this approach, two weak points are identified: The first makes reference to the description of the objects, and the second is a limitation of the learning algorithm, which hampers the utilization of best descriptors.

In order to address the learning limitations, we introduce evolutionary computation techniques to the classical object detection approach.

After testing the classical evolutionary approaches, such as genetic algorithms, we develop a new learning algorithm based on Probabilistic Darwin Machines, which better adapts to the learning problem. Once the learning limitation is avoided, we introduce a new feature set, which maintains the benefits of the classical feature set, adding the ability to describe non localities. This combination of evolutionary learning algorithm and features is tested on different public data sets, outperforming the results obtained by the classical approach.
Address Barcelona (Spain)
Corporate Author Thesis Ph.D. thesis
Publisher Ediciones Graficas Rey Place of Publication Editor Jordi Vitria
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference
Notes OR;HuPBA;MV Approved no
Call Number BCNPCL @ bcnpcl @ Bar2009 Serial 1262
Permanent link to this record
 

 
Author Francisco Cruz
Title (up) Probabilistic Graphical Models for Document Analysis Type Book Whole
Year 2016 Publication PhD Thesis, Universitat Autonoma de Barcelona-CVC Abbreviated Journal
Volume Issue Pages
Keywords
Abstract Latest advances in digitization techniques have fostered the interest in creating digital copies of collections of documents. Digitized documents permit an easy maintenance, loss-less storage, and efficient ways for transmission and to perform information retrieval processes. This situation has opened a new market niche to develop systems able to automatically extract and analyze information contained in these collections, specially in the ambit of the business activity.

Due to the great variety of types of documents this is not a trivial task. For instance, the automatic extraction of numerical data from invoices differs substantially from a task of text recognition in historical documents. However, in order to extract the information of interest, is always necessary to identify the area of the document where it is located. In the area of Document Analysis we refer to this process as layout analysis, which aims at identifying and categorizing the different entities that compose the document, such as text regions, pictures, text lines, or tables, among others. To perform this task it is usually necessary to incorporate a prior knowledge about the task into the analysis process, which can be modeled by defining a set of contextual relations between the different entities of the document. The use of context has proven to be useful to reinforce the recognition process and improve the results on many computer vision tasks. It presents two fundamental questions: What kind of contextual information is appropriate for a given task, and how to incorporate this information into the models.

In this thesis we study several ways to incorporate contextual information to the task of document layout analysis, and to the particular case of handwritten text line segmentation. We focus on the study of Probabilistic Graphical Models and other mechanisms for this purpose, and propose several solutions to these problems. First, we present a method for layout analysis based on Conditional Random Fields. With this model we encode local contextual relations between variables, such as pair-wise constraints. Besides, we encode a set of structural relations between different classes of regions at feature level. Second, we present a method based on 2D-Probabilistic Context-free Grammars to encode structural and hierarchical relations. We perform a comparative study between Probabilistic Graphical Models and this syntactic approach. Third, we propose a method for structured documents based on Bayesian Networks to represent the document structure, and an algorithm based in the Expectation-Maximization to find the best configuration of the page. We perform a thorough evaluation of the proposed methods on two particular collections of documents: a historical collection composed of ancient structured documents, and a collection of contemporary documents. In addition, we present a general method for the task of handwritten text line segmentation. We define a probabilistic framework where we combine the EM algorithm with variational approaches for computing inference and parameter learning on a Markov Random Field. We evaluate our method on several collections of documents, including a general dataset of annotated administrative documents. Results demonstrate the applicability of our method to real problems, and the contribution of the use of contextual information to this kind of problems.
Address
Corporate Author Thesis Ph.D. thesis
Publisher Ediciones Graficas Rey Place of Publication Editor Oriol Ramos Terrades
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN 978-84-945373-2-5 Medium
Area Expedition Conference
Notes DAG Approved no
Call Number Admin @ si @ Cru2016 Serial 2861
Permanent link to this record
 

 
Author Dani Rowe
Title (up) Probabilistic Image-based Tracking in Complex Human Environments Type Report
Year 2005 Publication CVC Technical Report #92 Abbreviated Journal
Volume Issue Pages
Keywords
Abstract
Address CVC (UAB)
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference
Notes Approved no
Call Number Admin @ si @ Row2005 Serial 601
Permanent link to this record
 

 
Author Dani Rowe; Ignasi Rius; Jordi Gonzalez; Xavier Roca; Juan J. Villanueva
Title (up) Probabilistic Image-Based Tracking: Improving Particle Filtering Type Book Chapter
Year 2005 Publication Pattern Recognition and Image Analysis (IbPRIA 2005), LNCS 3522: 85–92 Abbreviated Journal
Volume Issue Pages
Keywords
Abstract
Address Estoril (Portugal)
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference
Notes ISE Approved no
Call Number ISE @ ise @ RRG2005a Serial 545
Permanent link to this record
 

 
Author X. Orriols; Ricardo Toledo; X. Binefa; Petia Radeva; Jordi Vitria; Juan J. Villanueva
Title (up) Probabilistic Saliency Approach for Elongated Structure Detection using Deformable Models. Type Conference Article
Year 2000 Publication 15 th International Conference on Pattern Recognition Abbreviated Journal
Volume 3 Issue Pages 1006-1009
Keywords
Abstract
Address Barcelona.
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference ICPR
Notes OR;MILAB;ADAS;MV Approved no
Call Number BCNPCL @ bcnpcl @ OTB2000 Serial 224
Permanent link to this record
 

 
Author Antonio Hernandez; Miguel Angel Bautista; Xavier Perez Sala; Victor Ponce; Sergio Escalera; Xavier Baro; Oriol Pujol; Cecilio Angulo
Title (up) Probability-based Dynamic Time Warping and Bag-of-Visual-and-Depth-Words for Human Gesture Recognition in RGB-D Type Journal Article
Year 2014 Publication Pattern Recognition Letters Abbreviated Journal PRL
Volume 50 Issue 1 Pages 112-121
Keywords RGB-D; Bag-of-Words; Dynamic Time Warping; Human Gesture Recognition
Abstract PATREC5825
We present a methodology to address the problem of human gesture segmentation and recognition in video and depth image sequences. A Bag-of-Visual-and-Depth-Words (BoVDW) model is introduced as an extension of the Bag-of-Visual-Words (BoVW) model. State-of-the-art RGB and depth features, including a newly proposed depth descriptor, are analysed and combined in a late fusion form. The method is integrated in a Human Gesture Recognition pipeline, together with a novel probability-based Dynamic Time Warping (PDTW) algorithm which is used to perform prior segmentation of idle gestures. The proposed DTW variant uses samples of the same gesture category to build a Gaussian Mixture Model driven probabilistic model of that gesture class. Results of the whole Human Gesture Recognition pipeline in a public data set show better performance in comparison to both standard BoVW model and DTW approach.
Address
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference
Notes HuPBA;MV; 605.203 Approved no
Call Number Admin @ si @ HBP2014 Serial 2353
Permanent link to this record
 

 
Author Miguel Angel Bautista; Antonio Hernandez; Victor Ponce; Xavier Perez Sala; Xavier Baro; Oriol Pujol; Cecilio Angulo; Sergio Escalera
Title (up) Probability-based Dynamic TimeWarping for Gesture Recognition on RGB-D data Type Conference Article
Year 2012 Publication 21st International Conference on Pattern Recognition International Workshop on Depth Image Analysis Abbreviated Journal
Volume 7854 Issue Pages 126-135
Keywords
Abstract Dynamic Time Warping (DTW) is commonly used in gesture recognition tasks in order to tackle the temporal length variability of gestures. In the DTW framework, a set of gesture patterns are compared one by one to a maybe infinite test sequence, and a query gesture category is recognized if a warping cost below a certain threshold is found within the test sequence. Nevertheless, either taking one single sample per gesture category or a set of isolated samples may not encode the variability of such gesture category. In this paper, a probability-based DTW for gesture recognition is proposed. Different samples of the same gesture pattern obtained from RGB-Depth data are used to build a Gaussian-based probabilistic model of the gesture. Finally, the cost of DTW has been adapted accordingly to the new model. The proposed approach is tested in a challenging scenario, showing better performance of the probability-based DTW in comparison to state-of-the-art approaches for gesture recognition on RGB-D data.
Address
Corporate Author Thesis
Publisher Springer Berlin Heidelberg Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN 0302-9743 ISBN 978-3-642-40302-6 Medium
Area Expedition Conference WDIA
Notes MILAB; OR;HuPBA;MV Approved no
Call Number Admin @ si @ BHP2012 Serial 2120
Permanent link to this record