|   | 
Details
   web
Records
Author Juan Ramon Terven Salinas; Joaquin Salas; Bogdan Raducanu
Title Robust Head Gestures Recognition for Assistive Technology Type Book Chapter
Year 2014 Publication Pattern Recognition Abbreviated Journal
Volume 8495 Issue Pages (down) 152-161
Keywords
Abstract This paper presents a system capable of recognizing six head gestures: nodding, shaking, turning right, turning left, looking up, and looking down. The main difference of our system compared to other methods is that the Hidden Markov Models presented in this paper, are fully connected and consider all possible states in any given order, providing the following advantages to the system: (1) allows unconstrained movement of the head and (2) it can be easily integrated into a wearable device (e.g. glasses, neck-hung devices), in which case it can robustly recognize gestures in the presence of ego-motion. Experimental results show that this approach outperforms common methods that use restricted HMMs for each gesture.
Address
Corporate Author Thesis
Publisher Springer International Publishing Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title LNCS
Series Volume Series Issue Edition
ISSN 0302-9743 ISBN 978-3-319-07490-0 Medium
Area Expedition Conference
Notes LAMP; Approved no
Call Number Admin @ si @ TSR2014b Serial 2505
Permanent link to this record
 

 
Author Josep Llados; Ernest Valveny; Enric Marti
Title Symbol Recognition in Document Image Analysis: Methods and Challenges Type Journal Article
Year 2000 Publication Recent Research Developments in Pattern Recognition, Transworld Research Network, Abbreviated Journal
Volume 1 Issue Pages (down) 151–178.
Keywords
Abstract
Address
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN 81-86846-61-1 Medium
Area Expedition Conference
Notes DAG;IAM Approved no
Call Number IAM @ iam @ LVM2000 Serial 1575
Permanent link to this record
 

 
Author Nataliya Shapovalova; Carles Fernandez; Xavier Roca; Jordi Gonzalez
Title Semantics of Human Behavior in Image Sequences Type Book Chapter
Year 2011 Publication Computer Analysis of Human Behavior Abbreviated Journal
Volume Issue 7 Pages (down) 151-182
Keywords
Abstract Human behavior is contextualized and understanding the scene of an action is crucial for giving proper semantics to behavior. In this chapter we present a novel approach for scene understanding. The emphasis of this work is on the particular case of Human Event Understanding. We introduce a new taxonomy to organize the different semantic levels of the Human Event Understanding framework proposed. Such a framework particularly contributes to the scene understanding domain by (i) extracting behavioral patterns from the integrative analysis of spatial, temporal, and contextual evidence and (ii) integrative analysis of bottom-up and top-down approaches in Human Event Understanding. We will explore how the information about interactions between humans and their environment influences the performance of activity recognition, and how this can be extrapolated to the temporal domain in order to extract higher inferences from human events observed in sequences of images.
Address
Corporate Author Thesis
Publisher Springer London Place of Publication Editor Albert Ali Salah;
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN 978-0-85729-993-2 Medium
Area Expedition Conference
Notes ISE Approved no
Call Number Admin @ si @ SFR2011 Serial 1810
Permanent link to this record
 

 
Author Debora Gil; Oriol Ramos Terrades; Elisa Minchole; Carles Sanchez; Noelia Cubero de Frutos; Marta Diez-Ferrer; Rosa Maria Ortiz; Antoni Rosell
Title Classification of Confocal Endomicroscopy Patterns for Diagnosis of Lung Cancer Type Conference Article
Year 2017 Publication 6th Workshop on Clinical Image-based Procedures: Translational Research in Medical Imaging Abbreviated Journal
Volume 10550 Issue Pages (down) 151-159
Keywords
Abstract Confocal Laser Endomicroscopy (CLE) is an emerging imaging technique that allows the in-vivo acquisition of cell patterns of potentially malignant lesions. Such patterns could discriminate between inflammatory and neoplastic lesions and, thus, serve as a first in-vivo biopsy to discard cases that do not actually require a cell biopsy.

The goal of this work is to explore whether CLE images obtained during videobronchoscopy contain enough visual information to discriminate between benign and malign peripheral lesions for lung cancer diagnosis. To do so, we have performed a pilot comparative study with 12 patients (6 adenocarcinoma and 6 benign-inflammatory) using 2 different methods for CLE pattern analysis: visual analysis by 3 experts and a novel methodology that uses graph methods to find patterns in pre-trained feature spaces. Our preliminary results indicate that although visual analysis can only achieve a 60.2% of accuracy, the accuracy of the proposed unsupervised image pattern classification raises to 84.6%.

We conclude that CLE images visual information allow in-vivo detection of neoplastic lesions and graph structural analysis applied to deep-learning feature spaces can achieve competitive results.
Address Quebec; Canada; September 2017
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title LNCS
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference CLIP
Notes IAM; 600.096; 600.075; 600.145 Approved no
Call Number Admin @ si @ GRM2017 Serial 2957
Permanent link to this record
 

 
Author Josep Llados; Jaime Lopez-Krahe; Enric Marti
Title A system to understand hand-drawn floor plans using subgraph isomorphism and Hough transform Type Book Chapter
Year 1997 Publication Machine Vision and Applications Abbreviated Journal
Volume 10 Issue 3 Pages (down) 150-158
Keywords Line drawings – Hough transform – Graph matching – CAD systems – Graphics recognition
Abstract Presently, man-machine interface development is a widespread research activity. A system to understand hand drawn architectural drawings in a CAD environment is presented in this paper. To understand a document, we have to identify its building elements and their structural properties. An attributed graph structure is chosen as a symbolic representation of the input document and the patterns to recognize in it. An inexact subgraph isomorphism procedure using relaxation labeling techniques is performed. In this paper we focus on how to speed up the matching. There is a building element, the walls, characterized by a hatching pattern. Using a straight line Hough transform (SLHT)-based method, we recognize this pattern, characterized by parallel straight lines, and remove from the input graph the edges belonging to this pattern. The isomorphism is then applied to the remainder of the input graph. When all the building elements have been recognized, the document is redrawn, correcting the inaccurate strokes obtained from a hand-drawn input.
Address
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference
Notes DAG;IAM Approved no
Call Number IAM @ iam @ LLM1997a Serial 1566
Permanent link to this record
 

 
Author Anders Hast; Alicia Fornes
Title A Segmentation-free Handwritten Word Spotting Approach by Relaxed Feature Matching Type Conference Article
Year 2016 Publication 12th IAPR Workshop on Document Analysis Systems Abbreviated Journal
Volume Issue Pages (down) 150-155
Keywords
Abstract The automatic recognition of historical handwritten documents is still considered challenging task. For this reason, word spotting emerges as a good alternative for making the information contained in these documents available to the user. Word spotting is defined as the task of retrieving all instances of the query word in a document collection, becoming a useful tool for information retrieval. In this paper we propose a segmentation-free word spotting approach able to deal with large document collections. Our method is inspired on feature matching algorithms that have been applied to image matching and retrieval. Since handwritten words have different shape, there is no exact transformation to be obtained. However, the sufficient degree of relaxation is achieved by using a Fourier based descriptor and an alternative approach to RANSAC called PUMA. The proposed approach is evaluated on historical marriage records, achieving promising results.
Address Santorini; Greece; April 2016
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference DAS
Notes DAG; 602.006; 600.061; 600.077; 600.097 Approved no
Call Number HaF2016 Serial 2753
Permanent link to this record
 

 
Author Muhammad Muzzamil Luqman; Thierry Brouard; Jean-Yves Ramel; Josep Llados
Title Recherche de sous-graphes par encapsulation floue des cliques d'ordre 2: Application à la localisation de contenu dans les images de documents graphiques Type Conference Article
Year 2012 Publication Colloque International Francophone sur l'Écrit et le Document Abbreviated Journal
Volume Issue Pages (down) 149-162
Keywords
Abstract
Address
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference CIFED
Notes DAG Approved no
Call Number Admin @ si @ LBR2012 Serial 2382
Permanent link to this record
 

 
Author Francesco Ciompi; Oriol Pujol; Oriol Rodriguez-Leor; Angel Serrano; J. Mauri; Petia Radeva
Title On in-vitro and in-vivo IVUS data fusion Type Conference Article
Year 2009 Publication 12th International Conference of the Catalan Association for Artificial Intelligence Abbreviated Journal
Volume 202 Issue Pages (down) 147-156
Keywords
Abstract The design and the validation of an automatic plaque characterization technique based on Intravascular Ultrasound (IVUS) usually requires a data ground-truth. The histological analysis of post-mortem coronary arteries is commonly assumed as the state-of-the-art process for the extraction of a reliable data-set of atherosclerotic plaques. Unfortunately, the amount of data provided by this technique is usually few, due to the difficulties in collecting post-mortem cases and phenomena of tissue spoiling during histological analysis. In this paper we tackle the process of fusing in-vivo and in-vitro IVUS data starting with the analysis of recently proposed approaches for the creation of an enhanced IVUS data-set; furthermore, we propose a new approach, named pLDS, based on semi-supervised learning with a data selection criterion. The enhanced data-set obtained by each one of the analyzed approaches is used to train a classifier for tissue characterization purposes. Finally, the discriminative power of each classifier is quantitatively assessed and compared by classifying a data-set of validated in-vitro IVUS data.
Address Cardona (Spain)
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN 978-1-60750-061-2 Medium
Area Expedition Conference CCIA
Notes MILAB;HuPBA Approved no
Call Number BCNPCL @ bcnpcl @ CPR2009d Serial 1204
Permanent link to this record
 

 
Author Francisco Alvaro; Francisco Cruz; Joan Andreu Sanchez; Oriol Ramos Terrades; Jose Miguel Benedi
Title Structure Detection and Segmentation of Documents Using 2D Stochastic Context-Free Grammars Type Journal Article
Year 2015 Publication Neurocomputing Abbreviated Journal NEUCOM
Volume 150 Issue A Pages (down) 147-154
Keywords document image analysis; stochastic context-free grammars; text classi cation features
Abstract In this paper we de ne a bidimensional extension of Stochastic Context-Free Grammars for structure detection and segmentation of images of documents.
Two sets of text classi cation features are used to perform an initial classi cation of each zone of the page. Then, the document segmentation is obtained as the most likely hypothesis according to a stochastic grammar. We used a dataset of historical marriage license books to validate this approach. We also tested several inference algorithms for Probabilistic Graphical Models
and the results showed that the proposed grammatical model outperformed
the other methods. Furthermore, grammars also provide the document structure
along with its segmentation.
Address
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference
Notes DAG; 601.158; 600.077; 600.061 Approved no
Call Number Admin @ si @ ACS2015 Serial 2531
Permanent link to this record
 

 
Author Stepan Simsa; Milan Sulc; Michal Uricar; Yash Patel; Ahmed Hamdi; Matej Kocian; Matyas Skalicky; Jiri Matas; Antoine Doucet; Mickael Coustaty; Dimosthenis Karatzas
Title DocILE Benchmark for Document Information Localization and Extraction Type Conference Article
Year 2023 Publication 17th International Conference on Document Analysis and Recognition Abbreviated Journal
Volume 14188 Issue Pages (down) 147–166
Keywords Document AI; Information Extraction; Line Item Recognition; Business Documents; Intelligent Document Processing
Abstract This paper introduces the DocILE benchmark with the largest dataset of business documents for the tasks of Key Information Localization and Extraction and Line Item Recognition. It contains 6.7k annotated business documents, 100k synthetically generated documents, and nearly 1M unlabeled documents for unsupervised pre-training. The dataset has been built with knowledge of domain- and task-specific aspects, resulting in the following key features: (i) annotations in 55 classes, which surpasses the granularity of previously published key information extraction datasets by a large margin; (ii) Line Item Recognition represents a highly practical information extraction task, where key information has to be assigned to items in a table; (iii) documents come from numerous layouts and the test set includes zero- and few-shot cases as well as layouts commonly seen in the training set. The benchmark comes with several baselines, including RoBERTa, LayoutLMv3 and DETR-based Table Transformer; applied to both tasks of the DocILE benchmark, with results shared in this paper, offering a quick starting point for future work. The dataset, baselines and supplementary material are available at https://github.com/rossumai/docile.
Address San Jose; CA; USA; August 2023
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title LNCS
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference ICDAR
Notes DAG Approved no
Call Number Admin @ si @ SSU2023 Serial 3903
Permanent link to this record
 

 
Author Naveen Onkarappa; Angel Sappa
Title Space Variant Representations for Mobile Platform Vision Applications Type Conference Article
Year 2011 Publication 14th International Conference on Computer Analysis of Images and Patterns Abbreviated Journal
Volume 6855 Issue II Pages (down) 146-154
Keywords
Abstract The log-polar space variant representation, motivated by biological vision, has been widely studied in the literature. Its data reduction and invariance properties made it useful in many vision applications. However, due to its nature, it fails in preserving features in the periphery. In the current work, as an attempt to overcome this problem, we propose a novel space-variant representation. It is evaluated and proved to be better than the log-polar representation in preserving the peripheral information, crucial for on-board mobile vision applications. The evaluation is performed by comparing log-polar and the proposed representation once they are used for estimating dense optical flow.
Address Seville, Spain
Corporate Author Thesis
Publisher Springer Berlin Heidelberg Place of Publication Editor P. Real, D. Diaz, H. Molina, A. Berciano, W. Kropatsch
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN 0302-9743 ISBN 978-3-642-23677-8 Medium
Area Expedition Conference CAIP
Notes ADAS Approved no
Call Number NaS2011; ADAS @ adas @ Serial 1686
Permanent link to this record
 

 
Author Maedeh Aghaei; Mariella Dimiccoli; Petia Radeva
Title Multi-face tracking by extended bag-of-tracklets in egocentric photo-streams Type Journal Article
Year 2016 Publication Computer Vision and Image Understanding Abbreviated Journal CVIU
Volume 149 Issue Pages (down) 146-156
Keywords
Abstract Wearable cameras offer a hands-free way to record egocentric images of daily experiences, where social events are of special interest. The first step towards detection of social events is to track the appearance of multiple persons involved in them. In this paper, we propose a novel method to find correspondences of multiple faces in low temporal resolution egocentric videos acquired through a wearable camera. This kind of photo-stream imposes additional challenges to the multi-tracking problem with respect to conventional videos. Due to the free motion of the camera and to its low temporal resolution, abrupt changes in the field of view, in illumination condition and in the target location are highly frequent. To overcome such difficulties, we propose a multi-face tracking method that generates a set of tracklets through finding correspondences along the whole sequence for each detected face and takes advantage of the tracklets redundancy to deal with unreliable ones. Similar tracklets are grouped into the so called extended bag-of-tracklets (eBoT), which is aimed to correspond to a specific person. Finally, a prototype tracklet is extracted for each eBoT, where the occurred occlusions are estimated by relying on a new measure of confidence. We validated our approach over an extensive dataset of egocentric photo-streams and compared it to state of the art methods, demonstrating its effectiveness and robustness.
Address
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference
Notes MILAB; Approved no
Call Number Admin @ si @ ADR2016b Serial 2742
Permanent link to this record
 

 
Author Alex Falcon; Swathikiran Sudhakaran; Giuseppe Serra; Sergio Escalera; Oswald Lanz
Title Relevance-based Margin for Contrastively-trained Video Retrieval Models Type Conference Article
Year 2022 Publication ICMR '22: Proceedings of the 2022 International Conference on Multimedia Retrieval Abbreviated Journal
Volume Issue Pages (down) 146-157
Keywords
Abstract Video retrieval using natural language queries has attracted increasing interest due to its relevance in real-world applications, from intelligent access in private media galleries to web-scale video search. Learning the cross-similarity of video and text in a joint embedding space is the dominant approach. To do so, a contrastive loss is usually employed because it organizes the embedding space by putting similar items close and dissimilar items far. This framework leads to competitive recall rates, as they solely focus on the rank of the groundtruth items. Yet, assessing the quality of the ranking list is of utmost importance when considering intelligent retrieval systems, since multiple items may share similar semantics, hence a high relevance. Moreover, the aforementioned framework uses a fixed margin to separate similar and dissimilar items, treating all non-groundtruth items as equally irrelevant. In this paper we propose to use a variable margin: we argue that varying the margin used during training based on how much relevant an item is to a given query, i.e. a relevance-based margin, easily improves the quality of the ranking lists measured through nDCG and mAP. We demonstrate the advantages of our technique using different models on EPIC-Kitchens-100 and YouCook2. We show that even if we carefully tuned the fixed margin, our technique (which does not have the margin as a hyper-parameter) would still achieve better performance. Finally, extensive ablation studies and qualitative analysis support the robustness of our approach. Code will be released at \urlhttps://github.com/aranciokov/RelevanceMargin-ICMR22.
Address Newwark, NJ, USA, 27 June 2022
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference ICMR
Notes HuPBA; no menciona Approved no
Call Number Admin @ si @ FSS2022 Serial 3808
Permanent link to this record
 

 
Author Javier Marin; David Geronimo; David Vazquez; Antonio Lopez
Title Pedestrian Detection: Exploring Virtual Worlds Type Book Chapter
Year 2012 Publication Handbook of Pattern Recognition: Methods and Application Abbreviated Journal
Volume 5 Issue Pages (down) 145-162
Keywords Virtual worlds; Pedestrian Detection; Domain Adaptation
Abstract Handbook of pattern recognition will include contributions from university educators and active research experts. This Handbook is intended to serve as a basic reference on methods and applications of pattern recognition. The primary aim of this handbook is providing the community of pattern recognition with a readable, easy to understand resource that covers introductory, intermediate and advanced topics with equal clarity. Therefore, the Handbook of pattern recognition can serve equally well as reference resource and as classroom textbook. Contributions cover all methods, techniques and applications of pattern recognition. A tentative list of relevant topics might include: 1- Statistical, structural, syntactic pattern recognition. 2- Neural networks, machine learning, data mining. 3- Discrete geometry, algebraic, graph-based techniques for pattern recognition. 4- Face recognition, Signal analysis, image coding and processing, shape and texture analysis. 5- Document processing, text and graphics recognition, digital libraries. 6- Speech recognition, music analysis, multimedia systems. 7- Natural language analysis, information retrieval. 8- Biometrics, biomedical pattern analysis and information systems. 9- Other scientific, engineering, social and economical applications of pattern recognition. 10- Special hardware architectures, software packages for pattern recognition.
Address
Corporate Author Thesis
Publisher iConcept Press Place of Publication Editor
Language English Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN 978-1-477554-82-1 Medium
Area Expedition Conference
Notes ADAS Approved no
Call Number ADAS @ adas @ MGV2012 Serial 1979
Permanent link to this record
 

 
Author Md. Mostafa Kamal Sarker; Syeda Furruka Banu; Hatem A. Rashwan; Mohamed Abdel-Nasser; Vivek Kumar Singh; Sylvie Chambon; Petia Radeva; Domenec Puig
Title Food Places Classification in Egocentric Images Using Siamese Neural Networks Type Conference Article
Year 2019 Publication 22nd International Conference of the Catalan Association of Artificial Intelligence Abbreviated Journal
Volume Issue Pages (down) 145-151
Keywords
Abstract Wearable cameras are become more popular in recent years for capturing the unscripted moments of the first-person that help to analyze the users lifestyle. In this work, we aim to recognize the places related to food in egocentric images during a day to identify the daily food patterns of the first-person. Thus, this system can assist to improve their eating behavior to protect users against food-related diseases. In this paper, we use Siamese Neural Networks to learn the similarity between images from corresponding inputs for one-shot food places classification. We tested our proposed method with ‘MiniEgoFoodPlaces’ with 15 food related places. The proposed Siamese Neural Networks model with MobileNet achieved an overall classification accuracy of 76.74% and 77.53% on the validation and test sets of the “MiniEgoFoodPlaces” dataset, respectively outperforming with the base models, such as ResNet50, InceptionV3, and InceptionResNetV2.
Address Illes Balears; October 2019
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference CCIA
Notes MILAB; no proj Approved no
Call Number Admin @ si @ SBR2019 Serial 3368
Permanent link to this record