|   | 
Details
   web
Records
Author Marçal Rusiñol; Lluis Pere de las Heras; Oriol Ramos Terrades
Title Flowchart Recognition for Non-Textual Information Retrieval in Patent Search Type Journal Article
Year 2014 Publication Information Retrieval Abbreviated Journal IR
Volume (down) 17 Issue 5-6 Pages 545-562
Keywords Flowchart recognition; Patent documents; Text/graphics separation; Raster-to-vector conversion; Symbol recognition
Abstract Relatively little research has been done on the topic of patent image retrieval and in general in most of the approaches the retrieval is performed in terms of a similarity measure between the query image and the images in the corpus. However, systems aimed at overcoming the semantic gap between the visual description of patent images and their conveyed concepts would be very helpful for patent professionals. In this paper we present a flowchart recognition method aimed at achieving a structured representation of flowchart images that can be further queried semantically. The proposed method was submitted to the CLEF-IP 2012 flowchart recognition task. We report the obtained results on this dataset.
Address
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN 1386-4564 ISBN Medium
Area Expedition Conference
Notes DAG; 600.077 Approved no
Call Number Admin @ si @ RHR2013 Serial 2342
Permanent link to this record
 

 
Author David Fernandez; Josep Llados; Alicia Fornes
Title A graph-based approach for segmenting touching lines in historical handwritten documents Type Journal Article
Year 2014 Publication International Journal on Document Analysis and Recognition Abbreviated Journal IJDAR
Volume (down) 17 Issue 3 Pages 293-312
Keywords Text line segmentation; Handwritten documents; Document image processing; Historical document analysis
Abstract Text line segmentation in handwritten documents is an important task in the recognition of historical documents. Handwritten document images contain text lines with multiple orientations, touching and overlapping characters between consecutive text lines and different document structures, making line segmentation a difficult task. In this paper, we present a new approach for handwritten text line segmentation solving the problems of touching components, curvilinear text lines and horizontally overlapping components. The proposed algorithm formulates line segmentation as finding the central path in the area between two consecutive lines. This is solved as a graph traversal problem. A graph is constructed using the skeleton of the image. Then, a path-finding algorithm is used to find the optimum path between text lines. The proposed algorithm has been evaluated on a comprehensive dataset consisting of five databases: ICDAR2009, ICDAR2013, UMD, the George Washington and the Barcelona Marriages Database. The proposed method outperforms the state-of-the-art considering the different types and difficulties of the benchmarking data.
Address
Corporate Author Thesis
Publisher Springer Berlin Heidelberg Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN 1433-2833 ISBN Medium
Area Expedition Conference
Notes DAG; 600.056; 600.061; 602.006; 600.077 Approved no
Call Number Admin @ si @ FLF2014 Serial 2459
Permanent link to this record
 

 
Author Marçal Rusiñol; Volkmar Frinken; Dimosthenis Karatzas; Andrew Bagdanov; Josep Llados
Title Multimodal page classification in administrative document image streams Type Journal Article
Year 2014 Publication International Journal on Document Analysis and Recognition Abbreviated Journal IJDAR
Volume (down) 17 Issue 4 Pages 331-341
Keywords Digital mail room; Multimodal page classification; Visual and textual document description
Abstract In this paper, we present a page classification application in a banking workflow. The proposed architecture represents administrative document images by merging visual and textual descriptions. The visual description is based on a hierarchical representation of the pixel intensity distribution. The textual description uses latent semantic analysis to represent document content as a mixture of topics. Several off-the-shelf classifiers and different strategies for combining visual and textual cues have been evaluated. A final step uses an n-gram model of the page stream allowing a finer-grained classification of pages. The proposed method has been tested in a real large-scale environment and we report results on a dataset of 70,000 pages.
Address
Corporate Author Thesis
Publisher Springer Berlin Heidelberg Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN 1433-2833 ISBN Medium
Area Expedition Conference
Notes DAG; LAMP; 600.056; 600.061; 601.240; 601.223; 600.077; 600.079 Approved no
Call Number Admin @ si @ RFK2014 Serial 2523
Permanent link to this record
 

 
Author Sergio Escalera; Vassilis Athitsos; Isabelle Guyon
Title Challenges in multimodal gesture recognition Type Journal Article
Year 2016 Publication Journal of Machine Learning Research Abbreviated Journal JMLR
Volume (down) 17 Issue Pages 1-54
Keywords Gesture Recognition; Time Series Analysis; Multimodal Data Analysis; Computer Vision; Pattern Recognition; Wearable sensors; Infrared Cameras; KinectTM
Abstract This paper surveys the state of the art on multimodal gesture recognition and introduces the JMLR special topic on gesture recognition 2011-2015. We began right at the start of the KinectTMrevolution when inexpensive infrared cameras providing image depth recordings became available. We published papers using this technology and other more conventional methods, including regular video cameras, to record data, thus providing a good overview of uses of machine learning and computer vision using multimodal data in this area of application. Notably, we organized a series of challenges and made available several datasets we recorded for that purpose, including tens of thousands
of videos, which are available to conduct further research. We also overview recent state of the art works on gesture recognition based on a proposed taxonomy for gesture recognition, discussing challenges and future lines of research.
Address
Corporate Author Thesis
Publisher Place of Publication Editor Zhuowen Tu
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference
Notes HuPBA;MILAB; Approved no
Call Number Admin @ si @ EAG2016 Serial 2764
Permanent link to this record
 

 
Author Arash Akbarinia; Karl R. Gegenfurtner
Title Metameric Mismatching in Natural and Artificial Reflectances Type Journal Article
Year 2017 Publication Journal of Vision Abbreviated Journal JV
Volume (down) 17 Issue 10 Pages 390-390
Keywords Metamer; colour perception; spectral discrimination; photoreceptors
Abstract The human visual system and most digital cameras sample the continuous spectral power distribution through three classes of receptors. This implies that two distinct spectral reflectances can result in identical tristimulus values under one illuminant and differ under another – the problem of metamer mismatching. It is still debated how frequent this issue arises in the real world, using naturally occurring reflectance functions and common illuminants.

We gathered more than ten thousand spectral reflectance samples from various sources, covering a wide range of environments (e.g., flowers, plants, Munsell chips) and evaluated their responses under a number of natural and artificial source of lights. For each pair of reflectance functions, we estimated the perceived difference using the CIE-defined distance ΔE2000 metric in Lab color space.

The degree of metamer mismatching depended on the lower threshold value l when two samples would be considered to lead to equal sensor excitations (ΔE < l), and on the higher threshold value h when they would be considered different. For example, for l=h=1, we found that 43.129 comparisons out of a total of 6×107 pairs would be considered metameric (1 in 104). For l=1 and h=5, this number reduced to 705 metameric pairs (2 in 106). Extreme metamers, for instance l=1 and h=10, were rare (22 pairs or 6 in 108), as were instances where the two members of a metameric pair would be assigned to different color categories. Not unexpectedly, we observed variations among different reflectance databases and illuminant spectra with more frequency under artificial illuminants than natural ones.

Overall, our numbers are not very different from those obtained earlier (Foster et al, JOSA A, 2006). However, our results also show that the degree of metamerism is typically not very strong and that category switches hardly ever occur.
Address Florida, USA; May 2017
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference
Notes NEUROBIT; no menciona Approved no
Call Number Admin @ si @ AkG2017 Serial 2899
Permanent link to this record
 

 
Author Cristhian A. Aguilera-Carrasco; Angel Sappa; Cristhian Aguilera; Ricardo Toledo
Title Cross-Spectral Local Descriptors via Quadruplet Network Type Journal Article
Year 2017 Publication Sensors Abbreviated Journal SENS
Volume (down) 17 Issue 4 Pages 873
Keywords
Abstract This paper presents a novel CNN-based architecture, referred to as Q-Net, to learn local feature descriptors that are useful for matching image patches from two different spectral bands. Given correctly matched and non-matching cross-spectral image pairs, a quadruplet network is trained to map input image patches to a common Euclidean space, regardless of the input spectral band. Our approach is inspired by the recent success of triplet networks in the visible spectrum, but adapted for cross-spectral scenarios, where, for each matching pair, there are always two possible non-matching patches: one for each spectrum. Experimental evaluations on a public cross-spectral VIS-NIR dataset shows that the proposed approach improves the state-of-the-art. Moreover, the proposed technique can also be used in mono-spectral settings, obtaining a similar performance to triplet network descriptors, but requiring less training data.
Address
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference
Notes ADAS; 600.086; 600.118 Approved no
Call Number Admin @ si @ ASA2017 Serial 2914
Permanent link to this record
 

 
Author Zhijie Fang; David Vazquez; Antonio Lopez
Title On-Board Detection of Pedestrian Intentions Type Journal Article
Year 2017 Publication Sensors Abbreviated Journal SENS
Volume (down) 17 Issue 10 Pages 2193
Keywords pedestrian intention; ADAS; self-driving
Abstract Avoiding vehicle-to-pedestrian crashes is a critical requirement for nowadays advanced driver assistant systems (ADAS) and future self-driving vehicles. Accordingly, detecting pedestrians from raw sensor data has a history of more than 15 years of research, with vision playing a central role.
During the last years, deep learning has boosted the accuracy of image-based pedestrian detectors.
However, detection is just the first step towards answering the core question, namely is the vehicle going to crash with a pedestrian provided preventive actions are not taken? Therefore, knowing as soon as possible if a detected pedestrian has the intention of crossing the road ahead of the vehicle is
essential for performing safe and comfortable maneuvers that prevent a crash. However, compared to pedestrian detection, there is relatively little literature on detecting pedestrian intentions. This paper aims to contribute along this line by presenting a new vision-based approach which analyzes the
pose of a pedestrian along several frames to determine if he or she is going to enter the road or not. We present experiments showing 750 ms of anticipation for pedestrians crossing the road, which at a typical urban driving speed of 50 km/h can provide 15 additional meters (compared to a pure pedestrian detector) for vehicle automatic reactions or to warn the driver. Moreover, in contrast with state-of-the-art methods, our approach is monocular, neither requiring stereo nor optical flow information.
Address
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference
Notes ADAS; 600.085; 600.076; 601.223; 600.116; 600.118 Approved no
Call Number Admin @ si @ FVL2017 Serial 2983
Permanent link to this record
 

 
Author Penny Tarling; Mauricio Cantor; Albert Clapes; Sergio Escalera
Title Deep learning with self-supervision and uncertainty regularization to count fish in underwater images Type Journal Article
Year 2022 Publication PloS One Abbreviated Journal Plos
Volume (down) 17 Issue 5 Pages e0267759
Keywords
Abstract Effective conservation actions require effective population monitoring. However, accurately counting animals in the wild to inform conservation decision-making is difficult. Monitoring populations through image sampling has made data collection cheaper, wide-reaching and less intrusive but created a need to process and analyse this data efficiently. Counting animals from such data is challenging, particularly when densely packed in noisy images. Attempting this manually is slow and expensive, while traditional computer vision methods are limited in their generalisability. Deep learning is the state-of-the-art method for many computer vision tasks, but it has yet to be properly explored to count animals. To this end, we employ deep learning, with a density-based regression approach, to count fish in low-resolution sonar images. We introduce a large dataset of sonar videos, deployed to record wild Lebranche mullet schools (Mugil liza), with a subset of 500 labelled images. We utilise abundant unlabelled data in a self-supervised task to improve the supervised counting task. For the first time in this context, by introducing uncertainty quantification, we improve model training and provide an accompanying measure of prediction uncertainty for more informed biological decision-making. Finally, we demonstrate the generalisability of our proposed counting framework through testing it on a recent benchmark dataset of high-resolution annotated underwater images from varying habitats (DeepFish). From experiments on both contrasting datasets, we demonstrate our network outperforms the few other deep learning models implemented for solving this task. By providing an open-source framework along with training data, our study puts forth an efficient deep learning template for crowd counting aquatic animals thereby contributing effective methods to assess natural populations from the ever-increasing visual data.
Address
Corporate Author Thesis
Publisher Public Library of Science Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference
Notes HuPBA Approved no
Call Number Admin @ si @ TCC2022 Serial 3743
Permanent link to this record
 

 
Author Ajian Liu; Chenxu Zhao; Zitong Yu; Jun Wan; Anyang Su; Xing Liu; Zichang Tan; Sergio Escalera; Junliang Xing; Yanyan Liang; Guodong Guo; Zhen Lei; Stan Z. Li; Shenshen Du
Title Contrastive Context-Aware Learning for 3D High-Fidelity Mask Face Presentation Attack Detection Type Journal Article
Year 2022 Publication IEEE Transactions on Information Forensics and Security Abbreviated Journal TIForensicSEC
Volume (down) 17 Issue Pages 2497 - 2507
Keywords
Abstract Face presentation attack detection (PAD) is essential to secure face recognition systems primarily from high-fidelity mask attacks. Most existing 3D mask PAD benchmarks suffer from several drawbacks: 1) a limited number of mask identities, types of sensors, and a total number of videos; 2) low-fidelity quality of facial masks. Basic deep models and remote photoplethysmography (rPPG) methods achieved acceptable performance on these benchmarks but still far from the needs of practical scenarios. To bridge the gap to real-world applications, we introduce a large-scale Hi gh- Fi delity Mask dataset, namely HiFiMask . Specifically, a total amount of 54,600 videos are recorded from 75 subjects with 225 realistic masks by 7 new kinds of sensors. Along with the dataset, we propose a novel C ontrastive C ontext-aware L earning (CCL) framework. CCL is a new training methodology for supervised PAD tasks, which is able to learn by leveraging rich contexts accurately (e.g., subjects, mask material and lighting) among pairs of live faces and high-fidelity mask attacks. Extensive experimental evaluations on HiFiMask and three additional 3D mask datasets demonstrate the effectiveness of our method. The codes and dataset will be released soon.
Address
Corporate Author Thesis
Publisher IEEE Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference
Notes HuPBA Approved no
Call Number Admin @ si @ LZY2022 Serial 3778
Permanent link to this record
 

 
Author Juan Borrego-Carazo; Carles Sanchez; David Castells; Jordi Carrabina; Debora Gil
Title A benchmark for the evaluation of computational methods for bronchoscopic navigation Type Journal Article
Year 2022 Publication International Journal of Computer Assisted Radiology and Surgery Abbreviated Journal IJCARS
Volume (down) 17 Issue 1 Pages
Keywords
Abstract
Address
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference
Notes IAM Approved no
Call Number Admin @ si @ BSC2022 Serial 3832
Permanent link to this record
 

 
Author Antoni Rosell; Sonia Baeza; S. Garcia-Reina; JL. Mate; Ignasi Guasch; I. Nogueira; I. Garcia-Olive; Guillermo Torres; Carles Sanchez; Debora Gil
Title EP01.05-001 Radiomics to Increase the Effectiveness of Lung Cancer Screening Programs. Radiolung Preliminary Results Type Journal Article
Year 2022 Publication Journal of Thoracic Oncology Abbreviated Journal JTO
Volume (down) 17 Issue 9 Pages S182
Keywords
Abstract
Address
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference
Notes IAM Approved no
Call Number Admin @ si @ RBG2022b Serial 3834
Permanent link to this record
 

 
Author Olivier Penacchio; Xavier Otazu; Arnold J Wilkings; Sara M. Haigh
Title A mechanistic account of visual discomfort Type Journal Article
Year 2023 Publication Frontiers in Neuroscience Abbreviated Journal FN
Volume (down) 17 Issue Pages
Keywords
Abstract Much of the neural machinery of the early visual cortex, from the extraction of local orientations to contextual modulations through lateral interactions, is thought to have developed to provide a sparse encoding of contour in natural scenes, allowing the brain to process efficiently most of the visual scenes we are exposed to. Certain visual stimuli, however, cause visual stress, a set of adverse effects ranging from simple discomfort to migraine attacks, and epileptic seizures in the extreme, all phenomena linked with an excessive metabolic demand. The theory of efficient coding suggests a link between excessive metabolic demand and images that deviate from natural statistics. Yet, the mechanisms linking energy demand and image spatial content in discomfort remain elusive. Here, we used theories of visual coding that link image spatial structure and brain activation to characterize the response to images observers reported as uncomfortable in a biologically based neurodynamic model of the early visual cortex that included excitatory and inhibitory layers to implement contextual influences. We found three clear markers of aversive images: a larger overall activation in the model, a less sparse response, and a more unbalanced distribution of activity across spatial orientations. When the ratio of excitation over inhibition was increased in the model, a phenomenon hypothesised to underlie interindividual differences in susceptibility to visual discomfort, the three markers of discomfort progressively shifted toward values typical of the response to uncomfortable stimuli. Overall, these findings propose a unifying mechanistic explanation for why there are differences between images and between observers, suggesting how visual input and idiosyncratic hyperexcitability give rise to abnormal brain responses that result in visual stress.
Address
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference
Notes NEUROBIT Approved no
Call Number Admin @ si @ POW2023 Serial 3886
Permanent link to this record
 

 
Author Yuhua Luo; Francisco Jose Perales; Juan J. Villanueva
Title An automatic Rotoscopy System for Human Motion Based on a Biomedical Graphical Model. Type Journal Article
Year 1992 Publication Computer & Graphics Abbreviated Journal
Volume (down) 16 Issue 4 Pages 355-362
Keywords
Abstract
Address
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference
Notes Approved no
Call Number ISE @ ise @ LPV1992 Serial 249
Permanent link to this record
 

 
Author A. Pujol; Juan J. Villanueva
Title A supervised Modification of the Hausdorff distance for visual shape classification Type Journal
Year 2002 Publication International Journal of Pattern Recognition and Artificial Intelligence Abbreviated Journal
Volume (down) 16 Issue 3 Pages 349-359
Keywords
Abstract (IF: 0.359)
Address
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference
Notes ISE Approved no
Call Number PuV2002 Serial 273
Permanent link to this record
 

 
Author V. Kober; Mikhail Mozerov; J. Alvarez-Borrego; I.A. Ovseyevich
Title Adaptive Correlation Filters for Pattern Recognition Type Journal
Year 2006 Publication Pattern Recognition and Image Analysis Abbreviated Journal
Volume (down) 16 Issue 3 Pages 425-431
Keywords Pattern recognition, Correlation filters, A adaptive filters
Abstract Adaptive correlation filters based on synthetic discriminant functions (SDFs) for reliable pattern recognition are proposed. A given value of discrimination capability can be achieved by adapting a SDF filter to the input scene. This can be done by iterative training. Computer simulation results obtained with the proposed filters are compared with those of various correlation filters in terms of recognition performance.
Address
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference
Notes ISE Approved no
Call Number ISE @ ise @ KMA2006a Serial 673
Permanent link to this record