Publicacions CVC -- Query Results

[51–60] << 61 62 63 64 65 66 67 68 69 70 >> [71–80]

Details

Records
Author	Marçal Rusiñol; Lluis Pere de las Heras; Oriol Ramos Terrades
Title	Flowchart Recognition for Non-Textual Information Retrieval in Patent Search			Type	Journal Article
Year	2014	Publication	Information Retrieval	Abbreviated Journal	IR
Volume	17	Issue	5-6	Pages	545-562
Keywords	Flowchart recognition; Patent documents; Text/graphics separation; Raster-to-vector conversion; Symbol recognition
Abstract	Relatively little research has been done on the topic of patent image retrieval and in general in most of the approaches the retrieval is performed in terms of a similarity measure between the query image and the images in the corpus. However, systems aimed at overcoming the semantic gap between the visual description of patent images and their conveyed concepts would be very helpful for patent professionals. In this paper we present a flowchart recognition method aimed at achieving a structured representation of flowchart images that can be further queried semantically. The proposed method was submitted to the CLEF-IP 2012 flowchart recognition task. We report the obtained results on this dataset.
Address
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN	1386-4564	ISBN		Medium
Area		Expedition		Conference
Notes	DAG; 600.077			Approved	no
Call Number	Admin @ si @ RHR2013			Serial	2342
Permanent link to this record



Author	David Fernandez; Josep Llados; Alicia Fornes
Title	A graph-based approach for segmenting touching lines in historical handwritten documents			Type	Journal Article
Year	2014	Publication	International Journal on Document Analysis and Recognition	Abbreviated Journal	IJDAR
Volume	17	Issue	3	Pages	293-312
Keywords	Text line segmentation; Handwritten documents; Document image processing; Historical document analysis
Abstract	Text line segmentation in handwritten documents is an important task in the recognition of historical documents. Handwritten document images contain text lines with multiple orientations, touching and overlapping characters between consecutive text lines and different document structures, making line segmentation a difficult task. In this paper, we present a new approach for handwritten text line segmentation solving the problems of touching components, curvilinear text lines and horizontally overlapping components. The proposed algorithm formulates line segmentation as finding the central path in the area between two consecutive lines. This is solved as a graph traversal problem. A graph is constructed using the skeleton of the image. Then, a path-finding algorithm is used to find the optimum path between text lines. The proposed algorithm has been evaluated on a comprehensive dataset consisting of five databases: ICDAR2009, ICDAR2013, UMD, the George Washington and the Barcelona Marriages Database. The proposed method outperforms the state-of-the-art considering the different types and difficulties of the benchmarking data.
Address
Corporate Author				Thesis
Publisher	Springer Berlin Heidelberg	Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN	1433-2833	ISBN		Medium
Area		Expedition		Conference
Notes	DAG; 600.056; 600.061; 602.006; 600.077			Approved	no
Call Number	Admin @ si @ FLF2014			Serial	2459
Permanent link to this record



Author	Marçal Rusiñol; Volkmar Frinken; Dimosthenis Karatzas; Andrew Bagdanov; Josep Llados
Title	Multimodal page classification in administrative document image streams			Type	Journal Article
Year	2014	Publication	International Journal on Document Analysis and Recognition	Abbreviated Journal	IJDAR
Volume	17	Issue	4	Pages	331-341
Keywords	Digital mail room; Multimodal page classification; Visual and textual document description
Abstract	In this paper, we present a page classification application in a banking workflow. The proposed architecture represents administrative document images by merging visual and textual descriptions. The visual description is based on a hierarchical representation of the pixel intensity distribution. The textual description uses latent semantic analysis to represent document content as a mixture of topics. Several off-the-shelf classifiers and different strategies for combining visual and textual cues have been evaluated. A final step uses an n-gram model of the page stream allowing a finer-grained classification of pages. The proposed method has been tested in a real large-scale environment and we report results on a dataset of 70,000 pages.
Address
Corporate Author				Thesis
Publisher	Springer Berlin Heidelberg	Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN	1433-2833	ISBN		Medium
Area		Expedition		Conference
Notes	DAG; LAMP; 600.056; 600.061; 601.240; 601.223; 600.077; 600.079			Approved	no
Call Number	Admin @ si @ RFK2014			Serial	2523
Permanent link to this record



Author	Sergio Escalera; Vassilis Athitsos; Isabelle Guyon
Title	Challenges in multimodal gesture recognition			Type	Journal Article
Year	2016	Publication	Journal of Machine Learning Research	Abbreviated Journal	JMLR
Volume	17	Issue		Pages	1-54
Keywords	Gesture Recognition; Time Series Analysis; Multimodal Data Analysis; Computer Vision; Pattern Recognition; Wearable sensors; Infrared Cameras; KinectTM
Abstract	This paper surveys the state of the art on multimodal gesture recognition and introduces the JMLR special topic on gesture recognition 2011-2015. We began right at the start of the KinectTMrevolution when inexpensive infrared cameras providing image depth recordings became available. We published papers using this technology and other more conventional methods, including regular video cameras, to record data, thus providing a good overview of uses of machine learning and computer vision using multimodal data in this area of application. Notably, we organized a series of challenges and made available several datasets we recorded for that purpose, including tens of thousands of videos, which are available to conduct further research. We also overview recent state of the art works on gesture recognition based on a proposed taxonomy for gesture recognition, discussing challenges and future lines of research.
Address
Corporate Author				Thesis
Publisher		Place of Publication		Editor	Zhuowen Tu
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference
Notes	HuPBA;MILAB;			Approved	no
Call Number	Admin @ si @ EAG2016			Serial	2764
Permanent link to this record



Author	Arash Akbarinia; Karl R. Gegenfurtner
Title	Metameric Mismatching in Natural and Artificial Reflectances			Type	Journal Article
Year	2017	Publication	Journal of Vision	Abbreviated Journal	JV
Volume	17	Issue	10	Pages	390-390
Keywords	Metamer; colour perception; spectral discrimination; photoreceptors
Abstract	The human visual system and most digital cameras sample the continuous spectral power distribution through three classes of receptors. This implies that two distinct spectral reflectances can result in identical tristimulus values under one illuminant and differ under another – the problem of metamer mismatching. It is still debated how frequent this issue arises in the real world, using naturally occurring reflectance functions and common illuminants. We gathered more than ten thousand spectral reflectance samples from various sources, covering a wide range of environments (e.g., flowers, plants, Munsell chips) and evaluated their responses under a number of natural and artificial source of lights. For each pair of reflectance functions, we estimated the perceived difference using the CIE-defined distance ΔE2000 metric in Lab color space. The degree of metamer mismatching depended on the lower threshold value l when two samples would be considered to lead to equal sensor excitations (ΔE < l), and on the higher threshold value h when they would be considered different. For example, for l=h=1, we found that 43.129 comparisons out of a total of 6×107 pairs would be considered metameric (1 in 104). For l=1 and h=5, this number reduced to 705 metameric pairs (2 in 106). Extreme metamers, for instance l=1 and h=10, were rare (22 pairs or 6 in 108), as were instances where the two members of a metameric pair would be assigned to different color categories. Not unexpectedly, we observed variations among different reflectance databases and illuminant spectra with more frequency under artificial illuminants than natural ones. Overall, our numbers are not very different from those obtained earlier (Foster et al, JOSA A, 2006). However, our results also show that the degree of metamerism is typically not very strong and that category switches hardly ever occur.
Address	Florida, USA; May 2017
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference
Notes	NEUROBIT; no menciona			Approved	no
Call Number	Admin @ si @ AkG2017			Serial	2899
Permanent link to this record



Author	Cristhian A. Aguilera-Carrasco; Angel Sappa; Cristhian Aguilera; Ricardo Toledo
Title	Cross-Spectral Local Descriptors via Quadruplet Network			Type	Journal Article
Year	2017	Publication	Sensors	Abbreviated Journal	SENS
Volume	17	Issue	4	Pages	873
Keywords
Abstract	This paper presents a novel CNN-based architecture, referred to as Q-Net, to learn local feature descriptors that are useful for matching image patches from two different spectral bands. Given correctly matched and non-matching cross-spectral image pairs, a quadruplet network is trained to map input image patches to a common Euclidean space, regardless of the input spectral band. Our approach is inspired by the recent success of triplet networks in the visible spectrum, but adapted for cross-spectral scenarios, where, for each matching pair, there are always two possible non-matching patches: one for each spectrum. Experimental evaluations on a public cross-spectral VIS-NIR dataset shows that the proposed approach improves the state-of-the-art. Moreover, the proposed technique can also be used in mono-spectral settings, obtaining a similar performance to triplet network descriptors, but requiring less training data.
Address
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference
Notes	ADAS; 600.086; 600.118			Approved	no
Call Number	Admin @ si @ ASA2017			Serial	2914
Permanent link to this record



Author	Zhijie Fang; David Vazquez; Antonio Lopez
Title	On-Board Detection of Pedestrian Intentions			Type	Journal Article
Year	2017	Publication	Sensors	Abbreviated Journal	SENS
Volume	17	Issue	10	Pages	2193
Keywords	pedestrian intention; ADAS; self-driving
Abstract	Avoiding vehicle-to-pedestrian crashes is a critical requirement for nowadays advanced driver assistant systems (ADAS) and future self-driving vehicles. Accordingly, detecting pedestrians from raw sensor data has a history of more than 15 years of research, with vision playing a central role. During the last years, deep learning has boosted the accuracy of image-based pedestrian detectors. However, detection is just the first step towards answering the core question, namely is the vehicle going to crash with a pedestrian provided preventive actions are not taken? Therefore, knowing as soon as possible if a detected pedestrian has the intention of crossing the road ahead of the vehicle is essential for performing safe and comfortable maneuvers that prevent a crash. However, compared to pedestrian detection, there is relatively little literature on detecting pedestrian intentions. This paper aims to contribute along this line by presenting a new vision-based approach which analyzes the pose of a pedestrian along several frames to determine if he or she is going to enter the road or not. We present experiments showing 750 ms of anticipation for pedestrians crossing the road, which at a typical urban driving speed of 50 km/h can provide 15 additional meters (compared to a pure pedestrian detector) for vehicle automatic reactions or to warn the driver. Moreover, in contrast with state-of-the-art methods, our approach is monocular, neither requiring stereo nor optical flow information.
Address
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference
Notes	ADAS; 600.085; 600.076; 601.223; 600.116; 600.118			Approved	no
Call Number	Admin @ si @ FVL2017			Serial	2983
Permanent link to this record



Author	Penny Tarling; Mauricio Cantor; Albert Clapes; Sergio Escalera
Title	Deep learning with self-supervision and uncertainty regularization to count fish in underwater images			Type	Journal Article
Year	2022	Publication	PloS One	Abbreviated Journal	Plos
Volume	17	Issue	5	Pages	e0267759
Keywords
Abstract	Effective conservation actions require effective population monitoring. However, accurately counting animals in the wild to inform conservation decision-making is difficult. Monitoring populations through image sampling has made data collection cheaper, wide-reaching and less intrusive but created a need to process and analyse this data efficiently. Counting animals from such data is challenging, particularly when densely packed in noisy images. Attempting this manually is slow and expensive, while traditional computer vision methods are limited in their generalisability. Deep learning is the state-of-the-art method for many computer vision tasks, but it has yet to be properly explored to count animals. To this end, we employ deep learning, with a density-based regression approach, to count fish in low-resolution sonar images. We introduce a large dataset of sonar videos, deployed to record wild Lebranche mullet schools (Mugil liza), with a subset of 500 labelled images. We utilise abundant unlabelled data in a self-supervised task to improve the supervised counting task. For the first time in this context, by introducing uncertainty quantification, we improve model training and provide an accompanying measure of prediction uncertainty for more informed biological decision-making. Finally, we demonstrate the generalisability of our proposed counting framework through testing it on a recent benchmark dataset of high-resolution annotated underwater images from varying habitats (DeepFish). From experiments on both contrasting datasets, we demonstrate our network outperforms the few other deep learning models implemented for solving this task. By providing an open-source framework along with training data, our study puts forth an efficient deep learning template for crowd counting aquatic animals thereby contributing effective methods to assess natural populations from the ever-increasing visual data.
Address
Corporate Author				Thesis
Publisher	Public Library of Science	Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference
Notes	HuPBA			Approved	no
Call Number	Admin @ si @ TCC2022			Serial	3743
Permanent link to this record



Author	Ajian Liu; Chenxu Zhao; Zitong Yu; Jun Wan; Anyang Su; Xing Liu; Zichang Tan; Sergio Escalera; Junliang Xing; Yanyan Liang; Guodong Guo; Zhen Lei; Stan Z. Li; Shenshen Du
Title	Contrastive Context-Aware Learning for 3D High-Fidelity Mask Face Presentation Attack Detection			Type	Journal Article
Year	2022	Publication	IEEE Transactions on Information Forensics and Security	Abbreviated Journal	TIForensicSEC
Volume	17	Issue		Pages	2497 - 2507
Keywords
Abstract	Face presentation attack detection (PAD) is essential to secure face recognition systems primarily from high-fidelity mask attacks. Most existing 3D mask PAD benchmarks suffer from several drawbacks: 1) a limited number of mask identities, types of sensors, and a total number of videos; 2) low-fidelity quality of facial masks. Basic deep models and remote photoplethysmography (rPPG) methods achieved acceptable performance on these benchmarks but still far from the needs of practical scenarios. To bridge the gap to real-world applications, we introduce a large-scale Hi gh- Fi delity Mask dataset, namely HiFiMask . Specifically, a total amount of 54,600 videos are recorded from 75 subjects with 225 realistic masks by 7 new kinds of sensors. Along with the dataset, we propose a novel C ontrastive C ontext-aware L earning (CCL) framework. CCL is a new training methodology for supervised PAD tasks, which is able to learn by leveraging rich contexts accurately (e.g., subjects, mask material and lighting) among pairs of live faces and high-fidelity mask attacks. Extensive experimental evaluations on HiFiMask and three additional 3D mask datasets demonstrate the effectiveness of our method. The codes and dataset will be released soon.
Address
Corporate Author				Thesis
Publisher	IEEE	Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference
Notes	HuPBA			Approved	no
Call Number	Admin @ si @ LZY2022			Serial	3778
Permanent link to this record



Author	Juan Borrego-Carazo; Carles Sanchez; David Castells; Jordi Carrabina; Debora Gil
Title	A benchmark for the evaluation of computational methods for bronchoscopic navigation			Type	Journal Article
Year	2022	Publication	International Journal of Computer Assisted Radiology and Surgery	Abbreviated Journal	IJCARS
Volume	17	Issue	1	Pages
Keywords
Abstract
Address
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference
Notes	IAM			Approved	no
Call Number	Admin @ si @ BSC2022			Serial	3832
Permanent link to this record



Author	Antoni Rosell; Sonia Baeza; S. Garcia-Reina; JL. Mate; Ignasi Guasch; I. Nogueira; I. Garcia-Olive; Guillermo Torres; Carles Sanchez; Debora Gil
Title	EP01.05-001 Radiomics to Increase the Effectiveness of Lung Cancer Screening Programs. Radiolung Preliminary Results			Type	Journal Article
Year	2022	Publication	Journal of Thoracic Oncology	Abbreviated Journal	JTO
Volume	17	Issue	9	Pages	S182
Keywords
Abstract
Address
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference
Notes	IAM			Approved	no
Call Number	Admin @ si @ RBG2022b			Serial	3834
Permanent link to this record



Author	Olivier Penacchio; Xavier Otazu; Arnold J Wilkings; Sara M. Haigh
Title	A mechanistic account of visual discomfort			Type	Journal Article
Year	2023	Publication	Frontiers in Neuroscience	Abbreviated Journal	FN
Volume	17	Issue		Pages
Keywords
Abstract	Much of the neural machinery of the early visual cortex, from the extraction of local orientations to contextual modulations through lateral interactions, is thought to have developed to provide a sparse encoding of contour in natural scenes, allowing the brain to process efficiently most of the visual scenes we are exposed to. Certain visual stimuli, however, cause visual stress, a set of adverse effects ranging from simple discomfort to migraine attacks, and epileptic seizures in the extreme, all phenomena linked with an excessive metabolic demand. The theory of efficient coding suggests a link between excessive metabolic demand and images that deviate from natural statistics. Yet, the mechanisms linking energy demand and image spatial content in discomfort remain elusive. Here, we used theories of visual coding that link image spatial structure and brain activation to characterize the response to images observers reported as uncomfortable in a biologically based neurodynamic model of the early visual cortex that included excitatory and inhibitory layers to implement contextual influences. We found three clear markers of aversive images: a larger overall activation in the model, a less sparse response, and a more unbalanced distribution of activity across spatial orientations. When the ratio of excitation over inhibition was increased in the model, a phenomenon hypothesised to underlie interindividual differences in susceptibility to visual discomfort, the three markers of discomfort progressively shifted toward values typical of the response to uncomfortable stimuli. Overall, these findings propose a unifying mechanistic explanation for why there are differences between images and between observers, suggesting how visual input and idiosyncratic hyperexcitability give rise to abnormal brain responses that result in visual stress.
Address
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference
Notes	NEUROBIT			Approved	no
Call Number	Admin @ si @ POW2023			Serial	3886
Permanent link to this record



Author	Yuhua Luo; Francisco Jose Perales; Juan J. Villanueva
Title	An automatic Rotoscopy System for Human Motion Based on a Biomedical Graphical Model.			Type	Journal Article
Year	1992	Publication	Computer & Graphics	Abbreviated Journal
Volume	16	Issue	4	Pages	355-362
Keywords
Abstract
Address
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference
Notes				Approved	no
Call Number	ISE @ ise @ LPV1992			Serial	249
Permanent link to this record



Author	A. Pujol; Juan J. Villanueva
Title	A supervised Modification of the Hausdorff distance for visual shape classification			Type	Journal
Year	2002	Publication	International Journal of Pattern Recognition and Artificial Intelligence	Abbreviated Journal
Volume	16	Issue	3	Pages	349-359
Keywords
Abstract	(IF: 0.359)
Address
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference
Notes	ISE			Approved	no
Call Number	PuV2002			Serial	273
Permanent link to this record



Author	V. Kober; Mikhail Mozerov; J. Alvarez-Borrego; I.A. Ovseyevich
Title	Adaptive Correlation Filters for Pattern Recognition			Type	Journal
Year	2006	Publication	Pattern Recognition and Image Analysis	Abbreviated Journal
Volume	16	Issue	3	Pages	425-431
Keywords	Pattern recognition, Correlation filters, A adaptive filters
Abstract	Adaptive correlation filters based on synthetic discriminant functions (SDFs) for reliable pattern recognition are proposed. A given value of discrimination capability can be achieved by adapting a SDF filter to the input scene. This can be done by iterative training. Computer simulation results obtained with the proposed filters are compared with those of various correlation filters in terms of recognition performance.
Address
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference
Notes	ISE			Approved	no
Call Number	ISE @ ise @ KMA2006a			Serial	673
Permanent link to this record