|
Records |
Links |
|
Author |
Fernando Barrera; Felipe Lumbreras; Angel Sappa |
|
|
Title |
Evaluation of Similarity Functions in Multimodal Stereo |
Type |
Conference Article |
|
Year |
2012 |
Publication |
9th International Conference on Image Analysis and Recognition |
Abbreviated Journal |
|
|
|
Volume |
7324 |
Issue |
I |
Pages |
320-329 |
|
|
Keywords |
Aveiro, Portugal |
|
|
Abstract |
This paper presents an evaluation framework for multimodal stereo matching, which allows to compare the performance of four similarity functions. Additionally, it presents details of a multimodal stereo head that supply thermal infrared and color images, as well as, aspects of its calibration and rectification. The pipeline includes a novel method for the disparity selection, which is suitable for evaluating the similarity functions. Finally, a benchmark for comparing different initializations of the proposed framework is presented. Similarity functions are based on mutual information, gradient orientation and scale space representations. Their evaluation is performed using two metrics: i) disparity error, and ii) number of correct matches on planar regions. In addition to the proposed evaluation, the current paper also shows that 3D sparse representations can be recovered from such a multimodal stereo head. |
|
|
Address |
|
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
Springer Berlin Heidelberg |
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
LNCS |
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
0302-9743 |
ISBN |
978-3-642-31294-6 |
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
ICIAR |
|
|
Notes |
ADAS |
Approved |
no |
|
|
Call Number |
BLS2012a |
Serial |
2014 |
|
Permanent link to this record |
|
|
|
|
Author |
Diego Cheda; Daniel Ponsa; Antonio Lopez |
|
|
Title |
Monocular Depth-based Background Estimation |
Type |
Conference Article |
|
Year |
2012 |
Publication |
7th International Conference on Computer Vision Theory and Applications |
Abbreviated Journal |
|
|
|
Volume |
|
Issue |
|
Pages |
323-328 |
|
|
Keywords |
|
|
|
Abstract |
In this paper, we address the problem of reconstructing the background of a scene from a video sequence with occluding objects. The images are taken by hand-held cameras. Our method composes the background by selecting the appropriate pixels from previously aligned input images. To do that, we minimize a cost function that penalizes the deviations from the following assumptions: background represents objects whose distance to the camera is maximal, and background objects are stationary. Distance information is roughly obtained by a supervised learning approach that allows us to distinguish between close and distant image regions. Moving foreground objects are filtered out by using stationariness and motion boundary constancy measurements. The cost function is minimized by a graph cuts method. We demonstrate the applicability of our approach to recover an occlusion-free background in a set of sequences. |
|
|
Address |
Roma |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
VISAPP |
|
|
Notes |
ADAS |
Approved |
no |
|
|
Call Number |
Admin @ si @ CPL2012b; ADAS @ adas @ cpl2012e |
Serial |
2012 |
|
Permanent link to this record |
|
|
|
|
Author |
R. Valenti; N. Sebe; Theo Gevers |
|
|
Title |
What are you looking at? Improving Visual gaze Estimation by Saliency |
Type |
Journal Article |
|
Year |
2012 |
Publication |
International Journal of Computer Vision |
Abbreviated Journal |
IJCV |
|
|
Volume |
98 |
Issue |
3 |
Pages |
324-334 |
|
|
Keywords |
|
|
|
Abstract |
Impact factor 2010: 5.15
Impact factor 2011/12?: 5.36
In this paper we present a novel mechanism to obtain enhanced gaze estimation for subjects looking at a scene or an image. The system makes use of prior knowledge about the scene (e.g. an image on a computer screen), to define a probability map of the scene the subject is gazing at, in order to find the most probable location. The proposed system helps in correcting the fixations which are erroneously estimated by the gaze estimation device by employing a saliency framework to adjust the resulting gaze point vector. The system is tested on three scenarios: using eye tracking data, enhancing a low accuracy webcam based eye tracker, and using a head pose tracker. The correlation between the subjects in the commercial eye tracking data is improved by an average of 13.91%. The correlation on the low accuracy eye gaze tracker is improved by 59.85%, and for the head pose tracker we obtain an improvement of 10.23%. These results show the potential of the system as a way to enhance and self-calibrate different visual gaze estimation systems. |
|
|
Address |
|
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
0920-5691 |
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
|
|
|
Notes |
ALTRES;ISE |
Approved |
no |
|
|
Call Number |
Admin @ si @ VSG2012 |
Serial |
1848 |
|
Permanent link to this record |
|
|
|
|
Author |
Bogdan Raducanu; Fadi Dornaika |
|
|
Title |
Out-of-Sample Embedding by Sparse Representation |
Type |
Conference Article |
|
Year |
2012 |
Publication |
Structural, Syntactic, and Statistical Pattern Recognition, Joint IAPR International Workshop |
Abbreviated Journal |
|
|
|
Volume |
7626 |
Issue |
|
Pages |
336-344 |
|
|
Keywords |
|
|
|
Abstract |
A critical aspect of non-linear dimensionality reduction techniques is represented by the construction of the adjacency graph. The difficulty resides in finding the optimal parameters, a process which, in general, is heuristically driven. Recently, sparse representation has been proposed as a non-parametric solution to overcome this problem. In this paper, we demonstrate that this approach not only serves for the graph construction, but also represents an efficient and accurate alternative for out-of-sample embedding. Considering for a case study the Laplacian Eigenmaps, we applied our method to the face recognition problem. Experimental results conducted on some challenging datasets confirmed the robustness of our approach and its superiority when compared to existing techniques. |
|
|
Address |
|
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
Springer Berlin Heidelberg |
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
0302-9743 |
ISBN |
978-3-642-34165-6 |
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
SSPR&SPR |
|
|
Notes |
OR;MV |
Approved |
no |
|
|
Call Number |
Admin @ si @ RaD2012c |
Serial |
2175 |
|
Permanent link to this record |
|
|
|
|
Author |
Jordi Roca; Maria Vanrell; C. Alejandro Parraga |
|
|
Title |
What is constant in colour constancy? |
Type |
Conference Article |
|
Year |
2012 |
Publication |
6th European Conference on Colour in Graphics, Imaging and Vision |
Abbreviated Journal |
|
|
|
Volume |
|
Issue |
|
Pages |
337-343 |
|
|
Keywords |
|
|
|
Abstract |
Color constancy refers to the ability of the human visual system to stabilize
the color appearance of surfaces under an illuminant change. In this work we studied how the interrelations among nine colors are perceived under illuminant changes, particularly whether they remain stable across 10 different conditions (5 illuminants and 2 backgrounds). To do so we have used a paradigm that measures several colors under an immersive state of adaptation. From our measures we defined a perceptual structure descriptor that is up to 87% stable over all conditions, suggesting that color category features could be used to predict color constancy. This is in agreement with previous results on the stability of border categories [1,2] and with computational color constancy
algorithms [3] for estimating the scene illuminant. |
|
|
Address |
|
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
9781622767014 |
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
CGIV |
|
|
Notes |
CIC |
Approved |
no |
|
|
Call Number |
RVP2012 |
Serial |
2189 |
|
Permanent link to this record |
|
|
|
|
Author |
Alberto Hidalgo; Ferran Poveda; Enric Marti;Debora Gil;Albert Andaluz; Francesc Carreras; Manuel Ballester |
|
|
Title |
Evidence of continuous helical structure of the cardiac ventricular anatomy assessed by diffusion tensor imaging magnetic resonance multiresolution tractography |
Type |
Journal Article |
|
Year |
2012 |
Publication |
European Radiology |
Abbreviated Journal |
ECR |
|
|
Volume |
3 |
Issue |
1 |
Pages |
361-362 |
|
|
Keywords |
|
|
|
Abstract |
Deep understanding of myocardial structure linking morphology and func- tion of the heart would unravel crucial knowledge for medical and surgical clinical procedures and studies. Diffusion tensor MRI provides a discrete measurement of the 3D arrangement of myocardial fibres by the observation of local anisotropic
diffusion of water molecules in biological tissues. In this work, we present a multi- scale visualisation technique based on DT-MRI streamlining capable of uncovering additional properties of the architectural organisation of the heart. Methods and Materials: We selected the John Hopkins University (JHU) Canine Heart Dataset, where the long axis cardiac plane is aligned with the scanner’s Z- axis. Their equipment included a 4-element passed array coil emitting a 1.5 T. For DTI acquisition, a 3D-FSE sequence is apply. We used 200 seeds for full-scale tractography, while we applied a MIP mapping technique for simplified tractographic reconstruction. In this case, we reduced each DTI 3D volume dimensions by order- two magnitude before streamlining.
Our simplified tractographic reconstruction method keeps the main geometric features of fibres, allowing for an easier identification of their global morphological disposition, including the ventricular basal ring. Moreover, we noticed a clearly visible helical disposition of the myocardial fibres, in line with the helical myocardial band ventricular structure described by Torrent-Guasp. Finally, our simplified visualisation with single tracts identifies the main segments of the helical ventricular architecture.
DT-MRI makes possible the identification of a continuous helical architecture of the myocardial fibres, which validates Torrent-Guasp’s helical myocardial band ventricular anatomical model. |
|
|
Address |
Viena, Austria |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
Springer Link |
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
1869-4101 |
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
|
|
|
Notes |
IAM |
Approved |
no |
|
|
Call Number |
IAM @ iam @ HPM2012 |
Serial |
1858 |
|
Permanent link to this record |
|
|
|
|
Author |
Ferran Diego; G.D. Evangelidis; Joan Serrat |
|
|
Title |
Night-time outdoor surveillance by mobile cameras |
Type |
Conference Article |
|
Year |
2012 |
Publication |
1st International Conference on Pattern Recognition Applications and Methods |
Abbreviated Journal |
|
|
|
Volume |
2 |
Issue |
|
Pages |
365-371 |
|
|
Keywords |
|
|
|
Abstract |
This paper addresses the problem of video surveillance by mobile cameras. We present a method that allows online change detection in night-time outdoor surveillance. Because of the camera movement, background frames are not available and must be “localized” in former sequences and registered with the current frames. To this end, we propose a Frame Localization And Registration (FLAR) approach that solves the problem efficiently. Frames of former sequences define a database which is queried by current frames in turn. To quickly retrieve nearest neighbors, database is indexed through a visual dictionary method based on the SURF descriptor. Furthermore, the frame localization is benefited by a temporal filter that exploits the temporal coherence of videos. Next, the recently proposed ECC alignment scheme is used to spatially register the synchronized frames. Finally, change detection methods apply to aligned frames in order to mark suspicious areas. Experiments with real night sequences recorded by in-vehicle cameras demonstrate the performance of the proposed method and verify its efficiency and effectiveness against other methods. |
|
|
Address |
Algarve, Portugal |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
ICPRAM |
|
|
Notes |
ADAS |
Approved |
no |
|
|
Call Number |
Admin @ si @ DES2012 |
Serial |
2035 |
|
Permanent link to this record |
|
|
|
|
Author |
Ekaterina Zaytseva; Santiago Segui; Jordi Vitria |
|
|
Title |
Sketchable Histograms of Oriented Gradients for Object Detection |
Type |
Conference Article |
|
Year |
2012 |
Publication |
17th Iberomerican Conference on Pattern Recognition |
Abbreviated Journal |
|
|
|
Volume |
7441 |
Issue |
|
Pages |
374-381 |
|
|
Keywords |
|
|
|
Abstract |
In this paper we investigate a new representation approach for visual object recognition. The new representation, called sketchable-HoG, extends the classical histogram of oriented gradients (HoG) feature by adding two different aspects: the stability of the majority orientation and the continuity of gradient orientations. In this way, the sketchable-HoG locally characterizes the complexity of an object model and introduces global structure information while still keeping simplicity, compactness and robustness. We evaluated the proposed image descriptor on publicly Catltech 101 dataset. The obtained results outperforms classical HoG descriptor as well as other reported descriptors in the literature. |
|
|
Address |
Buenos Aires, Argentina |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
Springer Berlin Heidelberg |
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
0302-9743 |
ISBN |
978-3-642-33274-6 |
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
CIARP |
|
|
Notes |
OR; MILAB;MV |
Approved |
no |
|
|
Call Number |
Admin @ si @ ZSV2012 |
Serial |
2048 |
|
Permanent link to this record |
|
|
|
|
Author |
Jose Manuel Alvarez; Theo Gevers; Y. LeCun; Antonio Lopez |
|
|
Title |
Road Scene Segmentation from a Single Image |
Type |
Conference Article |
|
Year |
2012 |
Publication |
12th European Conference on Computer Vision |
Abbreviated Journal |
|
|
|
Volume |
7578 |
Issue |
VII |
Pages |
376-389 |
|
|
Keywords |
road detection |
|
|
Abstract |
Road scene segmentation is important in computer vision for different applications such as autonomous driving and pedestrian detection. Recovering the 3D structure of road scenes provides relevant contextual information to improve their understanding.
In this paper, we use a convolutional neural network based algorithm to learn features from noisy labels to recover the 3D scene layout of a road image. The novelty of the algorithm relies on generating training labels by applying an algorithm trained on a general image dataset to classify on–board images. Further, we propose a novel texture descriptor based on a learned color plane fusion to obtain maximal uniformity in road areas. Finally, acquired (off–line) and current (on–line) information are combined to detect road areas in single images.
From quantitative and qualitative experiments, conducted on publicly available datasets, it is concluded that convolutional neural networks are suitable for learning 3D scene layout from noisy labels and provides a relative improvement of 7% compared to the baseline. Furthermore, combining color planes provides a statistical description of road areas that exhibits maximal uniformity and provides a relative improvement of 8% compared to the baseline. Finally, the improvement is even bigger when acquired and current information from a single image are combined |
|
|
Address |
Florence, Italy |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
Springer Berlin Heidelberg |
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
LNCS |
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
0302-9743 |
ISBN |
978-3-642-33785-7 |
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
ECCV |
|
|
Notes |
ADAS;ISE |
Approved |
no |
|
|
Call Number |
Admin @ si @ AGL2012; ADAS @ adas @ agl2012a |
Serial |
2022 |
|
Permanent link to this record |
|
|
|
|
Author |
Fadi Dornaika; Alireza Bosaghzadeh; Bogdan Raducanu |
|
|
Title |
LSDA Solution Schemes for Modelless 3D Head Pose Estimation |
Type |
Conference Article |
|
Year |
2012 |
Publication |
IEEE Workshop on the Applications of Computer Vision |
Abbreviated Journal |
|
|
|
Volume |
|
Issue |
|
Pages |
393-398 |
|
|
Keywords |
|
|
|
Abstract |
|
|
|
Address |
Breckenridge; USA; |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
WACV |
|
|
Notes |
OR;MV |
Approved |
no |
|
|
Call Number |
Admin @ si @ DBR2012 |
Serial |
1889 |
|
Permanent link to this record |
|
|
|
|
Author |
Bhaskar Chakraborty; Michael Holte; Thomas B. Moeslund; Jordi Gonzalez |
|
|
Title |
Selective Spatio-Temporal Interest Points |
Type |
Journal Article |
|
Year |
2012 |
Publication |
Computer Vision and Image Understanding |
Abbreviated Journal |
CVIU |
|
|
Volume |
116 |
Issue |
3 |
Pages |
396-410 |
|
|
Keywords |
|
|
|
Abstract |
Recent progress in the field of human action recognition points towards the use of Spatio-TemporalInterestPoints (STIPs) for local descriptor-based recognition strategies. In this paper, we present a novel approach for robust and selective STIP detection, by applying surround suppression combined with local and temporal constraints. This new method is significantly different from existing STIP detection techniques and improves the performance by detecting more repeatable, stable and distinctive STIPs for human actors, while suppressing unwanted background STIPs. For action representation we use a bag-of-video words (BoV) model of local N-jet features to build a vocabulary of visual-words. To this end, we introduce a novel vocabulary building strategy by combining spatial pyramid and vocabulary compression techniques, resulting in improved performance and efficiency. Action class specific Support Vector Machine (SVM) classifiers are trained for categorization of human actions. A comprehensive set of experiments on popular benchmark datasets (KTH and Weizmann), more challenging datasets of complex scenes with background clutter and camera motion (CVC and CMU), movie and YouTube video clips (Hollywood 2 and YouTube), and complex scenes with multiple actors (MSR I and Multi-KTH), validates our approach and show state-of-the-art performance. Due to the unavailability of ground truth action annotation data for the Multi-KTH dataset, we introduce an actor specific spatio-temporal clustering of STIPs to address the problem of automatic action annotation of multiple simultaneous actors. Additionally, we perform cross-data action recognition by training on source datasets (KTH and Weizmann) and testing on completely different and more challenging target datasets (CVC, CMU, MSR I and Multi-KTH). This documents the robustness of our proposed approach in the realistic scenario, using separate training and test datasets, which in general has been a shortcoming in the performance evaluation of human action recognition techniques. |
|
|
Address |
|
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
Elsevier |
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
1077-3142 |
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
|
|
|
Notes |
ISE |
Approved |
no |
|
|
Call Number |
Admin @ si @ CHM2012 |
Serial |
1806 |
|
Permanent link to this record |
|
|
|
|
Author |
Diego Cheda; Daniel Ponsa; Antonio Lopez |
|
|
Title |
Monocular Egomotion Estimation based on Image Matching |
Type |
Conference Article |
|
Year |
2012 |
Publication |
1st International Conference on Pattern Recognition Applications and Methods |
Abbreviated Journal |
|
|
|
Volume |
|
Issue |
|
Pages |
425-430 |
|
|
Keywords |
SLAM |
|
|
Abstract |
|
|
|
Address |
Portugal |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
ICPRAM |
|
|
Notes |
ADAS |
Approved |
no |
|
|
Call Number |
Admin @ si @ CPL2012a;; ADAS @ adas @ |
Serial |
2011 |
|
Permanent link to this record |
|
|
|
|
Author |
Fernando Barrera; Felipe Lumbreras; Angel Sappa |
|
|
Title |
Multimodal Stereo Vision System: 3D Data Extraction and Algorithm Evaluation |
Type |
Journal Article |
|
Year |
2012 |
Publication |
IEEE Journal of Selected Topics in Signal Processing |
Abbreviated Journal |
J-STSP |
|
|
Volume |
6 |
Issue |
5 |
Pages |
437-446 |
|
|
Keywords |
|
|
|
Abstract |
This paper proposes an imaging system for computing sparse depth maps from multispectral images. A special stereo head consisting of an infrared and a color camera defines the proposed multimodal acquisition system. The cameras are rigidly attached so that their image planes are parallel. Details about the calibration and image rectification procedure are provided. Sparse disparity maps are obtained by the combined use of mutual information enriched with gradient information. The proposed approach is evaluated using a Receiver Operating Characteristics curve. Furthermore, a multispectral dataset, color and infrared images, together with their corresponding ground truth disparity maps, is generated and used as a test bed. Experimental results in real outdoor scenarios are provided showing its viability and that the proposed approach is not restricted to a specific domain. |
|
|
Address |
|
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
1932-4553 |
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
|
|
|
Notes |
ADAS |
Approved |
no |
|
|
Call Number |
Admin @ si @ BLS2012b |
Serial |
2155 |
|
Permanent link to this record |
|
|
|
|
Author |
Jon Almazan; David Fernandez; Alicia Fornes; Josep Llados; Ernest Valveny |
|
|
Title |
A Coarse-to-Fine Approach for Handwritten Word Spotting in Large Scale Historical Documents Collection |
Type |
Conference Article |
|
Year |
2012 |
Publication |
13th International Conference on Frontiers in Handwriting Recognition |
Abbreviated Journal |
|
|
|
Volume |
|
Issue |
|
Pages |
453-458 |
|
|
Keywords |
|
|
|
Abstract |
In this paper we propose an approach for word spotting in handwritten document images. We state the problem from a focused retrieval perspective, i.e. locating instances of a query word in a large scale dataset of digitized manuscripts. We combine two approaches, namely one based on word segmentation and another one segmentation-free. The first approach uses a hashing strategy to coarsely prune word images that are unlikely to be instances of the query word. This process is fast but has a low precision due to the errors introduced in the segmentation step. The regions containing candidate words are sent to the second process based on a state of the art technique from the visual object detection field. This discriminative model represents the appearance of the query word and computes a similarity score. In this way we propose a coarse-to-fine approach achieving a compromise between efficiency and accuracy. The validation of the model is shown using a collection of old handwritten manuscripts. We appreciate a substantial improvement in terms of precision regarding the previous proposed method with a low computational cost increase. |
|
|
Address |
|
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
978-1-4673-2262-1 |
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
ICFHR |
|
|
Notes |
DAG |
Approved |
no |
|
|
Call Number |
DAG @ dag @ AFF2012 |
Serial |
1983 |
|
Permanent link to this record |
|
|
|
|
Author |
Bogdan Raducanu; Fadi Dornaika |
|
|
Title |
Appearance-based Face Recognition Using A Supervised Manifold Learning Framework |
Type |
Conference Article |
|
Year |
2012 |
Publication |
IEEE Workshop on the Applications of Computer Vision |
Abbreviated Journal |
|
|
|
Volume |
|
Issue |
|
Pages |
465-470 |
|
|
Keywords |
|
|
|
Abstract |
Many natural image sets, depicting objects whose appearance is changing due to motion, pose or light variations, can be considered samples of a low-dimension nonlinear manifold embedded in the high-dimensional observation space (the space of all possible images). The main contribution of our work is represented by a Supervised Laplacian Eigemaps (S-LE) algorithm, which exploits the class label information for mapping the original data in the embedded space. Our proposed approach benefits from two important properties: i) it is discriminative, and ii) it adaptively selects the neighbors of a sample without using any predefined neighborhood size. Experiments were conducted on four face databases and the results demonstrate that the proposed algorithm significantly outperforms many linear and non-linear embedding techniques. Although we've focused on the face recognition problem, the proposed approach could also be extended to other category of objects characterized by large variance in their appearance. |
|
|
Address |
Breckenridge; CO; USA |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
IEEE Xplore |
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
1550-5790 |
ISBN |
978-1-4673-0233-3 |
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
WACV |
|
|
Notes |
OR;MV |
Approved |
no |
|
|
Call Number |
Admin @ si @ RaD2012d |
Serial |
1890 |
|
Permanent link to this record |