|
Gabriel Villalonga, Sebastian Ramos, German Ros, David Vazquez, & Antonio Lopez. (2014). 3d Pedestrian Detection via Random Forest.
Abstract: Our demo focuses on showing the extraordinary performance of our novel 3D pedestrian detector along with its simplicity and real-time capabilities. This detector has been designed for autonomous driving applications, but it can also be applied in other scenarios that cover both outdoor and indoor applications.
Our pedestrian detector is based on the combination of a random forest classifier with HOG-LBP features and the inclusion of a preprocessing stage based on 3D scene information in order to precisely determinate the image regions where the detector should search for pedestrians. This approach ends up in a high accurate system that runs real-time as it is required by many computer vision and robotics applications.
Keywords: Pedestrian Detection
|
|
|
Pau Riba, Jon Almazan, Alicia Fornes, David Fernandez, Ernest Valveny, & Josep Llados. (2014). e-Crowds: a mobile platform for browsing and searching in historical demographyrelated manuscripts. In 14th International Conference on Frontiers in Handwriting Recognition (pp. 228–233).
Abstract: This paper presents a prototype system running on portable devices for browsing and word searching through historical handwritten document collections. The platform adapts the paradigm of eBook reading, where the narrative is not necessarily sequential, but centered on the user actions. The novelty is to replace digitally born books by digitized historical manuscripts of marriage licenses, so document analysis tasks are required in the browser. With an active reading paradigm, the user can cast queries of people names, so he/she can implicitly follow genealogical links. In addition, the system allows combined searches: the user can refine a search by adding more words to search. As a second contribution, the retrieval functionality involves as a core technology a word spotting module with an unified approach, which allows combined query searches, and also two input modalities: query-by-example, and query-by-string.
|
|
|
C. Alejandro Parraga, Jordi Roca, Dimosthenis Karatzas, & Sophie Wuerger. (2014). Limitations of visual gamma corrections in LCD displays. Dis - Displays, 35(5), 227–239.
Abstract: A method for estimating the non-linear gamma transfer function of liquid–crystal displays (LCDs) without the need of a photometric measurement device was described by Xiao et al. (2011) [1]. It relies on observer’s judgments of visual luminance by presenting eight half-tone patterns with luminances from 1/9 to 8/9 of the maximum value of each colour channel. These half-tone patterns were distributed over the screen both over the vertical and horizontal viewing axes. We conducted a series of photometric and psychophysical measurements (consisting in the simultaneous presentation of half-tone patterns in each trial) to evaluate whether the angular dependency of the light generated by three different LCD technologies would bias the results of these gamma transfer function estimations. Our results show that there are significant differences between the gamma transfer functions measured and produced by observers at different viewing angles. We suggest appropriate modifications to the Xiao et al. paradigm to counterbalance these artefacts which also have the advantage of shortening the amount of time spent in collecting the psychophysical measurements.
Keywords: Display calibration; Psychophysics; Perceptual; Visual gamma correction; Luminance matching; Observer-based calibration
|
|
|
Jorge Bernal, Fernando Vilariño, F. Javier Sanchez, M. Arnold, Anarta Ghosh, & Gerard Lacey. (2014). Experts vs Novices: Applying Eye-tracking Methodologies in Colonoscopy Video Screening for Polyp Search. In 2014 Symposium on Eye Tracking Research and Applications (pp. 223–226).
Abstract: We present in this paper a novel study aiming at identifying the differences in visual search patterns between physicians of diverse levels of expertise during the screening of colonoscopy videos. Physicians were clustered into two groups -experts and novices- according to the number of procedures performed, and fixations were captured by an eye-tracker device during the task of polyp search in different video sequences. These fixations were integrated into heat maps, one for each cluster. The obtained maps were validated over a ground truth consisting of a mask of the polyp, and the comparison between experts and novices was performed by using metrics such as reaction time, dwelling time and energy concentration ratio. Experimental results show a statistically significant difference between experts and novices, and the obtained maps show to be a useful tool for the characterisation of the behaviour of each group.
|
|
|
Lluis Pere de las Heras, Ahmed Sheraz, Marcus Liwicki, Ernest Valveny, & Gemma Sanchez. (2014). Statistical Segmentation and Structural Recognition for Floor Plan Interpretation. IJDAR - International Journal on Document Analysis and Recognition, 17(3), 221–237.
Abstract: A generic method for floor plan analysis and interpretation is presented in this article. The method, which is mainly inspired by the way engineers draw and interpret floor plans, applies two recognition steps in a bottom-up manner. First, basic building blocks, i.e., walls, doors, and windows are detected using a statistical patch-based segmentation approach. Second, a graph is generated, and structural pattern recognition techniques are applied to further locate the main entities, i.e., rooms of the building. The proposed approach is able to analyze any type of floor plan regardless of the notation used. We have evaluated our method on different publicly available datasets of real architectural floor plans with different notations. The overall detection and recognition accuracy is about 95 %, which is significantly better than any other state-of-the-art method. Our approach is generic enough such that it could be easily adopted to the recognition and interpretation of any other printed machine-generated structured documents.
|
|
|
Ariel Amato, Felipe Lumbreras, & Angel Sappa. (2014). A General-purpose Crowdsourcing Platform for Mobile Devices. In 9th International Conference on Computer Vision Theory and Applications (Vol. 3, pp. 211–215).
Abstract: This paper presents details of a general purpose micro-task on-demand platform based on the crowdsourcing philosophy. This platform was specifically developed for mobile devices in order to exploit the strengths of such devices; namely: i) massivity, ii) ubiquity and iii) embedded sensors. The combined use of mobile platforms and the crowdsourcing model allows to tackle from the simplest to the most complex tasks. Users experience is the highlighted feature of this platform (this fact is extended to both task-proposer and tasksolver). Proper tools according with a specific task are provided to a task-solver in order to perform his/her job in a simpler, faster and appealing way. Moreover, a task can be easily submitted by just selecting predefined templates, which cover a wide range of possible applications. Examples of its usage in computer vision and computer games are provided illustrating the potentiality of the platform.
Keywords: Crowdsourcing Platform; Mobile Crowdsourcing
|
|
|
P. Wang, V. Eglin, C. Garcia, C. Largeron, Josep Llados, & Alicia Fornes. (2014). A Novel Learning-free Word Spotting Approach Based on Graph Representation. In 11th IAPR International Workshop on Document Analysis and Systems (pp. 207–211).
Abstract: Effective information retrieval on handwritten document images has always been a challenging task. In this paper, we propose a novel handwritten word spotting approach based on graph representation. The presented model comprises both topological and morphological signatures of handwriting. Skeleton-based graphs with the Shape Context labelled vertexes are established for connected components. Each word image is represented as a sequence of graphs. In order to be robust to the handwriting variations, an exhaustive merging process based on DTW alignment result is introduced in the similarity measure between word images. With respect to the computation complexity, an approximate graph edit distance approach using bipartite matching is employed for graph matching. The experiments on the George Washington dataset and the marriage records from the Barcelona Cathedral dataset demonstrate that the proposed approach outperforms the state-of-the-art structural methods.
|
|
|
Alicia Fornes, V.C.Kieu, M. Visani, N.Journet, & Anjan Dutta. (2014). The ICDAR/GREC 2013 Music Scores Competition: Staff Removal. In B.Lamiroy, & J.-M. Ogier (Eds.), Graphics Recognition. Current Trends and Challenges (Vol. 8746, pp. 207–220). LNCS. Springer Berlin Heidelberg.
Abstract: The first competition on music scores that was organized at ICDAR and GREC in 2011 awoke the interest of researchers, who participated in both staff removal and writer identification tasks. In this second edition, we focus on the staff removal task and simulate a real case scenario concerning old and degraded music scores. For this purpose, we have generated a new set of semi-synthetic images using two degradation models that we previously introduced: local noise and 3D distortions. In this extended paper we provide an extended description of the dataset, degradation models, evaluation metrics, the participant’s methods and the obtained results that could not be presented at ICDAR and GREC proceedings due to page limitations.
Keywords: Competition; Graphics recognition; Music scores; Writer identification; Staff removal
|
|
|
R. Clariso, David Masip, & A. Rius. (2014). Student projects empowering mobile learning in higher education. RUSC - Revista de Universidad y Sociedad del Conocimiento, 192–207.
|
|
|
Marçal Rusiñol, J. Chazalon, & Jean-Marc Ogier. (2014). Combining Focus Measure Operators to Predict OCR Accuracy in Mobile-Captured Document Images. In 11th IAPR International Workshop on Document Analysis and Systems (pp. 181–185).
Abstract: Mobile document image acquisition is a new trend raising serious issues in business document processing workflows. Such digitization procedure is unreliable, and integrates many distortions which must be detected as soon as possible, on the mobile, to avoid paying data transmission fees, and losing information due to the inability to re-capture later a document with temporary availability. In this context, out-of-focus blur is major issue: users have no direct control over it, and it seriously degrades OCR recognition. In this paper, we concentrate on the estimation of focus quality, to ensure a sufficient legibility of a document image for OCR processing. We propose two contributions to improve OCR accuracy prediction for mobile-captured document images. First, we present 24 focus measures, never tested on document images, which are fast to compute and require no training. Second, we show that a combination of those measures enables state-of-the art performance regarding the correlation with OCR accuracy. The resulting approach is fast, robust, and easy to implement in a mobile device. Experiments are performed on a public dataset, and precise details about image processing are given.
|
|
|
David Roche, Debora Gil, & Jesus Giraldo. (2014). Mathematical modeling of G protein-coupled receptor function: What can we learn from empirical and mechanistic models? In G Protein-Coupled Receptors – Modeling and Simulation Advances in Experimental Medicine and Biology (Vol. 796, pp. 159–181). Springer Netherlands.
Abstract: Empirical and mechanistic models differ in their approaches to the analysis of pharmacological effect. Whereas the parameters of the former are not physical constants those of the latter embody the nature, often complex, of biology. Empirical models are exclusively used for curve fitting, merely to characterize the shape of the E/[A] curves. Mechanistic models, on the contrary, enable the examination of mechanistic hypotheses by parameter simulation. Regretfully, the many parameters that mechanistic models may include can represent a great difficulty for curve fitting, representing, thus, a challenge for computational method development. In the present study some empirical and mechanistic models are shown and the connections, which may appear in a number of cases between them, are analyzed from the curves they yield. It may be concluded that systematic and careful curve shape analysis can be extremely useful for the understanding of receptor function, ligand classification and drug discovery, thus providing a common language for the communication between pharmacologists and medicinal chemists.
Keywords: β-arrestin; biased agonism; curve fitting; empirical modeling; evolutionary algorithm; functional selectivity; G protein; GPCR; Hill coefficient; intrinsic efficacy; inverse agonism; mathematical modeling; mechanistic modeling; operational model; parameter optimization; receptor dimer; receptor oligomerization; receptor constitutive activity; signal transduction; two-state model
|
|
|
Thanh Ha Do, Salvatore Tabbone, & Oriol Ramos Terrades. (2014). Spotting Symbol Using Sparsity over Learned Dictionary of Local Descriptors. In 11th IAPR International Workshop on Document Analysis and Systems (pp. 156–160).
Abstract: This paper proposes a new approach to spot symbols into graphical documents using sparse representations. More specifically, a dictionary is learned from a training database of local descriptors defined over the documents. Following their sparse representations, interest points sharing similar properties are used to define interest regions. Using an original adaptation of information retrieval techniques, a vector model for interest regions and for a query symbol is built based on its sparsity in a visual vocabulary where the visual words are columns in the learned dictionary. The matching process is performed comparing the similarity between vector models. Evaluation on SESYD datasets demonstrates that our method is promising.
|
|
|
Juan Ramon Terven Salinas, Joaquin Salas, & Bogdan Raducanu. (2014). Robust Head Gestures Recognition for Assistive Technology. In Pattern Recognition (Vol. 8495, pp. 152–161). LNCS. Springer International Publishing.
Abstract: This paper presents a system capable of recognizing six head gestures: nodding, shaking, turning right, turning left, looking up, and looking down. The main difference of our system compared to other methods is that the Hidden Markov Models presented in this paper, are fully connected and consider all possible states in any given order, providing the following advantages to the system: (1) allows unconstrained movement of the head and (2) it can be easily integrated into a wearable device (e.g. glasses, neck-hung devices), in which case it can robustly recognize gestures in the presence of ego-motion. Experimental results show that this approach outperforms common methods that use restricted HMMs for each gesture.
|
|
|
Oualid M. Benkarim, Petia Radeva, & Laura Igual. (2014). Label Consistent Multiclass Discriminative Dictionary Learning for MRI Segmentation. In 8th Conference on Articulated Motion and Deformable Objects (Vol. 8563, pp. 138–147). LNCS. Springer International Publishing.
Abstract: The automatic segmentation of multiple subcortical structures in brain Magnetic Resonance Images (MRI) still remains a challenging task. In this paper, we address this problem using sparse representation and discriminative dictionary learning, which have shown promising results in compression, image denoising and recently in MRI segmentation. Particularly, we use multiclass dictionaries learned from a set of brain atlases to simultaneously segment multiple subcortical structures.
We also impose dictionary atoms to be specialized in one given class using label consistent K-SVD, which can alleviate the bias produced by unbalanced libraries, present when dealing with small structures. The proposed method is compared with other state of the art approaches for the segmentation of the Basal Ganglia of 35 subjects of a public dataset.
The promising results of the segmentation method show the eciency of the multiclass discriminative dictionary learning algorithms in MRI segmentation problems.
Keywords: MRI segmentation; sparse representation; discriminative dic- tionary learning; multiclass classication
|
|
|
Naveen Onkarappa, & Angel Sappa. (2014). Speed and Texture: An Empirical Study on Optical-Flow Accuracy in ADAS Scenarios. TITS - IEEE Transactions on Intelligent Transportation Systems, 15(1), 136–147.
Abstract: IF: 3.064
Increasing mobility in everyday life has led to the concern for the safety of automotives and human life. Computer vision has become a valuable tool for developing driver assistance applications that target such a concern. Many such vision-based assisting systems rely on motion estimation, where optical flow has shown its potential. A variational formulation of optical flow that achieves a dense flow field involves a data term and regularization terms. Depending on the image sequence, the regularization has to appropriately be weighted for better accuracy of the flow field. Because a vehicle can be driven in different kinds of environments, roads, and speeds, optical-flow estimation has to be accurately computed in all such scenarios. In this paper, we first present the polar representation of optical flow, which is quite suitable for driving scenarios due to the possibility that it offers to independently update regularization factors in different directional components. Then, we study the influence of vehicle speed and scene texture on optical-flow accuracy. Furthermore, we analyze the relationships of these specific characteristics on a driving scenario (vehicle speed and road texture) with the regularization weights in optical flow for better accuracy. As required by the work in this paper, we have generated several synthetic sequences along with ground-truth flow fields.
|
|