Juan Ignacio Toledo, Sebastian Sudholt, Alicia Fornes, Jordi Cucurull, A. Fink, & Josep Llados. (2016). Handwritten Word Image Categorization with Convolutional Neural Networks and Spatial Pyramid Pooling. In Joint IAPR International Workshops on Statistical Techniques in Pattern Recognition (SPR) and Structural and Syntactic Pattern Recognition (SSPR) (Vol. 10029, pp. 543–552). LNCS. Springer International Publishing.
Abstract: The extraction of relevant information from historical document collections is one of the key steps in order to make these documents available for access and searches. The usual approach combines transcription and grammars in order to extract semantically meaningful entities. In this paper, we describe a new method to obtain word categories directly from non-preprocessed handwritten word images. The method can be used to directly extract information, being an alternative to the transcription. Thus it can be used as a first step in any kind of syntactical analysis. The approach is based on Convolutional Neural Networks with a Spatial Pyramid Pooling layer to deal with the different shapes of the input images. We performed the experiments on a historical marriage record dataset, obtaining promising results.
Keywords: Document image analysis; Word image categorization; Convolutional neural networks; Named entity detection
|
Antoni Gurgui, Debora Gil, Enric Marti, & Vicente Grau. (2016). Left-Ventricle Basal Region Constrained Parametric Mapping to Unitary Domain. In 7th International Workshop on Statistical Atlases & Computational Modelling of the Heart (Vol. 10124, pp. 163–171). LNCS.
Abstract: Due to its complex geometry, the basal ring is often omitted when putting different heart geometries into correspondence. In this paper, we present the first results on a new mapping of the left ventricle basal rings onto a normalized coordinate system using a fold-over free approach to the solution to the Laplacian. To guarantee correspondences between different basal rings, we imposed some internal constrained positions at anatomical landmarks in the normalized coordinate system. To prevent internal fold-overs, constraints are handled by cutting the volume into regions defined by anatomical features and mapping each piece of the volume separately. Initial results presented in this paper indicate that our method is able to handle internal constrains without introducing fold-overs and thus guarantees one-to-one mappings between different basal ring geometries.
Keywords: Laplacian; Constrained maps; Parameterization; Basal ring
|
Carles Sanchez, Debora Gil, Jorge Bernal, F. Javier Sanchez, Marta Diez-Ferrer, & Antoni Rosell. (2016). Navigation Path Retrieval from Videobronchoscopy using Bronchial Branches. In 19th International Conference on Medical Image Computing and Computer Assisted Intervention Workshops (Vol. 9401, pp. 62–70). LNCS.
Abstract: Bronchoscopy biopsy can be used to diagnose lung cancer without risking complications of other interventions like transthoracic needle aspiration. During bronchoscopy, the clinician has to navigate through the bronchial tree to the target lesion. A main drawback is the difficulty to check whether the exploration is following the correct path. The usual guidance using fluoroscopy implies repeated radiation of the clinician, while alternative systems (like electromagnetic navigation) require specific equipment that increases intervention costs. We propose to compute the navigated path using anatomical landmarks extracted from the sole analysis of videobronchoscopy images. Such landmarks allow matching the current exploration to the path previously planned on a CT to indicate clinician whether the planning is being correctly followed or not. We present a feasibility study of our landmark based CT-video matching using bronchoscopic videos simulated on a virtual bronchoscopy interactive interface.
Keywords: Bronchoscopy navigation; Lumen center; Brochial branches; Navigation path; Videobronchoscopy
|
Juan A. Carvajal Ayala, Dennis Romero, & Angel Sappa. (2016). Fine-tuning based deep convolutional networks for lepidopterous genus recognition. In 21st Ibero American Congress on Pattern Recognition (pp. 467–475). LNCS.
Abstract: This paper describes an image classification approach oriented to identify specimens of lepidopterous insects at Ecuadorian ecological reserves. This work seeks to contribute to studies in the area of biology about genus of butterflies and also to facilitate the registration of unrecognized specimens. The proposed approach is based on the fine-tuning of three widely used pre-trained Convolutional Neural Networks (CNNs). This strategy is intended to overcome the reduced number of labeled images. Experimental results with a dataset labeled by expert biologists is presented, reaching a recognition accuracy above 92%.
|
Marco Bellantonio, Mohammad A. Haque, Pau Rodriguez, Kamal Nasrollahi, Taisi Telve, Sergio Escalera, et al. (2016). Spatio-Temporal Pain Recognition in CNN-based Super-Resolved Facial Images. In 23rd International Conference on Pattern Recognition (Vol. 10165). LNCS.
Abstract: Automatic pain detection is a long expected solution to a prevalent medical problem of pain management. This is more relevant when the subject of pain is young children or patients with limited ability to communicate about their pain experience. Computer vision-based analysis of facial pain expression provides a way of efficient pain detection. When deep machine learning methods came into the scene, automatic pain detection exhibited even better performance. In this paper, we figured out three important factors to exploit in automatic pain detection: spatial information available regarding to pain in each of the facial video frames, temporal axis information regarding to pain expression pattern in a subject video sequence, and variation of face resolution. We employed a combination of convolutional neural network and recurrent neural network to setup a deep hybrid pain detection framework that is able to exploit both spatial and temporal pain information from facial video. In order to analyze the effect of different facial resolutions, we introduce a super-resolution algorithm to generate facial video frames with different resolution setups. We investigated the performance on the publicly available UNBC-McMaster Shoulder Pain database. As a contribution, the paper provides novel and important information regarding to the performance of a hybrid deep learning framework for pain detection in facial images of different resolution.
|
Iiris Lusi, Sergio Escalera, & Gholamreza Anbarjafari. (2016). Human Head Pose Estimation on SASE database using Random Hough Regression Forests. In 23rd International Conference on Pattern Recognition Workshops (Vol. 10165). LNCS.
Abstract: In recent years head pose estimation has become an important task in face analysis scenarios. Given the availability of high resolution 3D sensors, the design of a high resolution head pose database would be beneficial for the community. In this paper, Random Hough Forests are used to estimate 3D head pose and location on a new 3D head database, SASE, which represents the baseline performance on the new data for an upcoming international head pose estimation competition. The data in SASE is acquired with a Microsoft Kinect 2 camera, including the RGB and depth information of 50 subjects with a large sample of head poses, allowing us to test methods for real-life scenarios. We briefly review the database while showing baseline head pose estimation results based on Random Hough Forests.
|