|
Partha Pratim Roy, Josep Llados, & Umapada Pal. (2009). A Complete System for Detection and Recognition of Text in Graphical Documents using Background Information. In 5th International Conference on Computer Vision Theory and Applications.
|
|
|
Hongxing Gao, Marçal Rusiñol, Dimosthenis Karatzas, Apostolos Antonacopoulos, & Josep Llados. (2013). An interactive appearance-based document retrieval system for historical newspapers. In Proceedings of the International Conference on Computer Vision Theory and Applications (pp. 84–87).
Abstract: In this paper we present a retrieval-based application aimed at assisting a user to semi-automatically segment an incoming flow of historical newspaper images by automatically detecting a particular type of pages based on their appearance. A visual descriptor is used to assess page similarity while a relevance feedback process allow refining the results iteratively. The application is tested on a large dataset of digitised historic newspapers.
|
|
|
Diego Cheda, Daniel Ponsa, & Antonio Lopez. (2012). Monocular Depth-based Background Estimation. In 7th International Conference on Computer Vision Theory and Applications (pp. 323–328).
Abstract: In this paper, we address the problem of reconstructing the background of a scene from a video sequence with occluding objects. The images are taken by hand-held cameras. Our method composes the background by selecting the appropriate pixels from previously aligned input images. To do that, we minimize a cost function that penalizes the deviations from the following assumptions: background represents objects whose distance to the camera is maximal, and background objects are stationary. Distance information is roughly obtained by a supervised learning approach that allows us to distinguish between close and distant image regions. Moving foreground objects are filtered out by using stationariness and motion boundary constancy measurements. The cost function is minimized by a graph cuts method. We demonstrate the applicability of our approach to recover an occlusion-free background in a set of sequences.
|
|
|
Christophe Rigaud, Dimosthenis Karatzas, Joost Van de Weijer, Jean-Christophe Burie, & Jean-Marc Ogier. (2013). Automatic text localisation in scanned comic books. In Proceedings of the International Conference on Computer Vision Theory and Applications (pp. 814–819).
Abstract: Comic books constitute an important cultural heritage asset in many countries. Digitization combined with subsequent document understanding enable direct content-based search as opposed to metadata only search (e.g. album title or author name). Few studies have been done in this direction. In this work we detail a novel approach for the automatic text localization in scanned comics book pages, an essential step towards a fully automatic comics book understanding. We focus on speech text as it is semantically important and represents the majority of the text present in comics. The approach is compared with existing methods of text localization found in the literature and results are presented.
Keywords: Text localization; comics; text/graphic separation; complex background; unstructured document
|
|
|
Carles Sanchez, Debora Gil, Antoni Rosell, Albert Andaluz, & F. Javier Sanchez. (2013). Segmentation of Tracheal Rings in Videobronchoscopy combining Geometry and Appearance. In Sebastiano Battiato and José Braz (Ed.), Proceedings of the International Conference on Computer Vision Theory and Applications (Vol. 1, pp. 153–161). LNCS. Portugal: SciTePress.
Abstract: Videobronchoscopy is a medical imaging technique that allows interactive navigation inside the respiratory pathways and minimal invasive interventions. Tracheal procedures are ordinary interventions that require measurement of the percentage of obstructed pathway for injury (stenosis) assessment. Visual assessment of stenosis in videobronchoscopic sequences requires high expertise of trachea anatomy and is prone to human error. Accurate detection of tracheal rings is the basis for automated estimation of the size of stenosed trachea. Processing of videobronchoscopic images acquired at the operating room is a challenging task due to the wide range of artifacts and acquisition conditions. We present a model of the geometric-appearance of tracheal rings for its detection in videobronchoscopic videos. Experiments on sequences acquired at the operating room, show a performance close to inter-observer variability
Keywords: Video-bronchoscopy, tracheal ring segmentation, trachea geometric and appearance model
|
|
|
Pedro Martins, Carlo Gatta, & Paulo Carvalho. (2012). Feature-driven Maximally Stable Extremal Regions. In 7th International Conference on Computer Vision Theory and Applications (pp. 490–497).
|
|
|
Patricia Marquez, Debora Gil, R.Mester, & Aura Hernandez-Sabate. (2014). Local Analysis of Confidence Measures for Optical Flow Quality Evaluation. In 9th International Conference on Computer Vision Theory and Applications (Vol. 3, pp. 450–457).
Abstract: Optical Flow (OF) techniques facing the complexity of real sequences have been developed in the last years. Even using the most appropriate technique for our specific problem, at some points the output flow might fail to achieve the minimum error required for the system. Confidence measures computed from either input data or OF output should discard those points where OF is not accurate enough for its further use. It follows that evaluating the capabilities of a confidence measure for bounding OF error is as important as the definition
itself. In this paper we analyze different confidence measures and point out their advantages and limitations for their use in real world settings. We also explore the agreement with current tools for their evaluation of confidence measures performance.
Keywords: Optical Flow; Confidence Measure; Performance Evaluation.
|
|
|
Q. Xue, Laura Igual, A. Berenguel, M. Guerrieri, & L. Garrido. (2014). Active Contour Segmentation with Affine Coordinate-Based Parametrization. In 9th International Conference on Computer Vision Theory and Applications (Vol. 1, pp. 5–14).
Abstract: In this paper, we present a new framework for image segmentation based on parametrized active contours. The contour and the points of the image space are parametrized using a set of reduced control points that have to form a closed polygon in two dimensional problems and a closed surface in three dimensional problems. By moving the control points, the active contour evolves. We use mean value coordinates as the parametrization tool for the interface, which allows to parametrize any point of the space, inside or outside the closed polygon
or surface. Region-based energies such as the one proposed by Chan and Vese can be easily implemented in both two and three dimensional segmentation problems. We show the usefulness of our approach with several experiments.
Keywords: Active Contours; Affine Coordinates; Mean Value Coordinates
|
|
|
P. Ricaurte, C. Chilan, Cristhian A. Aguilera-Carrasco, Boris X. Vintimilla, & Angel Sappa. (2014). Performance Evaluation of Feature Point Descriptors in the Infrared Domain. In 9th International Conference on Computer Vision Theory and Applications (Vol. 1, pp. 545–550).
Abstract: This paper presents a comparative evaluation of classical feature point descriptors when they are used in the long-wave infrared spectral band. Robustness to changes in rotation, scaling, blur, and additive noise are evaluated using a state of the art framework. Statistical results using an outdoor image data set are presented together with a discussion about the differences with respect to the results obtained when images from the visible spectrum are considered.
Keywords: Infrared Imaging; Feature Point Descriptors
|
|
|
Naveen Onkarappa, Cristhian A. Aguilera-Carrasco, Boris X. Vintimilla, & Angel Sappa. (2014). Cross-spectral Stereo Correspondence using Dense Flow Fields. In 9th International Conference on Computer Vision Theory and Applications (Vol. 3, pp. 613–617).
Abstract: This manuscript addresses the cross-spectral stereo correspondence problem. It proposes the usage of a dense flow field based representation instead of the original cross-spectral images, which have a low correlation. In this way, working in the flow field space, classical cost functions can be used as similarity measures. Preliminary experimental results on urban environments have been obtained showing the validity of the proposed approach.
Keywords: Cross-spectral Stereo Correspondence; Dense Optical Flow; Infrared and Visible Spectrum
|
|
|
Ariel Amato, Felipe Lumbreras, & Angel Sappa. (2014). A General-purpose Crowdsourcing Platform for Mobile Devices. In 9th International Conference on Computer Vision Theory and Applications (Vol. 3, pp. 211–215).
Abstract: This paper presents details of a general purpose micro-task on-demand platform based on the crowdsourcing philosophy. This platform was specifically developed for mobile devices in order to exploit the strengths of such devices; namely: i) massivity, ii) ubiquity and iii) embedded sensors. The combined use of mobile platforms and the crowdsourcing model allows to tackle from the simplest to the most complex tasks. Users experience is the highlighted feature of this platform (this fact is extended to both task-proposer and tasksolver). Proper tools according with a specific task are provided to a task-solver in order to perform his/her job in a simpler, faster and appealing way. Moreover, a task can be easily submitted by just selecting predefined templates, which cover a wide range of possible applications. Examples of its usage in computer vision and computer games are provided illustrating the potentiality of the platform.
Keywords: Crowdsourcing Platform; Mobile Crowdsourcing
|
|
|
Antoni Gurgui, Debora Gil, & Enric Marti. (2015). Laplacian Unitary Domain for Texture Morphing. In Proceedings of the 10th International Conference on Computer Vision Theory and Applications VISIGRAPP2015 (Vol. 1, pp. 693–699). SciTePress.
Abstract: Deformation of expressive textures is the gateway to realistic computer synthesis of expressions. By their good mathematical properties and flexible formulation on irregular meshes, most texture mappings rely on solutions to the Laplacian in the cartesian space. In the context of facial expression morphing, this approximation can be seen from the opposite point of view by neglecting the metric. In this paper, we use the properties of the Laplacian in manifolds to present a novel approach to warping expressive facial images in order to generate a morphing between them.
Keywords: Facial; metamorphosis;LaplacianMorphing
|
|
|
Carles Sanchez, Antonio Esteban Lansaque, Agnes Borras, Marta Diez-Ferrer, Antoni Rosell, & Debora Gil. (2017). Towards a Videobronchoscopy Localization System from Airway Centre Tracking. In 12th International Conference on Computer Vision Theory and Applications (pp. 352–359).
Abstract: Bronchoscopists use fluoroscopy to guide flexible bronchoscopy to the lesion to be biopsied without any kind of incision. Being fluoroscopy an imaging technique based on X-rays, the risk of developmental problems and cancer is increased in those subjects exposed to its application, so minimizing radiation is crucial. Alternative guiding systems such as electromagnetic navigation require specific equipment, increase the cost of the clinical procedure and still require fluoroscopy. In this paper we propose an image based guiding system based on the extraction of airway centres from intra-operative videos. Such anatomical landmarks are matched to the airway centreline extracted from a pre-planned CT to indicate the best path to the nodule. We present a
feasibility study of our navigation system using simulated bronchoscopic videos and a multi-expert validation of landmarks extraction in 3 intra-operative ultrathin explorations.
Keywords: Video-bronchoscopy; Lung cancer diagnosis; Airway lumen detection; Region tracking; Guided bronchoscopy navigation
|
|
|
Mohamed Ilyes Lakhal, Hakan Cevikalp, & Sergio Escalera. (2018). CRN: End-to-end Convolutional Recurrent Network Structure Applied to Vehicle Classification. In 13th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications (Vol. 5, pp. 137–144).
Abstract: Vehicle type classification is considered to be a central part of Intelligent Traffic Systems. In the recent years, deep learning methods have emerged in as being the state-of-the-art in many computer vision tasks. In this paper, we present a novel yet simple deep learning framework for the vehicle type classification problem. We propose an end-to-end trainable system, that combines convolution neural network for feature extraction and recurrent neural network as a classifier. The recurrent network structure is used to handle various types of feature inputs, and at the same time allows to produce a single or a set of class predictions. In order to assess the effectiveness of our solution, we have conducted a set of experiments in two public datasets, obtaining state of the art results. In addition, we also report results on the newly released MIO-TCD dataset.
Keywords: Vehicle Classification; Deep Learning; End-to-end Learning
|
|
|
Rafael E. Rivadeneira, Angel Sappa, & Boris X. Vintimilla. (2020). Thermal Image Super-resolution: A Novel Architecture and Dataset. In 15th International Conference on Computer Vision Theory and Applications (pp. 111–119).
Abstract: This paper proposes a novel CycleGAN architecture for thermal image super-resolution, together with a large dataset consisting of thermal images at different resolutions. The dataset has been acquired using three thermal cameras at different resolutions, which acquire images from the same scenario at the same time. The thermal cameras are mounted in rig trying to minimize the baseline distance to make easier the registration problem.
The proposed architecture is based on ResNet6 as a Generator and PatchGAN as Discriminator. The novelty on the proposed unsupervised super-resolution training (CycleGAN) is possible due to the existence of aforementioned thermal images—images of the same scenario with different resolutions. The proposed approach is evaluated in the dataset and compared with classical bicubic interpolation. The dataset and the network are available.
|
|