|
Marçal Rusiñol, Dimosthenis Karatzas, Andrew Bagdanov, & Josep Llados. (2012). Multipage Document Retrieval by Textual and Visual Representations. In 21st International Conference on Pattern Recognition (pp. 521–524).
Abstract: In this paper we present a multipage administrative document image retrieval system based on textual and visual representations of document pages. Individual pages are represented by textual or visual information using a bag-of-words framework. Different fusion strategies are evaluated which allow the system to perform multipage document retrieval on the basis of a single page retrieval system. Results are reported on a large dataset of document images sampled from a banking workflow.
|
|
|
Volkmar Frinken, Markus Baumgartner, Andreas Fischer, & Horst Bunke. (2012). Semi-Supervised Learning for Cursive Handwriting Recognition using Keyword Spotting. In 13th International Conference on Frontiers in Handwriting Recognition (pp. 49–54).
Abstract: State-of-the-art handwriting recognition systems are learning-based systems that require large sets of training data. The creation of training data, and consequently the creation of a well-performing recognition system, requires therefore a substantial amount of human work. This can be reduced with semi-supervised learning, which uses unlabeled text lines for training as well. Current approaches estimate the correct transcription of the unlabeled data via handwriting recognition which is not only extremely demanding as far as computational costs are concerned but also requires a good model of the target language. In this paper, we propose a different approach that makes use of keyword spotting, which is significantly faster and does not need any language model. In a set of experiments we demonstrate its superiority over existing approaches.
|
|
|
Joost Van de Weijer, Robert Benavente, Maria Vanrell, Cordelia Schmid, Ramon Baldrich, Jacob Verbeek, et al. (2012). Color Naming. In Theo Gevers, Arjan Gijsenij, Joost Van de Weijer, & Jan-Mark Geusebroek (Eds.), Color in Computer Vision: Fundamentals and Applications (pp. 287–317). John Wiley & Sons, Ltd.
|
|
|
Sergio Escalera, Josep Moya, Laura Igual, Veronica Violant, & Maria Teresa Anguera. (2012). Análisis Comportamental Automatizado de TDAH: la Influencia de la Variable Motivación. In IPSI – Cosmocaixa, Jornadas "Empremtes del present, efectes en la psicoanàlisi, la cultura i la societat.
|
|
|
Angel Sappa, & George A. Triantafyllid. (2012). Computer Graphics and Imaging.
|
|
|
Theo Gevers, Arjan Gijsenij, Joost Van de Weijer, & J.M. Geusebroek. (2012). Color in Computer Vision: Fundamentals and Applications. The Wiley-IS&T Series in Imaging Science and Technology.
|
|
|
Michal Drozdzal, Petia Radeva, Santiago Segui, Laura Igual, Carolina Malagelada, Fernando Azpiroz, et al. (2012). System and method for automatic detection of in vivo contraction video sequences.
Abstract: Publication date: 2012/3/8
|
|
|
Marçal Rusiñol, Lluis Pere de las Heras, Joan Mas, Oriol Ramos Terrades, Dimosthenis Karatzas, Anjan Dutta, et al. (2012). CVC-UAB's participation in the Flowchart Recognition Task of CLEF-IP 2012. In Conference and Labs of the Evaluation Forum.
|
|
|
Shida Beigpour. (2013). Illumination and object reflectance modeling (Joost Van de Weijer, & Ernest Valveny, Eds.). Ph.D. thesis, Ediciones Graficas Rey, .
Abstract: More realistic and accurate models of the scene illumination and object reflectance can greatly improve the quality of many computer vision and computer graphics tasks. Using such model, a more profound knowledge about the interaction of light with object surfaces can be established which proves crucial to a variety of computer vision applications. In the current work, we investigate the various existing approaches to illumination and reflectance modeling and form an analysis on their shortcomings in capturing the complexity of real-world scenes. Based on this analysis we propose improvements to different aspects of reflectance and illumination estimation in order to more realistically model the real-world scenes in the presence of complex lighting phenomena (i.e, multiple illuminants, interreflections and shadows). Moreover, we captured our own multi-illuminant dataset which consists of complex scenes and illumination conditions both outdoor and in laboratory conditions. In addition we investigate the use of synthetic data to facilitate the construction of datasets and improve the process of obtaining ground-truth information.
|
|
|
Christophe Rigaud, Dimosthenis Karatzas, Joost Van de Weijer, Jean-Christophe Burie, & Jean-Marc Ogier. (2013). Automatic text localisation in scanned comic books. In Proceedings of the International Conference on Computer Vision Theory and Applications (pp. 814–819).
Abstract: Comic books constitute an important cultural heritage asset in many countries. Digitization combined with subsequent document understanding enable direct content-based search as opposed to metadata only search (e.g. album title or author name). Few studies have been done in this direction. In this work we detail a novel approach for the automatic text localization in scanned comics book pages, an essential step towards a fully automatic comics book understanding. We focus on speech text as it is semantically important and represents the majority of the text present in comics. The approach is compared with existing methods of text localization found in the literature and results are presented.
Keywords: Text localization; comics; text/graphic separation; complex background; unstructured document
|
|
|
Laura Igual, & Xavier Baro. (2013). Experiencia de aprendizaje de programación basada en proyectos. Simposio-Taller Estrategias y herramientas para el aprendizaje y la evaluación.
|
|
|
S.Grau, Anna Puig, Sergio Escalera, Maria Salamo, & Oscar Amoros. (2013). Efficient complementary viewpoint selection in volume rendering. In 21st WSCG Conference on Computer Graphics,.
Abstract: A major goal of visualization is to appropriately express knowledge of scientific data. Generally, gathering visual information contained in the volume data often requires a lot of expertise from the final user to setup the parameters of the visualization. One way of alleviating this problem is to provide the position of inner structures with different viewpoint locations to enhance the perception and construction of the mental image. To this end, traditional illustrations use two or three different views of the regions of interest. Similarly, with the aim of assisting the users to easily place a good viewpoint location, this paper proposes an automatic and interactive method that locates different complementary viewpoints from a reference camera in volume datasets. Specifically, the proposed method combines the quantity of information each camera provides for each structure and the shape similarity of the projections of the remaining viewpoints based on Dynamic Time Warping. The selected complementary viewpoints allow a better understanding of the focused structure in several applications. Thus, the user interactively receives feedback based on several viewpoints that helps him to understand the visual information. A live-user evaluation on different data sets show a good convergence to useful complementary viewpoints.
Keywords: Dual camera; Visualization; Interactive Interfaces; Dynamic Time Warping.
|
|
|
Vitaliy Konovalov, Albert Clapes, & Sergio Escalera. (2013). Automatic Hand Detection in RGB-Depth Data Sequences. In 16th Catalan Conference on Artificial Intelligence (pp. 91–100). LNCS.
Abstract: Detecting hands in multi-modal RGB-Depth visual data has become a challenging Computer Vision problem with several applications of interest. This task involves dealing with changes in illumination, viewpoint variations, the articulated nature of the human body, the high flexibility of the wrist articulation, and the deformability of the hand itself. In this work, we propose an accurate and efficient automatic hand detection scheme to be applied in Human-Computer Interaction (HCI) applications in which the user is seated at the desk and, thus, only the upper body is visible. Our main hypothesis is that hand landmarks remain at a nearly constant geodesic distance from an automatically located anatomical reference point.
In a given frame, the human body is segmented first in the depth image. Then, a
graph representation of the body is built in which the geodesic paths are computed from the reference point. The dense optical flow vectors on the corresponding RGB image are used to reduce ambiguities of the geodesic paths’ connectivity, allowing to eliminate false edges interconnecting different body parts. Finally, we are able to detect the position of both hands based on invariant geodesic distances and optical flow within the body region, without involving costly learning procedures.
|
|
|
Santiago Segui, Michal Drozdzal, Ekaterina Zaytseva, Carolina Malagelada, Fernando Azpiroz, Petia Radeva, et al. (2013). A new image centrality descriptor for wrinkle frame detection in WCE videos. In 13th IAPR Conference on Machine Vision Applications.
Abstract: Small bowel motility dysfunctions are a widespread functional disorder characterized by abdominal pain and altered bowel habits in the absence of specific and unique organic pathology. Current methods of diagnosis are complex and can only be conducted at some highly specialized referral centers. Wireless Video Capsule Endoscopy (WCE) could be an interesting diagnostic alternative that presents excellent clinical advantages, since it is non-invasive and can be conducted by non specialists. The purpose of this work is to present a new method for the detection of wrinkle frames in WCE, a critical characteristic to detect one of the main motility events: contractions. The method goes beyond the use of one of the classical image feature, the Histogram
|
|
|
Xavier Baro, David Masip, Elena Planas, & Julia Minguillon. (2013). PeLP: Plataforma para el Aprendizaje de Lenguajes de Programación.
|
|