|
Santiago Segui, Michal Drozdzal, Ekaterina Zaytseva, Fernando Azpiroz, Petia Radeva, & Jordi Vitria. (2014). Detection of wrinkle frames in endoluminal videos using betweenness centrality measures for images. TITB - IEEE Transactions on Information Technology in Biomedicine, 18(6), 1831–1838.
Abstract: Intestinal contractions are one of the most important events to diagnose motility pathologies of the small intestine. When visualized by wireless capsule endoscopy (WCE), the sequence of frames that represents a contraction is characterized by a clear wrinkle structure in the central frames that corresponds to the folding of the intestinal wall. In this paper we present a new method to robustly detect wrinkle frames in full WCE videos by using a new mid-level image descriptor that is based on a centrality measure proposed for graphs. We present an extended validation, carried out in a very large database, that shows that the proposed method achieves state of the art performance for this task.
Keywords: Wireless Capsule Endoscopy; Small Bowel Motility Dysfunction; Contraction Detection; Structured Prediction; Betweenness Centrality
|
|
|
Joost Van de Weijer, Cordelia Schmid, Jakob Verbeek, & Diane Larlus. (2009). Learning Color Names for Real-World Applications. TIP - IEEE Transaction in Image Processing, 18(7), 1512–1524.
Abstract: Color names are required in real-world applications such as image retrieval and image annotation. Traditionally, they are learned from a collection of labelled color chips. These color chips are labelled with color names within a well-defined experimental setup by human test subjects. However naming colors in real-world images differs significantly from this experimental setting. In this paper, we investigate how color names learned from color chips compare to color names learned from real-world images. To avoid hand labelling real-world images with color names we use Google Image to collect a data set. Due to limitations of Google Image this data set contains a substantial quantity of wrongly labelled data. We propose several variants of the PLSA model to learn color names from this noisy data. Experimental results show that color names learned from real-world images significantly outperform color names learned from labelled color chips for both image retrieval and image annotation.
|
|
|
Josep Llados, Horst Bunke, & Enric Marti. (1997). Finding rotational symmetries by cyclic string matching. PRL - Pattern recognition letters, 18(14), 1435–1442.
Abstract: Symmetry is an important shape feature. In this paper, a simple and fast method to detect perfect and distorted rotational symmetries of 2D objects is described. The boundary of a shape is polygonally approximated and represented as a string. Rotational symmetries are found by cyclic string matching between two identical copies of the shape string. The set of minimum cost edit sequences that transform the shape string to a cyclically shifted version of itself define the rotational symmetry and its order. Finally, a modification of the algorithm is proposed to detect reflectional symmetries. Some experimental results are presented to show the reliability of the proposed algorithm
Keywords: Rotational symmetry; Reflectional symmetry; String matching
|
|
|
Eloi Puertas, Sergio Escalera, & Oriol Pujol. (2015). Generalized Multi-scale Stacked Sequential Learning for Multi-class Classification. PAA - Pattern Analysis and Applications, 18(2), 247–261.
Abstract: In many classification problems, neighbor data labels have inherent sequential relationships. Sequential learning algorithms take benefit of these relationships in order to improve generalization. In this paper, we revise the multi-scale sequential learning approach (MSSL) for applying it in the multi-class case (MMSSL). We introduce the error-correcting output codesframework in the MSSL classifiers and propose a formulation for calculating confidence maps from the margins of the base classifiers. In addition, we propose a MMSSL compression approach which reduces the number of features in the extended data set without a loss in performance. The proposed methods are tested on several databases, showing significant performance improvement compared to classical approaches.
Keywords: Stacked sequential learning; Multi-scale; Error-correct output codes (ECOC); Contextual classification
|
|
|
David Roche, Debora Gil, & Jesus Giraldo. (2013). Multiple active receptor conformation, agonist efficacy and maximum effect of the system: the conformation-based operational model of agonism,. DDT - Drug Discovery Today, 18(7-8), 365–371.
Abstract: The operational model of agonism assumes that the maximum effect a particular receptor system can achieve (the Em parameter) is fixed. Em estimates are above but close to the asymptotic maximum effects of endogenous agonists. The concept of Em is contradicted by superagonists and those positive allosteric modulators that significantly increase the maximum effect of endogenous agonists. An extension of the operational model is proposed that assumes that the Em parameter does not necessarily have a single value for a receptor system but has multiple values associated to multiple active receptor conformations. The model provides a mechanistic link between active receptor conformation and agonist efficacy, which can be useful for the analysis of agonist response under different receptor scenarios.
|
|
|
Mohammad Ali Bagheri, Qigang Gao, & Sergio Escalera. (2015). Combining Local and Global Learners in the Pairwise Multiclass Classification. PAA - Pattern Analysis and Applications, 18(4), 845–860.
Abstract: Pairwise classification is a well-known class binarization technique that converts a multiclass problem into a number of two-class problems, one problem for each pair of classes. However, in the pairwise technique, nuisance votes of many irrelevant classifiers may result in a wrong class prediction. To overcome this problem, a simple, but efficient method is proposed and evaluated in this paper. The proposed method is based on excluding some classes and focusing on the most probable classes in the neighborhood space, named Local Crossing Off (LCO). This procedure is performed by employing a modified version of standard K-nearest neighbor and large margin nearest neighbor algorithms. The LCO method takes advantage of nearest neighbor classification algorithm because of its local learning behavior as well as the global behavior of powerful binary classifiers to discriminate between two classes. Combining these two properties in the proposed LCO technique will avoid the weaknesses of each method and will increase the efficiency of the whole classification system. On several benchmark datasets of varying size and difficulty, we found that the LCO approach leads to significant improvements using different base learners. The experimental results show that the proposed technique not only achieves better classification accuracy in comparison to other standard approaches, but also is computationally more efficient for tackling classification problems which have a relatively large number of target classes.
Keywords: Multiclass classification; Pairwise approach; One-versus-one
|
|
|
Lluis Pere de las Heras, Oriol Ramos Terrades, Sergi Robles, & Gemma Sanchez. (2015). CVC-FP and SGT: a new database for structural floor plan analysis and its groundtruthing tool. IJDAR - International Journal on Document Analysis and Recognition, 18(1), 15–30.
Abstract: Recent results on structured learning methods have shown the impact of structural information in a wide range of pattern recognition tasks. In the field of document image analysis, there is a long experience on structural methods for the analysis and information extraction of multiple types of documents. Yet, the lack of conveniently annotated and free access databases has not benefited the progress in some areas such as technical drawing understanding. In this paper, we present a floor plan database, named CVC-FP, that is annotated for the architectural objects and their structural relations. To construct this database, we have implemented a groundtruthing tool, the SGT tool, that allows to make specific this sort of information in a natural manner. This tool has been made for general purpose groundtruthing: It allows to define own object classes and properties, multiple labeling options are possible, grants the cooperative work, and provides user and version control. We finally have collected some of the recent work on floor plan interpretation and present a quantitative benchmark for this database. Both CVC-FP database and the SGT tool are freely released to the research community to ease comparisons between methods and boost reproducible research.
|
|
|
Christophe Rigaud, Clement Guerin, Dimosthenis Karatzas, Jean-Christophe Burie, & Jean-Marc Ogier. (2015). Knowledge-driven understanding of images in comic books. IJDAR - International Journal on Document Analysis and Recognition, 18(3), 199–221.
Abstract: Document analysis is an active field of research, which can attain a complete understanding of the semantics of a given document. One example of the document understanding process is enabling a computer to identify the key elements of a comic book story and arrange them according to a predefined domain knowledge. In this study, we propose a knowledge-driven system that can interact with bottom-up and top-down information to progressively understand the content of a document. We model the comic book’s and the image processing domains knowledge for information consistency analysis. In addition, different image processing methods are improved or developed to extract panels, balloons, tails, texts, comic characters and their semantic relations in an unsupervised way.
Keywords: Document Understanding; comics analysis; expert system
|
|
|
David Aldavert, Marçal Rusiñol, Ricardo Toledo, & Josep Llados. (2015). A Study of Bag-of-Visual-Words Representations for Handwritten Keyword Spotting. IJDAR - International Journal on Document Analysis and Recognition, 18(3), 223–234.
Abstract: The Bag-of-Visual-Words (BoVW) framework has gained popularity among the document image analysis community, specifically as a representation of handwritten words for recognition or spotting purposes. Although in the computer vision field the BoVW method has been greatly improved, most of the approaches in the document image analysis domain still rely on the basic implementation of the BoVW method disregarding such latest refinements. In this paper, we present a review of those improvements and its application to the keyword spotting task. We thoroughly evaluate their impact against a baseline system in the well-known George Washington dataset and compare the obtained results against nine state-of-the-art keyword spotting methods. In addition, we also compare both the baseline and improved systems with the methods presented at the Handwritten Keyword Spotting Competition 2014.
Keywords: Bag-of-Visual-Words; Keyword spotting; Handwritten documents; Performance evaluation
|
|
|
Mark Philip Philipsen, Jacob Velling Dueholm, Anders Jorgensen, Sergio Escalera, & Thomas B. Moeslund. (2018). Organ Segmentation in Poultry Viscera Using RGB-D. SENS - Sensors, 18(1), 117.
Abstract: We present a pattern recognition framework for semantic segmentation of visual structures, that is, multi-class labelling at pixel level, and apply it to the task of segmenting organs in the eviscerated viscera from slaughtered poultry in RGB-D images. This is a step towards replacing the current strenuous manual inspection at poultry processing plants. Features are extracted from feature maps such as activation maps from a convolutional neural network (CNN). A random forest classifier assigns class probabilities, which are further refined by utilizing context in a conditional random field. The presented method is compatible with both 2D and 3D features, which allows us to explore the value of adding 3D and CNN-derived features. The dataset consists of 604 RGB-D images showing 151 unique sets of eviscerated viscera from four different perspectives. A mean Jaccard index of 78.11% is achieved across the four classes of organs by using features derived from 2D, 3D and a CNN, compared to 74.28% using only basic 2D image features.
Keywords: semantic segmentation; RGB-D; random forest; conditional random field; 2D; 3D; CNN
|
|
|
Xavier Soria, Angel Sappa, & Riad I. Hammoud. (2018). Wide-Band Color Imagery Restoration for RGB-NIR Single Sensor Images. SENS - Sensors, 18(7), 2059.
Abstract: Multi-spectral RGB-NIR sensors have become ubiquitous in recent years. These sensors allow the visible and near-infrared spectral bands of a given scene to be captured at the same time. With such cameras, the acquired imagery has a compromised RGB color representation due to near-infrared bands (700–1100 nm) cross-talking with the visible bands (400–700 nm).
This paper proposes two deep learning-based architectures to recover the full RGB color images, thus removing the NIR information from the visible bands. The proposed approaches directly restore the high-resolution RGB image by means of convolutional neural networks. They are evaluated with several outdoor images; both architectures reach a similar performance when evaluated in different
scenarios and using different similarity metrics. Both of them improve the state of the art approaches.
Keywords: RGB-NIR sensor; multispectral imaging; deep learning; CNNs
|
|
|
Xim Cerda-Company, Xavier Otazu, Nilai Sallent, & C. Alejandro Parraga. (2018). The effect of luminance differences on color assimilation. JV - Journal of Vision, 18(11), 10.
Abstract: The color appearance of a surface depends on the color of its surroundings (inducers). When the perceived color shifts towards that of the surroundings, the effect is called “color assimilation” and when it shifts away from the surroundings it is called “color contrast.” There is also evidence that the phenomenon depends on the spatial configuration of the inducer, e.g., uniform surrounds tend to induce color contrast and striped surrounds tend to induce color assimilation. However, previous work found that striped surrounds under certain conditions do not induce color assimilation but induce color contrast (or do not induce anything at all), suggesting that luminance differences and high spatial frequencies could be key factors in color assimilation. Here we present a new psychophysical study of color assimilation where we assessed the contribution of luminance differences (between the target and its surround) present in striped stimuli. Our results show that luminance differences are key factors in color assimilation for stimuli varying along the s axis of MacLeod-Boynton color space, but not for stimuli varying along the l axis. This asymmetry suggests that koniocellular neural mechanisms responsible for color assimilation only contribute when there is a luminance difference, supporting the idea that mutual-inhibition has a major role in color induction.
|
|
|
Cristhian A. Aguilera-Carrasco, C. Aguilera, & Angel Sappa. (2018). Melamine Faced Panels Defect Classification beyond the Visible Spectrum. SENS - Sensors, 18(11), 1–10.
Abstract: In this work, we explore the use of images from different spectral bands to classify defects in melamine faced panels, which could appear through the production process. Through experimental evaluation, we evaluate the use of images from the visible (VS), near-infrared (NIR), and long wavelength infrared (LWIR), to classify the defects using a feature descriptor learning approach together with a support vector machine classifier. Two descriptors were evaluated, Extended Local Binary Patterns (E-LBP) and SURF using a Bag of Words (BoW) representation. The evaluation was carried on with an image set obtained during this work, which contained five different defect categories that currently occurs in the industry. Results show that using images from beyond the visual spectrum helps to improve classification performance in contrast with a single visible spectrum solution.
Keywords: industrial application; infrared; machine learning
|
|
|
Francisco Blanco, Felipe Lumbreras, Joan Serrat, Roswitha Siener, Silvia Serranti, Giuseppe Bonifazi, et al. (2014). Taking advantage of Hyperspectral Imaging classification of urinary stones against conventional IR Spectroscopy. JBiO - Journal of Biomedical Optics, 19(12), 126004–1 - 126004–9.
Abstract: The analysis of urinary stones is mandatory for the best management of the disease after the stone passage in order to prevent further stone episodes. Thus the use of an appropriate methodology for an individualized stone analysis becomes a key factor for giving the patient the most suitable treatment. A recently developed hyperspectral imaging methodology, based on pixel-to-pixel analysis of near-infrared spectral images, is compared to the reference technique in stone analysis, infrared (IR) spectroscopy. The developed classification model yields >90% correct classification rate when compared to IR and is able to precisely locate stone components within the structure of the stone with a 15 µm resolution. Due to the little sample pretreatment, low analysis time, good performance of the model, and the automation of the measurements, they become analyst independent; this methodology can be considered to become a routine analysis for clinical laboratories.
|
|
|
Lluis Gomez, & Dimosthenis Karatzas. (2016). A fast hierarchical method for multi‐script and arbitrary oriented scene text extraction. IJDAR - International Journal on Document Analysis and Recognition, 19(4), 335–349.
Abstract: Typography and layout lead to the hierarchical organisation of text in words, text lines, paragraphs. This inherent structure is a key property of text in any script and language, which has nonetheless been minimally leveraged by existing text detection methods. This paper addresses the problem of text
segmentation in natural scenes from a hierarchical perspective.
Contrary to existing methods, we make explicit use of text structure, aiming directly to the detection of region groupings corresponding to text within a hierarchy produced by an agglomerative similarity clustering process over individual regions. We propose an optimal way to construct such an hierarchy introducing a feature space designed to produce text group hypotheses with
high recall and a novel stopping rule combining a discriminative classifier and a probabilistic measure of group meaningfulness based in perceptual organization. Results obtained over four standard datasets, covering text in variable orientations and different languages, demonstrate that our algorithm, while being trained in a single mixed dataset, outperforms state of the art
methods in unconstrained scenarios.
Keywords: scene text; segmentation; detection; hierarchical grouping; perceptual organisation
|
|