J. Nuñez, Xavier Otazu, & M.T. Merino. (2005). A Multiresolution-Based Method for the Determination of the Relative Resolution between Images. First Application to Remote Sensing and Medical Images. International Journal of Imaging Systems and Technology, 15(5): 225–235 (IF: 0.439).
|
|
Xavier Boix, Josep M. Gonfaus, Joost Van de Weijer, Andrew Bagdanov, Joan Serrat, & Jordi Gonzalez. (2012). Harmony Potentials: Fusing Global and Local Scale for Semantic Image Segmentation. IJCV - International Journal of Computer Vision, 96(1), 83–102.
Abstract: The Hierarchical Conditional Random Field(HCRF) model have been successfully applied to a number of image labeling problems, including image segmentation. However, existing HCRF models of image segmentation do not allow multiple classes to be assigned to a single region, which limits their ability to incorporate contextual information across multiple scales.
At higher scales in the image, this representation yields an oversimplied model since multiple classes can be reasonably expected to appear within large regions. This simplied model particularly limits the impact of information at higher scales. Since class-label information at these scales is usually more reliable than at lower, noisier scales, neglecting this information is undesirable. To
address these issues, we propose a new consistency potential for image labeling problems, which we call the harmony potential. It can encode any possible combi-
nation of labels, penalizing only unlikely combinations of classes. We also propose an eective sampling strategy over this expanded label set that renders tractable the underlying optimization problem. Our approach obtains state-of-the-art results on two challenging, standard benchmark datasets for semantic image segmentation: PASCAL VOC 2010, and MSRC-21.
|
|
Fahad Shahbaz Khan, Joost Van de Weijer, & Maria Vanrell. (2012). Modulating Shape Features by Color Attention for Object Recognition. IJCV - International Journal of Computer Vision, 98(1), 49–64.
Abstract: Bag-of-words based image representation is a successful approach for object recognition. Generally, the subsequent stages of the process: feature detection,feature description, vocabulary construction and image representation are performed independent of the intentioned object classes to be detected. In such a framework, it was found that the combination of different image cues, such as shape and color, often obtains below expected results. This paper presents a novel method for recognizing object categories when using ultiple cues by separately processing the shape and color cues and combining them by modulating the shape features by category specific color attention. Color is used to compute bottom up and top-down attention maps. Subsequently, these color attention maps are used to modulate the weights of the shape features. In regions with higher attention shape features are given more weight than in regions with low attention. We compare our approach with existing methods that combine color and shape cues on five data sets containing varied importance of both cues, namely, Soccer (color predominance), Flower (color and hape parity), PASCAL VOC 2007 and 2009 (shape predominance) and Caltech-101 (color co-interference). The experiments clearly demonstrate that in all five data sets our proposed framework significantly outperforms existing methods for combining color and shape information.
|
|
Fahad Shahbaz Khan, Muhammad Anwer Rao, Joost Van de Weijer, Andrew Bagdanov, Antonio Lopez, & Michael Felsberg. (2013). Coloring Action Recognition in Still Images. IJCV - International Journal of Computer Vision, 105(3), 205–221.
Abstract: In this article we investigate the problem of human action recognition in static images. By action recognition we intend a class of problems which includes both action classification and action detection (i.e. simultaneous localization and classification). Bag-of-words image representations yield promising results for action classification, and deformable part models perform very well object detection. The representations for action recognition typically use only shape cues and ignore color information. Inspired by the recent success of color in image classification and object detection, we investigate the potential of color for action classification and detection in static images. We perform a comprehensive evaluation of color descriptors and fusion approaches for action recognition. Experiments were conducted on the three datasets most used for benchmarking action recognition in still images: Willow, PASCAL VOC 2010 and Stanford-40. Our experiments demonstrate that incorporating color information considerably improves recognition performance, and that a descriptor based on color names outperforms pure color descriptors. Our experiments demonstrate that late fusion of color and shape information outperforms other approaches on action recognition. Finally, we show that the different color–shape fusion approaches result in complementary information and combining them yields state-of-the-art performance for action classification.
|
|
Felipe Lumbreras, Xavier Roca, Daniel Ponsa, Robert Benavente, Judit Martinez, Silvia Sanchez, et al. (2001). Visual Inspection of Safety Belts. In International Conference on Quality Control by Artificial Vision (Vol. 2, 526–531).
|
|
Joost Van de Weijer, & Shida Beigpour. (2011). The Dichromatic Reflection Model: Future Research Directions and Applications. In José L. and B. Mestetskiy (Ed.), International Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications. SciTePress.
Abstract: The dichromatic reflection model (DRM) predicts that color distributions form a parallelogram in color space, whose shape is defined by the body reflectance and the illuminant color. In this paper we resume the assumptions which led to the DRM and shortly recall two of its main applications domains: color image segmentation and photometric invariant feature computation. After having introduced the model we discuss several limitations of the theory, especially those which are raised once working on real-world uncalibrated images. In addition, we summerize recent extensions of the model which allow to handle more complicated light interactions. Finally, we suggest some future research directions which would further extend its applicability.
Keywords: dblp
|
|
Maria Vanrell, & Jordi Vitria. (1997). Optimal 3x3 decomposable disks for morphological transformations. Image and Vision Computing, 15(2): 845–854.
|
|
Arjan Gijsenij, Theo Gevers, & Joost Van de Weijer. (2011). Computational Color Constancy: Survey and Experiments. TIP - IEEE Transactions on Image Processing, 20(9), 2475–2489.
Abstract: Computational color constancy is a fundamental prerequisite for many computer vision applications. This paper presents a survey of many recent developments and state-of-the- art methods. Several criteria are proposed that are used to assess the approaches. A taxonomy of existing algorithms is proposed and methods are separated in three groups: static methods, gamut-based methods and learning-based methods. Further, the experimental setup is discussed including an overview of publicly available data sets. Finally, various freely available methods, of which some are considered to be state-of-the-art, are evaluated on two data sets.
Keywords: computational color constancy;computer vision application;gamut-based method;learning-based method;static method;colour vision;computer vision;image colour analysis;learning (artificial intelligence);lighting
|
|
Javier Vazquez, Maria Vanrell, Ramon Baldrich, & Francesc Tous. (2012). Color Constancy by Category Correlation. TIP - IEEE Transactions on Image Processing, 21(4), 1997–2007.
Abstract: Finding color representations which are stable to illuminant changes is still an open problem in computer vision. Until now most approaches have been based on physical constraints or statistical assumptions derived from the scene, while very little attention has been paid to the effects that selected illuminants have
on the final color image representation. The novelty of this work is to propose
perceptual constraints that are computed on the corrected images. We define the
category hypothesis, which weights the set of feasible illuminants according to their ability to map the corrected image onto specific colors. Here we choose these colors as the universal color categories related to basic linguistic terms which have been psychophysically measured. These color categories encode natural color statistics, and their relevance across different cultures is indicated by the fact that they have received a common color name. From this category hypothesis we propose a fast implementation that allows the sampling of a large set of illuminants. Experiments prove that our method rivals current state-of-art performance without the need for training algorithmic parameters. Additionally, the method can be used as a framework to insert top-down information from other sources, thus opening further research directions in solving for color constancy.
|
|
Shida Beigpour, Christian Riess, Joost Van de Weijer, & Elli Angelopoulou. (2014). Multi-Illuminant Estimation with Conditional Random Fields. TIP - IEEE Transactions on Image Processing, 23(1), 83–95.
Abstract: Most existing color constancy algorithms assume uniform illumination. However, in real-world scenes, this is not often the case. Thus, we propose a novel framework for estimating the colors of multiple illuminants and their spatial distribution in the scene. We formulate this problem as an energy minimization task within a conditional random field over a set of local illuminant estimates. In order to quantitatively evaluate the proposed method, we created a novel data set of two-dominant-illuminant images comprised of laboratory, indoor, and outdoor scenes. Unlike prior work, our database includes accurate pixel-wise ground truth illuminant information. The performance of our method is evaluated on multiple data sets. Experimental results show that our framework clearly outperforms single illuminant estimators as well as a recently proposed multi-illuminant estimation approach.
Keywords: color constancy; CRF; multi-illuminant
|
|
Fahad Shahbaz Khan, Joost Van de Weijer, Muhammad Anwer Rao, Michael Felsberg, & Carlo Gatta. (2014). Semantic Pyramids for Gender and Action Recognition. TIP - IEEE Transactions on Image Processing, 23(8), 3633–3645.
Abstract: Person description is a challenging problem in computer vision. We investigated two major aspects of person description: 1) gender and 2) action recognition in still images. Most state-of-the-art approaches for gender and action recognition rely on the description of a single body part, such as face or full-body. However, relying on a single body part is suboptimal due to significant variations in scale, viewpoint, and pose in real-world images. This paper proposes a semantic pyramid approach for pose normalization. Our approach is fully automatic and based on combining information from full-body, upper-body, and face regions for gender and action recognition in still images. The proposed approach does not require any annotations for upper-body and face of a person. Instead, we rely on pretrained state-of-the-art upper-body and face detectors to automatically extract semantic information of a person. Given multiple bounding boxes from each body part detector, we then propose a simple method to select the best candidate bounding box, which is used for feature extraction. Finally, the extracted features from the full-body, upper-body, and face regions are combined into a single representation for classification. To validate the proposed approach for gender recognition, experiments are performed on three large data sets namely: 1) human attribute; 2) head-shoulder; and 3) proxemics. For action recognition, we perform experiments on four data sets most used for benchmarking action recognition in still images: 1) Sports; 2) Willow; 3) PASCAL VOC 2010; and 4) Stanford-40. Our experiments clearly demonstrate that the proposed approach, despite its simplicity, outperforms state-of-the-art methods for gender and action recognition.
|
|
J. Nuñez, O. Fors, Xavier Otazu, Vicenç Pala, Roman Arbiol, & M.T. Merino. (2006). A Wavelet-Based Method for the Determination of the Relative Resolution Between Remotely Sensed Images. IEEE Transactions on Geoscience and Remote Sensing, 44(9): 2539–2548.
|
|
Xavier Otazu, M. Gonzalez-Audicana, O. Fors, & J. Nuñez. (2005). Introduction of Sensor Spectral Response Into Image Fusion Methods. Application to Wavelet-Based Methods. IEEE Transactions on Geoscience and Remote Sensing, 43(10): 2376–2385 (IF: 1.627).
|
|
David Geronimo, Joan Serrat, Antonio Lopez, & Ramon Baldrich. (2013). Traffic sign recognition for computer vision project-based learning. T-EDUC - IEEE Transactions on Education, 56(3), 364–371.
Abstract: This paper presents a graduate course project on computer vision. The aim of the project is to detect and recognize traffic signs in video sequences recorded by an on-board vehicle camera. This is a demanding problem, given that traffic sign recognition is one of the most challenging problems for driving assistance systems. Equally, it is motivating for the students given that it is a real-life problem. Furthermore, it gives them the opportunity to appreciate the difficulty of real-world vision problems and to assess the extent to which this problem can be solved by modern computer vision and pattern classification techniques taught in the classroom. The learning objectives of the course are introduced, as are the constraints imposed on its design, such as the diversity of students' background and the amount of time they and their instructors dedicate to the course. The paper also describes the course contents, schedule, and how the project-based learning approach is applied. The outcomes of the course are discussed, including both the students' marks and their personal feedback.
Keywords: traffic signs
|
|
Eduard Vazquez, Ramon Baldrich, Joost Van de Weijer, & Maria Vanrell. (2011). Describing Reflectances for Colour Segmentation Robust to Shadows, Highlights and Textures. TPAMI - IEEE Transactions on Pattern Analysis and Machine Intelligence, 33(5), 917–930.
Abstract: The segmentation of a single material reflectance is a challenging problem due to the considerable variation in image measurements caused by the geometry of the object, shadows, and specularities. The combination of these effects has been modeled by the dichromatic reflection model. However, the application of the model to real-world images is limited due to unknown acquisition parameters and compression artifacts. In this paper, we present a robust model for the shape of a single material reflectance in histogram space. The method is based on a multilocal creaseness analysis of the histogram which results in a set of ridges representing the material reflectances. The segmentation method derived from these ridges is robust to both shadow, shading and specularities, and texture in real-world images. We further complete the method by incorporating prior knowledge from image statistics, and incorporate spatial coherence by using multiscale color contrast information. Results obtained show that our method clearly outperforms state-of-the-art segmentation methods on a widely used segmentation benchmark, having as a main characteristic its excellent performance in the presence of shadows and highlights at low computational cost.
|
|
Naila Murray, Maria Vanrell, Xavier Otazu, & C. Alejandro Parraga. (2013). Low-level SpatioChromatic Grouping for Saliency Estimation. TPAMI - IEEE Transactions on Pattern Analysis and Machine Intelligence, 35(11), 2810–2816.
Abstract: We propose a saliency model termed SIM (saliency by induction mechanisms), which is based on a low-level spatiochromatic model that has successfully predicted chromatic induction phenomena. In so doing, we hypothesize that the low-level visual mechanisms that enhance or suppress image detail are also responsible for making some image regions more salient. Moreover, SIM adds geometrical grouplets to enhance complex low-level features such as corners, and suppress relatively simpler features such as edges. Since our model has been fitted on psychophysical chromatic induction data, it is largely nonparametric. SIM outperforms state-of-the-art methods in predicting eye fixations on two datasets and using two metrics.
|
|
Arjan Gijsenij, Theo Gevers, & Joost Van de Weijer. (2012). Improving Color Constancy by Photometric Edge Weighting. TPAMI - IEEE Transaction on Pattern Analysis and Machine Intelligence, 34(5), 918–929.
Abstract: : Edge-based color constancy methods make use of image derivatives to estimate the illuminant. However, different edge types exist in real-world images such as material, shadow and highlight edges. These different edge types may have a distinctive influence on the performance of the illuminant estimation. Therefore, in this paper, an extensive analysis is provided of different edge types on the performance of edge-based color constancy methods. First, an edge-based taxonomy is presented classifying edge types based on their photometric properties (e.g. material, shadow-geometry and highlights). Then, a performance evaluation of edge-based color constancy is provided using these different edge types. From this performance evaluation it is derived that specular and shadow edge types are more valuable than material edges for the estimation of the illuminant. To this end, the (iterative) weighted Grey-Edge algorithm is proposed in which these edge types are more emphasized for the estimation of the illuminant. Images that are recorded under controlled circumstances demonstrate that the proposed iterative weighted Grey-Edge algorithm based on highlights reduces the median angular error with approximately $25\%$. In an uncontrolled environment, improvements in angular error up to $11\%$ are obtained with respect to regular edge-based color constancy.
|
|
Jaime Moreno, & Xavier Otazu. (2011). Image compression algorithm based on Hilbert scanning of embedded quadTrees: an introduction of the Hi-SET coder. In IEEE International Conference on Multimedia and Expo (pp. 1–6).
Abstract: In this work we present an effective and computationally simple algorithm for image compression based on Hilbert Scanning of Embedded quadTrees (Hi-SET). It allows to represent an image as an embedded bitstream along a fractal function. Embedding is an important feature of modern image compression algorithms, in this way Salomon in [1, pg. 614] cite that another feature and perhaps a unique one is the fact of achieving the best quality for the number of bits input by the decoder at any point during the decoding. Hi-SET possesses also this latter feature. Furthermore, the coder is based on a quadtree partition strategy, that applied to image transformation structures such as discrete cosine or wavelet transform allows to obtain an energy clustering both in frequency and space. The coding algorithm is composed of three general steps, using just a list of significant pixels. The implementation of the proposed coder is developed for gray-scale and color image compression. Hi-SET compressed images are, on average, 6.20dB better than the ones obtained by other compression techniques based on the Hilbert scanning. Moreover, Hi-SET improves the image quality in 1.39dB and 1.00dB in gray-scale and color compression, respectively, when compared with JPEG2000 coder.
|
|