|
Fernando Barrera, Felipe Lumbreras, & Angel Sappa. (2012). Multimodal Stereo Vision System: 3D Data Extraction and Algorithm Evaluation. J-STSP - IEEE Journal of Selected Topics in Signal Processing, 6(5), 437–446.
Abstract: This paper proposes an imaging system for computing sparse depth maps from multispectral images. A special stereo head consisting of an infrared and a color camera defines the proposed multimodal acquisition system. The cameras are rigidly attached so that their image planes are parallel. Details about the calibration and image rectification procedure are provided. Sparse disparity maps are obtained by the combined use of mutual information enriched with gradient information. The proposed approach is evaluated using a Receiver Operating Characteristics curve. Furthermore, a multispectral dataset, color and infrared images, together with their corresponding ground truth disparity maps, is generated and used as a test bed. Experimental results in real outdoor scenarios are provided showing its viability and that the proposed approach is not restricted to a specific domain.
|
|
|
Oriol Pujol, & Petia Radeva. (2004). Texture Segmentation by Statistical Deformable Models. IJIG - International Journal of Image and Graphics, 433–452.
Abstract: Deformable models have received much popularity due to their ability to include high-level knowledge on the application domain into low-level image processing. Still, most proposed active contour models do not sufficiently profit from the application information and they are too generalized, leading to non-optimal final results of segmentation, tracking or 3D reconstruction processes. In this paper we propose a new deformable model defined in a statistical framework to segment objects of natural scenes. We perform a supervised learning of local appearance of the textured objects and construct a feature space using a set of co-occurrence matrix measures. Linear Discriminant Analysis allows us to obtain an optimal reduced feature space where a mixture model is applied to construct a likelihood map. Instead of using a heuristic potential field, our active model is deformed on a regularized version of the likelihood map in order to segment objects characterized by the same texture pattern. Different tests on synthetic images, natural scene and medical images show the advantages of our statistic deformable model.
Keywords: Texture segmentation, parametric active contours, statistic snakes
|
|
|
Naila Murray, Maria Vanrell, Xavier Otazu, & C. Alejandro Parraga. (2011). Saliency Estimation Using a Non-Parametric Low-Level Vision Model. In IEEE conference on Computer Vision and Pattern Recognition (pp. 433–440).
Abstract: Many successful models for predicting attention in a scene involve three main steps: convolution with a set of filters, a center-surround mechanism and spatial pooling to construct a saliency map. However, integrating spatial information and justifying the choice of various parameter values remain open problems. In this paper we show that an efficient model of color appearance in human vision, which contains a principled selection of parameters as well as an innate spatial pooling mechanism, can be generalized to obtain a saliency model that outperforms state-of-the-art models. Scale integration is achieved by an inverse wavelet transform over the set of scale-weighted center-surround responses. The scale-weighting function (termed ECSF) has been optimized to better replicate psychophysical data on color appearance, and the appropriate sizes of the center-surround inhibition windows have been determined by training a Gaussian Mixture Model on eye-fixation data, thus avoiding ad-hoc parameter selection. Additionally, we conclude that the extension of a color appearance model to saliency estimation adds to the evidence for a common low-level visual front-end for different visual tasks.
Keywords: Gaussian mixture model;ad hoc parameter selection;center-surround inhibition windows;center-surround mechanism;color appearance model;convolution;eye-fixation data;human vision;innate spatial pooling mechanism;inverse wavelet transform;low-level visual front-end;nonparametric low-level vision model;saliency estimation;saliency map;scale integration;scale-weighted center-surround response;scale-weighting function;visual task;Gaussian processes;biology;biology computing;colour vision;computer vision;visual perception;wavelet transforms
|
|
|
Luis Herranz, Shuqiang Jiang, & Ruihan Xu. (2017). Modeling Restaurant Context for Food Recognition. TMM - IEEE Transactions on Multimedia, 19(2), 430–440.
Abstract: Food photos are widely used in food logs for diet monitoring and in social networks to share social and gastronomic experiences. A large number of these images are taken in restaurants. Dish recognition in general is very challenging, due to different cuisines, cooking styles, and the intrinsic difficulty of modeling food from its visual appearance. However, contextual knowledge can be crucial to improve recognition in such scenario. In particular, geocontext has been widely exploited for outdoor landmark recognition. Similarly, we exploit knowledge about menus and location of restaurants and test images. We first adapt a framework based on discarding unlikely categories located far from the test image. Then, we reformulate the problem using a probabilistic model connecting dishes, restaurants, and locations. We apply that model in three different tasks: dish recognition, restaurant recognition, and location refinement. Experiments on six datasets show that by integrating multiple evidences (visual, location, and external knowledge) our system can boost the performance in all tasks.
|
|
|
Fadi Dornaika, & Angel Sappa. (2008). Evaluation of an Appearance-based 3D Face Tracker using Dense 3D Data. Machine Vision and Applications, 427–441.
|
|
|
Debora Gil, Aura Hernandez-Sabate, Mireia Burnat, Steven Jansen, & Jordi Martinez-Vilalta. (2009). Structure-Preserving Smoothing of Biomedical Images. In 13th International Conference on Computer Analysis of Images and Patterns (Vol. 5702, pp. 427–434). LNCS. Springer Berlin Heidelberg.
Abstract: Smoothing of biomedical images should preserve gray-level transitions between adjacent tissues, while restoring contours consistent with anatomical structures. Anisotropic diffusion operators are based on image appearance discontinuities (either local or contextual) and might fail at weak inter-tissue transitions. Meanwhile, the output of block-wise and morphological operations is prone to present a block structure due to the shape and size of the considered pixel neighborhood. In this contribution, we use differential geometry concepts to define a diffusion operator that restricts to image consistent level-sets. In this manner, the final state is a non-uniform intensity image presenting homogeneous inter-tissue transitions along anatomical structures, while smoothing intra-structure texture. Experiments on different types of medical images (magnetic resonance, computerized tomography) illustrate its benefit on a further process (such as segmentation) of images.
Keywords: non-linear smoothing; differential geometry; anatomical structures segmentation; cardiac magnetic resonance; computerized tomography.
|
|
|
Marcel P. Lucassen, Theo Gevers, & Arjan Gijsenij. (2011). Texture Affects Color Emotion. CRA - Color Research & Applications, 36(6), 426–436.
Abstract: Several studies have recorded color emotions in subjects viewing uniform color (UC) samples. We conduct an experiment to measure and model how these color emotions change when texture is added to the color samples. Using a computer monitor, our subjects arrange samples along four scales: warm–cool, masculine–feminine, hard–soft, and heavy–light. Three sample types of increasing visual complexity are used: UC, grayscale textures, and color textures (CTs). To assess the intraobserver variability, the experiment is repeated after 1 week. Our results show that texture fully determines the responses on the Hard-Soft scale, and plays a role of decreasing weight for the masculine–feminine, heavy–light, and warm–cool scales. Using some 25,000 observer responses, we derive color emotion functions that predict the group-averaged scale responses from the samples' color and texture parameters. For UC samples, the accuracy of our functions is significantly higher (average R2 = 0.88) than that of previously reported functions applied to our data. The functions derived for CT samples have an accuracy of R2 = 0.80. We conclude that when textured samples are used in color emotion studies, the psychological responses may be strongly affected by texture. © 2010 Wiley Periodicals, Inc. Col Res Appl, 2010
Keywords: color;texture;color emotion;observer variability;ranking
|
|
|
V. Kober, Mikhail Mozerov, J. Alvarez-Borrego, & I.A. Ovseyevich. (2006). Adaptive Correlation Filters for Pattern Recognition. Pattern Recognition and Image Analysis, 425–431.
Abstract: Adaptive correlation filters based on synthetic discriminant functions (SDFs) for reliable pattern recognition are proposed. A given value of discrimination capability can be achieved by adapting a SDF filter to the input scene. This can be done by iterative training. Computer simulation results obtained with the proposed filters are compared with those of various correlation filters in terms of recognition performance.
Keywords: Pattern recognition, Correlation filters, A adaptive filters
|
|
|
Diego Cheda, Daniel Ponsa, & Antonio Lopez. (2012). Monocular Egomotion Estimation based on Image Matching. In 1st International Conference on Pattern Recognition Applications and Methods (pp. 425–430).
|
|
|
Md. Mostafa Kamal Sarker, Hatem A. Rashwan, Hatem A. Rashwan, Estefania Talavera, Syeda Furruka Banu, Petia Radeva, et al. (2018). MACNet: Multi-scale Atrous Convolution Networks for Food Places Classification in Egocentric Photo-streams. In European Conference on Computer Vision workshops (pp. 423–433). LCNS.
Abstract: First-person (wearable) camera continually captures unscripted interactions of the camera user with objects, people, and scenes reflecting his personal and relational tendencies. One of the preferences of people is their interaction with food events. The regulation of food intake and its duration has a great importance to protect against diseases. Consequently, this work aims to develop a smart model that is able to determine the recurrences of a person on food places during a day. This model is based on a deep end-to-end model for automatic food places recognition by analyzing egocentric photo-streams. In this paper, we apply multi-scale Atrous convolution networks to extract the key features related to food places of the input images. The proposed model is evaluated on an in-house private dataset called “EgoFoodPlaces”. Experimental results shows promising results of food places classification recognition in egocentric photo-streams.
|
|
|
Gemma Rotger, Francesc Moreno-Noguer, Felipe Lumbreras, & Antonio Agudo. (2019). Single view facial hair 3D reconstruction. In 9th Iberian Conference on Pattern Recognition and Image Analysis (Vol. 11867, pp. 423–436). LNCS.
Abstract: n this work, we introduce a novel energy-based framework that addresses the challenging problem of 3D reconstruction of facial hair from a single RGB image. To this end, we identify hair pixels over the image via texture analysis and then determine individual hair fibers that are modeled by means of a parametric hair model based on 3D helixes. We propose to minimize an energy composed of several terms, in order to adapt the hair parameters that better fit the image detections. The final hairs respond to the resulting fibers after a post-processing step where we encourage further realism. The resulting approach generates realistic facial hair fibers from solely an RGB image without assuming any training data nor user interaction. We provide an experimental evaluation on real-world pictures where several facial hair styles and image conditions are observed, showing consistent results and establishing a comparison with respect to competing approaches.
Keywords: 3D Vision; Shape Reconstruction; Facial Hair Modeling
|
|
|
David Guillamet, & Jordi Vitria. (2000). A Comparison of Local versus Global Color Histograms for Object Recognition. In 15 th International Conference on Pattern Recognition (Vol. 2, pp. 422–425).
|
|
|
Victoria Ruiz, Angel Sanchez, Jose F. Velez, & Bogdan Raducanu. (2019). Automatic Image-Based Waste Classification. In International Work-Conference on the Interplay Between Natural and Artificial Computation. From Bioinspired Systems and Biomedical Applications to Machine Learning (Vol. 11487, 422–431). LNCS.
Abstract: The management of solid waste in large urban environments has become a complex problem due to increasing amount of waste generated every day by citizens and companies. Current Computer Vision and Deep Learning techniques can help in the automatic detection and classification of waste types for further recycling tasks. In this work, we use the TrashNet dataset to train and compare different deep learning architectures for automatic classification of garbage types. In particular, several Convolutional Neural Networks (CNN) architectures were compared: VGG, Inception and ResNet. The best classification results were obtained using a combined Inception-ResNet model that achieved 88.6% of accuracy. These are the best results obtained with the considered dataset.
Keywords: Computer Vision; Deep learning; Convolutional neural networks; Waste classification
|
|
|
Francesc Tous, Agnes Borras, Robert Benavente, Ramon Baldrich, Maria Vanrell, & Josep Llados. (2002). Textual Descriptions for Browsing People by Visual Apperance. In Lecture Notes in Artificial Intelligence (Vol. 2504, pp. 419–429). Springer Verlag.
Abstract: This paper presents a first approach to build colour and structural descriptors for information retrieval on a people database. Queries are formulated in terms of their appearance that allows to seek people wearing specific clothes of a given colour name or texture. Descriptors are automatically computed by following three essential steps. A colour naming labelling from pixel properties. A region seg- mentation step based on colour properties of pixels combined with edge information. And a high level step that models the region arrangements in order to build clothes structure. Results are tested on large set of images from real scenes taken at the entrance desk of a building
|
|
|
David Geronimo, Antonio Lopez, Daniel Ponsa, & Angel Sappa. (2007). Haar Wavelets and Edge Orientation Histograms for On-Board Pedestrian Detection. In J. Marti et al. (Ed.), 3rd Iberian Conference on Pattern Recognition and Image Analysis, LNCS 4477 (Vol. 1, 418–425).
Keywords: Pedestrian detection
|
|