Farshad Nourbakhsh. (2009). Colour logo recognition (Vol. 145). Master's thesis, , Bellaterra, Barcelona.
|
Hany Salah Eldeen. (2009). Colour Naming in Context through a Perceptual Model (Vol. 130). Master's thesis, , Bellaterra, Barcelona.
|
David Augusto Rojas. (2009). Colouring Local Feature Detection for Matching (Vol. 133). Master's thesis, , Bellaterra, Barcelona.
|
Xavier Boix, Josep M. Gonfaus, Fahad Shahbaz Khan, Joost Van de Weijer, Andrew Bagdanov, Marco Pedersoli, et al. (2009). Combining local and global bag-of-word representations for semantic segmentation. In Workshop on The PASCAL Visual Object Classes Challenge.
|
Salim Jouili, Salvatore Tabbone, & Ernest Valveny. (2009). Comparing Graph Similarity Measures for Graphical Recognition. In 8th IAPR International Workshop on Graphics Recognition. LNCS. Springer.
Abstract: In this paper we evaluate four graph distance measures. The analysis is performed for document retrieval tasks. For this aim, different kind of documents are used including line drawings (symbols), ancient documents (ornamental letters), shapes and trademark-logos. The experimental results show that the performance of each graph distance measure depends on the kind of data and the graph representation technique.
|
L.Tarazon, D. Perez, N. Serrano, V. Alabau, Oriol Ramos Terrades, A. Sanchis, et al. (2009). Confidence Measures for Error Correction in Interactive Transcription of Handwritten Text. In 15th International Conference on Image Analysis and Processing (Vol. 5716, pp. 567–574). LNCS. Springer Berlin Heidelberg.
Abstract: An effective approach to transcribe old text documents is to follow an interactive-predictive paradigm in which both, the system is guided by the human supervisor, and the supervisor is assisted by the system to complete the transcription task as efficiently as possible. In this paper, we focus on a particular system prototype called GIDOC, which can be seen as a first attempt to provide user-friendly, integrated support for interactive-predictive page layout analysis, text line detection and handwritten text transcription. More specifically, we focus on the handwriting recognition part of GIDOC, for which we propose the use of confidence measures to guide the human supervisor in locating possible system errors and deciding how to proceed. Empirical results are reported on two datasets showing that a word error rate not larger than a 10% can be achieved by only checking the 32% of words that are recognised with less confidence.
|
Mehdi Mirza-Mohammadi, Sergio Escalera, & Petia Radeva. (2009). Contextual-Guided Bag-of-Visual-Words Model for Multi-class Object Categorization. In 13th International Conference on Computer Analysis of Images and Patterns (Vol. 5702, 748–756). LNCS. Springer Berlin Heidelberg.
Abstract: Bag-of-words model (BOW) is inspired by the text classification problem, where a document is represented by an unsorted set of contained words. Analogously, in the object categorization problem, an image is represented by an unsorted set of discrete visual words (BOVW). In these models, relations among visual words are performed after dictionary construction. However, close object regions can have far descriptions in the feature space, being grouped as different visual words. In this paper, we present a method for considering geometrical information of visual words in the dictionary construction step. Object interest regions are obtained by means of the Harris-Affine detector and then described using the SIFT descriptor. Afterward, a contextual-space and a feature-space are defined, and a merging process is used to fuse feature words based on their proximity in the contextual-space. Moreover, we use the Error Correcting Output Codes framework to learn the new dictionary in order to perform multi-class classification. Results show significant classification improvements when spatial information is taken into account in the dictionary construction step.
|
Agnes Borras. (2009). Contributions to the Content-Based Image Retrieval Using Pictorial Queries (Josep Llados, Ed.). Ph.D. thesis, Ediciones Graficas Rey, Bellaterra.
Abstract: The broad access to digital cameras, personal computers and Internet, has lead to the generation of large volumes of data in digital form. If we want an effective usage of this huge amount of data, we need automatic tools to allow the retrieval of relevant information. Image data is a particular type of information that requires specific techniques of description and indexing. The computer vision field that studies these kind of techniques is called Content-Based Image Retrieval (CBIR). Instead of using text-based descriptions, a system of CBIR deals on properties that are inherent in the images themselves. Hence, the feature-based description provides a universal via of image expression in contrast with the more than 6000 languages spoken in the world.
Nowadays, the CBIR is a dynamic focus of research that has derived in important applications for many professional groups. The potential fields of application can be such diverse as: the medical domain, the crime prevention, the protection of the intel- lectual property, the journalism, the graphic design, the web search, the preservation of cultural heritage, etc.
The definition on the role of the user is a key point in the development of a CBIR application. The user is in charge to formulate the queries from which the images are retrieved. We have centered our attention on the image retrieval techniques that use queries based on pictorial information. We have identified a taxonomy composed by four main query paradigms: query-by-selection, query-by-iconic-composition, query- by-sketch and query-by-paint. Each one of these paradigms allows a different degree of user expressivity. From a simple image selection, to a complete painting of the query, the user takes control of the input in the CBIR system.
Along the chapters of this thesis we have analyzed the influence that each query paradigm imposes in the internal operations of a CBIR system. Moreover, we have proposed a set of contributions that we have exemplified in the context of a final application.
|
Agnes Borras, & Josep Llados. (2009). Corest: A measure of color and space stability to detect salient regions according to human criteria. In 5th International Conference on Computer Vision Theory and Applications (pp. 204–209).
|
Ivan Huerta, Michael Holte, Thomas B. Moeslund, & Jordi Gonzalez. (2009). Detection and Removal of Chromatic Moving Shadows in Surveillance Scenarios. In 12th International Conference on Computer Vision (pp. 1499–1506).
Abstract: Segmentation in the surveillance domain has to deal with shadows to avoid distortions when detecting moving objects. Most segmentation approaches dealing with shadow detection are typically restricted to penumbra shadows. Therefore, such techniques cannot cope well with umbra shadows. Consequently, umbra shadows are usually detected as part of moving objects. In this paper we present a novel technique based on gradient and colour models for separating chromatic moving cast shadows from detected moving objects. Firstly, both a chromatic invariant colour cone model and an invariant gradient model are built to perform automatic segmentation while detecting potential shadows. In a second step, regions corresponding to potential shadows are grouped by considering “a bluish effect” and an edge partitioning. Lastly, (i) temporal similarities between textures and (ii) spatial similarities between chrominance angle and brightness distortions are analysed for all potential shadow regions in order to finally identify umbra shadows. Unlike other approaches, our method does not make any a-priori assumptions about camera location, surface geometries, surface textures, shapes and types of shadows, objects, and background. Experimental results show the performance and accuracy of our approach in different shadowed materials and illumination conditions.
|
Fernando Vilariño, Panagiota Spyridonos, Petia Radeva, Jordi Vitria, Fernando Azpiroz, & Juan Malagelada. (2009). Device, system and method for measurement and analysis of contractile activity.
Abstract: A method and system for determining intestinal dysfunction condition are provided by classifying and analyzing image frames captured in-vivo. The method and system also relate to the detection of contractile activity in intestinal tracts, to automatic detection of video image frames taken in the gastrointestinal tract including contractile activity, and more particularly to measurement and analysis of contractile activity of the GI tract based on image intensity of in vivo image data.
|
Sergio Escalera, R. M. Martinez, Jordi Vitria, Petia Radeva, & Maria Teresa Anguera. (2009). Dominance Detection in Face-to-face Conversations. In 2nd IEEE Workshop on CVPR for Human communicative Behavior analysis (97–102).
Abstract: Dominance is referred to the level of influence a person has in a conversation. Dominance is an important research area in social psychology, but the problem of its automatic estimation is a very recent topic in the contexts of social and wearable computing. In this paper, we focus on dominance detection from visual cues. We estimate the correlation among observers by categorizing the dominant people in a set of face-to-face conversations. Different dominance indicators from gestural communication are defined, manually annotated, and compared to the observers opinion. Moreover, the considered indicators are automatically extracted from video sequences and learnt by using binary classifiers. Results from the three analysis shows a high correlation and allows the categorization of dominant people in public discussion video sequences.
|
Francesco Ciompi, Oriol Pujol, E Fernandez-Nofrerias, J. Mauri, & Petia Radeva. (2009). ECOC Random Fields for Lumen Segmentation in Radial Artery IVUS Sequences. In 12th International Conference on Medical Image and Computer Assisted Intervention (Vol. 5762). LNCS. Springer Berlin Heidelberg.
Abstract: The measure of lumen volume on radial arteries can be used to evaluate the vessel response to different vasodilators. In this paper, we present a framework for automatic lumen segmentation in longitudinal cut images of radial artery from Intravascular ultrasound sequences. The segmentation is tackled as a classification problem where the contextual information is exploited by means of Conditional Random Fields (CRFs). A multi-class classification framework is proposed, and inference is achieved by combining binary CRFs according to the Error-Correcting-Output-Code technique. The results are validated against manually segmented sequences. Finally, the method is compared with other state-of-the-art classifiers.
|
Angel Sappa, & Mohammad Rouhani. (2009). Efficient Distance Estimation for Fitting Implicit Quadric Surfaces. In 16th IEEE International Conference on Image Processing (3521–3524).
Abstract: This paper presents a novel approach for estimating the shortest Euclidean distance from a given point to the corresponding implicit quadric fitting surface. It first estimates the orthogonal orientation to the surface from the given point; then the shortest distance is directly estimated by intersecting the implicit surface with a line passing through the given point according to the estimated orthogonal orientation. The proposed orthogonal distance estimation is easily obtained without increasing computational complexity; hence it can be used in error minimization surface fitting frameworks. Comparisons of the proposed metric with previous approaches are provided to show both improvements in CPU time as well as in the accuracy of the obtained results. Surfaces fitted by using the proposed geometric distance estimation and state of the art metrics are presented to show the viability of the proposed approach.
|
David Aldavert, Ricardo Toledo, Arnau Ramisa, & Ramon Lopez de Mantaras. (2009). Efficient Object Pixel-Level Categorization using Bag of Features: Advances in Visual Computing. In 5th International Symposium on Visual Computing (Vol. 5875, 44–55). Springer Berlin Heidelberg.
Abstract: In this paper we present a pixel-level object categorization method suitable to be applied under real-time constraints. Since pixels are categorized using a bag of features scheme, the major bottleneck of such an approach would be the feature pooling in local histograms of visual words. Therefore, we propose to bypass this time-consuming step and directly obtain the score from a linear Support Vector Machine classifier. This is achieved by creating an integral image of the components of the SVM which can readily obtain the classification score for any image sub-window with only 10 additions and 2 products, regardless of its size. Besides, we evaluated the performance of two efficient feature quantization methods: the Hierarchical K-Means and the Extremely Randomized Forest. All experiments have been done in the Graz02 database, showing comparable, or even better results to related work with a lower computational cost.
|