Agnes Borras. (2009). Contributions to the Content-Based Image Retrieval Using Pictorial Queries (Josep Llados, Ed.). Ph.D. thesis, Ediciones Graficas Rey, Bellaterra.
Abstract: The broad access to digital cameras, personal computers and Internet, has lead to the generation of large volumes of data in digital form. If we want an effective usage of this huge amount of data, we need automatic tools to allow the retrieval of relevant information. Image data is a particular type of information that requires specific techniques of description and indexing. The computer vision field that studies these kind of techniques is called Content-Based Image Retrieval (CBIR). Instead of using text-based descriptions, a system of CBIR deals on properties that are inherent in the images themselves. Hence, the feature-based description provides a universal via of image expression in contrast with the more than 6000 languages spoken in the world.
Nowadays, the CBIR is a dynamic focus of research that has derived in important applications for many professional groups. The potential fields of application can be such diverse as: the medical domain, the crime prevention, the protection of the intel- lectual property, the journalism, the graphic design, the web search, the preservation of cultural heritage, etc.
The definition on the role of the user is a key point in the development of a CBIR application. The user is in charge to formulate the queries from which the images are retrieved. We have centered our attention on the image retrieval techniques that use queries based on pictorial information. We have identified a taxonomy composed by four main query paradigms: query-by-selection, query-by-iconic-composition, query- by-sketch and query-by-paint. Each one of these paradigms allows a different degree of user expressivity. From a simple image selection, to a complete painting of the query, the user takes control of the input in the CBIR system.
Along the chapters of this thesis we have analyzed the influence that each query paradigm imposes in the internal operations of a CBIR system. Moreover, we have proposed a set of contributions that we have exemplified in the context of a final application.
|
Daniel Ponsa, & Antonio Lopez. (2009). Seguimiento Visual de Contornos Computerizado.
|
Ferran Diego, Daniel Ponsa, Joan Serrat, & Antonio Lopez. (2009). Video alignment for automotive applications.
Keywords: video alignment
|
Jose Manuel Alvarez, & Antonio Lopez. (2009). Model-based road detection using shadowless features and on-line learning.
|
Xavier Boix, Josep M. Gonfaus, Fahad Shahbaz Khan, Joost Van de Weijer, Andrew Bagdanov, Marco Pedersoli, et al. (2009). Combining local and global bag-of-word representations for semantic segmentation. In Workshop on The PASCAL Visual Object Classes Challenge.
|
David Geronimo. (2010). A Global Approach to Vision-Based Pedestrian Detection for Advanced Driver Assistance Systems (Antonio Lopez, Krystian Mikolajczyk, Jaume Amores, Dariu M. Gavrila, Oriol Pujol, & Felipe Lumbreras, Eds.). Ph.D. thesis, Ediciones Graficas Rey, .
Abstract: At the beginning of the 21th century, traffic accidents have become a major problem not only for developed countries but also for emerging ones. As in other scientific areas in which Artificial Intelligence is becoming a key actor, advanced driver assistance systems, and concretely pedestrian protection systems based on Computer Vision, are becoming a strong topic of research aimed at improving the safety of pedestrians. However, the challenge is of considerable complexity due to the varying appearance of humans (e.g., clothes, size, aspect ratio, shape, etc.), the dynamic nature of on-board systems and the unstructured moving environments that urban scenarios represent. In addition, the required performance is demanding both in terms of computational time and detection rates. In this thesis, instead of focusing on improving specific tasks as it is frequent in the literature, we present a global approach to the problem. Such a global overview starts by the proposal of a generic architecture to be used as a framework both to review the literature and to organize the studied techniques along the thesis. We then focus the research on tasks such as foreground segmentation, object classification and refinement following a general viewpoint and exploring aspects that are not usually analyzed. In order to perform the experiments, we also present a novel pedestrian dataset that consists of three subsets, each one addressed to the evaluation of a different specific task in the system. The results presented in this thesis not only end with a proposal of a pedestrian detection system but also go one step beyond by pointing out new insights, formalizing existing and proposed algorithms, introducing new techniques and evaluating their performance, which we hope will provide new foundations for future research in the area.
|
Mario Rojas, David Masip, A. Todorov, & Jordi Vitria. (2010). Automatic Point-based Facial Trait Judgments Evaluation. In 23rd IEEE Conference on Computer Vision and Pattern Recognition (2715–2720).
Abstract: Humans constantly evaluate the personalities of other people using their faces. Facial trait judgments have been studied in the psychological field, and have been determined to influence important social outcomes of our lives, such as elections outcomes and social relationships. Recent work on textual descriptions of faces has shown that trait judgments are highly correlated. Further, behavioral studies suggest that two orthogonal dimensions, valence and dominance, can describe the basis of the human judgments from faces. In this paper, we used a corpus of behavioral data of judgments on different trait dimensions to automatically learn a trait predictor from facial pixel images. We study whether trait evaluations performed by humans can be learned using machine learning classifiers, and used later in automatic evaluations of new facial images. The experiments performed using local point-based descriptors show promising results in the evaluation of the main traits.
|
Santiago Segui, Laura Igual, & Jordi Vitria. (2010). Weighted Bagging for Graph based One-Class Classifiers. In 9th International Workshop on Multiple Classifier Systems (Vol. 5997, pp. 1–10). LNCS. Springer Berlin Heidelberg.
Abstract: Most conventional learning algorithms require both positive and negative training data for achieving accurate classification results. However, the problem of learning classifiers from only positive data arises in many applications where negative data are too costly, difficult to obtain, or not available at all. Minimum Spanning Tree Class Descriptor (MSTCD) was presented as a method that achieves better accuracies than other one-class classifiers in high dimensional data. However, the presence of outliers in the target class severely harms the performance of this classifier. In this paper we propose two bagging strategies for MSTCD that reduce the influence of outliers in training data. We show the improved performance on both real and artificially contaminated data.
|
Partha Pratim Roy, Umapada Pal, & Josep Llados. (2010). Seal Object Detection in Document Images using GHT of Local Component Shapes. In 10th ACM Symposium On Applied Computing (23–27).
Abstract: Due to noise, overlapped text/signature and multi-oriented nature, seal (stamp) object detection involves a difficult challenge. This paper deals with automatic detection of seal from documents with cluttered background. Here, a seal object is characterized by scale and rotation invariant spatial feature descriptors (distance and angular position) computed from recognition result of individual connected components (characters). Recognition of multi-scale and multi-oriented component is done using Support Vector Machine classifier. Generalized Hough Transform (GHT) is used to detect the seal and a voting is casted for finding possible location of the seal object in a document based on these spatial feature descriptor of components pairs. The peak of votes in GHT accumulator validates the hypothesis to locate the seal object in a document. Experimental results show that, the method is efficient to locate seal instance of arbitrary shape and orientation in documents.
|
Marçal Rusiñol, & Josep Llados. (2010). Symbol Spotting in Digital Libraries:Focused Retrieval over Graphic-rich Document Collections. Springer.
Abstract: The specific problem of symbol recognition in graphical documents requires additional techniques to those developed for character recognition. The most well-known obstacle is the so-called Sayre paradox: Correct recognition requires good segmentation, yet improvement in segmentation is achieved using information provided by the recognition process. This dilemma can be avoided by techniques that identify sets of regions containing useful information. Such symbol-spotting methods allow the detection of symbols in maps or technical drawings without having to fully segment or fully recognize the entire content.
This unique text/reference provides a complete, integrated and large-scale solution to the challenge of designing a robust symbol-spotting method for collections of graphic-rich documents. The book examines a number of features and descriptors, from basic photometric descriptors commonly used in computer vision techniques to those specific to graphical shapes, presenting a methodology which can be used in a wide variety of applications. Additionally, readers are supplied with an insight into the problem of performance evaluation of spotting methods. Some very basic knowledge of pattern recognition, document image analysis and graphics recognition is assumed.
Keywords: Focused Retrieval , Graphical Pattern Indexation,Graphics Recognition ,Pattern Recognition , Performance Evaluation , Symbol Description ,Symbol Spotting
|
Muhammad Muzzamil Luqman, Thierry Brouard, Jean-Yves Ramel, & Josep Llados. (2010). Vers une approche foue of encapsulation de graphes: application a la reconnaissance de symboles. In Colloque International Francophone sur l'Écrit et le Document (pp. 169–184).
Abstract: We present a new methodology for symbol recognition, by employing a structural approach for representing visual associations in symbols and a statistical classifier for recognition. A graphic symbol is vectorized, its topological and geometrical details are encoded by an attributed relational graph and a signature is computed for it. Data adapted fuzzy intervals have been introduced for addressing the sensitivity of structural representations to noise. The joint probability distribution of signatures is encoded by a Bayesian network, which serves as a mechanism for pruning irrelevant features and choosing a subset of interesting features from structural signatures of underlying symbol set, and is deployed in a supervised learning scenario for recognizing query symbols. Experimental results on pre-segmented 2D linear architectural and electronic symbols from GREC databases are presented.
Keywords: Fuzzy interval; Graph embedding; Bayesian network; Symbol recognition
|
Jaume Amores. (2010). Vocabulary-based Approaches for Multiple-Instance Data: a Comparative Study. In 20th International Conference on Pattern Recognition (4246–4250).
Abstract: Multiple Instance Learning (MIL) has become a hot topic and many different algorithms have been proposed in the last years. Despite this fact, there is a lack of comparative studies that shed light into the characteristics of the different methods and their behavior in different scenarios. In this paper we provide such an analysis. We include methods from different families, and pay special attention to vocabulary-based approaches, a new family of methods that has not received much attention in the MIL literature. The empirical comparison includes seven databases from four heterogeneous domains, implementations of eight popular MIL methods, and a study of the behavior under synthetic conditions. Based on this analysis, we show that, with an appropriate implementation, vocabulary-based approaches outperform other MIL methods in most of the cases, showing in general a more consistent performance.
|
Josep M. Gonfaus, Xavier Boix, Joost Van de Weijer, Andrew Bagdanov, Joan Serrat, & Jordi Gonzalez. (2010). Harmony Potentials for Joint Classification and Segmentation. In 23rd IEEE Conference on Computer Vision and Pattern Recognition (3280–3287).
Abstract: Hierarchical conditional random fields have been successfully applied to object segmentation. One reason is their ability to incorporate contextual information at different scales. However, these models do not allow multiple labels to be assigned to a single node. At higher scales in the image, this yields an oversimplified model, since multiple classes can be reasonable expected to appear within one region. This simplified model especially limits the impact that observations at larger scales may have on the CRF model. Neglecting the information at larger scales is undesirable since class-label estimates based on these scales are more reliable than at smaller, noisier scales. To address this problem, we propose a new potential, called harmony potential, which can encode any possible combination of class labels. We propose an effective sampling strategy that renders tractable the underlying optimization problem. Results show that our approach obtains state-of-the-art results on two challenging datasets: Pascal VOC 2009 and MSRC-21.
|
Naila Murray, & Eduard Vazquez. (2010). Lacuna Restoration: How to choose a neutral colour? In Proceedings of The CREATE 2010 Conference (248–252).
Abstract: Painting restoration which involves filling in material loss (called lacuna) is a complex process. Several standard techniques exist to tackle lacuna restoration,
and this article focuses on those techniques that employ a “neutral” colour to mask the defect. Restoration experts often disagree on the choice of such a colour and in fact, the concept of a neutral colour is controversial. We posit that a neutral colour is one that attracts relatively little visual attention for a specific lacuna. We conducted an eye tracking experiment to compare two common neutral
colour selection methods, specifically the most common local colour and the mean local colour. Results obtained demonstrate that the most common local colour triggers less visual attention in general. Notwithstanding, we have observed instances in which the most common colour triggers a significant amount of attention when subjects spent time resolving their confusion about whether or not a lacuna was part of the painting.
|
Marta Teres, & Eduard Vazquez. (2010). Museums, spaces and museographical resources. Current state and proposals for a multidisciplinary framework to open new perspectives. In Proceedings of The CREATE 2010 Conference (319–323).
Abstract: Two of the main aims of a museum are to communicate its heritage and to make enjoy its visitors. This communication can be done through the pieces itself and the museographical resources but also through the building, the interior design, the light and the colour. Art museums, in opposition with other museums, lack on the application of these additional resources. Such a work necessarily requires a multidisciplinary point of view for a holistic vision of all what a museum implies and to use all its potential as a tool of knowledge and culture for all the visitors.
|