Fahad Shahbaz Khan, Joost Van de Weijer, & Maria Vanrell. (2009). Top-Down Color Attention for Object Recognition. In 12th International Conference on Computer Vision (pp. 979–986).
Abstract: Generally the bag-of-words based image representation follows a bottom-up paradigm. The subsequent stages of the process: feature detection, feature description, vocabulary construction and image representation are performed independent of the intentioned object classes to be detected. In such a framework, combining multiple cues such as shape and color often provides below-expected results. This paper presents a novel method for recognizing object categories when using multiple cues by separating the shape and color cue. Color is used to guide attention by means of a top-down category-specific attention map. The color attention map is then further deployed to modulate the shape features by taking more features from regions within an image that are likely to contain an object instance. This procedure leads to a category-specific image histogram representation for each category. Furthermore, we argue that the method combines the advantages of both early and late fusion. We compare our approach with existing methods that combine color and shape cues on three data sets containing varied importance of both cues, namely, Soccer ( color predominance), Flower (color and shape parity), and PASCAL VOC Challenge 2007 (shape predominance). The experiments clearly demonstrate that in all three data sets our proposed framework significantly outperforms the state-of-the-art methods for combining color and shape information.
|
Arjan Gijsenij, Theo Gevers, & Joost Van de Weijer. (2009). Physics-based Edge Evaluation for Improved Color Constancy. In 22nd IEEE Conference on Computer Vision and Pattern Recognition (581 – 588).
Abstract: Edge-based color constancy makes use of image derivatives to estimate the illuminant. However, different edge types exist in real-world images such as shadow, geometry, material and highlight edges. These different edge types may have a distinctive influence on the performance of the illuminant estimation.
|
Jose Manuel Alvarez, Ferran Diego, Joan Serrat, & Antonio Lopez. (2009). Automatic Ground-truthing using video registration for on-board detection algorithms. In 16th IEEE International Conference on Image Processing (pp. 4389–4392).
Abstract: Ground-truth data is essential for the objective evaluation of object detection methods in computer vision. Many works claim their method is robust but they support it with experiments which are not quantitatively assessed with regard some ground-truth. This is one of the main obstacles to properly evaluate and compare such methods. One of the main reasons is that creating an extensive and representative ground-truth is very time consuming, specially in the case of video sequences, where thousands of frames have to be labelled. Could such a ground-truth be generated, at least in part, automatically? Though it may seem a contradictory question, we show that this is possible for the case of video sequences recorded from a moving camera. The key idea is transferring existing frame segmentations from a reference sequence into another video sequence recorded at a different time on the same track, possibly under a different ambient lighting. We have carried out experiments on several video sequence pairs and quantitatively assessed the precision of the transformed ground-truth, which prove that our approach is not only feasible but also quite accurate.
|
Nicola Bellotto, Eric Sommerlade, Ben Benfold, Charles Bibby, I. Reid, Daniel Roth, et al. (2009). A Distributed Camera System for Multi-Resolution Surveillance. In 3rd ACM/IEEE International Conference on Distributed Smart Cameras.
Abstract: We describe an architecture for a multi-camera, multi-resolution surveillance system. The aim is to support a set of distributed static and pan-tilt-zoom (PTZ) cameras and visual tracking algorithms, together with a central supervisor unit. Each camera (and possibly pan-tilt device) has a dedicated process and processor. Asynchronous interprocess communications and archiving of data are achieved in a simple and effective way via a central repository, implemented using an SQL database. Visual tracking data from static views are stored dynamically into tables in the database via client calls to the SQL server. A supervisor process running on the SQL server determines if active zoom cameras should be dispatched to observe a particular target, and this message is effected via writing demands into another database table. We show results from a real implementation of the system comprising one static camera overviewing the environment under consideration and a PTZ camera operating under closed-loop velocity control, which uses a fast and robust level-set-based region tracker. Experiments demonstrate the effectiveness of our approach and its feasibility to multi-camera systems for intelligent surveillance.
Keywords: 10.1109/ICDSC.2009.5289413
|
Mikhail Mozerov, Ariel Amato, & Xavier Roca. (2009). Occlusion Handling in Trinocular Stereo using Composite Disparity Space Image. In 19th International Conference on Computer Graphics and Vision (69–73).
Abstract: In this paper we propose a method that smartly improves occlusion handling in stereo matching using trinocular stereo. The main idea is based on the assumption that any occluded region in a matched stereo pair (middle-left images) in general is not occluded in the opposite matched pair (middle-right images). Then two disparity space images (DSI) can be merged in one composite DSI. The proposed integration differs from the known approach that uses a cumulative cost. A dense disparity map is obtained with a global optimization algorithm using the proposed composite DSI. The experimental results are evaluated on the Middlebury data set, showing high performance of the proposed algorithm especially in the occluded regions. One of the top positions in the rank of the Middlebury website confirms the performance of our method to be competitive with the best stereo matching.
|
Ivan Huerta, Michael Holte, Thomas B. Moeslund, & Jordi Gonzalez. (2009). Detection and Removal of Chromatic Moving Shadows in Surveillance Scenarios. In 12th International Conference on Computer Vision (pp. 1499–1506).
Abstract: Segmentation in the surveillance domain has to deal with shadows to avoid distortions when detecting moving objects. Most segmentation approaches dealing with shadow detection are typically restricted to penumbra shadows. Therefore, such techniques cannot cope well with umbra shadows. Consequently, umbra shadows are usually detected as part of moving objects. In this paper we present a novel technique based on gradient and colour models for separating chromatic moving cast shadows from detected moving objects. Firstly, both a chromatic invariant colour cone model and an invariant gradient model are built to perform automatic segmentation while detecting potential shadows. In a second step, regions corresponding to potential shadows are grouped by considering “a bluish effect” and an edge partitioning. Lastly, (i) temporal similarities between textures and (ii) spatial similarities between chrominance angle and brightness distortions are analysed for all potential shadow regions in order to finally identify umbra shadows. Unlike other approaches, our method does not make any a-priori assumptions about camera location, surface geometries, surface textures, shapes and types of shadows, objects, and background. Experimental results show the performance and accuracy of our approach in different shadowed materials and illumination conditions.
|
D. Jayagopi, Bogdan Raducanu, & D. Gatica-Perez. (2009). Characterizing conversational group dynamics using nonverbal behaviour. In 10th IEEE International Conference on Multimedia and Expo (370–373).
Abstract: This paper addresses the novel problem of characterizing conversational group dynamics. It is well documented in social psychology that depending on the objectives a group, the dynamics are different. For example, a competitive meeting has a different objective from that of a collaborative meeting. We propose a method to characterize group dynamics based on the joint description of a group members' aggregated acoustical nonverbal behaviour to classify two meeting datasets (one being cooperative-type and the other being competitive-type). We use 4.5 hours of real behavioural multi-party data and show that our methodology can achieve a classification rate of upto 100%.
|
Ricard Coll, Alicia Fornes, & Josep Llados. (2009). Graphological Analysis of Handwritten Text Documents for Human Resources Recruitment. In 10th International Conference on Document Analysis and Recognition (1081–1085).
Abstract: The use of graphology in recruitment processes has become a popular tool in many human resources companies. This paper presents a model that links features from handwritten images to a number of personality characteristics used to measure applicant aptitudes for the job in a particular hiring scenario. In particular we propose a model of measuring active personality and leadership of the writer. Graphological features that define such a profile are measured in terms of document and script attributes like layout configuration, letter size, shape, slant and skew angle of lines, etc. After the extraction, data is classified using a neural network. An experimental framework with real samples has been constructed to illustrate the performance of the approach.
|
Alicia Fornes, Josep Llados, Gemma Sanchez, & Horst Bunke. (2009). Symbol-independent writer identification in old handwritten music scores. In In proceedings of 8th IAPR International Workshop on Graphics Recognition (186–197). Springer Berlin Heidelberg.
|
Alicia Fornes, Josep Llados, Gemma Sanchez, & Horst Bunke. (2009). On the use of textural features for writer identification in old handwritten music scores. In 10th International Conference on Document Analysis and Recognition (pp. 996–1000).
Abstract: Writer identification consists in determining the writer of a piece of handwriting from a set of writers. In this paper we present a system for writer identification in old handwritten music scores which uses only music notation to determine the author. The steps of the proposed system are the following. First of all, the music sheet is preprocessed for obtaining a music score without the staff lines. Afterwards, four different methods for generating texture images from music symbols are applied. Every approach uses a different spatial variation when combining the music symbols to generate the textures. Finally, Gabor filters and Grey-scale Co-ocurrence matrices are used to obtain the features. The classification is performed using a k-NN classifier based on Euclidean distance. The proposed method has been tested on a database of old music scores from the 17th to 19th centuries, achieving encouraging identification rates.
|
Agnes Borras, & Josep Llados. (2009). Corest: A measure of color and space stability to detect salient regions according to human criteria. In 5th International Conference on Computer Vision Theory and Applications (pp. 204–209).
|
Sergio Escalera, R. M. Martinez, Jordi Vitria, Petia Radeva, & Maria Teresa Anguera. (2009). Dominance Detection in Face-to-face Conversations. In 2nd IEEE Workshop on CVPR for Human communicative Behavior analysis (97–102).
Abstract: Dominance is referred to the level of influence a person has in a conversation. Dominance is an important research area in social psychology, but the problem of its automatic estimation is a very recent topic in the contexts of social and wearable computing. In this paper, we focus on dominance detection from visual cues. We estimate the correlation among observers by categorizing the dominant people in a set of face-to-face conversations. Different dominance indicators from gestural communication are defined, manually annotated, and compared to the observers opinion. Moreover, the considered indicators are automatically extracted from video sequences and learnt by using binary classifiers. Results from the three analysis shows a high correlation and allows the categorization of dominant people in public discussion video sequences.
|
Salim Jouili, Salvatore Tabbone, & Ernest Valveny. (2009). Evaluation of graph matching measures for documents retrieval. In In proceedings of 8th IAPR International Workshop on Graphics Recognition (13–21).
Abstract: In this paper we evaluate four graph distance measures. The analysis is performed for document retrieval tasks. For this aim, different kind of documents are used which include line drawings (symbols), ancient documents (ornamental letters), shapes and trademark-logos. The experimental results show that the performance of each grahp distance measure depends on the kind of data and the graph representation technique.
Keywords: Graph Matching; Graph retrieval; structural representation; Performance Evaluation
|
Angel Sappa, & Mohammad Rouhani. (2009). Efficient Distance Estimation for Fitting Implicit Quadric Surfaces. In 16th IEEE International Conference on Image Processing (3521–3524).
Abstract: This paper presents a novel approach for estimating the shortest Euclidean distance from a given point to the corresponding implicit quadric fitting surface. It first estimates the orthogonal orientation to the surface from the given point; then the shortest distance is directly estimated by intersecting the implicit surface with a line passing through the given point according to the estimated orthogonal orientation. The proposed orthogonal distance estimation is easily obtained without increasing computational complexity; hence it can be used in error minimization surface fitting frameworks. Comparisons of the proposed metric with previous approaches are provided to show both improvements in CPU time as well as in the accuracy of the obtained results. Surfaces fitted by using the proposed geometric distance estimation and state of the art metrics are presented to show the viability of the proposed approach.
|
Gemma Roig, Xavier Boix, & Fernando De la Torre. (2009). Optimal Feature Selection for Subspace Image Matching. In 2nd IEEE International Workshop on Subspace Methods in conjunction.
Abstract: Image matching has been a central research topic in computer vision over the last decades. Typical approaches to correspondence involve matching feature points between images. In this paper, we present a novel problem for establishing correspondences between a sparse set of image features and a previously learned subspace model. We formulate the matching task as an energy minimization, and jointly optimize over all possible feature assignments and parameters of the subspace model. This problem is in general NP-hard. We propose a convex relaxation approximation, and develop two optimization strategies: naïve gradient-descent and quadratic programming. Alternatively, we reformulate the optimization criterion as a sparse eigenvalue problem, and solve it using a recently proposed backward greedy algorithm. Experimental results on facial feature detection show that the quadratic programming solution provides better selection mechanism for relevant features.
|