|
Shiqi Yang, Kai Wang, Luis Herranz, & Joost Van de Weijer. (2021). On Implicit Attribute Localization for Generalized Zero-Shot Learning. IEEE Signal Processing Letters, 28, 872–876.
Abstract: Zero-shot learning (ZSL) aims to discriminate images from unseen classes by exploiting relations to seen classes via their attribute-based descriptions. Since attributes are often related to specific parts of objects, many recent works focus on discovering discriminative regions. However, these methods usually require additional complex part detection modules or attention mechanisms. In this paper, 1) we show that common ZSL backbones (without explicit attention nor part detection) can implicitly localize attributes, yet this property is not exploited. 2) Exploiting it, we then propose SELAR, a simple method that further encourages attribute localization, surprisingly achieving very competitive generalized ZSL (GZSL) performance when compared with more complex state-of-the-art methods. Our findings provide useful insight for designing future GZSL methods, and SELAR provides an easy to implement yet strong baseline.
|
|
|
Nataliya Shapovalova. (2010). On Importance of Interaction and Context (Vol. 155). Master's thesis, , .
|
|
|
Nataliya Shapovalova, Wenjuan Gong, Marco Pedersoli, Xavier Roca, & Jordi Gonzalez. (2011). On Importance of Interactions and Context in Human Action Recognition. In and M. Hernandez J. M. S. J. Vitria (Ed.), 5th Iberian Conference on Pattern Recognition and Image Analysis (Vol. 6669, pp. 58–66). LNCS. Springer Berlin Heidelberg.
Abstract: This paper is focused on the automatic recognition of human events in static images. Popular techniques use knowledge of the human pose for inferring the action, and the most recent approaches tend to combine pose information with either knowledge of the scene or of the objects with which the human interacts. Our approach makes a step forward in this direction by combining the human pose with the scene in which the human is placed, together with the spatial relationships between humans and objects. Based on standard, simple descriptors like HOG and SIFT, recognition performance is enhanced when these three types of knowledge are taken into account. Results obtained in the PASCAL 2010 Action Recognition Dataset demonstrate that our technique reaches state-of-the-art results using simple descriptors and classifiers.
|
|
|
Francesco Ciompi, Oriol Pujol, Oriol Rodriguez-Leor, Angel Serrano, J. Mauri, & Petia Radeva. (2009). On in-vitro and in-vivo IVUS data fusion. In 12th International Conference of the Catalan Association for Artificial Intelligence (Vol. 202, pp. 147–156).
Abstract: The design and the validation of an automatic plaque characterization technique based on Intravascular Ultrasound (IVUS) usually requires a data ground-truth. The histological analysis of post-mortem coronary arteries is commonly assumed as the state-of-the-art process for the extraction of a reliable data-set of atherosclerotic plaques. Unfortunately, the amount of data provided by this technique is usually few, due to the difficulties in collecting post-mortem cases and phenomena of tissue spoiling during histological analysis. In this paper we tackle the process of fusing in-vivo and in-vitro IVUS data starting with the analysis of recently proposed approaches for the creation of an enhanced IVUS data-set; furthermore, we propose a new approach, named pLDS, based on semi-supervised learning with a data selection criterion. The enhanced data-set obtained by each one of the analyzed approaches is used to train a classifier for tissue characterization purposes. Finally, the discriminative power of each classifier is quantitatively assessed and compared by classifying a data-set of validated in-vitro IVUS data.
|
|
|
David Fernandez, Josep Llados, Alicia Fornes, & R.Manmatha. (2012). On Influence of Line Segmentation in Efficient Word Segmentation in Old Manuscripts. In 13th International Conference on Frontiers in Handwriting Recognition (pp. 763–768).
Abstract: he objective of this work is to show the importance of a good line segmentation to obtain better results in the segmentation of words of historical documents. We have used the approach developed by Manmatha and Rothfeder [1] to segment words in old handwritten documents. In their work the lines of the documents are extracted using projections. In this work, we have developed an approach to segment lines more efficiently. The new line segmentation algorithm tackles with skewed, touching and noisy lines, so it is significantly improves word segmentation. Experiments using Spanish documents from the Marriages Database of the Barcelona Cathedral show that this approach reduces the error rate by more than 20%
Keywords: document image processing;handwritten character recognition;history;image segmentation;Spanish document;historical document;line segmentation;old handwritten document;old manuscript;word segmentation;Bifurcation;Dynamic programming;Handwriting recognition;Image segmentation;Measurement;Noise;Skeleton;Segmentation;document analysis;document and text processing;handwriting analysis;heuristics;path-finding
|
|
|
Felipe Codevilla, Antonio Lopez, Vladlen Koltun, & Alexey Dosovitskiy. (2018). On Offline Evaluation of Vision-based Driving Models. In 15th European Conference on Computer Vision (Vol. 11219, pp. 246–262). LNCS.
Abstract: Autonomous driving models should ideally be evaluated by deploying
them on a fleet of physical vehicles in the real world. Unfortunately, this approach is not practical for the vast majority of researchers. An attractive alternative is to evaluate models offline, on a pre-collected validation dataset with ground truth annotation. In this paper, we investigate the relation between various online and offline metrics for evaluation of autonomous driving models. We find that offline prediction error is not necessarily correlated with driving quality, and two models with identical prediction error can differ dramatically in their driving performance. We show that the correlation of offline evaluation with driving quality can be significantly improved by selecting an appropriate validation dataset and
suitable offline metrics.
Keywords: Autonomous driving; deep learning
|
|
|
Murad Al Haj, Jordi Gonzalez, & Larry S. Davis. (2012). On Partial Least Squares in Head Pose Estimation: How to simultaneously deal with misalignment. In 25th IEEE Conference on Computer Vision and Pattern Recognition (pp. 2602–2609). IEEE Xplore.
Abstract: Head pose estimation is a critical problem in many computer vision applications. These include human computer interaction, video surveillance, face and expression recognition. In most prior work on heads pose estimation, the positions of the faces on which the pose is to be estimated are specified manually. Therefore, the results are reported without studying the effect of misalignment. We propose a method based on partial least squares (PLS) regression to estimate pose and solve the alignment problem simultaneously. The contributions of this paper are two-fold: 1) we show that the kernel version of PLS (kPLS) achieves better than state-of-the-art results on the estimation problem and 2) we develop a technique to reduce misalignment based on the learned PLS factors.
|
|
|
Dani Rowe, Jordi Gonzalez, Ivan Huerta, & Juan J. Villanueva. (2007). On Reasoning over Tracking Events. In 15th Scandinavian Conference on Image Analysis (Vol. 4522, 502–511). LNCS.
|
|
|
Joan Serrat, Antonio Lopez, & David Lloret. (2000). On ridges and valleys. In 15 th International Conference on Pattern Recognition (Vol. 4, pp. 59–66).
|
|
|
Anna Salvatella. (2001). On texture description.
|
|
|
Oriol Pujol, & Petia Radeva. (2005). On the assessment of texture descriptors in intravascular ultrasound images: A boosting approach to a feasible plaque classification. In Plaque Imaging: Pixel to Molecular Level, IOS Press, J. Suri et al. (Eds.), 113: 276–299, ISBN: 1–58603–516–9.
|
|
|
Oriol Ramos Terrades, Ernest Valveny, & Salvatore Tabbone. (2007). On the Combination of Ridgelets Descriptors for Symbol Recognition. In Seventh IAPR International Workshop on Graphics Recognition (18–20).
|
|
|
Oriol Ramos Terrades, Ernest Valveny, & Salvatore Tabbone. (2008). On the Combination of Ridgelets Descriptors for Symbol Recognition. In Graphics Recognition: Recent Advances and New Oportunities, W. Lius, J. Llados, J.M. Ogier, LNCS 5046:104–113.
|
|
|
Pedro Martins, Paulo Carvalho, & Carlo Gatta. (2016). On the completeness of feature-driven maximally stable extremal regions. PRL - Pattern Recognition Letters, 74, 9–16.
Abstract: By definition, local image features provide a compact representation of the image in which most of the image information is preserved. This capability offered by local features has been overlooked, despite being relevant in many application scenarios. In this paper, we analyze and discuss the performance of feature-driven Maximally Stable Extremal Regions (MSER) in terms of the coverage of informative image parts (completeness). This type of features results from an MSER extraction on saliency maps in which features related to objects boundaries or even symmetry axes are highlighted. These maps are intended to be suitable domains for MSER detection, allowing this detector to provide a better coverage of informative image parts. Our experimental results, which were based on a large-scale evaluation, show that feature-driven MSER have relatively high completeness values and provide more complete sets than a traditional MSER detection even when sets of similar cardinality are considered.
Keywords: Local features; Completeness; Maximally Stable Extremal Regions
|
|
|
Jaume Gibert, Ernest Valveny, Horst Bunke, & Alicia Fornes. (2012). On the Correlation of Graph Edit Distance and L1 Distance in the Attribute Statistics Embedding Space. In Structural, Syntactic, and Statistical Pattern Recognition, Joint IAPR International Workshop (Vol. 7626, pp. 135–143). LNCS. Springer-Berlag, Berlin.
Abstract: Graph embeddings in vector spaces aim at assigning a pattern vector to every graph so that the problems of graph classification and clustering can be solved by using data processing algorithms originally developed for statistical feature vectors. An important requirement graph features should fulfil is that they reproduce as much as possible the properties among objects in the graph domain. In particular, it is usually desired that distances between pairs of graphs in the graph domain closely resemble those between their corresponding vectorial representations. In this work, we analyse relations between the edit distance in the graph domain and the L1 distance of the attribute statistics based embedding, for which good classification performance has been reported on various datasets. We show that there is actually a high correlation between the two kinds of distances provided that the corresponding parameter values that account for balancing the weight between node and edge based features are properly selected.
|
|