Home | << 1 2 >> |
Records | |||||
---|---|---|---|---|---|
Author | Lorenzo Seidenari; Giuseppe Serra; Andrew Bagdanov; Alberto del Bimbo | ||||
Title | Local pyramidal descriptors for image recognition | Type | Journal Article | ||
Year | 2014 | Publication | IEEE Transactions on Pattern Analysis and Machine Intelligence | Abbreviated Journal | TPAMI |
Volume | 36 | Issue | 5 | Pages | 1033 - 1040 |
Keywords | Object categorization; local features; kernel methods | ||||
Abstract | In this paper we present a novel method to improve the flexibility of descriptor matching for image recognition by using local multiresolution
pyramids in feature space. We propose that image patches be represented at multiple levels of descriptor detail and that these levels be defined in terms of local spatial pooling resolution. Preserving multiple levels of detail in local descriptors is a way of hedging one’s bets on which levels will most relevant for matching during learning and recognition. We introduce the Pyramid SIFT (P-SIFT) descriptor and show that its use in four state-of-the-art image recognition pipelines improves accuracy and yields state-of-the-art results. Our technique is applicable independently of spatial pyramid matching and we show that spatial pyramids can be combined with local pyramids to obtain further improvement.We achieve state-of-the-art results on Caltech-101 (80.1%) and Caltech-256 (52.6%) when compared to other approaches based on SIFT features over intensity images. Our technique is efficient and is extremely easy to integrate into image recognition pipelines. |
||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | 0162-8828 | ISBN | Medium | ||
Area | Expedition | Conference | |||
Notes | LAMP; 600.079 | Approved | no | ||
Call Number | Admin @ si @ SSB2014 | Serial | 2524 | ||
Permanent link to this record | |||||
Author | Oriol Pujol; David Masip | ||||
Title | Geometry-Based Ensembles: Toward a Structural Characterization of the Classification Boundary | Type | Journal Article | ||
Year | 2009 | Publication | IEEE Transactions on Pattern Analysis and Machine Intelligence | Abbreviated Journal | TPAMI |
Volume | 31 | Issue | 6 | Pages | 1140–1146 |
Keywords | |||||
Abstract | This article introduces a novel binary discriminative learning technique based on the approximation of the non-linear decision boundary by a piece-wise linear smooth additive model. The decision border is geometrically defined by means of the characterizing boundary points – points that belong to the optimal boundary under a certain notion of robustness. Based on these points, a set of locally robust linear classifiers is defined and assembled by means of a Tikhonov regularized optimization procedure in an additive model to create a final lambda-smooth decision rule. As a result, a very simple and robust classifier with a strong geometrical meaning and non-linear behavior is obtained. The simplicity of the method allows its extension to cope with some of nowadays machine learning challenges, such as online learning, large scale learning or parallelization, with linear computational complexity. We validate our approach on the UCI database. Finally, we apply our technique in online and large scale scenarios, and in six real life computer vision and pattern recognition problems: gender recognition, intravascular ultrasound tissue classification, speed traffic sign detection, Chagas' disease severity detection, clef classification and action recognition using a 3D accelerometer data. The results are promising and this paper opens a line of research that deserves further attention | ||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | |||
Notes | OR;HuPBA;MV | Approved | no | ||
Call Number | BCNPCL @ bcnpcl @ PuM2009 | Serial | 1252 | ||
Permanent link to this record | |||||
Author | Carlo Gatta; Francesco Ciompi | ||||
Title | Stacked Sequential Scale-Space Taylor Context | Type | Journal Article | ||
Year | 2014 | Publication | IEEE Transactions on Pattern Analysis and Machine Intelligence | Abbreviated Journal | TPAMI |
Volume | 36 | Issue | 8 | Pages | 1694-1700 |
Keywords | |||||
Abstract | We analyze sequential image labeling methods that sample the posterior label field in order to gather contextual information. We propose an effective method that extracts local Taylor coefficients from the posterior at different scales. Results show that our proposal outperforms state-of-the-art methods on MSRC-21, CAMVID, eTRIMS8 and KAIST2 data sets. | ||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | 0162-8828 | ISBN | Medium | ||
Area | Expedition | Conference | |||
Notes | LAMP; MILAB; 601.160; 600.079 | Approved | no | ||
Call Number | Admin @ si @ GaC2014 | Serial | 2466 | ||
Permanent link to this record | |||||
Author | G. Lisanti; I. Masi; Andrew Bagdanov; Alberto del Bimbo | ||||
Title | Person Re-identification by Iterative Re-weighted Sparse Ranking | Type | Journal Article | ||
Year | 2015 | Publication | IEEE Transactions on Pattern Analysis and Machine Intelligence | Abbreviated Journal | TPAMI |
Volume | 37 | Issue | 8 | Pages | 1629 - 1642 |
Keywords | |||||
Abstract | In this paper we introduce a method for person re-identification based on discriminative, sparse basis expansions of targets in terms of a labeled gallery of known individuals. We propose an iterative extension to sparse discriminative classifiers capable of ranking many candidate targets. The approach makes use of soft- and hard- re-weighting to redistribute energy among the most relevant contributing elements and to ensure that the best candidates are ranked at each iteration. Our approach also leverages a novel visual descriptor which we show to be discriminative while remaining robust to pose and illumination variations. An extensive comparative evaluation is given demonstrating that our approach achieves state-of-the-art performance on single- and multi-shot person re-identification scenarios on the VIPeR, i-LIDS, ETHZ, and CAVIAR4REID datasets. The combination of our descriptor and iterative sparse basis expansion improves state-of-the-art rank-1 performance by six percentage points on VIPeR and by 20 on CAVIAR4REID compared to other methods with a single gallery image per person. With multiple gallery and probe images per person our approach improves by 17 percentage points the state-of-the-art on i-LIDS and by 72 on CAVIAR4REID at rank-1. The approach is also quite efficient, capable of single-shot person re-identification over galleries containing hundreds of individuals at about 30 re-identifications per second. | ||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | 0162-8828 | ISBN | Medium | ||
Area | Expedition | Conference | |||
Notes | LAMP; 601.240; 600.079 | Approved | no | ||
Call Number | Admin @ si @ LMB2015 | Serial | 2557 | ||
Permanent link to this record | |||||
Author | Adriana Romero; Petia Radeva; Carlo Gatta | ||||
Title | Meta-parameter free unsupervised sparse feature learning | Type | Journal Article | ||
Year | 2015 | Publication | IEEE Transactions on Pattern Analysis and Machine Intelligence | Abbreviated Journal | TPAMI |
Volume | 37 | Issue | 8 | Pages | 1716-1722 |
Keywords | |||||
Abstract | We propose a meta-parameter free, off-the-shelf, simple and fast unsupervised feature learning algorithm, which exploits a new way of optimizing for sparsity. Experiments on CIFAR-10, STL- 10 and UCMerced show that the method achieves the state-of-theart performance, providing discriminative features that generalize well. | ||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | |||
Notes | MILAB; 600.068; 600.079; 601.160 | Approved | no | ||
Call Number | Admin @ si @ RRG2014b | Serial | 2594 | ||
Permanent link to this record | |||||
Author | Ciprian Corneanu; Marc Oliu; Jeffrey F. Cohn; Sergio Escalera | ||||
Title | Survey on RGB, 3D, Thermal, and Multimodal Approaches for Facial Expression Recognition: History | Type | Journal Article | ||
Year | 2016 | Publication | IEEE Transactions on Pattern Analysis and Machine Intelligence | Abbreviated Journal | TPAMI |
Volume | 28 | Issue | 8 | Pages | 1548-1568 |
Keywords | Facial expression; affect; emotion recognition; RGB; 3D; thermal; multimodal | ||||
Abstract | Facial expressions are an important way through which humans interact socially. Building a system capable of automatically recognizing facial expressions from images and video has been an intense field of study in recent years. Interpreting such expressions remains challenging and much research is needed about the way they relate to human affect. This paper presents a general overview of automatic RGB, 3D, thermal and multimodal facial expression analysis. We define a new taxonomy for the field, encompassing all steps from face detection to facial expression recognition, and describe and classify the state of the art methods accordingly. We also present the important datasets and the bench-marking of most influential methods. We conclude with a general discussion about trends, important questions and future lines of research. | ||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | |||
Notes | HuPBA;MILAB; | Approved | no | ||
Call Number | Admin @ si @ COC2016 | Serial | 2718 | ||
Permanent link to this record | |||||
Author | Xialei Liu; Joost Van de Weijer; Andrew Bagdanov | ||||
Title | Exploiting Unlabeled Data in CNNs by Self-Supervised Learning to Rank | Type | Journal Article | ||
Year | 2019 | Publication | IEEE Transactions on Pattern Analysis and Machine Intelligence | Abbreviated Journal | TPAMI |
Volume | 41 | Issue | 8 | Pages | 1862-1878 |
Keywords | Task analysis;Training;Image quality;Visualization;Uncertainty;Labeling;Neural networks;Learning from rankings;image quality assessment;crowd counting;active learning | ||||
Abstract | For many applications the collection of labeled data is expensive laborious. Exploitation of unlabeled data during training is thus a long pursued objective of machine learning. Self-supervised learning addresses this by positing an auxiliary task (different, but related to the supervised task) for which data is abundantly available. In this paper, we show how ranking can be used as a proxy task for some regression problems. As another contribution, we propose an efficient backpropagation technique for Siamese networks which prevents the redundant computation introduced by the multi-branch network architecture. We apply our framework to two regression problems: Image Quality Assessment (IQA) and Crowd Counting. For both we show how to automatically generate ranked image sets from unlabeled data. Our results show that networks trained to regress to the ground truth targets for labeled data and to simultaneously learn to rank unlabeled data obtain significantly better, state-of-the-art results for both IQA and crowd counting. In addition, we show that measuring network uncertainty on the self-supervised proxy task is a good measure of informativeness of unlabeled data. This can be used to drive an algorithm for active learning and we show that this reduces labeling effort by up to 50 percent. | ||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | |||
Notes | LAMP; 600.109; 600.106; 600.120 | Approved | no | ||
Call Number | LWB2019 | Serial | 3267 | ||
Permanent link to this record | |||||
Author | Oriol Ramos Terrades; Ernest Valveny; Salvatore Tabbone | ||||
Title | Optimal Classifier Fusion in a Non-Bayesian Probabilistic Framework | Type | Journal Article | ||
Year | 2009 | Publication | IEEE Transactions on Pattern Analysis and Machine Intelligence | Abbreviated Journal | TPAMI |
Volume | 31 | Issue | 9 | Pages | 1630–1644 |
Keywords | |||||
Abstract | The combination of the output of classifiers has been one of the strategies used to improve classification rates in general purpose classification systems. Some of the most common approaches can be explained using the Bayes' formula. In this paper, we tackle the problem of the combination of classifiers using a non-Bayesian probabilistic framework. This approach permits us to derive two linear combination rules that minimize misclassification rates under some constraints on the distribution of classifiers. In order to show the validity of this approach we have compared it with other popular combination rules from a theoretical viewpoint using a synthetic data set, and experimentally using two standard databases: the MNIST handwritten digit database and the GREC symbol database. Results on the synthetic data set show the validity of the theoretical approach. Indeed, results on real data show that the proposed methods outperform other common combination schemes. | ||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | 0162-8828 | ISBN | Medium | ||
Area | Expedition | Conference | |||
Notes | DAG | Approved | no | ||
Call Number | DAG @ dag @ RVT2009 | Serial | 1220 | ||
Permanent link to this record | |||||
Author | Arash Akbarinia; C. Alejandro Parraga | ||||
Title | Colour Constancy Beyond the Classical Receptive Field | Type | Journal Article | ||
Year | 2018 | Publication | IEEE Transactions on Pattern Analysis and Machine Intelligence | Abbreviated Journal | TPAMI |
Volume | 40 | Issue | 9 | Pages | 2081 - 2094 |
Keywords | |||||
Abstract | The problem of removing illuminant variations to preserve the colours of objects (colour constancy) has already been solved by the human brain using mechanisms that rely largely on centre-surround computations of local contrast. In this paper we adopt some of these biological solutions described by long known physiological findings into a simple, fully automatic, functional model (termed Adaptive Surround Modulation or ASM). In ASM, the size of a visual neuron's receptive field (RF) as well as the relationship with its surround varies according to the local contrast within the stimulus, which in turn determines the nature of the centre-surround normalisation of cortical neurons higher up in the processing chain. We modelled colour constancy by means of two overlapping asymmetric Gaussian kernels whose sizes are adapted based on the contrast of the surround pixels, resembling the change of RF size. We simulated the contrast-dependent surround modulation by weighting the contribution of each Gaussian according to the centre-surround contrast. In the end, we obtained an estimation of the illuminant from the set of the most activated RFs' outputs. Our results on three single-illuminant and one multi-illuminant benchmark datasets show that ASM is highly competitive against the state-of-the-art and it even outperforms learning-based algorithms in one case. Moreover, the robustness of our model is more tangible if we consider that our results were obtained using the same parameters for all datasets, that is, mimicking how the human visual system operates. These results might provide an insight on how dynamical adaptation mechanisms contribute to make object's colours appear constant to us. | ||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | |||
Notes | NEUROBIT; 600.068; 600.072 | Approved | no | ||
Call Number | Admin @ si @ AkP2018a | Serial | 2990 | ||
Permanent link to this record | |||||
Author | Zhengying Liu; Adrien Pavao; Zhen Xu; Sergio Escalera; Fabio Ferreira; Isabelle Guyon; Sirui Hong; Frank Hutter; Rongrong Ji; Julio C. S. Jacques Junior; Ge Li; Marius Lindauer; Zhipeng Luo; Meysam Madadi; Thomas Nierhoff; Kangning Niu; Chunguang Pan; Danny Stoll; Sebastien Treguer; Jin Wang; Peng Wang; Chenglin Wu; Youcheng Xiong; Arber Zela; Yang Zhang | ||||
Title | Winning Solutions and Post-Challenge Analyses of the ChaLearn AutoDL Challenge 2019 | Type | Journal Article | ||
Year | 2021 | Publication | IEEE Transactions on Pattern Analysis and Machine Intelligence | Abbreviated Journal | TPAMI |
Volume | 43 | Issue | 9 | Pages | 3108 - 3125 |
Keywords | |||||
Abstract | This paper reports the results and post-challenge analyses of ChaLearn's AutoDL challenge series, which helped sorting out a profusion of AutoML solutions for Deep Learning (DL) that had been introduced in a variety of settings, but lacked fair comparisons. All input data modalities (time series, images, videos, text, tabular) were formatted as tensors and all tasks were multi-label classification problems. Code submissions were executed on hidden tasks, with limited time and computational resources, pushing solutions that get results quickly. In this setting, DL methods dominated, though popular Neural Architecture Search (NAS) was impractical. Solutions relied on fine-tuned pre-trained networks, with architectures matching data modality. Post-challenge tests did not reveal improvements beyond the imposed time limit. While no component is particularly original or novel, a high level modular organization emerged featuring a “meta-learner”, “data ingestor”, “model selector”, “model/learner”, and “evaluator”. This modularity enabled ablation studies, which revealed the importance of (off-platform) meta-learning, ensembling, and efficient data management. Experiments on heterogeneous module combinations further confirm the (local) optimality of the winning solutions. Our challenge legacy includes an ever-lasting benchmark (http://autodl.chalearn.org), the open-sourced code of the winners, and a free “AutoDL self-service.” | ||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | |||
Notes | HUPBA; no proj | Approved | no | ||
Call Number | Admin @ si @ LPX2021 | Serial | 3587 | ||
Permanent link to this record | |||||
Author | Swathikiran Sudhakaran; Sergio Escalera; Oswald Lanz | ||||
Title | Gate-Shift-Fuse for Video Action Recognition | Type | Journal Article | ||
Year | 2023 | Publication | IEEE Transactions on Pattern Analysis and Machine Intelligence | Abbreviated Journal | TPAMI |
Volume | 45 | Issue | 9 | Pages | 10913-10928 |
Keywords | Action Recognition; Video Classification; Spatial Gating; Channel Fusion | ||||
Abstract | Convolutional Neural Networks are the de facto models for image recognition. However 3D CNNs, the straight forward extension of 2D CNNs for video recognition, have not achieved the same success on standard action recognition benchmarks. One of the main reasons for this reduced performance of 3D CNNs is the increased computational complexity requiring large scale annotated datasets to train them in scale. 3D kernel factorization approaches have been proposed to reduce the complexity of 3D CNNs. Existing kernel factorization approaches follow hand-designed and hard-wired techniques. In this paper we propose Gate-Shift-Fuse (GSF), a novel spatio-temporal feature extraction module which controls interactions in spatio-temporal decomposition and learns to adaptively route features through time and combine them in a data dependent manner. GSF leverages grouped spatial gating to decompose input tensor and channel weighting to fuse the decomposed tensors. GSF can be inserted into existing 2D CNNs to convert them into an efficient and high performing spatio-temporal feature extractor, with negligible parameter and compute overhead. We perform an extensive analysis of GSF using two popular 2D CNN families and achieve state-of-the-art or competitive performance on five standard action recognition benchmarks. | ||||
Address | 1 Sept. 2023 | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | |||
Notes | HUPBA; no menciona | Approved | no | ||
Call Number | Admin @ si @ SEL2023 | Serial | 3814 | ||
Permanent link to this record |