|
Mohammad Ali Bagheri, Qigang Gao, Sergio Escalera, Albert Clapes, Kamal Nasrollahi, Michael Holte, et al. (2015). Keep it Accurate and Diverse: Enhancing Action Recognition Performance by Ensemble Learning. In IEEE Conference on Computer Vision and Pattern Recognition Worshops (CVPRW) (pp. 22–29).
Abstract: The performance of different action recognition techniques has recently been studied by several computer vision researchers. However, the potential improvement in classification through classifier fusion by ensemble-based methods has remained unattended. In this work, we evaluate the performance of an ensemble of action learning techniques, each performing the recognition task from a different perspective.
The underlying idea is that instead of aiming a very sophisticated and powerful representation/learning technique, we can learn action categories using a set of relatively simple and diverse classifiers, each trained with different feature set. In addition, combining the outputs of several learners can reduce the risk of an unfortunate selection of a learner on an unseen action recognition scenario.
This leads to having a more robust and general-applicable framework. In order to improve the recognition performance, a powerful combination strategy is utilized based on the Dempster-Shafer theory, which can effectively make use
of diversity of base learners trained on different sources of information. The recognition results of the individual classifiers are compared with those obtained from fusing the classifiers’ output, showing enhanced performance of the proposed methodology.
|
|
|
Aitor Alvarez-Gila, Joost Van de Weijer, Yaxing Wang, & Estibaliz Garrote. (2022). MVMO: A Multi-Object Dataset for Wide Baseline Multi-View Semantic Segmentation. In 29th IEEE International Conference on Image Processing.
Abstract: We present MVMO (Multi-View, Multi-Object dataset): a synthetic dataset of 116,000 scenes containing randomly placed objects of 10 distinct classes and captured from 25 camera locations in the upper hemisphere. MVMO comprises photorealistic, path-traced image renders, together with semantic segmentation ground truth for every view. Unlike existing multi-view datasets, MVMO features wide baselines between cameras and high density of objects, which lead to large disparities, heavy occlusions and view-dependent object appearance. Single view semantic segmentation is hindered by self and inter-object occlusions that could benefit from additional viewpoints. Therefore, we expect that MVMO will propel research in multi-view semantic segmentation and cross-view semantic transfer. We also provide baselines that show that new research is needed in such fields to exploit the complementary information of multi-view setups 1 .
Keywords: multi-view; cross-view; semantic segmentation; synthetic dataset
|
|
|
Ahmed M. A. Salih, Ilaria Boscolo Galazzo, Federica Cruciani, Lorenza Brusini, & Petia Radeva. (2022). Investigating Explainable Artificial Intelligence for MRI-based Classification of Dementia: a New Stability Criterion for Explainable Methods. In 29th IEEE International Conference on Image Processing.
Abstract: Individuals diagnosed with Mild Cognitive Impairment (MCI) have shown an increased risk of developing Alzheimer’s Disease (AD). As such, early identification of dementia represents a key prognostic element, though hampered by complex disease patterns. Increasing efforts have focused on Machine Learning (ML) to build accurate classification models relying on a multitude of clinical/imaging variables. However, ML itself does not provide sensible explanations related to the model mechanism and feature contribution. Explainable Artificial Intelligence (XAI) represents the enabling technology in this framework, allowing to understand ML outcomes and derive human-understandable explanations. In this study, we aimed at exploring ML combined with MRI-based features and XAI to solve this classification problem and interpret the outcome. In particular, we propose a new method to assess the robustness of feature rankings provided by XAI methods, especially when multicollinearity exists. Our findings indicate that our method was able to disentangle the list of the informative features underlying dementia, with important implications for aiding personalized monitoring plans.
Keywords: Image processing; Stability criteria; Machine learning; Robustness; Alzheimer's disease; Monitoring
|
|
|
Chengyi Zou, Shuai Wan, Marta Mrak, Marc Gorriz Blanch, Luis Herranz, & Tiannan Ji. (2022). Towards Lightweight Neural Network-based Chroma Intra Prediction for Video Coding. In 29th IEEE International Conference on Image Processing.
Abstract: In video compression the luma channel can be useful for predicting chroma channels (Cb, Cr), as has been demonstrated with the Cross-Component Linear Model (CCLM) used in Versatile Video Coding (VVC) standard. More recently, it has been shown that neural networks can even better capture the relationship among different channels. In this paper, a new attention-based neural network is proposed for cross-component intra prediction. With the goal to simplify neural network design, the new framework consists of four branches: boundary branch and luma branch for extracting features from reference samples, attention branch for fusing the first two branches, and prediction branch for computing the predicted chroma samples. The proposed scheme is integrated into VVC test model together with one additional binary block-level syntax flag which indicates whether a given block makes use of the proposed method. Experimental results demonstrate 0.31%/2.36%/2.00% BD-rate reductions on Y/Cb/Cr components, respectively, on top of the VVC Test Model (VTM) 7.0 which uses CCLM.
Keywords: Video coding; Quantization (signal); Computational modeling; Neural networks; Predictive models; Video compression; Syntactics
|
|
|
Mohammad Rouhani, & Angel Sappa. (2009). A Novel Approach to Geometric Fitting of Implicit Quadrics. In 8th International Conference on Advanced Concepts for Intelligent Vision Systems (Vol. 5807, 121–132). LNCS. Springer Berlin Heidelberg.
Abstract: This paper presents a novel approach for estimating the geometric distance from a given point to the corresponding implicit quadric curve/surface. The proposed estimation is based on the height of a tetrahedron, which is used as a coarse but reliable estimation of the real distance. The estimated distance is then used for finding the best set of quadric parameters, by means of the Levenberg-Marquardt algorithm, which is a common framework in other geometric fitting approaches. Comparisons of the proposed approach with previous ones are provided to show both improvements in CPU time as well as in the accuracy of the obtained results.
|
|
|
Thanh Ha Do, Salvatore Tabbone, & Oriol Ramos Terrades. (2012). Noise suppression over bi-level graphical documents using a sparse representation. In Colloque International Francophone sur l'Écrit et le Document.
|
|
|
J.M. Sanchez, & X. Binefa. (1999). Automatic digital TV commercial recognition..
|
|
|
Felipe Lumbreras, Ramon Baldrich, Maria Vanrell, Joan Serrat, & Juan J. Villanueva. (1999). Multiresolution colour texture representations for tile classification.
|
|
|
Daniel Ponsa, A.F. Sole, Antonio Lopez, Cristina Cañero, Petia Radeva, & Jordi Vitria. (1999). Regularized EM.
|
|
|
David Guillamet, & Jordi Vitria. (1999). Using Eigenspace analysis of color distributions for object recognition.
|
|
|
A. Pujol, Felipe Lumbreras, Javier Varona, & Juan J. Villanueva. (1999). Template matching through invariant eigenspace projection..
|
|
|
Josep Llados, Felipe Lumbreras, & Javier Varona. (1999). A multidocument platform for automatic reading of identity cards..
|
|
|
A.F. Sole, Antonio Lopez, Cristina Cañero, Petia Radeva, & J. Saludes. (1999). Crease enhancement diffusion.
|
|
|
Javier Varona, A. Pujol, & Juan J. Villanueva. (1999). Visual tracking in application domains..
|
|
|
Xavier Roca, Jordi Vitria, Maria Vanrell, & Juan J. Villanueva. (1999). Visual behaviours for binocular navigation with autonomous systems..
|
|