|
Arnau Ramisa, Shrihari Vasudevan, David Aldavert, Ricardo Toledo and Ramon Lopez de Mantaras. 2009. Evaluation of the SIFT Object Recognition Method in Mobile Robots: Frontiers in Artificial Intelligence and Applications. 12th International Conference of the Catalan Association for Artificial Intelligence.9–18.
Abstract: General object recognition in mobile robots is of primary importance in order to enhance the representation of the environment that robots will use for their reasoning processes. Therefore, we contribute reduce this gap by evaluating the SIFT Object Recognition method in a challenging dataset, focusing on issues relevant to mobile robotics. Resistance of the method to the robotics working conditions was found, but it was limited mainly to well-textured objects.
|
|
|
Idoia Ruiz and Joan Serrat. 2020. Rank-based ordinal classification. 25th International Conference on Pattern Recognition.8069–8076.
Abstract: Differently from the regular classification task, in ordinal classification there is an order in the classes. As a consequence not all classification errors matter the same: a predicted class close to the groundtruth one is better than predicting a farther away class. To account for this, most previous works employ loss functions based on the absolute difference between the predicted and groundtruth class labels. We argue that there are many cases in ordinal classification where label values are arbitrary (for instance 1. . . C, being C the number of classes) and thus such loss functions may not be the best choice. We instead propose a network architecture that produces not a single class prediction but an ordered vector, or ranking, of all the possible classes from most to least likely. This is thanks to a loss function that compares groundtruth and predicted rankings of these class labels, not the labels themselves. Another advantage of this new formulation is that we can enforce consistency in the predictions, namely, predicted rankings come from some unimodal vector of scores with mode at the groundtruth class. We compare with the state of the art ordinal classification methods, showing
that ours attains equal or better performance, as measured by common ordinal classification metrics, on three benchmark datasets. Furthermore, it is also suitable for a new task on image aesthetics assessment, i.e. most voted score prediction. Finally, we also apply it to building damage assessment from satellite images, providing an analysis of its performance depending on the degree of imbalance of the dataset.
|
|
|
Arnau Ramisa, Adriana Tapus, Ramon Lopez de Mantaras and Ricardo Toledo. 2008. Mobile Robot Localization using Panoramic Vision and Combination of Feature Region Detectors. IEEE International Conference on Robotics and Automation,.538–543.
|
|
|
German Ros, Angel Sappa, Daniel Ponsa and Antonio Lopez. 2012. Visual SLAM for Driverless Cars: A Brief Survey. IEEE Workshop on Navigation, Perception, Accurate Positioning and Mapping for Intelligent Vehicles.
|
|
|
Jose Carlos Rubio, Joan Serrat and Antonio Lopez. 2012. Video Co-segmentation. 11th Asian Conference on Computer Vision. Springer Berlin Heidelberg, 13–24. (LNCS.)
Abstract: Segmentation of a single image is in general a highly underconstrained problem. A frequent approach to solve it is to somehow provide prior knowledge or constraints on how the objects of interest look like (in terms of their shape, size, color, location or structure). Image co-segmentation trades the need for such knowledge for something much easier to obtain, namely, additional images showing the object from other viewpoints. Now the segmentation problem is posed as one of differentiating the similar object regions in all the images from the more varying background. In this paper, for the first time, we extend this approach to video segmentation: given two or more video sequences showing the same object (or objects belonging to the same class) moving in a similar manner, we aim to outline its region in all the frames. In addition, the method works in an unsupervised manner, by learning to segment at testing time. We compare favorably with two state-of-the-art methods on video segmentation and report results on benchmark videos.
|
|
|
Jose Carlos Rubio, Joan Serrat and Antonio Lopez. 2012. Multiple target tracking and identity linking under split, merge and occlusion of targets and observations. 1st International Conference on Pattern Recognition Applications and Methods.
|
|
|
Jose Carlos Rubio, Joan Serrat and Antonio Lopez. 2012. Unsupervised co-segmentation through region matching. 25th IEEE Conference on Computer Vision and Pattern Recognition. IEEE Xplore, 749–756.
Abstract: Co-segmentation is defined as jointly partitioning multiple images depicting the same or similar object, into foreground and background. Our method consists of a multiple-scale multiple-image generative model, which jointly estimates the foreground and background appearance distributions from several images, in a non-supervised manner. In contrast to other co-segmentation methods, our approach does not require the images to have similar foregrounds and different backgrounds to function properly. Region matching is applied to exploit inter-image information by establishing correspondences between the common objects that appear in the scene. Moreover, computing many-to-many associations of regions allow further applications, like recognition of object parts across images. We report results on iCoseg, a challenging dataset that presents extreme variability in camera viewpoint, illumination and object deformations and poses. We also show that our method is robust against large intra-class variability in the MSRC database.
|
|
|
Jose Carlos Rubio, Joan Serrat, Antonio Lopez and N. Paragios. 2012. Image Contextual Representation and Matching through Hierarchies and Higher Order Graphs. 21st International Conference on Pattern Recognition.2664–2667.
Abstract: We present a region matching algorithm which establishes correspondences between regions from two segmented images. An abstract graph-based representation conceals the image in a hierarchical graph, exploiting the scene properties at two levels. First, the similarity and spatial consistency of the image semantic objects is encoded in a graph of commute times. Second, the cluttered regions of the semantic objects are represented with a shape descriptor. Many-to-many matching of regions is specially challenging due to the instability of the segmentation under slight image changes, and we explicitly handle it through high order potentials. We demonstrate the matching approach applied to images of world famous buildings, captured under different conditions, showing the robustness of our method to large variations in illumination and viewpoint.
|
|
|
Idoia Ruiz, Lorenzo Porzi, Samuel Rota Bulo, Peter Kontschieder and Joan Serrat. 2021. Weakly Supervised Multi-Object Tracking and Segmentation. IEEE Winter Conference on Applications of Computer Vision Workshops.125–133.
Abstract: We introduce the problem of weakly supervised MultiObject Tracking and Segmentation, i.e. joint weakly supervised instance segmentation and multi-object tracking, in which we do not provide any kind of mask annotation.
To address it, we design a novel synergistic training strategy by taking advantage of multi-task learning, i.e. classification and tracking tasks guide the training of the unsupervised instance segmentation. For that purpose, we extract weak foreground localization information, provided by
Grad-CAM heatmaps, to generate a partial ground truth to learn from. Additionally, RGB image level information is employed to refine the mask prediction at the edges of the
objects. We evaluate our method on KITTI MOTS, the most representative benchmark for this task, reducing the performance gap on the MOTSP metric between the fully supervised and weakly supervised approach to just 12% and 12.7 % for cars and pedestrians, respectively.
|
|
|
Mohammad Rouhani and Angel Sappa. 2012. Non-Rigid Shape Registration: A Single Linear Least Squares Framework. 12th European Conference on Computer Vision. Springer Berlin Heidelberg, 264–277. (LNCS.)
Abstract: This paper proposes a non-rigid registration formulation capturing both global and local deformations in a single framework. This formulation is based on a quadratic estimation of the registration distance together with a quadratic regularization term. Hence, the optimal transformation parameters are easily obtained by solving a liner system of equations, which guarantee a fast convergence. Experimental results with challenging 2D and 3D shapes are presented to show the validity of the proposed framework. Furthermore, comparisons with the most relevant approaches are provided.
|
|