|
J. Mauri and 14 others. 2000. Avaluació del Conjunt Stent/Artèria mitjançant ecografia intracoronària: lentorn informàtic. Congrés de la Societat Catalana de Cardiologia..
|
|
|
Ariel Amato, Felipe Lumbreras and Angel Sappa. 2014. A General-purpose Crowdsourcing Platform for Mobile Devices. 9th International Conference on Computer Vision Theory and Applications.211–215.
Abstract: This paper presents details of a general purpose micro-task on-demand platform based on the crowdsourcing philosophy. This platform was specifically developed for mobile devices in order to exploit the strengths of such devices; namely: i) massivity, ii) ubiquity and iii) embedded sensors. The combined use of mobile platforms and the crowdsourcing model allows to tackle from the simplest to the most complex tasks. Users experience is the highlighted feature of this platform (this fact is extended to both task-proposer and tasksolver). Proper tools according with a specific task are provided to a task-solver in order to perform his/her job in a simpler, faster and appealing way. Moreover, a task can be easily submitted by just selecting predefined templates, which cover a wide range of possible applications. Examples of its usage in computer vision and computer games are provided illustrating the potentiality of the platform.
Keywords: Crowdsourcing Platform; Mobile Crowdsourcing
|
|
|
Fahad Shahbaz Khan, Muhammad Anwer Rao, Joost Van de Weijer, Michael Felsberg and J.Laaksonen. 2015. Deep semantic pyramids for human attributes and action recognition. Image Analysis, Proceedings of 19th Scandinavian Conference , SCIA 2015. Springer International Publishing, 341–353.
Abstract: Describing persons and their actions is a challenging problem due to variations in pose, scale and viewpoint in real-world images. Recently, semantic pyramids approach [1] for pose normalization has shown to provide excellent results for gender and action recognition. The performance of semantic pyramids approach relies on robust image description and is therefore limited due to the use of shallow local features. In the context of object recognition [2] and object detection [3], convolutional neural networks (CNNs) or deep features have shown to improve the performance over the conventional shallow features.
We propose deep semantic pyramids for human attributes and action recognition. The method works by constructing spatial pyramids based on CNNs of different part locations. These pyramids are then combined to obtain a single semantic representation. We validate our approach on the Berkeley and 27 Human Attributes datasets for attributes classification. For action recognition, we perform experiments on two challenging datasets: Willow and PASCAL VOC 2010. The proposed deep semantic pyramids provide a significant gain of 17.2%, 13.9%, 24.3% and 22.6% compared to the standard shallow semantic pyramids on Berkeley, 27 Human Attributes, Willow and PASCAL VOC 2010 datasets respectively. Our results also show that deep semantic pyramids outperform conventional CNNs based on the full bounding box of the person. Finally, we compare our approach with state-of-the-art methods and show a gain in performance compared to best methods in literature.
Keywords: Action recognition; Human attributes; Semantic pyramids
|
|
|
Muhammad Anwer Rao, Fahad Shahbaz Khan, Joost Van de Weijer and Jorma Laaksonen. 2016. Combining Holistic and Part-based Deep Representations for Computational Painting Categorization. 6th International Conference on Multimedia Retrieval.
Abstract: Automatic analysis of visual art, such as paintings, is a challenging inter-disciplinary research problem. Conventional approaches only rely on global scene characteristics by encoding holistic information for computational painting categorization.We argue that such approaches are sub-optimal and that discriminative common visual structures provide complementary information for painting classification. We present an approach that encodes both the global scene layout and discriminative latent common structures for computational painting categorization. The region of interests are automatically extracted, without any manual part labeling, by training class-specific deformable part-based models. Both holistic and region-of-interests are then described using multi-scale dense convolutional features. These features are pooled separately using Fisher vector encoding and concatenated afterwards in a single image representation. Experiments are performed on a challenging dataset with 91 different painters and 13 diverse painting styles. Our approach outperforms the standard method, which only employs the global scene characteristics. Furthermore, our method achieves state-of-the-art results outperforming a recent multi-scale deep features based approach [11] by 6.4% and 3.8% respectively on artist and style classification.
|
|
|
Guim Perarnau, Joost Van de Weijer, Bogdan Raducanu and Jose Manuel Alvarez. 2016. Invertible conditional gans for image editing. 30th Annual Conference on Neural Information Processing Systems Worshops.
Abstract: Generative Adversarial Networks (GANs) have recently demonstrated to successfully approximate complex data distributions. A relevant extension of this model is conditional GANs (cGANs), where the introduction of external information allows to determine specific representations of the generated images. In this work, we evaluate encoders to inverse the mapping of a cGAN, i.e., mapping a real image into a latent space and a conditional representation. This allows, for example, to reconstruct and modify real images of faces conditioning on arbitrary attributes.
Additionally, we evaluate the design of cGANs. The combination of an encoder
with a cGAN, which we call Invertible cGAN (IcGAN), enables to re-generate real
images with deterministic complex modifications.
|
|
|
Javad Zolfaghari Bengar and 7 others. 2019. Temporal Coherence for Active Learning in Videos. IEEE International Conference on Computer Vision Workshops.914–923.
Abstract: Autonomous driving systems require huge amounts of data to train. Manual annotation of this data is time-consuming and prohibitively expensive since it involves human resources. Therefore, active learning emerged as an alternative to ease this effort and to make data annotation more manageable. In this paper, we introduce a novel active learning approach for object detection in videos by exploiting temporal coherence. Our active learning criterion is based on the estimated number of errors in terms of false positives and false negatives. The detections obtained by the object detector are used to define the nodes of a graph and tracked forward and backward to temporally link the nodes. Minimizing an energy function defined on this graphical model provides estimates of both false positives and false negatives. Additionally, we introduce a synthetic video dataset, called SYNTHIA-AL, specially designed to evaluate active learning for video object detection in road scenes. Finally, we show that our approach outperforms active learning baselines tested on two datasets.
|
|
|
Marc Masana, Idoia Ruiz, Joan Serrat, Joost Van de Weijer and Antonio Lopez. 2018. Metric Learning for Novelty and Anomaly Detection. 29th British Machine Vision Conference.
Abstract: When neural networks process images which do not resemble the distribution seen during training, so called out-of-distribution images, they often make wrong predictions, and do so too confidently. The capability to detect out-of-distribution images is therefore crucial for many real-world applications. We divide out-of-distribution detection between novelty detection ---images of classes which are not in the training set but are related to those---, and anomaly detection ---images with classes which are unrelated to the training set. By related we mean they contain the same type of objects, like digits in MNIST and SVHN. Most existing work has focused on anomaly detection, and has addressed this problem considering networks trained with the cross-entropy loss. Differently from them, we propose to use metric learning which does not have the drawback of the softmax layer (inherent to cross-entropy methods), which forces the network to divide its prediction power over the learned classes. We perform extensive experiments and evaluate both novelty and anomaly detection, even in a relevant application such as traffic sign recognition, obtaining comparable or better results than previous works.
|
|
|
Xialei Liu, Marc Masana, Luis Herranz, Joost Van de Weijer, Antonio Lopez and Andrew Bagdanov. 2018. Rotate your Networks: Better Weight Consolidation and Less Catastrophic Forgetting. 24th International Conference on Pattern Recognition.2262–2268.
Abstract: In this paper we propose an approach to avoiding catastrophic forgetting in sequential task learning scenarios. Our technique is based on a network reparameterization that approximately diagonalizes the Fisher Information Matrix of the network parameters. This reparameterization takes the form of
a factorized rotation of parameter space which, when used in conjunction with Elastic Weight Consolidation (which assumes a diagonal Fisher Information Matrix), leads to significantly better performance on lifelong learning of sequential tasks. Experimental results on the MNIST, CIFAR-100, CUB-200 and
Stanford-40 datasets demonstrate that we significantly improve the results of standard elastic weight consolidation, and that we obtain competitive results when compared to the state-of-the-art in lifelong learning without forgetting.
|
|
|
Simon Jégou, Michal Drozdzal, David Vazquez, Adriana Romero and Yoshua Bengio. 2017. The One Hundred Layers Tiramisu: Fully Convolutional DenseNets for Semantic Segmentation. IEEE Conference on Computer Vision and Pattern Recognition Workshops.
Abstract: State-of-the-art approaches for semantic image segmentation are built on Convolutional Neural Networks (CNNs). The typical segmentation architecture is composed of (a) a downsampling path responsible for extracting coarse semantic features, followed by (b) an upsampling path trained to recover the input image resolution at the output of the model and, optionally, (c) a post-processing module (e.g. Conditional Random Fields) to refine the model predictions.
Recently, a new CNN architecture, Densely Connected Convolutional Networks (DenseNets), has shown excellent results on image classification tasks. The idea of DenseNets is based on the observation that if each layer is directly connected to every other layer in a feed-forward fashion then the network will be more accurate and easier to train.
In this paper, we extend DenseNets to deal with the problem of semantic segmentation. We achieve state-of-the-art results on urban scene benchmark datasets such as CamVid and Gatech, without any further post-processing module nor pretraining. Moreover, due to smart construction of the model, our approach has much less parameters than currently published best entries for these datasets.
Keywords: Semantic Segmentation
|
|
|
Cristina Cañero and 16 others. 1999. Optimal Stent Implantation: Three-dimensional Evaluation of the Mutual Position of Stent and Vessel via Intracoronary Ecography. Proceedings of International Conference on Computer in Cardiology (CIC´99).
Abstract: We present a new automatic technique to visualize and quantify the mutual position between the stent and the vessel wall by considering their three-dimensional reconstruction. Two deformable generalized cylinders adapt to the image features in all IVUS planes corresponding to the vessel wall and the stent in order to reconstruct the boundaries of the stent and the vessel in space. The image features that characterize the stent and the vessel wall are determined in terms of edge and ridge image detectors taking into account the gray level of the image pixels. We show that the 30 reconstruction by deformable cylinders is accurate and robust due to the spatial data coherence in the considered volumetric IVUS image. The main clinic utility of the stent and vessel reconstruction by deformable’ cylinders consists of its possibility to visualize and to assess the optimal stent introduction.
|
|