Publicacions CVC -- Query Results

[71–80] << 81 82 83 84 85 86 87 88 89 90 >> [91–100]

Details

Records
Author	Marc Masana; Xialei Liu; Bartlomiej Twardowski; Mikel Menta; Andrew Bagdanov; Joost Van de Weijer
Title	Class-incremental learning: survey and performance evaluation			Type	Journal Article
Year	2022	Publication	IEEE Transactions on Pattern Analysis and Machine Intelligence	Abbreviated Journal	TPAMI
Volume		Issue		Pages
Keywords
Abstract	For future learning systems incremental learning is desirable, because it allows for: efficient resource usage by eliminating the need to retrain from scratch at the arrival of new data; reduced memory usage by preventing or limiting the amount of data required to be stored -- also important when privacy limitations are imposed; and learning that more closely resembles human learning. The main challenge for incremental learning is catastrophic forgetting, which refers to the precipitous drop in performance on previously learned tasks after learning a new one. Incremental learning of deep neural networks has seen explosive growth in recent years. Initial work focused on task incremental learning, where a task-ID is provided at inference time. Recently we have seen a shift towards class-incremental learning where the learner must classify at inference time between all classes seen in previous tasks without recourse to a task-ID. In this paper, we provide a complete survey of existing methods for incremental learning, and in particular we perform an extensive experimental evaluation on twelve class-incremental methods. We consider several new experimental scenarios, including a comparison of class-incremental methods on multiple large-scale datasets, investigation into small and large domain shifts, and comparison on various network architectures.
Address
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference
Notes	LAMP; 600.120			Approved	no
Call Number	Admin @ si @ MLT2022			Serial	3538
Permanent link to this record



Author	Marc Masana; Tinne Tuytelaars; Joost Van de Weijer
Title	Ternary Feature Masks: zero-forgetting for task-incremental learning			Type	Conference Article
Year	2021	Publication	34th IEEE Conference on Computer Vision and Pattern Recognition Workshops	Abbreviated Journal
Volume		Issue		Pages	3565-3574
Keywords
Abstract	We propose an approach without any forgetting to continual learning for the task-aware regime, where at inference the task-label is known. By using ternary masks we can upgrade a model to new tasks, reusing knowledge from previous tasks while not forgetting anything about them. Using masks prevents both catastrophic forgetting and backward transfer. We argue -- and show experimentally -- that avoiding the former largely compensates for the lack of the latter, which is rarely observed in practice. In contrast to earlier works, our masks are applied to the features (activations) of each layer instead of the weights. This considerably reduces the number of mask parameters for each new task; with more than three orders of magnitude for most networks. The encoding of the ternary masks into two bits per feature creates very little overhead to the network, avoiding scalability issues. To allow already learned features to adapt to the current task without changing the behavior of these features for previous tasks, we introduce task-specific feature normalization. Extensive experiments on several finegrained datasets and ImageNet show that our method outperforms current state-of-the-art while reducing memory overhead in comparison to weight-based approaches.
Address	Virtual; June 2021
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference	CVPRW
Notes	LAMP; 600.120			Approved	no
Call Number	Admin @ si @ MTW2021			Serial	3565
Permanent link to this record



Author	Marc Masana; Joost Van de Weijer; Luis Herranz;Andrew Bagdanov; Jose Manuel Alvarez
Title	Domain-adaptive deep network compression			Type	Conference Article
Year	2017	Publication	17th IEEE International Conference on Computer Vision	Abbreviated Journal
Volume		Issue		Pages
Keywords
Abstract	Deep Neural Networks trained on large datasets can be easily transferred to new domains with far fewer labeled examples by a process called fine-tuning. This has the advantage that representations learned in the large source domain can be exploited on smaller target domains. However, networks designed to be optimal for the source task are often prohibitively large for the target task. In this work we address the compression of networks after domain transfer. We focus on compression algorithms based on low-rank matrix decomposition. Existing methods base compression solely on learned network weights and ignore the statistics of network activations. We show that domain transfer leads to large shifts in network activations and that it is desirable to take this into account when compressing. We demonstrate that considering activation statistics when compressing weights leads to a rank-constrained regression problem with a closed-form solution. Because our method takes into account the target domain, it can more optimally remove the redundancy in the weights. Experiments show that our Domain Adaptive Low Rank (DALR) method significantly outperforms existing low-rank compression techniques. With our approach, the fc6 layer of VGG19 can be compressed more than 4x more than using truncated SVD alone – with only a minor or no loss in accuracy. When applied to domain-transferred networks it allows for compression down to only 5-20% of the original number of parameters with only a minor drop in performance.
Address	Venice; Italy; October 2017
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference	ICCV
Notes	LAMP; 601.305; 600.106; 600.120			Approved	no
Call Number	Admin @ si @			Serial	3034
Permanent link to this record



Author	Marc Masana; Joost Van de Weijer; Andrew Bagdanov
Title	On-the-fly Network pruning for object detection			Type	Conference Article
Year	2016	Publication	International conference on learning representations	Abbreviated Journal
Volume		Issue		Pages
Keywords
Abstract	Object detection with deep neural networks is often performed by passing a few thousand candidate bounding boxes through a deep neural network for each image. These bounding boxes are highly correlated since they originate from the same image. In this paper we investigate how to exploit feature occurrence at the image scale to prune the neural network which is subsequently applied to all bounding boxes. We show that removing units which have near-zero activation in the image allows us to significantly reduce the number of parameters in the network. Results on the PASCAL 2007 Object Detection Challenge demonstrate that up to 40% of units in some fully-connected layers can be entirely eliminated with little change in the detection result.
Address	Puerto Rico; May 2016
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference	ICLR
Notes	LAMP; 600.068; 600.106; 600.079			Approved	no
Call Number	Admin @ si @MWB2016			Serial	2758
Permanent link to this record



Author	Marc Masana; Idoia Ruiz; Joan Serrat; Joost Van de Weijer; Antonio Lopez
Title	Metric Learning for Novelty and Anomaly Detection			Type	Conference Article
Year	2018	Publication	29th British Machine Vision Conference	Abbreviated Journal
Volume		Issue		Pages
Keywords
Abstract	When neural networks process images which do not resemble the distribution seen during training, so called out-of-distribution images, they often make wrong predictions, and do so too confidently. The capability to detect out-of-distribution images is therefore crucial for many real-world applications. We divide out-of-distribution detection between novelty detection ---images of classes which are not in the training set but are related to those---, and anomaly detection ---images with classes which are unrelated to the training set. By related we mean they contain the same type of objects, like digits in MNIST and SVHN. Most existing work has focused on anomaly detection, and has addressed this problem considering networks trained with the cross-entropy loss. Differently from them, we propose to use metric learning which does not have the drawback of the softmax layer (inherent to cross-entropy methods), which forces the network to divide its prediction power over the learned classes. We perform extensive experiments and evaluate both novelty and anomaly detection, even in a relevant application such as traffic sign recognition, obtaining comparable or better results than previous works.
Address	Newcastle; uk; September 2018
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference	BMVC
Notes	LAMP; ADAS; 601.305; 600.124; 600.106; 602.200; 600.120; 600.118			Approved	no
Call Number	Admin @ si @ MRS2018			Serial	3156
Permanent link to this record



Author	Marc Masana; Bartlomiej Twardowski; Joost Van de Weijer
Title	On Class Orderings for Incremental Learning			Type	Conference Article
Year	2020	Publication	ICML Workshop on Continual Learning	Abbreviated Journal
Volume		Issue		Pages
Keywords
Abstract	The influence of class orderings in the evaluation of incremental learning has received very little attention. In this paper, we investigate the impact of class orderings for incrementally learned classifiers. We propose a method to compute various orderings for a dataset. The orderings are derived by simulated annealing optimization from the confusion matrix and reflect different incremental learning scenarios, including maximally and minimally confusing tasks. We evaluate a wide range of state-of-the-art incremental learning methods on the proposed orderings. Results show that orderings can have a significant impact on performance and the ranking of the methods.
Address	Virtual; July 2020
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference	ICMLW
Notes	LAMP; 600.120			Approved	no
Call Number	Admin @ si @ MTW2020			Serial	3505
Permanent link to this record



Author	Marc Masana
Title	Lifelong Learning of Neural Networks: Detecting Novelty and Adapting to New Domains without Forgetting			Type	Book Whole
Year	2020	Publication	PhD Thesis, Universitat Autonoma de Barcelona-CVC	Abbreviated Journal
Volume		Issue		Pages
Keywords
Abstract	Computer vision has gone through considerable changes in the last decade as neural networks have come into common use. As available computational capabilities have grown, neural networks have achieved breakthroughs in many computer vision tasks, and have even surpassed human performance in others. With accuracy being so high, focus has shifted to other issues and challenges. One research direction that saw a notable increase in interest is on lifelong learning systems. Such systems should be capable of efficiently performing tasks, identifying and learning new ones, and should moreover be able to deploy smaller versions of themselves which are experts on specific tasks. In this thesis, we contribute to research on lifelong learning and address the compression and adaptation of networks to small target domains, the incremental learning of networks faced with a variety of tasks, and finally the detection of out-of-distribution samples at inference time. We explore how knowledge can be transferred from large pretrained models to more task-specific networks capable of running on smaller devices by extracting the most relevant information. Using a pretrained model provides more robust representations and a more stable initialization when learning a smaller task, which leads to higher performance and is known as domain adaptation. However, those models are too large for certain applications that need to be deployed on devices with limited memory and computational capacity. In this thesis we show that, after performing domain adaptation, some learned activations barely contribute to the predictions of the model. Therefore, we propose to apply network compression based on low-rank matrix decomposition using the activation statistics. This results in a significant reduction of the model size and the computational cost. Like human intelligence, machine intelligence aims to have the ability to learn and remember knowledge. However, when a trained neural network is presented with learning a new task, it ends up forgetting previous ones. This is known as catastrophic forgetting and its avoidance is studied in continual learning. The work presented in this thesis extensively surveys continual learning techniques and presents an approach to avoid catastrophic forgetting in sequential task learning scenarios. Our technique is based on using ternary masks in order to update a network to new tasks, reusing the knowledge of previous ones while not forgetting anything about them. In contrast to earlier work, our masks are applied to the activations of each layer instead of the weights. This considerably reduces the number of parameters to be added for each new task. Furthermore, the analysis on a wide range of work on incremental learning without access to the task-ID, provides insight on current state-of-the-art approaches that focus on avoiding catastrophic forgetting by using regularization, rehearsal of previous tasks from a small memory, or compensating the task-recency bias. Neural networks trained with a cross-entropy loss force the outputs of the model to tend toward a one-hot encoded vector. This leads to models being too overly confident when presented with images or classes that were not present in the training distribution. The capacity of a system to be aware of the boundaries of the learned tasks and identify anomalies or classes which have not been learned yet is key to lifelong learning and autonomous systems. In this thesis, we present a metric learning approach to out-of-distribution detection that learns the task at hand on an embedding space.
Address
Corporate Author				Thesis	Ph.D. thesis
Publisher	Ediciones Graficas Rey	Place of Publication		Editor	Joost Van de Weijer;Andrew Bagdanov
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN	978-84-121011-9-5	Medium
Area		Expedition		Conference
Notes	LAMP; 600.120			Approved	no
Call Number	Admin @ si @ Mas20			Serial	3481
Permanent link to this record



Author	Marc Castello; Jordi Gonzalez; Ariel Amato; Pau Baiget; Carles Fernandez; Josep M. Gonfaus; Ramon Mollineda; Marco Pedersoli; Nicolas Perez de la Blanca; Xavier Roca
Title	Exploiting Multimodal Interaction Techniques for Video-Surveillance			Type	Book Chapter
Year	2013	Publication	Multimodal Interaction in Image and Video Applications Intelligent Systems Reference Library	Abbreviated Journal
Volume	48	Issue	8	Pages	135-151
Keywords
Abstract	In this paper we present an example of a video surveillance application that exploits Multimodal Interactive (MI) technologies. The main objective of the so-called VID-Hum prototype was to develop a cognitive artificial system for both the detection and description of a particular set of human behaviours arising from real-world events. The main procedure of the prototype described in this chapter entails: (i) adaptation, since the system adapts itself to the most common behaviours (qualitative data) inferred from tracking (quantitative data) thus being able to recognize abnormal behaviors; (ii) feedback, since an advanced interface based on Natural Language understanding allows end-users the communicationwith the prototype by means of conceptual sentences; and (iii) multimodality, since a virtual avatar has been designed to describe what is happening in the scene, based on those textual interpretations generated by the prototype. Thus, the MI methodology has provided an adequate framework for all these cooperating processes.
Address
Corporate Author				Thesis
Publisher	Springer Berlin Heidelberg	Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN	1868-4394	ISBN	978-3-642-35931-6	Medium
Area		Expedition		Conference
Notes	ISE; 605.203; 600.049			Approved	no
Call Number	CGA2013			Serial	2222
Permanent link to this record



Author	Marc Bolaños; R. Mestre; Estefania Talavera; Xavier Giro; Petia Radeva
Title	Visual Summary of Egocentric Photostreams by Representative Keyframes			Type	Conference Article
Year	2015	Publication	IEEE International Conference on Multimedia and Expo ICMEW2015	Abbreviated Journal
Volume		Issue		Pages	1-6
Keywords	egocentric; lifelogging; summarization; keyframes
Abstract	Building a visual summary from an egocentric photostream captured by a lifelogging wearable camera is of high interest for different applications (e.g. memory reinforcement). In this paper, we propose a new summarization method based on keyframes selection that uses visual features extracted bymeans of a convolutional neural network. Our method applies an unsupervised clustering for dividing the photostreams into events, and finally extracts the most relevant keyframe for each event. We assess the results by applying a blind-taste test on a group of 20 people who assessed the quality of the summaries.
Address	Torino; italy; July 2015
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue	978-1-4799-7079-7	Edition
ISSN		ISBN	978-1-4799-7079-7	Medium
Area		Expedition		Conference	ICME
Notes	MILAB			Approved	no
Call Number	Admin @ si @ BMT2015			Serial	2638
Permanent link to this record



Author	Marc Bolaños; Petia Radeva
Title	Simultaneous Food Localization and Recognition			Type	Conference Article
Year	2016	Publication	23rd International Conference on Pattern Recognition	Abbreviated Journal
Volume		Issue		Pages
Keywords
Abstract	CoRR abs/1604.07953 The development of automatic nutrition diaries, which would allow to keep track objectively of everything we eat, could enable a whole new world of possibilities for people concerned about their nutrition patterns. With this purpose, in this paper we propose the first method for simultaneous food localization and recognition. Our method is based on two main steps, which consist in, first, produce a food activation map on the input image (i.e. heat map of probabilities) for generating bounding boxes proposals and, second, recognize each of the food types or food-related objects present in each bounding box. We demonstrate that our proposal, compared to the most similar problem nowadays – object localization, is able to obtain high precision and reasonable recall levels with only a few bounding boxes. Furthermore, we show that it is applicable to both conventional and egocentric images.
Address	Cancun; Mexico; December 2016
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference	ICPR
Notes	MILAB; no proj			Approved	no
Call Number	Admin @ si @ BoR2016			Serial	2834
Permanent link to this record



Author	Marc Bolaños; Mariella Dimiccoli; Petia Radeva
Title	Towards Storytelling from Visual Lifelogging: An Overview			Type	Journal Article
Year	2017	Publication	IEEE Transactions on Human-Machine Systems	Abbreviated Journal	THMS
Volume	47	Issue	1	Pages	77 - 90
Keywords
Abstract	Visual lifelogging consists of acquiring images that capture the daily experiences of the user by wearing a camera over a long period of time. The pictures taken offer considerable potential for knowledge mining concerning how people live their lives, hence, they open up new opportunities for many potential applications in fields including healthcare, security, leisure and the quantified self. However, automatically building a story from a huge collection of unstructured egocentric data presents major challenges. This paper provides a thorough review of advances made so far in egocentric data analysis, and in view of the current state of the art, indicates new lines of research to move us towards storytelling from visual lifelogging.
Address
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference
Notes	MILAB; 601.235			Approved	no
Call Number	Admin @ si @ BDR2017			Serial	2712
Permanent link to this record



Author	Marc Bolaños; Maite Garolera; Petia Radeva
Title	Video Segmentation of Life-Logging Videos			Type	Conference Article
Year	2014	Publication	8th Conference on Articulated Motion and Deformable Objects	Abbreviated Journal
Volume	8563	Issue		Pages	1-9
Keywords
Abstract
Address
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference	AMDO
Notes	MILAB			Approved	no
Call Number	Admin @ si @ BGR2014			Serial	2558
Permanent link to this record



Author	Marc Bolaños; Maite Garolera; Petia Radeva
Title	Object Discovery using CNN Features in Egocentric Videos			Type	Conference Article
Year	2015	Publication	Pattern Recognition and Image Analysis, Proceedings of 7th Iberian Conference , ibPRIA 2015	Abbreviated Journal
Volume	9117	Issue		Pages	67-74
Keywords	Object discovery; Egocentric videos; Lifelogging; CNN
Abstract	Lifelogging devices based on photo/video are spreading faster everyday. This growth can represent great benefits to develop methods for extraction of meaningful information about the user wearing the device and his/her environment. In this paper, we propose a semi-supervised strategy for easily discovering objects relevant to the person wearing a first-person camera. The egocentric video sequence acquired by the camera, uses both the appearance extracted by means of a deep convolutional neural network and an object refill methodology that allow to discover objects even in case of small amount of object appearance in the collection of images. We validate our method on a sequence of 1000 egocentric daily images and obtain results with an F-measure of 0.5, 0.17 better than the state of the art approach.
Address	Santiago de Compostela; España; June 2015
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title	LNCS
Series Volume		Series Issue		Edition
ISSN	0302-9743	ISBN	978-3-319-19389-2	Medium
Area		Expedition		Conference	IbPRIA
Notes	MILAB			Approved	no
Call Number	Admin @ si @ BGR2015			Serial	2596
Permanent link to this record



Author	Marc Bolaños; Maite Garolera; Petia Radeva
Title	Active labeling application applied to food-related object recognition			Type	Conference Article
Year	2013	Publication	5th International Workshop on Multimedia for Cooking & Eating Activities	Abbreviated Journal
Volume		Issue		Pages	45-50
Keywords
Abstract	Every day, lifelogging devices, available for recording different aspects of our daily life, increase in number, quality and functions, just like the multiple applications that we give to them. Applying wearable devices to analyse the nutritional habits of people is a challenging application based on acquiring and analyzing life records in long periods of time. However, to extract the information of interest related to the eating patterns of people, we need automatic methods to process large amount of life-logging data (e.g. recognition of food-related objects). Creating a rich set of manually labeled samples to train the algorithms is slow, tedious and subjective. To address this problem, we propose a novel method in the framework of Active Labeling for construct- ing a training set of thousands of images. Inspired by the hierarchical sampling method for active learning [6], we propose an Active forest that organizes hierarchically the data for easy and fast labeling. Moreover, introducing a classifier into the hierarchical structures, as well as transforming the feature space for better data clustering, additionally im- prove the algorithm. Our method is successfully tested to label 89.700 food-related objects and achieves significant reduction in expert time labelling. Active labeling application applied to food-related object recognition ResearchGate. Available from: http://www.researchgate.net/publication/262252017Activelabelingapplicationappliedtofood-relatedobjectrecognition [accessed Jul 14, 2015].
Address	Barcelona; October 2013
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference	ACM-CEA
Notes	MILAB			Approved	no
Call Number	Admin @ si @ BGR2013b			Serial	2637
Permanent link to this record



Author	Marc Bolaños; Alvaro Peris; Francisco Casacuberta; Sergi Solera; Petia Radeva
Title	Egocentric video description based on temporally-linked sequences			Type	Journal Article
Year	2018	Publication	Journal of Visual Communication and Image Representation	Abbreviated Journal	JVCIR
Volume	50	Issue		Pages	205-216
Keywords	egocentric vision; video description; deep learning; multi-modal learning
Abstract	Egocentric vision consists in acquiring images along the day from a first person point-of-view using wearable cameras. The automatic analysis of this information allows to discover daily patterns for improving the quality of life of the user. A natural topic that arises in egocentric vision is storytelling, that is, how to understand and tell the story relying behind the pictures. In this paper, we tackle storytelling as an egocentric sequences description problem. We propose a novel methodology that exploits information from temporally neighboring events, matching precisely the nature of egocentric sequences. Furthermore, we present a new method for multimodal data fusion consisting on a multi-input attention recurrent network. We also release the EDUB-SegDesc dataset. This is the first dataset for egocentric image sequences description, consisting of 1,339 events with 3,991 descriptions, from 55 days acquired by 11 people. Finally, we prove that our proposal outperforms classical attentional encoder-decoder methods for video description.
Address
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference
Notes	MILAB; no proj			Approved	no
Call Number	Admin @ si @ BPC2018			Serial	3109
Permanent link to this record