|   | 
Details
   web
Records
Author Patricia Marquez
Title A Confidence Framework for the Assessment of Optical Flow Performance Type (up) Book Whole
Year 2015 Publication PhD Thesis, Universitat Autonoma de Barcelona-CVC Abbreviated Journal
Volume Issue Pages
Keywords
Abstract Optical Flow (OF) is the input of a wide range of decision support systems such as car driver assistance, UAV guiding or medical diagnose. In these real situations, the absence of ground truth forces to assess OF quality using quantities computed from either sequences or the computed optical flow itself. These quantities are generally known as Confidence Measures, CM. Even if we have a proper confidence measure we still need a way to evaluate its ability to discard pixels with an OF prone to have a large error. Current approaches only provide a descriptive evaluation of the CM performance but such approaches are not capable to fairly compare different confidence measures and optical flow algorithms. Thus, it is of prime importance to define a framework and a general road map for the evaluation of optical flow performance.

This thesis provides a framework able to decide which pairs “ optical flow – confidence measure” (OF-CM) are best suited for optical flow error bounding given a confidence level determined by a decision support system. To design this framework we cover the following points:

Descriptive scores. As a first step, we summarize and analyze the sources of inaccuracies in the output of optical flow algorithms. Second, we present several descriptive plots that visually assess CM capabilities for OF error bounding. In addition to the descriptive plots, given a plot representing OF-CM capabilities to bound the error, we provide a numeric score that categorizes the plot according to its decreasing profile, that is, a score assessing CM performance.
Statistical framework. We provide a comparison framework that assesses the best suited OF-CM pair for error bounding that uses a two stage cascade process. First of all we assess the predictive value of the confidence measures by means of a descriptive plot. Then, for a sample of descriptive plots computed over training frames, we obtain a generic curve that will be used for sequences with no ground truth. As a second step, we evaluate the obtained general curve and its capabilities to really reflect the predictive value of a confidence measure using the variability across train frames by means of ANOVA.

The presented framework has shown its potential in the application on clinical decision support systems. In particular, we have analyzed the impact of the different image artifacts such as noise and decay to the output of optical flow in a cardiac diagnose system and we have improved the navigation inside the bronchial tree on bronchoscopy.
Address July 2015
Corporate Author Thesis Ph.D. thesis
Publisher Ediciones Graficas Rey Place of Publication Editor Debora Gil;Aura Hernandez
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN 978-84-943427-2-1 Medium
Area Expedition Conference
Notes IAM; 600.075 Approved no
Call Number Admin @ si @ Mar2015 Serial 2687
Permanent link to this record
 

 
Author Marc Serra
Title Modeling, estimation and evaluation of intrinsic images considering color information Type (up) Book Whole
Year 2015 Publication PhD Thesis, Universitat Autonoma de Barcelona-CVC Abbreviated Journal
Volume Issue Pages
Keywords
Abstract Image values are the result of a combination of visual information coming from multiple sources. Recovering information from the multiple factors thatproduced an image seems a hard and ill-posed problem. However, it is important to observe that humans develop the ability to interpret images and recognize and isolate specific physical properties of the scene.

Images describing a single physical characteristic of an scene are called intrinsic images. These images would benefit most computer vision tasks which are often affected by the multiple complex effects that are usually found in natural images (e.g. cast shadows, specularities, interreflections...).

In this thesis we analyze the problem of intrinsic image estimation from different perspectives, including the theoretical formulation of the problem, the visual cues that can be used to estimate the intrinsic components and the evaluation mechanisms of the problem.
Address September 2015
Corporate Author Thesis Ph.D. thesis
Publisher Ediciones Graficas Rey Place of Publication Editor Robert Benavente;Olivier Penacchio
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN 978-84-943427-4-5 Medium
Area Expedition Conference
Notes CIC; 600.074 Approved no
Call Number Admin @ si @ Ser2015 Serial 2688
Permanent link to this record
 

 
Author Alejandro Gonzalez Alzate
Title Multi-modal Pedestrian Detection Type (up) Book Whole
Year 2015 Publication PhD Thesis, Universitat Autonoma de Barcelona-CVC Abbreviated Journal
Volume Issue Pages
Keywords
Abstract Pedestrian detection continues to be an extremely challenging problem in real scenarios, in which situations like illumination changes, noisy images, unexpected objects, uncontrolled scenarios and variant appearance of objects occur constantly. All these problems force the development of more robust detectors for relevant applications like vision-based autonomous vehicles, intelligent surveillance, and pedestrian tracking for behavior analysis. Most reliable vision-based pedestrian detectors base their decision on features extracted using a single sensor capturing complementary features, e.g., appearance, and texture. These features usually are extracted from the current frame, ignoring temporal information, or including it in a post process step e.g., tracking or temporal coherence. Taking into account these issues we formulate the following question: can we generate more robust pedestrian detectors by introducing new information sources in the feature extraction step?
In order to answer this question we develop different approaches for introducing new information sources to well-known pedestrian detectors. We start by the inclusion of temporal information following the Stacked Sequential Learning (SSL) paradigm which suggests that information extracted from the neighboring samples in a sequence can improve the accuracy of a base classifier.
We then focus on the inclusion of complementary information from different sensors like 3D point clouds (LIDAR – depth), far infrared images (FIR), or disparity maps (stereo pair cameras). For this end we develop a multi-modal framework in which information from different sensors is used for increasing detection accuracy (by increasing information redundancy). Finally we propose a multi-view pedestrian detector, this multi-view approach splits the detection problem in n sub-problems.
Each sub-problem will detect objects in a given specific view reducing in that way the variability problem faced when a single detectors is used for the whole problem. We show that these approaches obtain competitive results with other state-of-the-art methods but instead of design new features, we reuse existing ones boosting their performance.
Address November 2015
Corporate Author Thesis Ph.D. thesis
Publisher Ediciones Graficas Rey Place of Publication Editor David Vazquez;Antonio Lopez;
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN 978-84-943427-7-6 Medium
Area Expedition Conference
Notes ADAS; 600.076 Approved no
Call Number Admin @ si @ Gon2015 Serial 2706
Permanent link to this record
 

 
Author Adriana Romero
Title Assisting the training of deep neural networks with applications to computer vision Type (up) Book Whole
Year 2015 Publication PhD Thesis, Universitat de Barcelona-CVC Abbreviated Journal
Volume Issue Pages
Keywords
Abstract Deep learning has recently been enjoying an increasing popularity due to its success in solving challenging tasks. In particular, deep learning has proven to be effective in a large variety of computer vision tasks, such as image classification, object recognition and image parsing. Contrary to previous research, which required engineered feature representations, designed by experts, in order to succeed, deep learning attempts to learn representation hierarchies automatically from data. More recently, the trend has been to go deeper with representation hierarchies.
Learning (very) deep representation hierarchies is a challenging task, which
involves the optimization of highly non-convex functions. Therefore, the search
for algorithms to ease the learning of (very) deep representation hierarchies from data is extensive and ongoing.
In this thesis, we tackle the challenging problem of easing the learning of (very) deep representation hierarchies. We present a hyper-parameter free, off-the-shelf, simple and fast unsupervised algorithm to discover hidden structure from the input data by enforcing a very strong form of sparsity. We study the applicability and potential of the algorithm to learn representations of varying depth in a handful of applications and domains, highlighting the ability of the algorithm to provide discriminative feature representations that are able to achieve top performance.
Yet, while emphasizing the great value of unsupervised learning methods when
labeled data is scarce, the recent industrial success of deep learning has revolved around supervised learning. Supervised learning is currently the focus of many recent research advances, which have shown to excel at many computer vision tasks. Top performing systems often involve very large and deep models, which are not well suited for applications with time or memory limitations. More in line with the current trends, we engage in making top performing models more efficient, by designing very deep and thin models. Since training such very deep models still appears to be a challenging task, we introduce a novel algorithm that guides the training of very thin and deep models by hinting their intermediate representations.
Very deep and thin models trained by the proposed algorithm end up extracting feature representations that are comparable or even better performing
than the ones extracted by large state-of-the-art models, while compellingly
reducing the time and memory consumption of the model.
Address October 2015
Corporate Author Thesis Ph.D. thesis
Publisher Ediciones Graficas Rey Place of Publication Editor Carlo Gatta;Petia Radeva
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference
Notes MILAB Approved no
Call Number Admin @ si @ Rom2015 Serial 2707
Permanent link to this record
 

 
Author Sergio Vera
Title Anatomic Registration based on Medial Axis Parametrizations Type (up) Book Whole
Year 2015 Publication PhD Thesis, Universitat Autonoma de Barcelona-CVC Abbreviated Journal
Volume Issue Pages
Keywords
Abstract Image registration has been for many years the gold standard method to bring two images into correspondence. It has been used extensively in the eld of medical imaging in order to put images of di erent patients into a common overlapping spatial position. However, medical image registration is a slow, iterative optimization process, where many variables and prone to fall into the pit traps local minima.
A coordinate system parameterizing the interior of organs is a powerful tool for a systematic localization of injured tissue. If the same coordinate values are assigned to speci c anatomical sites, parameterizations ensure integration of data across different medical image modalities. Harmonic mappings have been used to produce parametric meshes over the surface of anatomical shapes, given their ability to set values at speci c locations through boundary conditions. However, most of the existing implementations in medical imaging restrict to either anatomical surfaces, or the depth coordinate with boundary conditions is given at discrete sites of limited geometric diversity.
The medial surface of the shape can be used to provide a continuous basis for the de nition of a depth coordinate. However, given that di erent methods for generation of medial surfaces generate di erent manifolds, not all of them are equally suited to be the basis of radial coordinate for a parameterization. It would be desirable that the medial surface will be smooth, and robust to surface shape noise, with low number of spurious branches or surfaces.
In this thesis we present methods for computation of smooth medial manifolds and apply them to the generation of for anatomical volumetric parameterization that extends current harmonic parameterizations to the interior anatomy using information provided by the volume medial surface. This reference system sets a solid base for creating anatomical models of the anatomical shapes, and allows comparing several patients in a common framework of reference.
Address November 2015
Corporate Author Thesis Ph.D. thesis
Publisher Ediciones Graficas Rey Place of Publication Editor Debora Gil;Miguel Angel Gonzalez Ballester
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN 978-84-943427-8-3 Medium
Area Expedition Conference
Notes IAM; 600.075 Approved no
Call Number Admin @ si @ Ver2015 Serial 2708
Permanent link to this record
 

 
Author Joan M. Nuñez
Title Vascular Pattern Characterization in Colonoscopy Images Type (up) Book Whole
Year 2015 Publication PhD Thesis, Universitat Autonoma de Barcelona-CVC Abbreviated Journal
Volume Issue Pages
Keywords
Abstract Colorectal cancer is the third most common cancer worldwide and the second most common malignant tumor in Europe. Screening tests have shown to be very e ective in increasing the survival rates since they allow an early detection of polyps. Among the di erent screening techniques, colonoscopy is considered the gold standard although clinical studies mention several problems that have an impact in the quality of the procedure. The navigation through the rectum and colon track can be challenging for the physicians which can increase polyp miss rates. The thorough visualization of the colon track must be ensured so that
the chances of missing lesions are minimized. The visual analysis of colonoscopy images can provide important information to the physicians and support their navigation during the procedure.
Blood vessels and their branching patterns can provide descriptive power to potentially develop biometric markers. Anatomical markers based on blood vessel patterns could be used to identify a particular scene in colonoscopy videos and to support endoscope navigation by generating a sequence of ordered scenes through the di erent colon sections. By verifying the presence of vascular content in the endoluminal scene it is also possible to certify a proper
inspection of the colon mucosa and to improve polyp localization. Considering the potential uses of blood vessel description, this contribution studies the characterization of the vascular content and the analysis of the descriptive power of its branching patterns.
Blood vessel characterization in colonoscopy images is shown to be a challenging task. The endoluminal scene is conformed by several elements whose similar characteristics hinder the development of particular models for each of them. To overcome such diculties we propose the use of the blood vessel branching characteristics as key features for pattern description. We present a model to characterize junctions in binary patterns. The implementation
of the junction model allows us to develop a junction localization method. We
created two data sets including manually labeled vessel information as well as manual ground truths of two types of keypoint landmarks: junctions and endpoints. The proposed method outperforms the available algorithms in the literature in experiments in both, our newly created colon vessel data set, and in DRIVE retinal fundus image data set. In the latter case, we created a manual ground truth of junction coordinates. Since we want to explore the descriptive potential of junctions and vessels, we propose a graph-based approach to
create anatomical markers. In the context of polyp localization, we present a new method to inhibit the in uence of blood vessels in the extraction valley-pro le information. The results show that our methodology decreases vessel in
uence, increases polyp information and leads to an improvement in state-of-the-art polyp localization performance. We also propose a polyp-speci c segmentation method that outperforms other general and speci c approaches.
Address November 2015
Corporate Author Thesis Ph.D. thesis
Publisher Ediciones Graficas Rey Place of Publication Editor Fernando Vilariño
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN 978-84-943427-6-9 Medium
Area Expedition Conference
Notes MV Approved no
Call Number Admin @ si @ Nuñ2015 Serial 2709
Permanent link to this record
 

 
Author Alejandro Gonzalez Alzate; Sebastian Ramos; David Vazquez; Antonio Lopez; Jaume Amores
Title Spatiotemporal Stacked Sequential Learning for Pedestrian Detection Type (up) Conference Article
Year 2015 Publication Pattern Recognition and Image Analysis, Proceedings of 7th Iberian Conference , ibPRIA 2015 Abbreviated Journal
Volume Issue Pages 3-12
Keywords SSL; Pedestrian Detection
Abstract Pedestrian classifiers decide which image windows contain a pedestrian. In practice, such classifiers provide a relatively high response at neighbor windows overlapping a pedestrian, while the responses around potential false positives are expected to be lower. An analogous reasoning applies for image sequences. If there is a pedestrian located within a frame, the same pedestrian is expected to appear close to the same location in neighbor frames. Therefore, such a location has chances of receiving high classification scores during several frames, while false positives are expected to be more spurious. In this paper we propose to exploit such correlations for improving the accuracy of base pedestrian classifiers. In particular, we propose to use two-stage classifiers which not only rely on the image descriptors required by the base classifiers but also on the response of such base classifiers in a given spatiotemporal neighborhood. More specifically, we train pedestrian classifiers using a stacked sequential learning (SSL) paradigm. We use a new pedestrian dataset we have acquired from a car to evaluate our proposal at different frame rates. We also test on a well known dataset: Caltech. The obtained results show that our SSL proposal boosts detection accuracy significantly with a minimal impact on the computational cost. Interestingly, SSL improves more the accuracy at the most dangerous situations, i.e. when a pedestrian is close to the camera.
Address Santiago de Compostela; España; June 2015
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area ACDC Expedition Conference IbPRIA
Notes ADAS; 600.057; 600.054; 600.076 Approved no
Call Number GRV2015; ADAS @ adas @ GRV2015 Serial 2454
Permanent link to this record
 

 
Author German Ros; Sebastian Ramos; Manuel Granados; Amir Bakhtiary; David Vazquez; Antonio Lopez
Title Vision-based Offline-Online Perception Paradigm for Autonomous Driving Type (up) Conference Article
Year 2015 Publication IEEE Winter Conference on Applications of Computer Vision Abbreviated Journal
Volume Issue Pages 231 - 238
Keywords Autonomous Driving; Scene Understanding; SLAM; Semantic Segmentation
Abstract Autonomous driving is a key factor for future mobility. Properly perceiving the environment of the vehicles is essential for a safe driving, which requires computing accurate geometric and semantic information in real-time. In this paper, we challenge state-of-the-art computer vision algorithms for building a perception system for autonomous driving. An inherent drawback in the computation of visual semantics is the trade-off between accuracy and computational cost. We propose to circumvent this problem by following an offline-online strategy. During the offline stage dense 3D semantic maps are created. In the online stage the current driving area is recognized in the maps via a re-localization process, which allows to retrieve the pre-computed accurate semantics and 3D geometry in realtime. Then, detecting the dynamic obstacles we obtain a rich understanding of the current scene. We evaluate quantitatively our proposal in the KITTI dataset and discuss the related open challenges for the computer vision community.
Address Hawaii; January 2015
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area ACDC Expedition Conference WACV
Notes ADAS; 600.076 Approved no
Call Number ADAS @ adas @ RRG2015 Serial 2499
Permanent link to this record
 

 
Author Alejandro Gonzalez Alzate; Gabriel Villalonga; Jiaolong Xu; David Vazquez; Jaume Amores; Antonio Lopez
Title Multiview Random Forest of Local Experts Combining RGB and LIDAR data for Pedestrian Detection Type (up) Conference Article
Year 2015 Publication IEEE Intelligent Vehicles Symposium IV2015 Abbreviated Journal
Volume Issue Pages 356-361
Keywords Pedestrian Detection
Abstract Despite recent significant advances, pedestrian detection continues to be an extremely challenging problem in real scenarios. In order to develop a detector that successfully operates under these conditions, it becomes critical to leverage upon multiple cues, multiple imaging modalities and a strong multi-view classifier that accounts for different pedestrian views and poses. In this paper we provide an extensive evaluation that gives insight into how each of these aspects (multi-cue, multimodality and strong multi-view classifier) affect performance both individually and when integrated together. In the multimodality component we explore the fusion of RGB and depth maps obtained by high-definition LIDAR, a type of modality that is only recently starting to receive attention. As our analysis reveals, although all the aforementioned aspects significantly help in improving the performance, the fusion of visible spectrum and depth information allows to boost the accuracy by a much larger margin. The resulting detector not only ranks among the top best performers in the challenging KITTI benchmark, but it is built upon very simple blocks that are easy to implement and computationally efficient. These simple blocks can be easily replaced with more sophisticated ones recently proposed, such as the use of convolutional neural networks for feature representation, to further improve the accuracy.
Address Seoul; Corea; June 2015
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area ACDC Expedition Conference IV
Notes ADAS; 600.076; 600.057; 600.054 Approved no
Call Number ADAS @ adas @ GVX2015 Serial 2625
Permanent link to this record
 

 
Author Alejandro Gonzalez Alzate; Gabriel Villalonga; German Ros; David Vazquez; Antonio Lopez
Title 3D-Guided Multiscale Sliding Window for Pedestrian Detection Type (up) Conference Article
Year 2015 Publication Pattern Recognition and Image Analysis, Proceedings of 7th Iberian Conference , ibPRIA 2015 Abbreviated Journal
Volume 9117 Issue Pages 560-568
Keywords Pedestrian Detection
Abstract The most relevant modules of a pedestrian detector are the candidate generation and the candidate classification. The former aims at presenting image windows to the latter so that they are classified as containing a pedestrian or not. Much attention has being paid to the classification module, while candidate generation has mainly relied on (multiscale) sliding window pyramid. However, candidate generation is critical for achieving real-time. In this paper we assume a context of autonomous driving based on stereo vision. Accordingly, we evaluate the effect of taking into account the 3D information (derived from the stereo) in order to prune the hundred of thousands windows per image generated by classical pyramidal sliding window. For our study we use a multimodal (RGB, disparity) and multi-descriptor (HOG, LBP, HOG+LBP) holistic ensemble based on linear SVM. Evaluation on data from the challenging KITTI benchmark suite shows the effectiveness of using 3D information to dramatically reduce the number of candidate windows, even improving the overall pedestrian detection accuracy.
Address Santiago de Compostela; España; June 2015
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area ACDC Expedition Conference IbPRIA
Notes ADAS; 600.076; 600.057; 600.054 Approved no
Call Number ADAS @ adas @ GVR2015 Serial 2585
Permanent link to this record
 

 
Author Joost Van de Weijer; Fahad Shahbaz Khan
Title An Overview of Color Name Applications in Computer Vision Type (up) Conference Article
Year 2015 Publication Computational Color Imaging Workshop Abbreviated Journal
Volume Issue Pages
Keywords color features; color names; object recognition
Abstract In this article we provide an overview of color name applications in computer vision. Color names are linguistic labels which humans use to communicate color. Computational color naming learns a mapping from pixels values to color names. In recent years color names have been applied to a wide variety of computer vision applications, including image classification, object recognition, texture classification, visual tracking and action recognition. Here we provide an overview of these results which show that in general color names outperform photometric invariants as a color representation.
Address Saint Etienne; France; March 2015
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference CCIW
Notes LAMP; 600.079; 600.068 Approved no
Call Number Admin @ si @ WeK2015 Serial 2586
Permanent link to this record
 

 
Author Sergio Escalera; Jordi Gonzalez; Xavier Baro; Pablo Pardo; Junior Fabian; Marc Oliu; Hugo Jair Escalante; Ivan Huerta; Isabelle Guyon
Title ChaLearn Looking at People 2015 new competitions: Age Estimation and Cultural Event Recognition Type (up) Conference Article
Year 2015 Publication IEEE International Joint Conference on Neural Networks IJCNN2015 Abbreviated Journal
Volume Issue Pages 1-8
Keywords
Abstract Following previous series on Looking at People (LAP) challenges [1], [2], [3], in 2015 ChaLearn runs two new competitions within the field of Looking at People: age and cultural event recognition in still images. We propose thefirst crowdsourcing application to collect and label data about apparent
age of people instead of the real age. In terms of cultural event recognition, tens of categories have to be recognized. This involves scene understanding and human analysis. This paper summarizes both challenges and data, providing some initial baselines. The results of the first round of the competition were presented at ChaLearn LAP 2015 IJCNN special session on computer vision and robotics http://www.dtic.ua.es/∼jgarcia/IJCNN2015.
Details of the ChaLearn LAP competitions can be found at http://gesture.chalearn.org/.
Address Killarney; Ireland; July 2015
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference IJCNN
Notes HuPBA; ISE; 600.063; 600.078;MV Approved no
Call Number Admin @ si @ EGB2015 Serial 2591
Permanent link to this record
 

 
Author Adriana Romero; Nicolas Ballas; Samira Ebrahimi Kahou; Antoine Chassang; Carlo Gatta; Yoshua Bengio
Title FitNets: Hints for Thin Deep Nets Type (up) Conference Article
Year 2015 Publication 3rd International Conference on Learning Representations ICLR2015 Abbreviated Journal
Volume Issue Pages
Keywords Computer Science ; Learning; Computer Science ;Neural and Evolutionary Computing
Abstract While depth tends to improve network performances, it also makes gradient-based training more difficult since deeper networks tend to be more non-linear. The recently proposed knowledge distillation approach is aimed at obtaining small and fast-to-execute models, and it has shown that a student network could imitate the soft output of a larger teacher network or ensemble of networks. In this paper, we extend this idea to allow the training of a student that is deeper and thinner than the teacher, using not only the outputs but also the intermediate representations learned by the teacher as hints to improve the training process and final performance of the student. Because the student intermediate hidden layer will generally be smaller than the teacher's intermediate hidden layer, additional parameters are introduced to map the student hidden layer to the prediction of the teacher hidden layer. This allows one to train deeper students that can generalize better or run faster, a trade-off that is controlled by the chosen student capacity. For example, on CIFAR-10, a deep student network with almost 10.4 times less parameters outperforms a larger, state-of-the-art teacher network.
Address San Diego; CA; May 2015
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference ICLR
Notes MILAB Approved no
Call Number Admin @ si @ RBK2015 Serial 2593
Permanent link to this record
 

 
Author Marc Bolaños; Maite Garolera; Petia Radeva
Title Object Discovery using CNN Features in Egocentric Videos Type (up) Conference Article
Year 2015 Publication Pattern Recognition and Image Analysis, Proceedings of 7th Iberian Conference , ibPRIA 2015 Abbreviated Journal
Volume 9117 Issue Pages 67-74
Keywords Object discovery; Egocentric videos; Lifelogging; CNN
Abstract Lifelogging devices based on photo/video are spreading faster everyday. This growth can represent great benefits to develop methods for extraction of meaningful information about the user wearing the device and his/her environment. In this paper, we propose a semi-supervised strategy for easily discovering objects relevant to the person wearing a first-person camera. The egocentric video sequence acquired by the camera, uses both the appearance extracted by means of a deep convolutional neural network and an object refill methodology that allow to discover objects even in case of small amount of object appearance in the collection of images. We validate our method on a sequence of 1000 egocentric daily images and obtain results with an F-measure of 0.5, 0.17 better than the state of the art approach.
Address Santiago de Compostela; España; June 2015
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title LNCS
Series Volume Series Issue Edition
ISSN 0302-9743 ISBN 978-3-319-19389-2 Medium
Area Expedition Conference IbPRIA
Notes MILAB Approved no
Call Number Admin @ si @ BGR2015 Serial 2596
Permanent link to this record
 

 
Author Estefania Talavera; Mariella Dimiccoli; Marc Bolaños; Maedeh Aghaei; Petia Radeva
Title R-clustering for egocentric video segmentation Type (up) Conference Article
Year 2015 Publication Pattern Recognition and Image Analysis, Proceedings of 7th Iberian Conference , ibPRIA 2015 Abbreviated Journal
Volume 9117 Issue Pages 327-336
Keywords Temporal video segmentation; Egocentric videos; Clustering
Abstract In this paper, we present a new method for egocentric video temporal segmentation based on integrating a statistical mean change detector and agglomerative clustering(AC) within an energy-minimization framework. Given the tendency of most AC methods to oversegment video sequences when clustering their frames, we combine the clustering with a concept drift detection technique (ADWIN) that has rigorous guarantee of performances. ADWIN serves as a statistical upper bound for the clustering-based video segmentation. We integrate both techniques in an energy-minimization framework that serves to disambiguate the decision of both techniques and to complete the segmentation taking into account the temporal continuity of video frames descriptors. We present experiments over egocentric sets of more than 13.000 images acquired with different wearable cameras, showing that our method outperforms state-of-the-art clustering methods.
Address Santiago de Compostela; España; June 2015
Corporate Author Thesis
Publisher Springer International Publishing Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title LNCS
Series Volume Series Issue Edition
ISSN 0302-9743 ISBN 978-3-319-19389-2 Medium
Area Expedition Conference IbPRIA
Notes MILAB Approved no
Call Number Admin @ si @ TDB2015 Serial 2597
Permanent link to this record