Home | << 1 2 3 4 5 6 7 8 9 10 >> [11–11] |
Records | |||||
---|---|---|---|---|---|
Author | Ivan Huerta; Michael Holte; Thomas B. Moeslund; Jordi Gonzalez | ||||
Title | Chromatic shadow detection and tracking for moving foreground segmentation | Type | Journal Article | ||
Year | 2015 | Publication | Image and Vision Computing | Abbreviated Journal | IMAVIS |
Volume | 41 | Issue | Pages | 42-53 | |
Keywords | Detecting moving objects; Chromatic shadow detection; Temporal local gradient; Spatial and Temporal brightness and angle distortions; Shadow tracking | ||||
Abstract | Advanced segmentation techniques in the surveillance domain deal with shadows to avoid distortions when detecting moving objects. Most approaches for shadow detection are still typically restricted to penumbra shadows and cannot cope well with umbra shadows. Consequently, umbra shadow regions are usually detected as part of moving objects, thus aecting the performance of the nal detection. In this paper we address the detection of both penumbra and umbra shadow regions. First, a novel bottom-up approach is presented based on gradient and colour models, which successfully discriminates between chromatic moving cast shadow regions and those regions detected as moving objects. In essence, those regions corresponding to potential shadows are detected based on edge partitioning and colour statistics. Subsequently (i) temporal similarities between textures and (ii) spatial similarities between chrominance angle and brightness distortions are analysed for each potential shadow region for detecting the umbra shadow regions. Our second contribution renes even further the segmentation results: a tracking-based top-down approach increases the performance of our bottom-up chromatic shadow detection algorithm by properly correcting non-detected shadows.
To do so, a combination of motion lters in a data association framework exploits the temporal consistency between objects and shadows to increase the shadow detection rate. Experimental results exceed current state-of-the- art in shadow accuracy for multiple well-known surveillance image databases which contain dierent shadowed materials and illumination conditions. |
||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | |||
Notes | ISE; 600.078; 600.063 | Approved | no | ||
Call Number | Admin @ si @ HHM2015 | Serial | 2703 | ||
Permanent link to this record | |||||
Author | Josep M. Gonfaus; Marco Pedersoli; Jordi Gonzalez; Andrea Vedaldi; Xavier Roca | ||||
Title | Factorized appearances for object detection | Type | Journal Article | ||
Year | 2015 | Publication | Computer Vision and Image Understanding | Abbreviated Journal | CVIU |
Volume | 138 | Issue | Pages | 92–101 | |
Keywords | Object recognition; Deformable part models; Learning and sharing parts; Discovering discriminative parts | ||||
Abstract | Deformable object models capture variations in an object’s appearance that can be represented as image deformations. Other effects such as out-of-plane rotations, three-dimensional articulations, and self-occlusions are often captured by considering mixture of deformable models, one per object aspect. A more scalable approach is representing instead the variations at the level of the object parts, applying the concept of a mixture locally. Combining a few part variations can in fact cheaply generate a large number of global appearances.
A limited version of this idea was proposed by Yang and Ramanan [1], for human pose dectection. In this paper we apply it to the task of generic object category detection and extend it in several ways. First, we propose a model for the relationship between part appearances more general than the tree of Yang and Ramanan [1], which is more suitable for generic categories. Second, we treat part locations as well as their appearance as latent variables so that training does not need part annotations but only the object bounding boxes. Third, we modify the weakly-supervised learning of Felzenszwalb et al. and Girshick et al. [2], [3] to handle a significantly more complex latent structure. Our model is evaluated on standard object detection benchmarks and is found to improve over existing approaches, yielding state-of-the-art results for several object categories. |
||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | |||
Notes | ISE; 600.063; 600.078 | Approved | no | ||
Call Number | Admin @ si @ GPG2015 | Serial | 2705 | ||
Permanent link to this record | |||||
Author | David Sanchez-Mendoza; David Masip; Agata Lapedriza | ||||
Title | Emotion recognition from mid-level features | Type | Journal Article | ||
Year | 2015 | Publication | Pattern Recognition Letters | Abbreviated Journal | PRL |
Volume | 67 | Issue | Part 1 | Pages | 66–74 |
Keywords | Facial expression; Emotion recognition; Action units; Computer vision | ||||
Abstract | In this paper we present a study on the use of Action Units as mid-level features for automatically recognizing basic and subtle emotions. We propose a representation model based on mid-level facial muscular movement features. We encode these movements dynamically using the Facial Action Coding System, and propose to use these intermediate features based on Action Units (AUs) to classify emotions. AUs activations are detected fusing a set of spatiotemporal geometric and appearance features. The algorithm is validated in two applications: (i) the recognition of 7 basic emotions using the publicly available Cohn-Kanade database, and (ii) the inference of subtle emotional cues in the Newscast database. In this second scenario, we consider emotions that are perceived cumulatively in longer periods of time. In particular, we Automatically classify whether video shoots from public News TV channels refer to Good or Bad news. To deal with the different video lengths we propose a Histogram of Action Units and compute it using a sliding window strategy on the frame sequences. Our approach achieves accuracies close to human perception. | ||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Elsevier B.V. | Place of Publication | Editor | ||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | 0167-8655 | ISBN | Medium | ||
Area | Expedition | Conference | |||
Notes | OR;MV | Approved | no | ||
Call Number | Admin @ si @ SML2015 | Serial | 2746 | ||
Permanent link to this record | |||||
Author | Joan M. Nuñez; Jorge Bernal; F. Javier Sanchez; Fernando Vilariño | ||||
Title | Growing Algorithm for Intersection Detection (GRAID) in branching patterns | Type | Journal Article | ||
Year | 2015 | Publication | Machine Vision and Applications | Abbreviated Journal | MVAP |
Volume | 26 | Issue | 2 | Pages | 387-400 |
Keywords | Bifurcation ; Crossroad; Intersection ;Retina ; Vessel | ||||
Abstract | Analysis of branching structures represents a very important task in fields such as medical diagnosis, road detection or biometrics. Detecting intersection landmarks Becomes crucial when capturing the structure of a branching pattern. We present a very simple geometrical model to describe intersections in branching structures based on two conditions: Bounded Tangency condition (BT) and Shortest Branch (SB) condition. The proposed model precisely sets a geometrical characterization of intersections and allows us to introduce a new unsupervised operator for intersection extraction. We propose an implementation that handles the consequences of digital domain operation that,unlike existing approaches, is not restricted to a particular scale and does not require the computation of the thinned pattern. The new proposal, as well as other existing approaches in the bibliography, are evaluated in a common framework for the first time. The performance analysis is based on two manually segmented image data sets: DRIVE retinal image database and COLON-VESSEL data set, a newly created data set of vascular content in colonoscopy frames. We have created an intersection landmark ground truth for each data set besides comparing our method in the only existing ground truth. Quantitative results confirm that we are able to outperform state-of-the-art performancelevels with the advantage that neither training nor parameter tuning is needed. | ||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | |||
Notes | ;SIAI | Approved | no | ||
Call Number | Admin @ si @MBS2015 | Serial | 2777 | ||
Permanent link to this record | |||||
Author | G.Thorvaldsen; Joana Maria Pujadas-Mora; T.Andersen ; L.Eikvil; Josep Llados; Alicia Fornes; Anna Cabre | ||||
Title | A Tale of two Transcriptions | Type | Journal | ||
Year | 2015 | Publication | Historical Life Course Studies | Abbreviated Journal | |
Volume | 2 | Issue | Pages | 1-19 | |
Keywords | Nominative Sources; Census; Vital Records; Computer Vision; Optical Character Recognition; Word Spotting | ||||
Abstract | non-indexed
This article explains how two projects implement semi-automated transcription routines: for census sheets in Norway and marriage protocols from Barcelona. The Spanish system was created to transcribe the marriage license books from 1451 to 1905 for the Barcelona area; one of the world’s longest series of preserved vital records. Thus, in the Project “Five Centuries of Marriages” (5CofM) at the Autonomous University of Barcelona’s Center for Demographic Studies, the Barcelona Historical Marriage Database has been built. More than 600,000 records were transcribed by 150 transcribers working online. The Norwegian material is cross-sectional as it is the 1891 census, recorded on one sheet per person. This format and the underlining of keywords for several variables made it more feasible to semi-automate data entry than when many persons are listed on the same page. While Optical Character Recognition (OCR) for printed text is scientifically mature, computer vision research is now focused on more difficult problems such as handwriting recognition. In the marriage project, document analysis methods have been proposed to automatically recognize the marriage licenses. Fully automatic recognition is still a challenge, but some promising results have been obtained. In Spain, Norway and elsewhere the source material is available as scanned pictures on the Internet, opening up the possibility for further international cooperation concerning automating the transcription of historic source materials. Like what is being done in projects to digitize printed materials, the optimal solution is likely to be a combination of manual transcription and machine-assisted recognition also for hand-written sources. |
||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | 2352-6343 | ISBN | Medium | ||
Area | Expedition | Conference | |||
Notes | DAG; 600.077; 602.006 | Approved | no | ||
Call Number | Admin @ si @ TPA2015 | Serial | 2582 | ||
Permanent link to this record | |||||
Author | Enric Marti; J.Roncaries; Debora Gil; Aura Hernandez-Sabate; Antoni Gurgui; Ferran Poveda | ||||
Title | PBL On Line: A proposal for the organization, part-time monitoring and assessment of PBL group activities | Type | Journal | ||
Year | 2015 | Publication | Journal of Technology and Science Education | Abbreviated Journal | JOTSE |
Volume | 5 | Issue | 2 | Pages | 87-96 |
Keywords | |||||
Abstract | |||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | |||
Notes | IAM; ADAS; 600.076; 600.075 | Approved | no | ||
Call Number | Admin @ si @ MRG2015 | Serial | 2608 | ||
Permanent link to this record | |||||
Author | Carles Sanchez; Oriol Ramos Terrades; Patricia Marquez; Enric Marti; J.Roncaries; Debora Gil | ||||
Title | Automatic evaluation of practices in Moodle for Self Learning in Engineering | Type | Journal | ||
Year | 2015 | Publication | Journal of Technology and Science Education | Abbreviated Journal | JOTSE |
Volume | 5 | Issue | 2 | Pages | 97-106 |
Keywords | |||||
Abstract | |||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | |||
Notes | IAM; DAG; 600.075; 600.077 | Approved | no | ||
Call Number | Admin @ si @ SRM2015 | Serial | 2610 | ||
Permanent link to this record | |||||
Author | Aura Hernandez-Sabate; Meritxell Joanpere; Nuria Gorgorio; Lluis Albarracin | ||||
Title | Mathematics learning opportunities when playing a Tower Defense Game | Type | Journal | ||
Year | 2015 | Publication | International Journal of Serious Games | Abbreviated Journal | IJSG |
Volume | 2 | Issue | 4 | Pages | 57-71 |
Keywords | Tower Defense game; learning opportunities; mathematics; problem solving; game design | ||||
Abstract | A qualitative research study is presented herein with the purpose of identifying mathematics learning opportunities in students between 10 and 12 years old while playing a commercial version of a Tower Defense game. These learning opportunities are understood as mathematicisable moments of the game and involve the establishment of relationships between the game and mathematical problem solving. Based on the analysis of these mathematicisable moments, we conclude that the game can promote problem-solving processes and learning opportunities that can be associated with different mathematical contents that appears in mathematics curricula, thought it seems that teacher or new game elements might be needed to facilitate the processes. | ||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | |||
Notes | ADAS; 600.076 | Approved | no | ||
Call Number | Admin @ si @ HJG2015 | Serial | 2730 | ||
Permanent link to this record | |||||
Author | Alejandro Gonzalez Alzate; Sebastian Ramos; David Vazquez; Antonio Lopez; Jaume Amores | ||||
Title | Spatiotemporal Stacked Sequential Learning for Pedestrian Detection | Type | Conference Article | ||
Year | 2015 | Publication | Pattern Recognition and Image Analysis, Proceedings of 7th Iberian Conference , ibPRIA 2015 | Abbreviated Journal | |
Volume | Issue | Pages | 3-12 | ||
Keywords | SSL; Pedestrian Detection | ||||
Abstract | Pedestrian classifiers decide which image windows contain a pedestrian. In practice, such classifiers provide a relatively high response at neighbor windows overlapping a pedestrian, while the responses around potential false positives are expected to be lower. An analogous reasoning applies for image sequences. If there is a pedestrian located within a frame, the same pedestrian is expected to appear close to the same location in neighbor frames. Therefore, such a location has chances of receiving high classification scores during several frames, while false positives are expected to be more spurious. In this paper we propose to exploit such correlations for improving the accuracy of base pedestrian classifiers. In particular, we propose to use two-stage classifiers which not only rely on the image descriptors required by the base classifiers but also on the response of such base classifiers in a given spatiotemporal neighborhood. More specifically, we train pedestrian classifiers using a stacked sequential learning (SSL) paradigm. We use a new pedestrian dataset we have acquired from a car to evaluate our proposal at different frame rates. We also test on a well known dataset: Caltech. The obtained results show that our SSL proposal boosts detection accuracy significantly with a minimal impact on the computational cost. Interestingly, SSL improves more the accuracy at the most dangerous situations, i.e. when a pedestrian is close to the camera. | ||||
Address | Santiago de Compostela; España; June 2015 | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | ACDC | Expedition | Conference | IbPRIA | |
Notes | ADAS; 600.057; 600.054; 600.076 | Approved | no | ||
Call Number | GRV2015; ADAS @ adas @ GRV2015 | Serial | 2454 | ||
Permanent link to this record | |||||
Author | German Ros; Sebastian Ramos; Manuel Granados; Amir Bakhtiary; David Vazquez; Antonio Lopez | ||||
Title | Vision-based Offline-Online Perception Paradigm for Autonomous Driving | Type | Conference Article | ||
Year | 2015 | Publication | IEEE Winter Conference on Applications of Computer Vision | Abbreviated Journal | |
Volume | Issue | Pages | 231 - 238 | ||
Keywords | Autonomous Driving; Scene Understanding; SLAM; Semantic Segmentation | ||||
Abstract | Autonomous driving is a key factor for future mobility. Properly perceiving the environment of the vehicles is essential for a safe driving, which requires computing accurate geometric and semantic information in real-time. In this paper, we challenge state-of-the-art computer vision algorithms for building a perception system for autonomous driving. An inherent drawback in the computation of visual semantics is the trade-off between accuracy and computational cost. We propose to circumvent this problem by following an offline-online strategy. During the offline stage dense 3D semantic maps are created. In the online stage the current driving area is recognized in the maps via a re-localization process, which allows to retrieve the pre-computed accurate semantics and 3D geometry in realtime. Then, detecting the dynamic obstacles we obtain a rich understanding of the current scene. We evaluate quantitatively our proposal in the KITTI dataset and discuss the related open challenges for the computer vision community. | ||||
Address | Hawaii; January 2015 | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | ACDC | Expedition | Conference | WACV | |
Notes | ADAS; 600.076 | Approved | no | ||
Call Number | ADAS @ adas @ RRG2015 | Serial | 2499 | ||
Permanent link to this record | |||||
Author | Alejandro Gonzalez Alzate; Gabriel Villalonga; Jiaolong Xu; David Vazquez; Jaume Amores; Antonio Lopez | ||||
Title | Multiview Random Forest of Local Experts Combining RGB and LIDAR data for Pedestrian Detection | Type | Conference Article | ||
Year | 2015 | Publication | IEEE Intelligent Vehicles Symposium IV2015 | Abbreviated Journal | |
Volume | Issue | Pages | 356-361 | ||
Keywords | Pedestrian Detection | ||||
Abstract | Despite recent significant advances, pedestrian detection continues to be an extremely challenging problem in real scenarios. In order to develop a detector that successfully operates under these conditions, it becomes critical to leverage upon multiple cues, multiple imaging modalities and a strong multi-view classifier that accounts for different pedestrian views and poses. In this paper we provide an extensive evaluation that gives insight into how each of these aspects (multi-cue, multimodality and strong multi-view classifier) affect performance both individually and when integrated together. In the multimodality component we explore the fusion of RGB and depth maps obtained by high-definition LIDAR, a type of modality that is only recently starting to receive attention. As our analysis reveals, although all the aforementioned aspects significantly help in improving the performance, the fusion of visible spectrum and depth information allows to boost the accuracy by a much larger margin. The resulting detector not only ranks among the top best performers in the challenging KITTI benchmark, but it is built upon very simple blocks that are easy to implement and computationally efficient. These simple blocks can be easily replaced with more sophisticated ones recently proposed, such as the use of convolutional neural networks for feature representation, to further improve the accuracy. | ||||
Address | Seoul; Corea; June 2015 | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | ACDC | Expedition | Conference | IV | |
Notes | ADAS; 600.076; 600.057; 600.054 | Approved | no | ||
Call Number | ADAS @ adas @ GVX2015 | Serial | 2625 | ||
Permanent link to this record | |||||
Author | Alejandro Gonzalez Alzate; Gabriel Villalonga; German Ros; David Vazquez; Antonio Lopez | ||||
Title | 3D-Guided Multiscale Sliding Window for Pedestrian Detection | Type | Conference Article | ||
Year | 2015 | Publication | Pattern Recognition and Image Analysis, Proceedings of 7th Iberian Conference , ibPRIA 2015 | Abbreviated Journal | |
Volume | 9117 | Issue | Pages | 560-568 | |
Keywords | Pedestrian Detection | ||||
Abstract | The most relevant modules of a pedestrian detector are the candidate generation and the candidate classification. The former aims at presenting image windows to the latter so that they are classified as containing a pedestrian or not. Much attention has being paid to the classification module, while candidate generation has mainly relied on (multiscale) sliding window pyramid. However, candidate generation is critical for achieving real-time. In this paper we assume a context of autonomous driving based on stereo vision. Accordingly, we evaluate the effect of taking into account the 3D information (derived from the stereo) in order to prune the hundred of thousands windows per image generated by classical pyramidal sliding window. For our study we use a multimodal (RGB, disparity) and multi-descriptor (HOG, LBP, HOG+LBP) holistic ensemble based on linear SVM. Evaluation on data from the challenging KITTI benchmark suite shows the effectiveness of using 3D information to dramatically reduce the number of candidate windows, even improving the overall pedestrian detection accuracy. | ||||
Address | Santiago de Compostela; España; June 2015 | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | ACDC | Expedition | Conference | IbPRIA | |
Notes | ADAS; 600.076; 600.057; 600.054 | Approved | no | ||
Call Number | ADAS @ adas @ GVR2015 | Serial | 2585 | ||
Permanent link to this record | |||||
Author | Joost Van de Weijer; Fahad Shahbaz Khan | ||||
Title | An Overview of Color Name Applications in Computer Vision | Type | Conference Article | ||
Year | 2015 | Publication | Computational Color Imaging Workshop | Abbreviated Journal | |
Volume | Issue | Pages | |||
Keywords | color features; color names; object recognition | ||||
Abstract | In this article we provide an overview of color name applications in computer vision. Color names are linguistic labels which humans use to communicate color. Computational color naming learns a mapping from pixels values to color names. In recent years color names have been applied to a wide variety of computer vision applications, including image classification, object recognition, texture classification, visual tracking and action recognition. Here we provide an overview of these results which show that in general color names outperform photometric invariants as a color representation. | ||||
Address | Saint Etienne; France; March 2015 | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | CCIW | ||
Notes | LAMP; 600.079; 600.068 | Approved | no | ||
Call Number | Admin @ si @ WeK2015 | Serial | 2586 | ||
Permanent link to this record | |||||
Author | Sergio Escalera; Jordi Gonzalez; Xavier Baro; Pablo Pardo; Junior Fabian; Marc Oliu; Hugo Jair Escalante; Ivan Huerta; Isabelle Guyon | ||||
Title | ChaLearn Looking at People 2015 new competitions: Age Estimation and Cultural Event Recognition | Type | Conference Article | ||
Year | 2015 | Publication | IEEE International Joint Conference on Neural Networks IJCNN2015 | Abbreviated Journal | |
Volume | Issue | Pages | 1-8 | ||
Keywords | |||||
Abstract | Following previous series on Looking at People (LAP) challenges [1], [2], [3], in 2015 ChaLearn runs two new competitions within the field of Looking at People: age and cultural event recognition in still images. We propose thefirst crowdsourcing application to collect and label data about apparent
age of people instead of the real age. In terms of cultural event recognition, tens of categories have to be recognized. This involves scene understanding and human analysis. This paper summarizes both challenges and data, providing some initial baselines. The results of the first round of the competition were presented at ChaLearn LAP 2015 IJCNN special session on computer vision and robotics http://www.dtic.ua.es/∼jgarcia/IJCNN2015. Details of the ChaLearn LAP competitions can be found at http://gesture.chalearn.org/. |
||||
Address | Killarney; Ireland; July 2015 | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | IJCNN | ||
Notes | HuPBA; ISE; 600.063; 600.078;MV | Approved | no | ||
Call Number | Admin @ si @ EGB2015 | Serial | 2591 | ||
Permanent link to this record | |||||
Author | Adriana Romero; Nicolas Ballas; Samira Ebrahimi Kahou; Antoine Chassang; Carlo Gatta; Yoshua Bengio | ||||
Title | FitNets: Hints for Thin Deep Nets | Type | Conference Article | ||
Year | 2015 | Publication | 3rd International Conference on Learning Representations ICLR2015 | Abbreviated Journal | |
Volume | Issue | Pages | |||
Keywords | Computer Science ; Learning; Computer Science ;Neural and Evolutionary Computing | ||||
Abstract | While depth tends to improve network performances, it also makes gradient-based training more difficult since deeper networks tend to be more non-linear. The recently proposed knowledge distillation approach is aimed at obtaining small and fast-to-execute models, and it has shown that a student network could imitate the soft output of a larger teacher network or ensemble of networks. In this paper, we extend this idea to allow the training of a student that is deeper and thinner than the teacher, using not only the outputs but also the intermediate representations learned by the teacher as hints to improve the training process and final performance of the student. Because the student intermediate hidden layer will generally be smaller than the teacher's intermediate hidden layer, additional parameters are introduced to map the student hidden layer to the prediction of the teacher hidden layer. This allows one to train deeper students that can generalize better or run faster, a trade-off that is controlled by the chosen student capacity. For example, on CIFAR-10, a deep student network with almost 10.4 times less parameters outperforms a larger, state-of-the-art teacher network. | ||||
Address | San Diego; CA; May 2015 | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | ICLR | ||
Notes | MILAB | Approved | no | ||
Call Number | Admin @ si @ RBK2015 | Serial | 2593 | ||
Permanent link to this record |