toggle visibility Search & Display Options

Select All    Deselect All
 |   | 
Details
   print
  Records Links
Author Dimosthenis Karatzas; V. Poulain d'Andecy; Marçal Rusiñol edit   pdf
doi  openurl
  Title Human-Document Interaction – a new frontier for document image analysis Type Conference Article
  Year 2016 Publication 12th IAPR Workshop on Document Analysis Systems Abbreviated Journal  
  Volume (down) Issue Pages 369-374  
  Keywords  
  Abstract All indications show that paper documents will not cede in favour of their digital counterparts, but will instead be used increasingly in conjunction with digital information. An open challenge is how to seamlessly link the physical with the digital – how to continue taking advantage of the important affordances of paper, without missing out on digital functionality. This paper
presents the authors’ experience with developing systems for Human-Document Interaction based on augmented document interfaces and examines new challenges and opportunities arising for the document image analysis field in this area. The system presented combines state of the art camera-based document
image analysis techniques with a range of complementary tech-nologies to offer fluid Human-Document Interaction. Both fixed and nomadic setups are discussed that have gone through user testing in real-life environments, and use cases are presented that span the spectrum from business to educational application
 
  Address Santorini; Greece; April 2016  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference DAS  
  Notes DAG; 600.084; 600.077 Approved no  
  Call Number KPR2016 Serial 2756  
Permanent link to this record
 

 
Author Q. Bao; Marçal Rusiñol; M.Coustaty; Muhammad Muzzamil Luqman; C.D. Tran; Jean-Marc Ogier edit   pdf
doi  openurl
  Title Delaunay triangulation-based features for Camera-based document image retrieval system Type Conference Article
  Year 2016 Publication 12th IAPR Workshop on Document Analysis Systems Abbreviated Journal  
  Volume (down) Issue Pages 1-6  
  Keywords Camera-based Document Image Retrieval; Delaunay Triangulation; Feature descriptors; Indexing  
  Abstract In this paper, we propose a new feature vector, named DElaunay TRIangulation-based Features (DETRIF), for real-time camera-based document image retrieval. DETRIF is computed based on the geometrical constraints from each pair of adjacency triangles in delaunay triangulation which is constructed from centroids of connected components. Besides, we employ a hashing-based indexing system in order to evaluate the performance of DETRIF and to compare it with other systems such as LLAH and SRIF. The experimentation is carried out on two datasets comprising of 400 heterogeneous-content complex linguistic map images (huge size, 9800 X 11768 pixels resolution)and 700 textual document images.  
  Address Santorini; Greece; April 2016  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference DAS  
  Notes DAG; 600.061; 600.084; 600.077 Approved no  
  Call Number Admin @ si @ BRC2016 Serial 2757  
Permanent link to this record
 

 
Author Marc Masana; Joost Van de Weijer; Andrew Bagdanov edit   pdf
openurl 
  Title On-the-fly Network pruning for object detection Type Conference Article
  Year 2016 Publication International conference on learning representations Abbreviated Journal  
  Volume (down) Issue Pages  
  Keywords  
  Abstract Object detection with deep neural networks is often performed by passing a few
thousand candidate bounding boxes through a deep neural network for each image.
These bounding boxes are highly correlated since they originate from the same
image. In this paper we investigate how to exploit feature occurrence at the image scale to prune the neural network which is subsequently applied to all bounding boxes. We show that removing units which have near-zero activation in the image allows us to significantly reduce the number of parameters in the network. Results on the PASCAL 2007 Object Detection Challenge demonstrate that up to 40% of units in some fully-connected layers can be entirely eliminated with little change in the detection result.
 
  Address Puerto Rico; May 2016  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference ICLR  
  Notes LAMP; 600.068; 600.106; 600.079 Approved no  
  Call Number Admin @ si @MWB2016 Serial 2758  
Permanent link to this record
 

 
Author Ozan Caglayan; Walid Aransa; Yaxing Wang; Marc Masana; Mercedes Garcıa-Martinez; Fethi Bougares; Loic Barrault; Joost Van de Weijer edit   pdf
openurl 
  Title Does Multimodality Help Human and Machine for Translation and Image Captioning? Type Conference Article
  Year 2016 Publication 1st conference on machine translation Abbreviated Journal  
  Volume (down) Issue Pages  
  Keywords  
  Abstract This paper presents the systems developed by LIUM and CVC for the WMT16 Multimodal Machine Translation challenge. We explored various comparative methods, namely phrase-based systems and attentional recurrent neural networks models trained using monomodal or multimodal data. We also performed a human evaluation in order to estimate theusefulness of multimodal data for human machine translation and image description generation. Our systems obtained the best results for both tasks according to the automatic evaluation metrics BLEU and METEOR.  
  Address Berlin; Germany; August 2016  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference WMT  
  Notes LAMP; 600.106 ; 600.068 Approved no  
  Call Number Admin @ si @ CAW2016 Serial 2761  
Permanent link to this record
 

 
Author Esteve Cervantes; Long Long Yu; Andrew Bagdanov; Marc Masana; Joost Van de Weijer edit   pdf
openurl 
  Title Hierarchical Part Detection with Deep Neural Networks Type Conference Article
  Year 2016 Publication 23rd IEEE International Conference on Image Processing Abbreviated Journal  
  Volume (down) Issue Pages  
  Keywords Object Recognition; Part Detection; Convolutional Neural Networks  
  Abstract Part detection is an important aspect of object recognition. Most approaches apply object proposals to generate hundreds of possible part bounding box candidates which are then evaluated by part classifiers. Recently several methods have investigated directly regressing to a limited set of bounding boxes from deep neural network representation. However, for object parts such methods may be unfeasible due to their relatively small size with respect to the image. We propose a hierarchical method for object and part detection. In a single network we first detect the object and then regress to part location proposals based only on the feature representation inside the object. Experiments show that our hierarchical approach outperforms a network which directly regresses the part locations. We also show that our approach obtains part detection accuracy comparable or better than state-of-the-art on the CUB-200 bird and Fashionista clothing item datasets with only a fraction of the number of part proposals.  
  Address Phoenix; Arizona; USA; September 2016  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference ICIP  
  Notes LAMP; 600.106 Approved no  
  Call Number Admin @ si @ CLB2016 Serial 2762  
Permanent link to this record
 

 
Author Muhammad Anwer Rao; Fahad Shahbaz Khan; Joost Van de Weijer; Jorma Laaksonen edit   pdf
doi  openurl
  Title Combining Holistic and Part-based Deep Representations for Computational Painting Categorization Type Conference Article
  Year 2016 Publication 6th International Conference on Multimedia Retrieval Abbreviated Journal  
  Volume (down) Issue Pages  
  Keywords  
  Abstract Automatic analysis of visual art, such as paintings, is a challenging inter-disciplinary research problem. Conventional approaches only rely on global scene characteristics by encoding holistic information for computational painting categorization.We argue that such approaches are sub-optimal and that discriminative common visual structures provide complementary information for painting classification. We present an approach that encodes both the global scene layout and discriminative latent common structures for computational painting categorization. The region of interests are automatically extracted, without any manual part labeling, by training class-specific deformable part-based models. Both holistic and region-of-interests are then described using multi-scale dense convolutional features. These features are pooled separately using Fisher vector encoding and concatenated afterwards in a single image representation. Experiments are performed on a challenging dataset with 91 different painters and 13 diverse painting styles. Our approach outperforms the standard method, which only employs the global scene characteristics. Furthermore, our method achieves state-of-the-art results outperforming a recent multi-scale deep features based approach [11] by 6.4% and 3.8% respectively on artist and style classification.  
  Address New York; USA; June 2016  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference ICMR  
  Notes LAMP; 600.068; 600.079;ADAS Approved no  
  Call Number Admin @ si @ RKW2016 Serial 2763  
Permanent link to this record
 

 
Author Isabelle Guyon; Imad Chaabane; Hugo Jair Escalante; Sergio Escalera; Damir Jajetic; James Robert Lloyd; Nuria Macia; Bisakha Ray; Lukasz Romaszko; Michele Sebag; Alexander Statnikov; Sebastien Treguer; Evelyne Viegas edit  openurl
  Title A brief Review of the ChaLearn AutoML Challenge: Any-time Any-dataset Learning without Human Intervention Type Conference Article
  Year 2016 Publication AutoML Workshop Abbreviated Journal  
  Volume (down) Issue 1 Pages 1-8  
  Keywords AutoML Challenge; machine learning; model selection; meta-learning; repre- sentation learning; active learning  
  Abstract The ChaLearn AutoML Challenge team conducted a large scale evaluation of fully automatic, black-box learning machines for feature-based classification and regression problems. The test bed was composed of 30 data sets from a wide variety of application domains and ranged across different types of complexity. Over six rounds, participants succeeded in delivering AutoML software capable of being trained and tested without human intervention. Although improvements can still be made to close the gap between human-tweaked and AutoML models, this competition contributes to the development of fully automated environments by challenging practitioners to solve problems under specific constraints and sharing their approaches; the platform will remain available for post-challenge submissions at http://codalab.org/AutoML.  
  Address New York; USA; June 2016  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference ICML  
  Notes HuPBA;MILAB Approved no  
  Call Number Admin @ si @ GCE2016 Serial 2769  
Permanent link to this record
 

 
Author Jun Wan; Yibing Zhao; Shuai Zhou; Isabelle Guyon; Sergio Escalera edit   pdf
doi  openurl
  Title ChaLearn Looking at People RGB-D Isolated and Continuous Datasets for Gesture Recognition Type Conference Article
  Year 2016 Publication 29th IEEE Conference on Computer Vision and Pattern Recognition Worshops Abbreviated Journal  
  Volume (down) Issue Pages  
  Keywords  
  Abstract In this paper, we present two large video multi-modal datasets for RGB and RGB-D gesture recognition: the ChaLearn LAP RGB-D Isolated Gesture Dataset (IsoGD)and the Continuous Gesture Dataset (ConGD). Both datasets are derived from the ChaLearn Gesture Dataset
(CGD) that has a total of more than 50000 gestures for the “one-shot-learning” competition. To increase the potential of the old dataset, we designed new well curated datasets composed of 249 gesture labels, and including 47933 gestures manually labeled the begin and end frames in sequences.Using these datasets we will open two competitions
on the CodaLab platform so that researchers can test and compare their methods for “user independent” gesture recognition. The first challenge is designed for gesture spotting
and recognition in continuous sequences of gestures while the second one is designed for gesture classification from segmented data. The baseline method based on the bag of visual words model is also presented.
 
  Address Las Vegas; USA; July 2016  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference CVPRW  
  Notes HuPBA;MILAB; Approved no  
  Call Number Admin @ si @ WZZ2016 Serial 2771  
Permanent link to this record
 

 
Author Mohammad Ali Bagheri; Qigang Gao; Sergio Escalera edit   pdf
doi  openurl
  Title Support Vector Machines with Time Series Distance Kernels for Action Classification Type Conference Article
  Year 2016 Publication IEEE Winter Conference on Applications of Computer Vision Abbreviated Journal  
  Volume (down) Issue Pages 1-7  
  Keywords  
  Abstract Despite the outperformance of Support Vector Machine (SVM) on many practical classification problems, the algorithm is not directly applicable to multi-dimensional trajectories having different lengths. In this paper, a new class of SVM that is applicable to trajectory classification, such as action recognition, is developed by incorporating two efficient time-series distances measures into the kernel function.
Dynamic Time Warping and Longest Common Subsequence distance measures along with their derivatives are
employed as the SVM kernel. In addition, the pairwise proximity learning strategy is utilized in order to make use of non-positive semi-definite kernels in the SVM formulation. The proposed method is employed for a challenging classification problem: action recognition by depth cameras using only skeleton data; and evaluated on three benchmark action datasets. Experimental results demonstrate the outperformance of our methodology compared to the state-ofthe-art on the considered datasets.
 
  Address Lake Placid; NY (USA); March 2016  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference WACV  
  Notes HuPBA;MILAB; Approved no  
  Call Number Admin @ si @ BGE2016a Serial 2773  
Permanent link to this record
 

 
Author Daniel Hernandez; Alejandro Chacon; Antonio Espinosa; David Vazquez; Juan Carlos Moure; Antonio Lopez edit   pdf
openurl 
  Title Stereo Matching using SGM on the GPU Type Report
  Year 2016 Publication Programming and Tuning Massively Parallel Systems Abbreviated Journal PUMPS  
  Volume (down) Issue Pages  
  Keywords CUDA; Stereo; Autonomous Vehicle  
  Abstract Dense, robust and real-time computation of depth information from stereo-camera systems is a computationally demanding requirement for robotics, advanced driver assistance systems (ADAS) and autonomous vehicles. Semi-Global Matching (SGM) is a widely used algorithm that propagates consistency constraints along several paths across the image. This work presents a real-time system producing reliable disparity estimation results on the new embedded energy efficient GPU devices. Our design runs on a Tegra X1 at 42 frames per second (fps) for an image size of 640x480, 128 disparity levels, and using 4 path directions for the SGM method.  
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference PUMPS  
  Notes ADAS; 600.085; 600.087; 600.076 Approved no  
  Call Number ADAS @ adas @ HCE2016b Serial 2776  
Permanent link to this record
 

 
Author Maria Oliver; Gloria Haro; Mariella Dimiccoli; Baptiste Mazin; Coloma Ballester edit   pdf
openurl 
  Title A computational model of amodal completion Type Conference Article
  Year 2016 Publication SIAM Conference on Imaging Science Abbreviated Journal  
  Volume (down) Issue Pages  
  Keywords  
  Abstract This paper presents a computational model to recover the most likely interpretation of the 3D scene structure from a planar image, where some objects may occlude others. The estimated scene interpretation is obtained by integrating some global and local cues and provides both the complete disoccluded objects that form the scene and their ordering according to depth. Our method first computes several distal scenes which are compatible with the proximal planar image. To compute these different hypothesized scenes, we propose a perceptually inspired object disocclusion method, which works by minimizing the Euler's elastica as well as by incorporating the relatability of partially occluded contours and the convexity of the disoccluded objects. Then, to estimate the preferred scene we rely on a Bayesian model and define probabilities taking into account the global complexity of the objects in the hypothesized scenes as well as the effort of bringing these objects in their relative position in the planar image, which is also measured by an Euler's elastica-based quantity. The model is illustrated with numerical experiments on, both, synthetic and real images showing the ability of our model to reconstruct the occluded objects and the preferred perceptual order among them. We also present results on images of the Berkeley dataset with provided figure-ground ground-truth labeling.  
  Address Albuquerque; New Mexico; USA; May 2016  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference IS  
  Notes MILAB; 601.235 Approved no  
  Call Number Admin @ si @OHD2016a Serial 2788  
Permanent link to this record
 

 
Author G. de Oliveira; A. Cartas; Marc Bolaños; Mariella Dimiccoli; Xavier Giro; Petia Radeva edit   pdf
openurl 
  Title LEMoRe: A Lifelog Engine for Moments Retrieval at the NTCIR-Lifelog LSAT Task Type Conference Article
  Year 2016 Publication 12th NTCIR Conference on Evaluation of Information Access Technologies Abbreviated Journal  
  Volume (down) Issue Pages  
  Keywords  
  Abstract Semantic image retrieval from large amounts of egocentric visual data requires to leverage powerful techniques for filling in the semantic gap. This paper introduces LEMoRe, a Lifelog Engine for Moments Retrieval, developed in the context of the Lifelog Semantic Access Task (LSAT) of the the NTCIR-12 challenge and discusses its performance variation on different trials. LEMoRe integrates classical image descriptors with high-level semantic concepts extracted by Convolutional Neural Networks (CNN), powered by a graphic user interface that uses natural language processing. Although this is just a first attempt towards interactive image retrieval from large egocentric datasets and there is a large room for improvement of the system components and the user interface, the structure of the system itself and the way the single components cooperate are very promising.  
  Address Tokyo; Japan; June 2016  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference NTCIR  
  Notes MILAB; Approved no  
  Call Number Admin @ si @OCB2016 Serial 2789  
Permanent link to this record
 

 
Author G. de Oliveira; Mariella Dimiccoli; Petia Radeva edit  openurl
  Title Egocentric Image Retrieval With Deep Convolutional Neural Networks Type Conference Article
  Year 2016 Publication 19th International Conference of the Catalan Association for Artificial Intelligence Abbreviated Journal  
  Volume (down) Issue Pages 71-76  
  Keywords  
  Abstract  
  Address Barcelona; Spain; October 2016  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference CCIA  
  Notes MILAB Approved no  
  Call Number Admin @ si @ODR2016 Serial 2790  
Permanent link to this record
 

 
Author Maedeh Aghaei; Mariella Dimiccoli; Petia Radeva edit   pdf
openurl 
  Title With whom do I interact with? Social interaction detection in egocentric photo-streams Type Conference Article
  Year 2016 Publication 23rd International Conference on Pattern Recognition Abbreviated Journal  
  Volume (down) Issue Pages  
  Keywords  
  Abstract Given a user wearing a low frame rate wearable camera during a day, this work aims to automatically detect the moments when the user gets engaged into a social interaction solely by reviewing the automatically captured photos by the worn camera. The proposed method, inspired by the sociological concept of F-formation, exploits distance and orientation of the appearing individuals -with respect to the user- in the scene from a bird-view perspective. As a result, the interaction pattern over the sequence can be understood as a two-dimensional time series that corresponds to the temporal evolution of the distance and orientation features over time. A Long-Short Term Memory-based Recurrent Neural Network is then trained to classify each time series. Experimental evaluation over a dataset of 30.000 images has shown promising results on the proposed method for social interaction detection in egocentric photo-streams.  
  Address Cancun; Mexico; December 2016  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference ICPR  
  Notes MILAB Approved no  
  Call Number Admin @ si @ADR2016a Serial 2791  
Permanent link to this record
 

 
Author Y. Patel; Lluis Gomez; Marçal Rusiñol; Dimosthenis Karatzas edit   pdf
openurl 
  Title Dynamic Lexicon Generation for Natural Scene Images Type Conference Article
  Year 2016 Publication 14th European Conference on Computer Vision Workshops Abbreviated Journal  
  Volume (down) Issue Pages 395-410  
  Keywords scene text; photo OCR; scene understanding; lexicon generation; topic modeling; CNN  
  Abstract Many scene text understanding methods approach the endtoend recognition problem from a word-spotting perspective and take huge bene t from using small per-image lexicons. Such customized lexicons are normally assumed as given and their source is rarely discussed.
In this paper we propose a method that generates contextualized lexicons
for scene images using only visual information. For this, we exploit
the correlation between visual and textual information in a dataset consisting
of images and textual content associated with them. Using the topic modeling framework to discover a set of latent topics in such a dataset allows us to re-rank a xed dictionary in a way that prioritizes the words that are more likely to appear in a given image. Moreover, we train a CNN that is able to reproduce those word rankings but using only the image raw pixels as input. We demonstrate that the quality of the automatically obtained custom lexicons is superior to a generic frequency-based baseline.
 
  Address Amsterdam; The Netherlands; October 2016  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference ECCVW  
  Notes DAG; 600.084 Approved no  
  Call Number Admin @ si @ PGR2016 Serial 2825  
Permanent link to this record
Select All    Deselect All
 |   | 
Details
   print

Save Citations:
Export Records: