toggle visibility Search & Display Options

Select All    Deselect All
 |   | 
Details
   print
  Records Links
Author Sounak Dey; Anjan Dutta; Suman Ghosh; Ernest Valveny; Josep Llados; Umapada Pal edit   pdf
doi  openurl
  Title Learning Cross-Modal Deep Embeddings for Multi-Object Image Retrieval using Text and Sketch Type Conference Article
  Year 2018 Publication 24th International Conference on Pattern Recognition Abbreviated Journal  
  Volume Issue Pages 916 - 921  
  Keywords  
  Abstract In this work we introduce a cross modal image retrieval system that allows both text and sketch as input modalities for the query. A cross-modal deep network architecture is formulated to jointly model the sketch and text input modalities as well as the the image output modality, learning a common embedding between text and images and between sketches and images. In addition, an attention model is used to selectively focus the attention on the different objects of the image, allowing for retrieval with multiple objects in the query. Experiments show that the proposed method performs the best in both single and multiple object image retrieval in standard datasets.  
  Address Beijing; China; August 2018  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference ICPR  
  Notes DAG; 602.167; 602.168; 600.097; 600.084; 600.121; 600.129 Approved no  
  Call Number (down) Admin @ si @ DDG2018b Serial 3152  
Permanent link to this record
 

 
Author Sounak Dey; Anjan Dutta; Suman Ghosh; Ernest Valveny; Josep Llados edit   pdf
openurl 
  Title Aligning Salient Objects to Queries: A Multi-modal and Multi-object Image Retrieval Framework Type Conference Article
  Year 2018 Publication 14th Asian Conference on Computer Vision Abbreviated Journal  
  Volume Issue Pages  
  Keywords  
  Abstract In this paper we propose an approach for multi-modal image retrieval in multi-labelled images. A multi-modal deep network architecture is formulated to jointly model sketches and text as input query modalities into a common embedding space, which is then further aligned with the image feature space. Our architecture also relies on a salient object detection through a supervised LSTM-based visual attention model learned from convolutional features. Both the alignment between the queries and the image and the supervision of the attention on the images are obtained by generalizing the Hungarian Algorithm using different loss functions. This permits encoding the object-based features and its alignment with the query irrespective of the availability of the co-occurrence of different objects in the training set. We validate the performance of our approach on standard single/multi-object datasets, showing state-of-the art performance in every dataset.  
  Address Perth; Australia; December 2018  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference ACCV  
  Notes DAG; 600.097; 600.121; 600.129 Approved no  
  Call Number (down) Admin @ si @ DDG2018a Serial 3151  
Permanent link to this record
 

 
Author Anastasios Doulamis; Nikolaos Doulamis; Marco Bertini; Jordi Gonzalez; Thomas B. Moeslund edit   pdf
url  openurl
  Title Introduction to the Special Issue on the Analysis and Retrieval of Events/Actions and Workflows in Video Streams Type Journal Article
  Year 2016 Publication Multimedia Tools and Applications Abbreviated Journal MTAP  
  Volume 75 Issue 22 Pages 14985-14990  
  Keywords  
  Abstract  
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes ISE; HUPBA Approved no  
  Call Number (down) Admin @ si @ DDB2016 Serial 2934  
Permanent link to this record
 

 
Author Anastasios Doulamis; Nikolaos Doulamis; Marco Bertini; Jordi Gonzalez; Thomas B. Moeslund edit   pdf
openurl 
  Title Analysis and Retrieval of Tracked Events and Motion in Imagery Streams Type Miscellaneous
  Year 2013 Publication ACM/IEEE international workshop on Analysis and retrieval of tracked events and motion in imagery stream Abbreviated Journal  
  Volume Issue Pages  
  Keywords  
  Abstract  
  Address Barcelona; October 2013  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes ISE Approved no  
  Call Number (down) Admin @ si @ DDB2013 Serial 2372  
Permanent link to this record
 

 
Author Mariella Dimiccoli; Marc Bolaños; Estefania Talavera; Maedeh Aghaei; Stavri G. Nikolov; Petia Radeva edit   pdf
url  doi
openurl 
  Title SR-Clustering: Semantic Regularized Clustering for Egocentric Photo Streams Segmentation Type Journal Article
  Year 2017 Publication Computer Vision and Image Understanding Abbreviated Journal CVIU  
  Volume 155 Issue Pages 55-69  
  Keywords  
  Abstract While wearable cameras are becoming increasingly popular, locating relevant information in large unstructured collections of egocentric images is still a tedious and time consuming processes. This paper addresses the problem of organizing egocentric photo streams acquired by a wearable camera into semantically meaningful segments. First, contextual and semantic information is extracted for each image by employing a Convolutional Neural Networks approach. Later, by integrating language processing, a vocabulary of concepts is defined in a semantic space. Finally, by exploiting the temporal coherence in photo streams, images which share contextual and semantic attributes are grouped together. The resulting temporal segmentation is particularly suited for further analysis, ranging from activity and event recognition to semantic indexing and summarization. Experiments over egocentric sets of nearly 17,000 images, show that the proposed approach outperforms state-of-the-art methods.  
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes MILAB; 601.235 Approved no  
  Call Number (down) Admin @ si @ DBT2017 Serial 2714  
Permanent link to this record
 

 
Author Ilke Demir; Dena Bazazian; Adriana Romero; Viktoriia Sharmanska; Lyne P. Tchapmi edit   pdf
doi  openurl
  Title WiCV 2018: The Fourth Women In Computer Vision Workshop Type Conference Article
  Year 2018 Publication 4th Women in Computer Vision Workshop Abbreviated Journal  
  Volume Issue Pages 1941-19412  
  Keywords Conferences; Computer vision; Industries; Object recognition; Engineering profession; Collaboration; Machine learning  
  Abstract We present WiCV 2018 – Women in Computer Vision Workshop to increase the visibility and inclusion of women researchers in computer vision field, organized in conjunction with CVPR 2018. Computer vision and machine learning have made incredible progress over the past years, yet the number of female researchers is still low both in academia and industry. WiCV is organized to raise visibility of female researchers, to increase the collaboration,
and to provide mentorship and give opportunities to femaleidentifying junior researchers in the field. In its fourth year, we are proud to present the changes and improvements over the past years, summary of statistics for presenters and attendees, followed by expectations from future generations.
 
  Address Salt Lake City; USA; June 2018  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference WiCV  
  Notes DAG; 600.121; 600.129 Approved no  
  Call Number (down) Admin @ si @ DBR2018 Serial 3222  
Permanent link to this record
 

 
Author Fadi Dornaika; Alireza Bosaghzadeh; Bogdan Raducanu edit   pdf
doi  isbn
openurl 
  Title Efficient Graph Construction for Label Propagation based Multi-observation Face Recognition Type Conference Article
  Year 2013 Publication Human Behavior Understanding 4th International Workshop Abbreviated Journal  
  Volume 8212 Issue Pages 124-135  
  Keywords  
  Abstract Workshop on Human Behavior Understanding
Human-machine interaction is a hot topic nowadays in the communities of multimedia and computer vision. In this context, face recognition algorithms (used as primary cue for a person’s identity assessment) work well under controlled conditions but degrade significantly when tested in real-world environments. Recently, graph-based label propagation for multi-observation face recognition was proposed. However, the associated graphs were constructed in an ad-hoc manner (e.g., using the KNN graph) that cannot adapt optimally to the data. In this paper, we propose a novel approach for efficient and adaptive graph construction that can be used for multi-observation face recognition as well as for other recognition problems. Experimental results performed on Honda video face database, show a distinct advantage of the proposed method over the standard graph construction methods.
 
  Address Barcelona  
  Corporate Author Thesis  
  Publisher Springer International Publishing Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN 0302-9743 ISBN 978-3-319-02713-5 Medium  
  Area Expedition Conference HBU  
  Notes OR;MV Approved no  
  Call Number (down) Admin @ si @ DBR2013 Serial 2315  
Permanent link to this record
 

 
Author Fadi Dornaika; Alireza Bosaghzadeh; Bogdan Raducanu edit   pdf
openurl 
  Title LSDA Solution Schemes for Modelless 3D Head Pose Estimation Type Conference Article
  Year 2012 Publication IEEE Workshop on the Applications of Computer Vision Abbreviated Journal  
  Volume Issue Pages 393-398  
  Keywords  
  Abstract  
  Address Breckenridge; USA;  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference WACV  
  Notes OR;MV Approved no  
  Call Number (down) Admin @ si @ DBR2012 Serial 1889  
Permanent link to this record
 

 
Author Alloy Das; Sanket Biswas; Umapada Pal; Josep Llados edit   pdf
url  openurl
  Title Diving into the Depths of Spotting Text in Multi-Domain Noisy Scenes Type Conference Article
  Year 2024 Publication IEEE International Conference on Robotics and Automation in PACIFICO Abbreviated Journal  
  Volume Issue Pages  
  Keywords  
  Abstract When used in a real-world noisy environment, the capacity to generalize to multiple domains is essential for any autonomous scene text spotting system. However, existing state-of-the-art methods employ pretraining and fine-tuning strategies on natural scene datasets, which do not exploit the feature interaction across other complex domains. In this work, we explore and investigate the problem of domain-agnostic scene text spotting, i.e., training a model on multi-domain source data such that it can directly generalize to target domains rather than being specialized for a specific domain or scenario. In this regard, we present the community a text spotting validation benchmark called Under-Water Text (UWT) for noisy underwater scenes to establish an important case study. Moreover, we also design an efficient super-resolution based end-to-end transformer baseline called DA-TextSpotter which achieves comparable or superior performance over existing text spotting architectures for both regular and arbitrary-shaped scene text spotting benchmarks in terms of both accuracy and model efficiency. The dataset, code and pre-trained models will be released upon acceptance.  
  Address Yokohama; Japan; May 2024  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference ICRA  
  Notes DAG Approved no  
  Call Number (down) Admin @ si @ DBP2024 Serial 3979  
Permanent link to this record
 

 
Author Alloy Das; Sanket Biswas; Ayan Banerjee; Josep Llados; Umapada Pal; Saumik Bhattacharya edit   pdf
url  openurl
  Title Harnessing the Power of Multi-Lingual Datasets for Pre-training: Towards Enhancing Text Spotting Performance Type Conference Article
  Year 2024 Publication Winter Conference on Applications of Computer Vision Abbreviated Journal  
  Volume Issue Pages 718-728  
  Keywords  
  Abstract The adaptation capability to a wide range of domains is crucial for scene text spotting models when deployed to real-world conditions. However, existing state-of-the-art (SOTA) approaches usually incorporate scene text detection and recognition simply by pretraining on natural scene text datasets, which do not directly exploit the intermediate feature representations between multiple domains. Here, we investigate the problem of domain-adaptive scene text spotting, i.e., training a model on multi-domain source data such that it can directly adapt to target domains rather than being specialized for a specific domain or scenario. Further, we investigate a transformer baseline called Swin-TESTR to focus on solving scene-text spotting for both regular and arbitrary-shaped scene text along with an exhaustive evaluation. The results clearly demonstrate the potential of intermediate representations to achieve significant performance on text spotting benchmarks across multiple domains (e.g. language, synth-to-real, and documents). both in terms of accuracy and efficiency.  
  Address Waikoloa; Hawai; USA; January 2024  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference WACV  
  Notes DAG Approved no  
  Call Number (down) Admin @ si @ DBB2024 Serial 3986  
Permanent link to this record
 

 
Author Maria del Camp Davesa edit  openurl
  Title Human action categorization in image sequences Type Report
  Year 2011 Publication CVC Technical Report Abbreviated Journal  
  Volume 169 Issue Pages  
  Keywords  
  Abstract  
  Address Bellaterra (Spain)  
  Corporate Author Computer Vision Center Thesis Master's thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes CiC;CIC Approved no  
  Call Number (down) Admin @ si @ Dav2011 Serial 1934  
Permanent link to this record
 

 
Author Fadi Dornaika; Jose Manuel Alvarez; Angel Sappa; Antonio Lopez edit   pdf
doi  openurl
  Title A New Framework for Stereo Sensor Pose through Road Segmentation and Registration Type Journal Article
  Year 2011 Publication IEEE Transactions on Intelligent Transportation Systems Abbreviated Journal TITS  
  Volume 12 Issue 4 Pages 954-966  
  Keywords road detection  
  Abstract This paper proposes a new framework for real-time estimation of the onboard stereo head's position and orientation relative to the road surface, which is required for any advanced driver-assistance application. This framework can be used with all road types: highways, urban, etc. Unlike existing works that rely on feature extraction in either the image domain or 3-D space, we propose a framework that directly estimates the unknown parameters from the stream of stereo pairs' brightness. The proposed approach consists of two stages that are invoked for every stereo frame. The first stage segments the road region in one monocular view. The second stage estimates the camera pose using a featureless registration between the segmented monocular road region and the other view in the stereo pair. This paper has two main contributions. The first contribution combines a road segmentation algorithm with a registration technique to estimate the online stereo camera pose. The second contribution solves the registration using a featureless method, which is carried out using two different optimization techniques: 1) the differential evolution algorithm and 2) the Levenberg-Marquardt (LM) algorithm. We provide experiments and evaluations of performance. The results presented show the validity of our proposed framework.  
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN 1524-9050 ISBN Medium  
  Area Expedition Conference  
  Notes ADAS Approved no  
  Call Number (down) Admin @ si @ DAS2011; ADAS @ adas @ das2011a Serial 1833  
Permanent link to this record
 

 
Author Fadi Dornaika; A.Assoum; Bogdan Raducanu edit   pdf
doi  isbn
openurl 
  Title Automatic Dimensionality Estimation for Manifold Learning through Optimal Feature Selection Type Conference Article
  Year 2012 Publication Structural, Syntactic, and Statistical Pattern Recognition, Joint IAPR International Workshop Abbreviated Journal  
  Volume 7626 Issue Pages 575-583  
  Keywords  
  Abstract A very important aspect in manifold learning is represented by automatic estimation of the intrinsic dimensionality. Unfortunately, this problem has received few attention in the literature of manifold learning. In this paper, we argue that feature selection paradigm can be used to the problem of automatic dimensionality estimation. Besides this, it also leads to improved recognition rates. Our approach for optimal feature selection is based on a Genetic Algorithm. As a case study for manifold learning, we have considered Laplacian Eigenmaps (LE) and Locally Linear Embedding (LLE). The effectiveness of the proposed framework was tested on the face recognition problem. Extensive experiments carried out on ORL, UMIST, Yale, and Extended Yale face data sets confirmed our hypothesis.  
  Address  
  Corporate Author Thesis  
  Publisher Springer Berlin Heidelberg Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title LNCS  
  Series Volume Series Issue Edition  
  ISSN 0302-9743 ISBN 978-3-642-34165-6 Medium  
  Area Expedition Conference SSPR&SPR  
  Notes OR;MV Approved no  
  Call Number (down) Admin @ si @ DAR2012 Serial 2174  
Permanent link to this record
 

 
Author Clementine Decamps; Alexis Arnaud; Florent Petitprez; Mira Ayadi; Aurelia Baures; Lucile Armenoult; Sergio Escalera; Isabelle Guyon; Remy Nicolle; Richard Tomasini; Aurelien de Reynies; Jerome Cros; Yuna Blum; Magali Richard edit   pdf
url  openurl
  Title DECONbench: a benchmarking platform dedicated to deconvolution methods for tumor heterogeneity quantification Type Journal Article
  Year 2021 Publication BMC Bioinformatics Abbreviated Journal  
  Volume 22 Issue Pages 473  
  Keywords  
  Abstract Quantification of tumor heterogeneity is essential to better understand cancer progression and to adapt therapeutic treatments to patient specificities. Bioinformatic tools to assess the different cell populations from single-omic datasets as bulk transcriptome or methylome samples have been recently developed, including reference-based and reference-free methods. Improved methods using multi-omic datasets are yet to be developed in the future and the community would need systematic tools to perform a comparative evaluation of these algorithms on controlled data.  
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes HUPBA; no proj Approved no  
  Call Number (down) Admin @ si @ DAP2021 Serial 3650  
Permanent link to this record
 

 
Author Franck Davoine; Fadi Dornaika edit  openurl
  Title Head and facial animation tracking using appearance-adaptive models and particle filters Type Book Chapter
  Year 2005 Publication V. Pavlovic and T.S. Huang (editors), Real–Time Vision for Human–Computer Interaction Abbreviated Journal  
  Volume Issue Pages  
  Keywords  
  Abstract  
  Address Springer-Verlag  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes Approved no  
  Call Number (down) Admin @ si @ DaD2005 Serial 599  
Permanent link to this record
Select All    Deselect All
 |   | 
Details
   print

Save Citations:
Export Records: