toggle visibility Search & Display Options

Select All    Deselect All
 |   | 
Details
   print
  Records Links
Author Josep Famadas; Meysam Madadi; Cristina Palmero; Sergio Escalera edit   pdf
url  openurl
  Title Generative Video Face Reenactment by AUs and Gaze Regularization Type Conference Article
  Year 2020 Publication (up) 15th IEEE International Conference on Automatic Face and Gesture Recognition Abbreviated Journal  
  Volume Issue Pages 444-451  
  Keywords  
  Abstract In this work, we propose an encoder-decoder-like architecture to perform face reenactment in image sequences. Our goal is to transfer the training subject identity to a given test subject. We regularize face reenactment by facial action unit intensity and 3D gaze vector regression. This way, we enforce the network to transfer subtle facial expressions and eye dynamics, providing a more lifelike result. The proposed encoder-decoder receives as input the previous sequence frame stacked to the current frame image of facial landmarks. Thus, the generated frames benefit from appearance and geometry, while keeping temporal coherence for the generated sequence. At test stage, a new target subject with the facial performance of the source subject and the appearance of the training subject is reenacted. Principal component analysis is applied to project the test subject geometry to the closest training subject geometry before reenactment. Evaluation of our proposal shows faster convergence, and more accurate and realistic results in comparison to other architectures without action units and gaze regularization.  
  Address Virtual; November 2020  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference FG  
  Notes HUPBA Approved no  
  Call Number Admin @ si @ FMP2020 Serial 3517  
Permanent link to this record
 

 
Author Fares Alnajar; Theo Gevers; Roberto Valenti; Sennay Ghebreab edit   pdf
doi  openurl
  Title Calibration-free Gaze Estimation using Human Gaze Patterns Type Conference Article
  Year 2013 Publication (up) 15th IEEE International Conference on Computer Vision Abbreviated Journal  
  Volume Issue Pages 137-144  
  Keywords  
  Abstract We present a novel method to auto-calibrate gaze estimators based on gaze patterns obtained from other viewers. Our method is based on the observation that the gaze patterns of humans are indicative of where a new viewer will look at [12]. When a new viewer is looking at a stimulus, we first estimate a topology of gaze points (initial gaze points). Next, these points are transformed so that they match the gaze patterns of other humans to find the correct gaze points. In a flexible uncalibrated setup with a web camera and no chin rest, the proposed method was tested on ten subjects and ten images. The method estimates the gaze points after looking at a stimulus for a few seconds with an average accuracy of 4.3 im. Although the reported performance is lower than what could be achieved with dedicated hardware or calibrated setup, the proposed method still provides a sufficient accuracy to trace the viewer attention. This is promising considering the fact that auto-calibration is done in a flexible setup , without the use of a chin rest, and based only on a few seconds of gaze initialization data. To the best of our knowledge, this is the first work to use human gaze patterns in order to auto-calibrate gaze estimators.  
  Address Sydney  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference ICCV  
  Notes ALTRES;ISE Approved no  
  Call Number Admin @ si @ AGV2013 Serial 2365  
Permanent link to this record
 

 
Author Hamdi Dibeklioglu; Albert Ali Salah; Theo Gevers edit   pdf
doi  openurl
  Title Like Father, Like Son: Facial Expression Dynamics for Kinship Verification Type Conference Article
  Year 2013 Publication (up) 15th IEEE International Conference on Computer Vision Abbreviated Journal  
  Volume Issue Pages 1497-1504  
  Keywords  
  Abstract Kinship verification from facial appearance is a difficult problem. This paper explores the possibility of employing facial expression dynamics in this problem. By using features that describe facial dynamics and spatio-temporal appearance over smile expressions, we show that it is possible to improve the state of the art in this problem, and verify that it is indeed possible to recognize kinship by resemblance of facial expressions. The proposed method is tested on different kin relationships. On the average, 72.89% verification accuracy is achieved on spontaneous smiles.  
  Address Sydney  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference ICCV  
  Notes ALTRES;ISE Approved no  
  Call Number Admin @ si @ DSG2013 Serial 2366  
Permanent link to this record
 

 
Author Javier Marin; David Vazquez; Antonio Lopez; Jaume Amores; Bastian Leibe edit   pdf
doi  openurl
  Title Random Forests of Local Experts for Pedestrian Detection Type Conference Article
  Year 2013 Publication (up) 15th IEEE International Conference on Computer Vision Abbreviated Journal  
  Volume Issue Pages 2592 - 2599  
  Keywords ADAS; Random Forest; Pedestrian Detection  
  Abstract Pedestrian detection is one of the most challenging tasks in computer vision, and has received a lot of attention in the last years. Recently, some authors have shown the advantages of using combinations of part/patch-based detectors in order to cope with the large variability of poses and the existence of partial occlusions. In this paper, we propose a pedestrian detection method that efficiently combines multiple local experts by means of a Random Forest ensemble. The proposed method works with rich block-based representations such as HOG and LBP, in such a way that the same features are reused by the multiple local experts, so that no extra computational cost is needed with respect to a holistic method. Furthermore, we demonstrate how to integrate the proposed approach with a cascaded architecture in order to achieve not only high accuracy but also an acceptable efficiency. In particular, the resulting detector operates at five frames per second using a laptop machine. We tested the proposed method with well-known challenging datasets such as Caltech, ETH, Daimler, and INRIA. The method proposed in this work consistently ranks among the top performers in all the datasets, being either the best method or having a small difference with the best one.  
  Address Sydney; Australia; December 2013  
  Corporate Author Thesis  
  Publisher IEEE Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN 1550-5499 ISBN Medium  
  Area Expedition Conference ICCV  
  Notes ADAS; 600.057; 600.054 Approved no  
  Call Number ADAS @ adas @ MVL2013 Serial 2333  
Permanent link to this record
 

 
Author Jon Almazan; Albert Gordo; Alicia Fornes; Ernest Valveny edit   pdf
doi  openurl
  Title Handwritten Word Spotting with Corrected Attributes Type Conference Article
  Year 2013 Publication (up) 15th IEEE International Conference on Computer Vision Abbreviated Journal  
  Volume Issue Pages 1017-1024  
  Keywords  
  Abstract We propose an approach to multi-writer word spotting, where the goal is to find a query word in a dataset comprised of document images. We propose an attributes-based approach that leads to a low-dimensional, fixed-length representation of the word images that is fast to compute and, especially, fast to compare. This approach naturally leads to an unified representation of word images and strings, which seamlessly allows one to indistinctly perform query-by-example, where the query is an image, and query-by-string, where the query is a string. We also propose a calibration scheme to correct the attributes scores based on Canonical Correlation Analysis that greatly improves the results on a challenging dataset. We test our approach on two public datasets showing state-of-the-art results.  
  Address Sydney; Australia; December 2013  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN 1550-5499 ISBN Medium  
  Area Expedition Conference ICCV  
  Notes DAG Approved no  
  Call Number Admin @ si @ AGF2013 Serial 2327  
Permanent link to this record
 

 
Author Gemma Roig; Xavier Boix; R. de Nijs; Sebastian Ramos; K. Kühnlenz; Luc Van Gool edit   pdf
doi  openurl
  Title Active MAP Inference in CRFs for Efficient Semantic Segmentation Type Conference Article
  Year 2013 Publication (up) 15th IEEE International Conference on Computer Vision Abbreviated Journal  
  Volume Issue Pages 2312 - 2319  
  Keywords Semantic Segmentation  
  Abstract Most MAP inference algorithms for CRFs optimize an energy function knowing all the potentials. In this paper, we focus on CRFs where the computational cost of instantiating the potentials is orders of magnitude higher than MAP inference. This is often the case in semantic image segmentation, where most potentials are instantiated by slow classifiers fed with costly features. We introduce Active MAP inference 1) to on-the-fly select a subset of potentials to be instantiated in the energy function, leaving the rest of the parameters of the potentials unknown, and 2) to estimate the MAP labeling from such incomplete energy function. Results for semantic segmentation benchmarks, namely PASCAL VOC 2010 [5] and MSRC-21 [19], show that Active MAP inference achieves similar levels of accuracy but with major efficiency gains.  
  Address Sydney; Australia; December 2013  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN 1550-5499 ISBN Medium  
  Area Expedition Conference ICCV  
  Notes ADAS; 600.057 Approved no  
  Call Number ADAS @ adas @ RBN2013 Serial 2377  
Permanent link to this record
 

 
Author Fahad Shahbaz Khan; Joost Van de Weijer; Sadiq Ali; Michael Felsberg edit   pdf
doi  isbn
openurl 
  Title Evaluating the impact of color on texture recognition Type Conference Article
  Year 2013 Publication (up) 15th International Conference on Computer Analysis of Images and Patterns Abbreviated Journal  
  Volume 8047 Issue Pages 154-162  
  Keywords Color; Texture; image representation  
  Abstract State-of-the-art texture descriptors typically operate on grey scale images while ignoring color information. A common way to obtain a joint color-texture representation is to combine the two visual cues at the pixel level. However, such an approach provides sub-optimal results for texture categorisation task.
In this paper we investigate how to optimally exploit color information for texture recognition. We evaluate a variety of color descriptors, popular in image classification, for texture categorisation. In addition we analyze different fusion approaches to combine color and texture cues. Experiments are conducted on the challenging scenes and 10 class texture datasets. Our experiments clearly suggest that in all cases color names provide the best performance. Late fusion is the best strategy to combine color and texture. By selecting the best color descriptor with optimal fusion strategy provides a gain of 5% to 8% compared to texture alone on scenes and texture datasets.
 
  Address York; UK; August 2013  
  Corporate Author Thesis  
  Publisher Springer Berlin Heidelberg Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN 0302-9743 ISBN 978-3-642-40260-9 Medium  
  Area Expedition Conference CAIP  
  Notes CIC; 600.048 Approved no  
  Call Number Admin @ si @ KWA2013 Serial 2263  
Permanent link to this record
 

 
Author Naveen Onkarappa; Angel Sappa edit  doi
isbn  openurl
  Title Laplacian Derivative based Regularization for Optical Flow Estimation in Driving Scenario Type Conference Article
  Year 2013 Publication (up) 15th International Conference on Computer Analysis of Images and Patterns Abbreviated Journal  
  Volume 8048 Issue Pages 483-490  
  Keywords Optical flow; regularization; Driver Assistance Systems; Performance Evaluation  
  Abstract Existing state of the art optical flow approaches, which are evaluated on standard datasets such as Middlebury, not necessarily have a similar performance when evaluated on driving scenarios. This drop on performance is due to several challenges arising on real scenarios during driving. Towards this direction, in this paper, we propose a modification to the regularization term in a variational optical flow formulation, that notably improves the results, specially in driving scenarios. The proposed modification consists on using the Laplacian derivatives of flow components in the regularization term instead of gradients of flow components. We show the improvements in results on a standard real image sequences dataset (KITTI).  
  Address York; UK; August 2013  
  Corporate Author Thesis  
  Publisher Springer Berlin Heidelberg Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title LNCS  
  Series Volume Series Issue Edition  
  ISSN 0302-9743 ISBN 978-3-642-40245-6 Medium  
  Area Expedition Conference CAIP  
  Notes ADAS; 600.055; 601.215 Approved no  
  Call Number Admin @ si @ OnS2013b Serial 2244  
Permanent link to this record
 

 
Author Marcelo D. Pistarelli; Angel Sappa; Ricardo Toledo edit  doi
isbn  openurl
  Title Multispectral Stereo Image Correspondence Type Conference Article
  Year 2013 Publication (up) 15th International Conference on Computer Analysis of Images and Patterns Abbreviated Journal  
  Volume 8048 Issue Pages 217-224  
  Keywords  
  Abstract This paper presents a novel multispectral stereo image correspondence approach. It is evaluated using a stereo rig constructed with a visible spectrum camera and a long wave infrared spectrum camera. The novelty of the proposed approach lies on the usage of Hough space as a correspondence search domain. In this way it avoids searching for correspondence in the original multispectral image domains, where information is low correlated, and a common domain is used. The proposed approach is intended to be used in outdoor urban scenarios, where images contain large amount of edges. These edges are used as distinctive characteristics for the matching in the Hough space. Experimental results are provided showing the validity of the proposed approach.  
  Address York; uk; August 2013  
  Corporate Author Thesis  
  Publisher Springer Berlin Heidelberg Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title LNCS  
  Series Volume Series Issue Edition  
  ISSN 0302-9743 ISBN 978-3-642-40245-6 Medium  
  Area Expedition Conference CAIP  
  Notes ADAS; 600.055 Approved no  
  Call Number Admin @ si @ PST2013 Serial 2561  
Permanent link to this record
 

 
Author Rafael E. Rivadeneira; Angel Sappa; Boris X. Vintimilla edit   pdf
openurl 
  Title Thermal Image Super-resolution: A Novel Architecture and Dataset Type Conference Article
  Year 2020 Publication (up) 15th International Conference on Computer Vision Theory and Applications Abbreviated Journal  
  Volume Issue Pages 111-119  
  Keywords  
  Abstract This paper proposes a novel CycleGAN architecture for thermal image super-resolution, together with a large dataset consisting of thermal images at different resolutions. The dataset has been acquired using three thermal cameras at different resolutions, which acquire images from the same scenario at the same time. The thermal cameras are mounted in rig trying to minimize the baseline distance to make easier the registration problem.
The proposed architecture is based on ResNet6 as a Generator and PatchGAN as Discriminator. The novelty on the proposed unsupervised super-resolution training (CycleGAN) is possible due to the existence of aforementioned thermal images—images of the same scenario with different resolutions. The proposed approach is evaluated in the dataset and compared with classical bicubic interpolation. The dataset and the network are available.
 
  Address Valletta; Malta; February 2020  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference VISAPP  
  Notes MSIAU; 600.130; 600.122 Approved no  
  Call Number Admin @ si @ RSV2020 Serial 3432  
Permanent link to this record
 

 
Author Jorge Charco; Angel Sappa; Boris X. Vintimilla; Henry Velesaca edit   pdf
doi  openurl
  Title Transfer Learning from Synthetic Data in the Camera Pose Estimation Problem Type Conference Article
  Year 2020 Publication (up) 15th International Conference on Computer Vision Theory and Applications Abbreviated Journal  
  Volume Issue Pages  
  Keywords  
  Abstract This paper presents a novel Siamese network architecture, as a variant of Resnet-50, to estimate the relative camera pose on multi-view environments. In order to improve the performance of the proposed model a transfer learning strategy, based on synthetic images obtained from a virtual-world, is considered. The transfer learning consists of first training the network using pairs of images from the virtual-world scenario
considering different conditions (i.e., weather, illumination, objects, buildings, etc.); then, the learned weight
of the network are transferred to the real case, where images from real-world scenarios are considered. Experimental results and comparisons with the state of the art show both, improvements on the relative pose estimation accuracy using the proposed model, as well as further improvements when the transfer learning strategy (synthetic-world data transfer learning real-world data) is considered to tackle the limitation on the
training due to the reduced number of pairs of real-images on most of the public data sets.
 
  Address Valletta; Malta; February 2020  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference VISAPP  
  Notes MSIAU; 600.130; 601.349; 600.122 Approved no  
  Call Number Admin @ si @ CSV2020 Serial 3433  
Permanent link to this record
 

 
Author Raul Gomez; Ali Furkan Biten; Lluis Gomez; Jaume Gibert; Marçal Rusiñol; Dimosthenis Karatzas edit   pdf
url  doi
openurl 
  Title Selective Style Transfer for Text Type Conference Article
  Year 2019 Publication (up) 15th International Conference on Document Analysis and Recognition Abbreviated Journal  
  Volume Issue Pages 805-812  
  Keywords transfer; text style transfer; data augmentation; scene text detection  
  Abstract This paper explores the possibilities of image style transfer applied to text maintaining the original transcriptions. Results on different text domains (scene text, machine printed text and handwritten text) and cross-modal results demonstrate that this is feasible, and open different research lines. Furthermore, two architectures for selective style transfer, which means
transferring style to only desired image pixels, are proposed. Finally, scene text selective style transfer is evaluated as a data augmentation technique to expand scene text detection datasets, resulting in a boost of text detectors performance. Our implementation of the described models is publicly available.
 
  Address Sydney; Australia; September 2019  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference ICDAR  
  Notes DAG; 600.129; 600.135; 601.338; 601.310; 600.121 Approved no  
  Call Number GBG2019 Serial 3265  
Permanent link to this record
 

 
Author Ali Furkan Biten; Ruben Tito; Andres Mafla; Lluis Gomez; Marçal Rusiñol; M. Mathew; C.V. Jawahar; Ernest Valveny; Dimosthenis Karatzas edit   pdf
url  doi
openurl 
  Title ICDAR 2019 Competition on Scene Text Visual Question Answering Type Conference Article
  Year 2019 Publication (up) 15th International Conference on Document Analysis and Recognition Abbreviated Journal  
  Volume Issue Pages 1563-1570  
  Keywords  
  Abstract This paper presents final results of ICDAR 2019 Scene Text Visual Question Answering competition (ST-VQA). ST-VQA introduces an important aspect that is not addressed by any Visual Question Answering system up to date, namely the incorporation of scene text to answer questions asked about an image. The competition introduces a new dataset comprising 23,038 images annotated with 31,791 question / answer pairs where the answer is always grounded on text instances present in the image. The images are taken from 7 different public computer vision datasets, covering a wide range of scenarios. The competition was structured in three tasks of increasing difficulty, that require reading the text in a scene and understanding it in the context of the scene, to correctly answer a given question. A novel evaluation metric is presented, which elegantly assesses both key capabilities expected from an optimal model: text recognition and image understanding. A detailed analysis of results from different participants is showcased, which provides insight into the current capabilities of VQA systems that can read. We firmly believe the dataset proposed in this challenge will be an important milestone to consider towards a path of more robust and general models that can exploit scene text to achieve holistic image understanding.  
  Address Sydney; Australia; September 2019  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference ICDAR  
  Notes DAG; 600.129; 601.338; 600.121 Approved no  
  Call Number Admin @ si @ BTM2019c Serial 3286  
Permanent link to this record
 

 
Author Rui Zhang; Yongsheng Zhou; Qianyi Jiang; Qi Song; Nan Li; Kai Zhou; Lei Wang; Dong Wang; Minghui Liao; Mingkun Yang; Xiang Bai; Baoguang Shi; Dimosthenis Karatzas; Shijian Lu; CV Jawahar edit   pdf
url  doi
openurl 
  Title ICDAR 2019 Robust Reading Challenge on Reading Chinese Text on Signboard Type Conference Article
  Year 2019 Publication (up) 15th International Conference on Document Analysis and Recognition Abbreviated Journal  
  Volume Issue Pages 1577-1581  
  Keywords  
  Abstract Chinese scene text reading is one of the most challenging problems in computer vision and has attracted great interest. Different from English text, Chinese has more than 6000 commonly used characters and Chinesecharacters can be arranged in various layouts with numerous fonts. The Chinese signboards in street view are a good choice for Chinese scene text images since they have different backgrounds, fonts and layouts. We organized a competition called ICDAR2019-ReCTS, which mainly focuses on reading Chinese text on signboard. This report presents the final results of the competition. A large-scale dataset of 25,000 annotated signboard images, in which all the text lines and characters are annotated with locations and transcriptions, were released. Four tasks, namely character recognition, text line recognition, text line detection and end-to-end recognition were set up. Besides, considering the Chinese text ambiguity issue, we proposed a multi ground truth (multi-GT) evaluation method to make evaluation fairer. The competition started on March 1, 2019 and ended on April 30, 2019. 262 submissions from 46 teams are received. Most of the participants come from universities, research institutes, and tech companies in China. There are also some participants from the United States, Australia, Singapore, and Korea. 21 teams submit results for Task 1, 23 teams submit results for Task 2, 24 teams submit results for Task 3, and 13 teams submit results for Task 4.  
  Address Sydney; Australia; September 2019  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference ICDAR  
  Notes DAG; 600.129; 600.121 Approved no  
  Call Number Admin @ si @ LZZ2019 Serial 3335  
Permanent link to this record
 

 
Author Mohammed Al Rawi; Ernest Valveny; Dimosthenis Karatzas edit   pdf
url  doi
openurl 
  Title Can One Deep Learning Model Learn Script-Independent Multilingual Word-Spotting? Type Conference Article
  Year 2019 Publication (up) 15th International Conference on Document Analysis and Recognition Abbreviated Journal  
  Volume Issue Pages 260-267  
  Keywords  
  Abstract Word spotting has gained increased attention lately as it can be used to extract textual information from handwritten documents and scene-text images. Current word spotting approaches are designed to work on a single language and/or script. Building intelligent models that learn script-independent multilingual word-spotting is challenging due to the large variability of multilingual alphabets and symbols. We used ResNet-152 and the Pyramidal Histogram of Characters (PHOC) embedding to build a one-model script-independent multilingual word-spotting and we tested it on Latin, Arabic, and Bangla (Indian) languages. The one-model we propose performs on par with the multi-model language-specific word-spotting system, and thus, reduces the number of models needed for each script and/or language.  
  Address Sydney; Australia; September 2019  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference ICDAR  
  Notes DAG; 600.129; 600.121 Approved no  
  Call Number Admin @ si @ RVK2019 Serial 3337  
Permanent link to this record
Select All    Deselect All
 |   | 
Details
   print

Save Citations:
Export Records: