|   | 
Details
   web
Records
Author Lei Kang; Pau Riba; Yaxing Wang; Marçal Rusiñol; Alicia Fornes; Mauricio Villegas
Title GANwriting: Content-Conditioned Generation of Styled Handwritten Word Images Type (up) Conference Article
Year 2020 Publication 16th European Conference on Computer Vision Abbreviated Journal
Volume Issue Pages
Keywords
Abstract Although current image generation methods have reached impressive quality levels, they are still unable to produce plausible yet diverse images of handwritten words. On the contrary, when writing by hand, a great variability is observed across different writers, and even when analyzing words scribbled by the same individual, involuntary variations are conspicuous. In this work, we take a step closer to producing realistic and varied artificially rendered handwritten words. We propose a novel method that is able to produce credible handwritten word images by conditioning the generative process with both calligraphic style features and textual content. Our generator is guided by three complementary learning objectives: to produce realistic images, to imitate a certain handwriting style and to convey a specific textual content. Our model is unconstrained to any predefined vocabulary, being able to render whatever input word. Given a sample writer, it is also able to mimic its calligraphic features in a few-shot setup. We significantly advance over prior art and demonstrate with qualitative, quantitative and human-based evaluations the realistic aspect of our synthetically produced images.
Address Virtual; August 2020
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference ECCV
Notes DAG; 600.140; 600.121; 600.129 Approved no
Call Number Admin @ si @ KPW2020 Serial 3426
Permanent link to this record
 

 
Author Henry Velesaca; Steven Araujo; Patricia Suarez; Angel Sanchez; Angel Sappa
Title Off-the-Shelf Based System for Urban Environment Video Analytics Type (up) Conference Article
Year 2020 Publication 27th International Conference on Systems, Signals and Image Processing Abbreviated Journal
Volume Issue Pages
Keywords greenhouse gases; carbon footprint; object detection; object tracking; website framework; off-the-shelf video analytics
Abstract This paper presents the design and implementation details of a system build-up by using off-the-shelf algorithms for urban video analytics. The system allows the connection to
public video surveillance camera networks to obtain the necessary information to generate statistics from urban scenarios (e.g., amount of vehicles, type of cars, direction, numbers of persons, etc.). The obtained information could be used not only for traffic management but also to estimate the carbon footprint of urban scenarios. As a case study, a university campus is selected to evaluate the performance of the proposed system. The system is implemented in a modular way so that it is being used as a testbed to evaluate different algorithms. Implementation results are provided showing the validity and utility of the proposed approach.
Address Virtual IWSSIP
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference IWSSIP
Notes MSIAU; 600.130; 601.349; 600.122 Approved no
Call Number Admin @ si @ VAS2020 Serial 3429
Permanent link to this record
 

 
Author Henry Velesaca; Raul Mira; Patricia Suarez; Christian X. Larrea; Angel Sappa
Title Deep Learning Based Corn Kernel Classification Type (up) Conference Article
Year 2020 Publication 1st International Workshop and Prize Challenge on Agriculture-Vision: Challenges & Opportunities for Computer Vision in Agriculture Abbreviated Journal
Volume Issue Pages
Keywords
Abstract This paper presents a full pipeline to classify sample sets of corn kernels. The proposed approach follows a segmentation-classification scheme. The image segmentation is performed through a well known deep learningbased approach, the Mask R-CNN architecture, while the classification is performed hrough a novel-lightweight network specially designed for this task—good corn kernel, defective corn kernel and impurity categories are considered. As a second contribution, a carefully annotated multitouching corn kernel dataset has been generated. This dataset has been used for training the segmentation and the classification modules. Quantitative evaluations have been
performed and comparisons with other approaches are provided showing improvements with the proposed pipeline.
Address Virtual CVPR
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference CVPRW
Notes MSIAU; 600.130; 600.122 Approved no
Call Number Admin @ si @ VMS2020 Serial 3430
Permanent link to this record
 

 
Author Rafael E. Rivadeneira; Angel Sappa; Boris X. Vintimilla
Title Thermal Image Super-resolution: A Novel Architecture and Dataset Type (up) Conference Article
Year 2020 Publication 15th International Conference on Computer Vision Theory and Applications Abbreviated Journal
Volume Issue Pages 111-119
Keywords
Abstract This paper proposes a novel CycleGAN architecture for thermal image super-resolution, together with a large dataset consisting of thermal images at different resolutions. The dataset has been acquired using three thermal cameras at different resolutions, which acquire images from the same scenario at the same time. The thermal cameras are mounted in rig trying to minimize the baseline distance to make easier the registration problem.
The proposed architecture is based on ResNet6 as a Generator and PatchGAN as Discriminator. The novelty on the proposed unsupervised super-resolution training (CycleGAN) is possible due to the existence of aforementioned thermal images—images of the same scenario with different resolutions. The proposed approach is evaluated in the dataset and compared with classical bicubic interpolation. The dataset and the network are available.
Address Valletta; Malta; February 2020
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference VISAPP
Notes MSIAU; 600.130; 600.122 Approved no
Call Number Admin @ si @ RSV2020 Serial 3432
Permanent link to this record
 

 
Author Rafael E. Rivadeneira; Angel Sappa; Boris X. Vintimilla; Lin Guo; Jiankun Hou; Armin Mehri; Parichehr Behjati Ardakani; Heena Patel; Vishal Chudasama; Kalpesh Prajapati; Kishor P. Upla; Raghavendra Ramachandra; Kiran Raja; Christoph Busch; Feras Almasri; Olivier Debeir; Sabari Nathan; Priya Kansal; Nolan Gutierrez; Bardia Mojra; William J. Beksi
Title Thermal Image Super-Resolution Challenge – PBVS 2020 Type (up) Conference Article
Year 2020 Publication 16h IEEE Workshop on Perception Beyond the Visible Spectrum Abbreviated Journal
Volume Issue Pages
Keywords
Abstract This paper summarizes the top contributions to the first challenge on thermal image super-resolution (TISR), which was organized as part of the Perception Beyond the Visible Spectrum (PBVS) 2020 workshop. In this challenge, a novel thermal image dataset is considered together with state-of-the-art approaches evaluated under a common framework. The dataset used in the challenge consists of 1021 thermal images, obtained from three distinct thermal cameras at different resolutions (low-resolution, mid-resolution, and high-resolution), resulting in a total of 3063 thermal images. From each resolution, 951 images are used for training and 50 for testing while the 20 remaining images are used for two proposed evaluations. The first evaluation consists of downsampling the low-resolution, mid-resolution, and high-resolution thermal images by x2, x3 and x4 respectively, and comparing their super-resolution results with the corresponding ground truth images. The second evaluation is comprised of obtaining the x2 super-resolution from a given mid-resolution thermal image and comparing it with the corresponding semi-registered high-resolution thermal image. Out of 51 registered participants, 6 teams reached the final validation phase.
Address Virtual CVPR
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference CVPRW
Notes MSIAU; ISE; 600.119; 600.122 Approved no
Call Number Admin @ si @ RSV2020 Serial 3431
Permanent link to this record
 

 
Author Jorge Charco; Angel Sappa; Boris X. Vintimilla; Henry Velesaca
Title Transfer Learning from Synthetic Data in the Camera Pose Estimation Problem Type (up) Conference Article
Year 2020 Publication 15th International Conference on Computer Vision Theory and Applications Abbreviated Journal
Volume Issue Pages
Keywords
Abstract This paper presents a novel Siamese network architecture, as a variant of Resnet-50, to estimate the relative camera pose on multi-view environments. In order to improve the performance of the proposed model a transfer learning strategy, based on synthetic images obtained from a virtual-world, is considered. The transfer learning consists of first training the network using pairs of images from the virtual-world scenario
considering different conditions (i.e., weather, illumination, objects, buildings, etc.); then, the learned weight
of the network are transferred to the real case, where images from real-world scenarios are considered. Experimental results and comparisons with the state of the art show both, improvements on the relative pose estimation accuracy using the proposed model, as well as further improvements when the transfer learning strategy (synthetic-world data transfer learning real-world data) is considered to tackle the limitation on the
training due to the reduced number of pairs of real-images on most of the public data sets.
Address Valletta; Malta; February 2020
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference VISAPP
Notes MSIAU; 600.130; 601.349; 600.122 Approved no
Call Number Admin @ si @ CSV2020 Serial 3433
Permanent link to this record
 

 
Author Xavier Soria; Edgar Riba; Angel Sappa
Title Dense Extreme Inception Network: Towards a Robust CNN Model for Edge Detection Type (up) Conference Article
Year 2020 Publication IEEE Winter Conference on Applications of Computer Vision Abbreviated Journal
Volume Issue Pages
Keywords
Abstract This paper proposes a Deep Learning based edge detector, which is inspired on both HED (Holistically-Nested Edge Detection) and Xception networks. The proposed approach generates thin edge-maps that are plausible for human eyes; it can be used in any edge detection task without previous training or fine tuning process. As a second contribution, a large dataset with carefully annotated edges has been generated. This dataset has been used for training the proposed approach as well the state-of-the-art algorithms for comparisons. Quantitative and qualitative evaluations have been performed on different benchmarks showing improvements with the proposed method when F-measure of ODS and OIS are considered.
Address Aspen; USA; March 2020
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference WACV
Notes MSIAU; 600.130; 601.349; 600.122 Approved no
Call Number Admin @ si @ SRS2020 Serial 3434
Permanent link to this record
 

 
Author Ciprian Corneanu; Sergio Escalera; Aleix M. Martinez
Title Computing the Testing Error Without a Testing Set Type (up) Conference Article
Year 2020 Publication 33rd IEEE Conference on Computer Vision and Pattern Recognition Abbreviated Journal
Volume Issue Pages
Keywords
Abstract Oral. Paper award nominee.
Deep Neural Networks (DNNs) have revolutionized computer vision. We now have DNNs that achieve top (performance) results in many problems, including object recognition, facial expression analysis, and semantic segmentation, to name but a few. The design of the DNNs that achieve top results is, however, non-trivial and mostly done by trailand-error. That is, typically, researchers will derive many DNN architectures (i.e., topologies) and then test them on multiple datasets. However, there are no guarantees that the selected DNN will perform well in the real world. One can use a testing set to estimate the performance gap between the training and testing sets, but avoiding overfitting-to-thetesting-data is almost impossible. Using a sequestered testing dataset may address this problem, but this requires a constant update of the dataset, a very expensive venture. Here, we derive an algorithm to estimate the performance gap between training and testing that does not require any testing dataset. Specifically, we derive a number of persistent topology measures that identify when a DNN is learning to generalize to unseen samples. This allows us to compute the DNN’s testing error on unseen samples, even when we do not have access to them. We provide extensive experimental validation on multiple networks and datasets to demonstrate the feasibility of the proposed approach.
Address Virtual CVPR
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference CVPR
Notes HuPBA; no proj Approved no
Call Number Admin @ si @ CEM2020 Serial 3437
Permanent link to this record
 

 
Author Swathikiran Sudhakaran; Sergio Escalera; Oswald Lanz
Title Gate-Shift Networks for Video Action Recognition Type (up) Conference Article
Year 2020 Publication 33rd IEEE Conference on Computer Vision and Pattern Recognition Abbreviated Journal
Volume Issue Pages
Keywords
Abstract Deep 3D CNNs for video action recognition are designed to learn powerful representations in the joint spatio-temporal feature space. In practice however, because of the large number of parameters and computations involved, they may under-perform in the lack of sufficiently large datasets for training them at scale. In this paper we introduce spatial gating in spatial-temporal decomposition of 3D kernels. We implement this concept with Gate-Shift Module (GSM). GSM is lightweight and turns a 2D-CNN into a highly efficient spatio-temporal feature extractor. With GSM plugged in, a 2D-CNN learns to adaptively route features through time and combine them, at almost no additional parameters and computational overhead. We perform an extensive evaluation of the proposed module to study its effectiveness in video action recognition, achieving state-of-the-art results on Something Something-V1 and Diving48 datasets, and obtaining competitive results on EPIC-Kitchens with far less model complexity.
Address Virtual CVPR
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference CVPR
Notes HuPBA; no proj Approved no
Call Number Admin @ si @ SEL2020 Serial 3438
Permanent link to this record
 

 
Author Eduardo Aguilar; Bhalaji Nagarajan; Rupali Khatun; Marc Bolaños; Petia Radeva
Title Uncertainty Modeling and Deep Learning Applied to Food Image Analysis Type (up) Conference Article
Year 2020 Publication 13th International Joint Conference on Biomedical Engineering Systems and Technologies Abbreviated Journal
Volume Issue Pages
Keywords
Abstract Recently, computer vision approaches specially assisted by deep learning techniques have shown unexpected advancements that practically solve problems that never have been imagined to be automatized like face recognition or automated driving. However, food image recognition has received a little effort in the Computer Vision community. In this project, we review the field of food image analysis and focus on how to combine with two challenging research lines: deep learning and uncertainty modeling. After discussing our methodology to advance in this direction, we comment potential research, social and economic impact of the research on food image analysis.
Address Villetta; Malta; February 2020
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference BIODEVICES
Notes MILAB Approved no
Call Number Admin @ si @ ANK2020 Serial 3526
Permanent link to this record
 

 
Author Mohamed Ali Souibgui; Y.Kessentini; Alicia Fornes
Title A conditional GAN based approach for distorted camera captured documents recovery Type (up) Conference Article
Year 2020 Publication 4th Mediterranean Conference on Pattern Recognition and Artificial Intelligence Abbreviated Journal
Volume Issue Pages
Keywords
Abstract
Address Virtual; December 2020
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference MedPRAI
Notes DAG; 600.121 Approved no
Call Number Admin @ si @ SKF2020 Serial 3450
Permanent link to this record
 

 
Author Fernando Vilariño
Title Unveiling the Social Impact of AI Type (up) Conference Article
Year 2020 Publication Workshop at Digital Living Lab Days Conference Abbreviated Journal
Volume Issue Pages
Keywords
Abstract
Address September 2020
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference
Notes MV; DAG; 600.121; 600.140;SIAI Approved no
Call Number Admin @ si @ Vil2020 Serial 3459
Permanent link to this record
 

 
Author Hassan Ahmed Sial; Ramon Baldrich; Maria Vanrell; Dimitris Samaras
Title Light Direction and Color Estimation from Single Image with Deep Regression Type (up) Conference Article
Year 2020 Publication London Imaging Conference Abbreviated Journal
Volume Issue Pages
Keywords
Abstract We present a method to estimate the direction and color of the scene light source from a single image. Our method is based on two main ideas: (a) we use a new synthetic dataset with strong shadow effects with similar constraints to the SID dataset; (b) we define a deep architecture trained on the mentioned dataset to estimate the direction and color of the scene light source. Apart from showing good performance on synthetic images, we additionally propose a preliminary procedure to obtain light positions of the Multi-Illumination dataset, and, in this way, we also prove that our trained model achieves good performance when it is applied to real scenes.
Address Virtual; September 2020
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference LIM
Notes CIC; 600.118; 600.140; Approved no
Call Number Admin @ si @ SBV2020 Serial 3460
Permanent link to this record
 

 
Author Sagnik Das; Hassan Ahmed Sial; Ke Ma; Ramon Baldrich; Maria Vanrell; Dimitris Samaras
Title Intrinsic Decomposition of Document Images In-the-Wild Type (up) Conference Article
Year 2020 Publication 31st British Machine Vision Conference Abbreviated Journal
Volume Issue Pages
Keywords
Abstract Automatic document content processing is affected by artifacts caused by the shape
of the paper, non-uniform and diverse color of lighting conditions. Fully-supervised
methods on real data are impossible due to the large amount of data needed. Hence, the
current state of the art deep learning models are trained on fully or partially synthetic images. However, document shadow or shading removal results still suffer because: (a) prior methods rely on uniformity of local color statistics, which limit their application on real-scenarios with complex document shapes and textures and; (b) synthetic or hybrid datasets with non-realistic, simulated lighting conditions are used to train the models. In this paper we tackle these problems with our two main contributions. First, a physically constrained learning-based method that directly estimates document reflectance based on intrinsic image formation which generalizes to challenging illumination conditions. Second, a new dataset that clearly improves previous synthetic ones, by adding a large range of realistic shading and diverse multi-illuminant conditions, uniquely customized to deal with documents in-the-wild. The proposed architecture works in two steps. First, a white balancing module neutralizes the color of the illumination on the input image. Based on the proposed multi-illuminant dataset we achieve a good white-balancing in really difficult conditions. Second, the shading separation module accurately disentangles the shading and paper material in a self-supervised manner where only the synthetic texture is used as a weak training signal (obviating the need for very costly ground truth with disentangled versions of shading and reflectance). The proposed approach leads to significant generalization of document reflectance estimation in real scenes with challenging illumination. We extensively evaluate on the real benchmark datasets available for intrinsic image decomposition and document shadow removal tasks. Our reflectance estimation scheme, when used as a pre-processing step of an OCR pipeline, shows a 21% improvement of character error rate (CER), thus, proving the practical applicability. The data and code will be available at: https://github.com/cvlab-stonybrook/DocIIW.
Address Virtual; September 2020
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference BMVC
Notes CIC; 600.087; 600.140; 600.118 Approved no
Call Number Admin @ si @ DSM2020 Serial 3461
Permanent link to this record
 

 
Author Xinhang Song; Haitao Zeng; Sixian Zhang; Luis Herranz; Shuqiang Jiang
Title Generalized Zero-shot Learning with Multi-source Semantic Embeddings for Scene Recognition Type (up) Conference Article
Year 2020 Publication 28th ACM International Conference on Multimedia Abbreviated Journal
Volume Issue Pages
Keywords
Abstract Recognizing visual categories from semantic descriptions is a promising way to extend the capability of a visual classifier beyond the concepts represented in the training data (i.e. seen categories). This problem is addressed by (generalized) zero-shot learning methods (GZSL), which leverage semantic descriptions that connect them to seen categories (e.g. label embedding, attributes). Conventional GZSL are designed mostly for object recognition. In this paper we focus on zero-shot scene recognition, a more challenging setting with hundreds of categories where their differences can be subtle and often localized in certain objects or regions. Conventional GZSL representations are not rich enough to capture these local discriminative differences. Addressing these limitations, we propose a feature generation framework with two novel components: 1) multiple sources of semantic information (i.e. attributes, word embeddings and descriptions), 2) region descriptions that can enhance scene discrimination. To generate synthetic visual features we propose a two-step generative approach, where local descriptions are sampled and used as conditions to generate visual features. The generated features are then aggregated and used together with real features to train a joint classifier. In order to evaluate the proposed method, we introduce a new dataset for zero-shot scene recognition with multi-semantic annotations. Experimental results on the proposed dataset and SUN Attribute dataset illustrate the effectiveness of the proposed method.
Address Virtual; October 2020
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference ACM
Notes LAMP; 600.141; 600.120 Approved no
Call Number Admin @ si @ SZZ2020 Serial 3465
Permanent link to this record