Publicacions CVC -- Query Results

<< 1 2 3 4 5 6 7 8 9 >>

Details

Records
Author	Lei Kang; Pau Riba; Marçal Rusiñol; Alicia Fornes; Mauricio Villegas
Title	Distilling Content from Style for Handwritten Word Recognition			Type	Conference Article
Year	2020	Publication	17th International Conference on Frontiers in Handwriting Recognition	Abbreviated Journal
Volume		Issue		Pages
Keywords
Abstract	Despite the latest transcription accuracies reached using deep neural network architectures, handwritten text recognition still remains a challenging problem, mainly because of the large inter-writer style variability. Both augmenting the training set with artificial samples using synthetic fonts, and writer adaptation techniques have been proposed to yield more generic approaches aimed at dodging style unevenness. In this work, we take a step closer to learn style independent features from handwritten word images. We propose a novel method that is able to disentangle the content and style aspects of input images by jointly optimizing a generative process and a handwritten word recognizer. The generator is aimed at transferring writing style features from one sample to another in an image-to-image translation approach, thus leading to a learned content-centric features that shall be independent to writing style attributes. Our proposed recognition model is able then to leverage such writer-agnostic features to reach better recognition performances. We advance over prior training strategies and demonstrate with qualitative and quantitative evaluations the performance of both the generative process and the recognition efficiency in the IAM dataset.
Address	Virtual ICFHR; September 2020
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference	ICFHR
Notes	DAG; 600.129; 600.140; 600.121			Approved	no
Call Number	Admin @ si @ KRR2020			Serial	3425
Permanent link to this record



Author	Lei Kang; Pau Riba; Yaxing Wang; Marçal Rusiñol; Alicia Fornes; Mauricio Villegas
Title	GANwriting: Content-Conditioned Generation of Styled Handwritten Word Images			Type	Conference Article
Year	2020	Publication	16th European Conference on Computer Vision	Abbreviated Journal
Volume		Issue		Pages
Keywords
Abstract	Although current image generation methods have reached impressive quality levels, they are still unable to produce plausible yet diverse images of handwritten words. On the contrary, when writing by hand, a great variability is observed across different writers, and even when analyzing words scribbled by the same individual, involuntary variations are conspicuous. In this work, we take a step closer to producing realistic and varied artificially rendered handwritten words. We propose a novel method that is able to produce credible handwritten word images by conditioning the generative process with both calligraphic style features and textual content. Our generator is guided by three complementary learning objectives: to produce realistic images, to imitate a certain handwriting style and to convey a specific textual content. Our model is unconstrained to any predefined vocabulary, being able to render whatever input word. Given a sample writer, it is also able to mimic its calligraphic features in a few-shot setup. We significantly advance over prior art and demonstrate with qualitative, quantitative and human-based evaluations the realistic aspect of our synthetically produced images.
Address	Virtual; August 2020
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference	ECCV
Notes	DAG; 600.140; 600.121; 600.129			Approved	no
Call Number	Admin @ si @ KPW2020			Serial	3426
Permanent link to this record



Author	Henry Velesaca; Steven Araujo; Patricia Suarez; Angel Sanchez; Angel Sappa
Title	Off-the-Shelf Based System for Urban Environment Video Analytics			Type	Conference Article
Year	2020	Publication	27th International Conference on Systems, Signals and Image Processing	Abbreviated Journal
Volume		Issue		Pages
Keywords	greenhouse gases; carbon footprint; object detection; object tracking; website framework; off-the-shelf video analytics
Abstract	This paper presents the design and implementation details of a system build-up by using off-the-shelf algorithms for urban video analytics. The system allows the connection to public video surveillance camera networks to obtain the necessary information to generate statistics from urban scenarios (e.g., amount of vehicles, type of cars, direction, numbers of persons, etc.). The obtained information could be used not only for traffic management but also to estimate the carbon footprint of urban scenarios. As a case study, a university campus is selected to evaluate the performance of the proposed system. The system is implemented in a modular way so that it is being used as a testbed to evaluate different algorithms. Implementation results are provided showing the validity and utility of the proposed approach.
Address	Virtual IWSSIP
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference	IWSSIP
Notes	MSIAU; 600.130; 601.349; 600.122			Approved	no
Call Number	Admin @ si @ VAS2020			Serial	3429
Permanent link to this record



Author	Rafael E. Rivadeneira; Angel Sappa; Boris X. Vintimilla
Title	Thermal Image Super-resolution: A Novel Architecture and Dataset			Type	Conference Article
Year	2020	Publication	15th International Conference on Computer Vision Theory and Applications	Abbreviated Journal
Volume		Issue		Pages	111-119
Keywords
Abstract	This paper proposes a novel CycleGAN architecture for thermal image super-resolution, together with a large dataset consisting of thermal images at different resolutions. The dataset has been acquired using three thermal cameras at different resolutions, which acquire images from the same scenario at the same time. The thermal cameras are mounted in rig trying to minimize the baseline distance to make easier the registration problem. The proposed architecture is based on ResNet6 as a Generator and PatchGAN as Discriminator. The novelty on the proposed unsupervised super-resolution training (CycleGAN) is possible due to the existence of aforementioned thermal images—images of the same scenario with different resolutions. The proposed approach is evaluated in the dataset and compared with classical bicubic interpolation. The dataset and the network are available.
Address	Valletta; Malta; February 2020
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference	VISAPP
Notes	MSIAU; 600.130; 600.122			Approved	no
Call Number	Admin @ si @ RSV2020			Serial	3432
Permanent link to this record



Author	Ciprian Corneanu; Sergio Escalera; Aleix M. Martinez
Title	Computing the Testing Error Without a Testing Set			Type	Conference Article
Year	2020	Publication	33rd IEEE Conference on Computer Vision and Pattern Recognition	Abbreviated Journal
Volume		Issue		Pages
Keywords
Abstract	Oral. Paper award nominee. Deep Neural Networks (DNNs) have revolutionized computer vision. We now have DNNs that achieve top (performance) results in many problems, including object recognition, facial expression analysis, and semantic segmentation, to name but a few. The design of the DNNs that achieve top results is, however, non-trivial and mostly done by trailand-error. That is, typically, researchers will derive many DNN architectures (i.e., topologies) and then test them on multiple datasets. However, there are no guarantees that the selected DNN will perform well in the real world. One can use a testing set to estimate the performance gap between the training and testing sets, but avoiding overfitting-to-thetesting-data is almost impossible. Using a sequestered testing dataset may address this problem, but this requires a constant update of the dataset, a very expensive venture. Here, we derive an algorithm to estimate the performance gap between training and testing that does not require any testing dataset. Specifically, we derive a number of persistent topology measures that identify when a DNN is learning to generalize to unseen samples. This allows us to compute the DNN’s testing error on unseen samples, even when we do not have access to them. We provide extensive experimental validation on multiple networks and datasets to demonstrate the feasibility of the proposed approach.
Address	Virtual CVPR
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference	CVPR
Notes	HuPBA; no proj			Approved	no
Call Number	Admin @ si @ CEM2020			Serial	3437
Permanent link to this record



Author	Swathikiran Sudhakaran; Sergio Escalera; Oswald Lanz
Title	Gate-Shift Networks for Video Action Recognition			Type	Conference Article
Year	2020	Publication	33rd IEEE Conference on Computer Vision and Pattern Recognition	Abbreviated Journal
Volume		Issue		Pages
Keywords
Abstract	Deep 3D CNNs for video action recognition are designed to learn powerful representations in the joint spatio-temporal feature space. In practice however, because of the large number of parameters and computations involved, they may under-perform in the lack of sufficiently large datasets for training them at scale. In this paper we introduce spatial gating in spatial-temporal decomposition of 3D kernels. We implement this concept with Gate-Shift Module (GSM). GSM is lightweight and turns a 2D-CNN into a highly efficient spatio-temporal feature extractor. With GSM plugged in, a 2D-CNN learns to adaptively route features through time and combine them, at almost no additional parameters and computational overhead. We perform an extensive evaluation of the proposed module to study its effectiveness in video action recognition, achieving state-of-the-art results on Something Something-V1 and Diving48 datasets, and obtaining competitive results on EPIC-Kitchens with far less model complexity.
Address	Virtual CVPR
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference	CVPR
Notes	HuPBA; no proj			Approved	no
Call Number	Admin @ si @ SEL2020			Serial	3438
Permanent link to this record



Author	Jun Wan; Guodong Guo; Sergio Escalera; Hugo Jair Escalante; Stan Z. Li
Title	Multi-modal Face Presentation Attach Detection			Type	Book Whole
Year	2020	Publication	Synthesis Lectures on Computer Vision	Abbreviated Journal
Volume	13	Issue		Pages
Keywords
Abstract
Address
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference
Notes	HuPBA			Approved	no
Call Number	Admin @ si @ WGE2020			Serial	3440
Permanent link to this record



Author	Mohamed Ali Souibgui; Y.Kessentini; Alicia Fornes
Title	A conditional GAN based approach for distorted camera captured documents recovery			Type	Conference Article
Year	2020	Publication	4th Mediterranean Conference on Pattern Recognition and Artificial Intelligence	Abbreviated Journal
Volume		Issue		Pages
Keywords
Abstract
Address	Virtual; December 2020
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference	MedPRAI
Notes	DAG; 600.121			Approved	no
Call Number	Admin @ si @ SKF2020			Serial	3450
Permanent link to this record



Author	Manuel Carbonell; Alicia Fornes; Mauricio Villegas; Josep Llados
Title	A Neural Model for Text Localization, Transcription and Named Entity Recognition in Full Pages			Type	Journal Article
Year	2020	Publication	Pattern Recognition Letters	Abbreviated Journal	PRL
Volume	136	Issue		Pages	219-227
Keywords
Abstract	In the last years, the consolidation of deep neural network architectures for information extraction in document images has brought big improvements in the performance of each of the tasks involved in this process, consisting of text localization, transcription, and named entity recognition. However, this process is traditionally performed with separate methods for each task. In this work we propose an end-to-end model that combines a one stage object detection network with branches for the recognition of text and named entities respectively in a way that shared features can be learned simultaneously from the training error of each of the tasks. By doing so the model jointly performs handwritten text detection, transcription, and named entity recognition at page level with a single feed forward step. We exhaustively evaluate our approach on different datasets, discussing its advantages and limitations compared to sequential approaches. The results show that the model is capable of benefiting from shared features by simultaneously solving interdependent tasks.
Address
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference
Notes	DAG; 600.140; 601.311; 600.121			Approved	no
Call Number	Admin @ si @ CFV2020			Serial	3451
Permanent link to this record



Author	Fernando Vilariño
Title	Unveiling the Social Impact of AI			Type	Conference Article
Year	2020	Publication	Workshop at Digital Living Lab Days Conference	Abbreviated Journal
Volume		Issue		Pages
Keywords
Abstract
Address	September 2020
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference
Notes	MV; DAG; 600.121; 600.140;SIAI			Approved	no
Call Number	Admin @ si @ Vil2020			Serial	3459
Permanent link to this record



Author	Hassan Ahmed Sial; Ramon Baldrich; Maria Vanrell; Dimitris Samaras
Title	Light Direction and Color Estimation from Single Image with Deep Regression			Type	Conference Article
Year	2020	Publication	London Imaging Conference	Abbreviated Journal
Volume		Issue		Pages
Keywords
Abstract	We present a method to estimate the direction and color of the scene light source from a single image. Our method is based on two main ideas: (a) we use a new synthetic dataset with strong shadow effects with similar constraints to the SID dataset; (b) we define a deep architecture trained on the mentioned dataset to estimate the direction and color of the scene light source. Apart from showing good performance on synthetic images, we additionally propose a preliminary procedure to obtain light positions of the Multi-Illumination dataset, and, in this way, we also prove that our trained model achieves good performance when it is applied to real scenes.
Address	Virtual; September 2020
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference	LIM
Notes	CIC; 600.118; 600.140;			Approved	no
Call Number	Admin @ si @ SBV2020			Serial	3460
Permanent link to this record



Author	Sagnik Das; Hassan Ahmed Sial; Ke Ma; Ramon Baldrich; Maria Vanrell; Dimitris Samaras
Title	Intrinsic Decomposition of Document Images In-the-Wild			Type	Conference Article
Year	2020	Publication	31st British Machine Vision Conference	Abbreviated Journal
Volume		Issue		Pages
Keywords
Abstract	Automatic document content processing is affected by artifacts caused by the shape of the paper, non-uniform and diverse color of lighting conditions. Fully-supervised methods on real data are impossible due to the large amount of data needed. Hence, the current state of the art deep learning models are trained on fully or partially synthetic images. However, document shadow or shading removal results still suffer because: (a) prior methods rely on uniformity of local color statistics, which limit their application on real-scenarios with complex document shapes and textures and; (b) synthetic or hybrid datasets with non-realistic, simulated lighting conditions are used to train the models. In this paper we tackle these problems with our two main contributions. First, a physically constrained learning-based method that directly estimates document reflectance based on intrinsic image formation which generalizes to challenging illumination conditions. Second, a new dataset that clearly improves previous synthetic ones, by adding a large range of realistic shading and diverse multi-illuminant conditions, uniquely customized to deal with documents in-the-wild. The proposed architecture works in two steps. First, a white balancing module neutralizes the color of the illumination on the input image. Based on the proposed multi-illuminant dataset we achieve a good white-balancing in really difficult conditions. Second, the shading separation module accurately disentangles the shading and paper material in a self-supervised manner where only the synthetic texture is used as a weak training signal (obviating the need for very costly ground truth with disentangled versions of shading and reflectance). The proposed approach leads to significant generalization of document reflectance estimation in real scenes with challenging illumination. We extensively evaluate on the real benchmark datasets available for intrinsic image decomposition and document shadow removal tasks. Our reflectance estimation scheme, when used as a pre-processing step of an OCR pipeline, shows a 21% improvement of character error rate (CER), thus, proving the practical applicability. The data and code will be available at: https://github.com/cvlab-stonybrook/DocIIW.
Address	Virtual; September 2020
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference	BMVC
Notes	CIC; 600.087; 600.140; 600.118			Approved	no
Call Number	Admin @ si @ DSM2020			Serial	3461
Permanent link to this record



Author	Kai Wang; Luis Herranz; Anjan Dutta; Joost Van de Weijer
Title	Bookworm continual learning: beyond zero-shot learning and continual learning			Type	Conference Article
Year	2020	Publication	Workshop TASK-CV 2020	Abbreviated Journal
Volume		Issue		Pages
Keywords
Abstract	We propose bookworm continual learning(BCL), a flexible setting where unseen classes can be inferred via a semantic model, and the visual model can be updated continually. Thus BCL generalizes both continual learning (CL) and zero-shot learning (ZSL). We also propose the bidirectional imagination (BImag) framework to address BCL where features of both past and future classes are generated. We observe that conditioning the feature generator on attributes can actually harm the continual learning ability, and propose two variants (joint class-attribute conditioning and asymmetric generation) to alleviate this problem.
Address	Virtual; August 2020
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference	ECCVW
Notes	LAMP; 600.141; 600.120			Approved	no
Call Number	Admin @ si @ WHD2020			Serial	3466
Permanent link to this record



Author	Debora Gil; Guillermo Torres
Title	A multi-shape loss function with adaptive class balancing for the segmentation of lung structures			Type	Conference Article
Year	2020	Publication	34th International Congress and Exhibition on Computer Assisted Radiology & Surgery	Abbreviated Journal
Volume		Issue		Pages
Keywords
Abstract
Address	Virtual; June 2020
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference	CARS
Notes	IAM; 600.139; 600.145			Approved	no
Call Number	Admin @ si @ GiT2020			Serial	3472
Permanent link to this record



Author	Debora Gil; Oriol Ramos Terrades; Raquel Perez
Title	Topological Radiomics (TOPiomics): Early Detection of Genetic Abnormalities in Cancer Treatment Evolution			Type	Conference Article
Year	2020	Publication	Women in Geometry and Topology	Abbreviated Journal
Volume		Issue		Pages
Keywords
Abstract
Address	Barcelona; September 2019
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference
Notes	IAM; DAG; 600.139; 600.145; 600.121			Approved	no
Call Number	Admin @ si @ GRP2020			Serial	3473
Permanent link to this record