toggle visibility Search & Display Options

Select All    Deselect All
 |   | 
Details
   print
  Records Links
Author A. Martinez edit  openurl
  Title (up) Disseny d´agents autonoms. Type Miscellaneous
  Year 1994 Publication Graduating Project Abbreviated Journal  
  Volume Issue Pages  
  Keywords  
  Abstract  
  Address  
  Corporate Author Thesis Master's thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes Approved no  
  Call Number Admin @ si @ Mar1994 Serial 236  
Permanent link to this record
 

 
Author Jordi Vitria edit  openurl
  Title (up) Disseny de sistemes (intel.ligents) de visio. Type Miscellaneous
  Year 1996 Publication Abbreviated Journal  
  Volume Issue Pages  
  Keywords  
  Abstract  
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes OR;MV Approved no  
  Call Number BCNPCL @ bcnpcl @ Vit1996a Serial 88  
Permanent link to this record
 

 
Author Yu Jie; Jaume Amores; N. Sebe; Petia Radeva; Tian Qi edit  openurl
  Title (up) Distance Learning for Similarity Estimation Type Journal
  Year 2008 Publication IEEE Trans. on Pattern Analysis and Machine Intelligence, vol.30(3):451–462 Abbreviated Journal  
  Volume Issue Pages  
  Keywords  
  Abstract  
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes ADAS;MILAB Approved no  
  Call Number ADAS @ adas @ JAS2008 Serial 961  
Permanent link to this record
 

 
Author Lei Kang; Pau Riba; Marçal Rusiñol; Alicia Fornes; Mauricio Villegas edit   pdf
openurl 
  Title (up) Distilling Content from Style for Handwritten Word Recognition Type Conference Article
  Year 2020 Publication 17th International Conference on Frontiers in Handwriting Recognition Abbreviated Journal  
  Volume Issue Pages  
  Keywords  
  Abstract Despite the latest transcription accuracies reached using deep neural network architectures, handwritten text recognition still remains a challenging problem, mainly because of the large inter-writer style variability. Both augmenting the training set with artificial samples using synthetic fonts, and writer adaptation techniques have been proposed to yield more generic approaches aimed at dodging style unevenness. In this work, we take a step closer to learn style independent features from handwritten word images. We propose a novel method that is able to disentangle the content and style aspects of input images by jointly optimizing a generative process and a handwritten
word recognizer. The generator is aimed at transferring writing style features from one sample to another in an image-to-image translation approach, thus leading to a learned content-centric features that shall be independent to writing style attributes.
Our proposed recognition model is able then to leverage such writer-agnostic features to reach better recognition performances. We advance over prior training strategies and demonstrate with qualitative and quantitative evaluations the performance of both
the generative process and the recognition efficiency in the IAM dataset.
 
  Address Virtual ICFHR; September 2020  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference ICFHR  
  Notes DAG; 600.129; 600.140; 600.121 Approved no  
  Call Number Admin @ si @ KRR2020 Serial 3425  
Permanent link to this record
 

 
Author Yaxing Wang; Joost Van de Weijer; Lu Yu; Shangling Jui edit  openurl
  Title (up) Distilling GANs with Style-Mixed Triplets for X2I Translation with Limited Data Type Conference Article
  Year 2022 Publication 10th International Conference on Learning Representations Abbreviated Journal  
  Volume Issue Pages  
  Keywords  
  Abstract Conditional image synthesis is an integral part of many X2I translation systems, including image-to-image, text-to-image and audio-to-image translation systems. Training these large systems generally requires huge amounts of training data.
Therefore, we investigate knowledge distillation to transfer knowledge from a high-quality unconditioned generative model (e.g., StyleGAN) to a conditioned synthetic image generation modules in a variety of systems. To initialize the conditional and reference branch (from a unconditional GAN) we exploit the style mixing characteristics of high-quality GANs to generate an infinite supply of style-mixed triplets to perform the knowledge distillation. Extensive experimental results in a number of image generation tasks (i.e., image-to-image, semantic segmentation-to-image, text-to-image and audio-to-image) demonstrate qualitatively and quantitatively that our method successfully transfers knowledge to the synthetic image generation modules, resulting in more realistic images than previous methods as confirmed by a significant drop in the FID.
 
  Address Virtual  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference ICLR  
  Notes LAMP; 600.147 Approved no  
  Call Number Admin @ si @ WWY2022 Serial 3791  
Permanent link to this record
 

 
Author Pau Riba edit  isbn
openurl 
  Title (up) Distilling Structure from Imagery: Graph-based Models for the Interpretation of Document Images Type Book Whole
  Year 2020 Publication PhD Thesis, Universitat Autonoma de Barcelona-CVC Abbreviated Journal  
  Volume Issue Pages  
  Keywords  
  Abstract From its early stages, the community of Pattern Recognition and Computer Vision has considered the importance of leveraging the structural information when understanding images. Usually, graphs have been proposed as a suitable model to represent this kind of information due to their flexibility and representational power able to codify both, the components, objects, or entities and their pairwise relationship. Even though graphs have been successfully applied to a huge variety of tasks, as a result of their symbolic and relational nature, graphs have always suffered from some limitations compared to statistical approaches. Indeed, some trivial mathematical operations do not have an equivalence in the graph domain. For instance, in the core of many pattern recognition applications, there is a need to compare two objects. This operation, which is trivial when considering feature vectors defined in \(\mathbb{R}^n\), is not properly defined for graphs.


In this thesis, we have investigated the importance of the structural information from two perspectives, the traditional graph-based methods and the new advances on Geometric Deep Learning. On the one hand, we explore the problem of defining a graph representation and how to deal with it on a large scale and noisy scenario. On the other hand, Graph Neural Networks are proposed to first redefine a Graph Edit Distance methodologies as a metric learning problem, and second, to apply them in a real use case scenario for the detection of repetitive patterns which define tables in invoice documents. As experimental framework, we have validated the different methodological contributions in the domain of Document Image Analysis and Recognition.
 
  Address  
  Corporate Author Thesis Ph.D. thesis  
  Publisher Ediciones Graficas Rey Place of Publication Editor Josep Llados;Alicia Fornes  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN 978-84-121011-6-4 Medium  
  Area Expedition Conference  
  Notes DAG; 600.121 Approved no  
  Call Number Admin @ si @ Rib20 Serial 3478  
Permanent link to this record
 

 
Author Sudeep Katakol; Basem Elbarashy; Luis Herranz; Joost Van de Weijer; Antonio Lopez edit   pdf
url  doi
openurl 
  Title (up) Distributed Learning and Inference with Compressed Images Type Journal Article
  Year 2021 Publication IEEE Transactions on Image Processing Abbreviated Journal TIP  
  Volume 30 Issue Pages 3069 - 3083  
  Keywords  
  Abstract Modern computer vision requires processing large amounts of data, both while training the model and/or during inference, once the model is deployed. Scenarios where images are captured and processed in physically separated locations are increasingly common (e.g. autonomous vehicles, cloud computing). In addition, many devices suffer from limited resources to store or transmit data (e.g. storage space, channel capacity). In these scenarios, lossy image compression plays a crucial role to effectively increase the number of images collected under such constraints. However, lossy compression entails some undesired degradation of the data that may harm the performance of the downstream analysis task at hand, since important semantic information may be lost in the process. Moreover, we may only have compressed images at training time but are able to use original images at inference time, or vice versa, and in such a case, the downstream model suffers from covariate shift. In this paper, we analyze this phenomenon, with a special focus on vision-based perception for autonomous driving as a paradigmatic scenario. We see that loss of semantic information and covariate shift do indeed exist, resulting in a drop in performance that depends on the compression rate. In order to address the problem, we propose dataset restoration, based on image restoration with generative adversarial networks (GANs). Our method is agnostic to both the particular image compression method and the downstream task; and has the advantage of not adding additional cost to the deployed models, which is particularly important in resource-limited devices. The presented experiments focus on semantic segmentation as a challenging use case, cover a broad range of compression rates and diverse datasets, and show how our method is able to significantly alleviate the negative effects of compression on the downstream visual task.  
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes LAMP; ADAS; 600.120; 600.118 Approved no  
  Call Number Admin @ si @ KEH2021 Serial 3543  
Permanent link to this record
 

 
Author Eduard Vazquez edit  openurl
  Title (up) Distribution Characterization using Topological Features. Application to Colour Image Processing Type Report
  Year 2007 Publication CVC Technical Report #107 Abbreviated Journal  
  Volume Issue Pages  
  Keywords  
  Abstract  
  Address CVC (UAB)  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes Approved no  
  Call Number Admin @ si @ Vaz2007a Serial 823  
Permanent link to this record
 

 
Author Eduard Vazquez edit  openurl
  Title (up) Distribution Characterization using Topological Features. Application to Colour Image Processing Type Report
  Year 2007 Publication CVC Technical Report # 107 Abbreviated Journal  
  Volume Issue Pages  
  Keywords  
  Abstract  
  Address  
  Corporate Author Thesis Master's thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes Approved no  
  Call Number Admin @ si @ Vaz2009 Serial 1254  
Permanent link to this record
 

 
Author Ariel Amato; Angel Sappa; Alicia Fornes; Felipe Lumbreras; Josep Llados edit   pdf
doi  isbn
openurl 
  Title (up) Divide and Conquer: Atomizing and Parallelizing A Task in A Mobile Crowdsourcing Platform Type Conference Article
  Year 2013 Publication 2nd International ACM Workshop on Crowdsourcing for Multimedia Abbreviated Journal  
  Volume Issue Pages 21-22  
  Keywords  
  Abstract In this paper we present some conclusions about the advantages of having an efficient task formulation when a crowdsourcing platform is used. In particular we show how the task atomization and distribution can help to obtain results in an efficient way. Our proposal is based on a recursive splitting of the original task into a set of smaller and simpler tasks. As a result both more accurate and faster solutions are obtained. Our evaluation is performed on a set of ancient documents that need to be digitized.  
  Address Barcelona; October 2013  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN 978-1-4503-2396-3 Medium  
  Area Expedition Conference CrowdMM  
  Notes ADAS; ISE; DAG; 600.054; 600.055; 600.045; 600.061; 602.006 Approved no  
  Call Number Admin @ si @ SLA2013 Serial 2335  
Permanent link to this record
 

 
Author Alloy Das; Sanket Biswas; Umapada Pal; Josep Llados edit   pdf
url  openurl
  Title (up) Diving into the Depths of Spotting Text in Multi-Domain Noisy Scenes Type Conference Article
  Year 2024 Publication IEEE International Conference on Robotics and Automation in PACIFICO Abbreviated Journal  
  Volume Issue Pages  
  Keywords  
  Abstract When used in a real-world noisy environment, the capacity to generalize to multiple domains is essential for any autonomous scene text spotting system. However, existing state-of-the-art methods employ pretraining and fine-tuning strategies on natural scene datasets, which do not exploit the feature interaction across other complex domains. In this work, we explore and investigate the problem of domain-agnostic scene text spotting, i.e., training a model on multi-domain source data such that it can directly generalize to target domains rather than being specialized for a specific domain or scenario. In this regard, we present the community a text spotting validation benchmark called Under-Water Text (UWT) for noisy underwater scenes to establish an important case study. Moreover, we also design an efficient super-resolution based end-to-end transformer baseline called DA-TextSpotter which achieves comparable or superior performance over existing text spotting architectures for both regular and arbitrary-shaped scene text spotting benchmarks in terms of both accuracy and model efficiency. The dataset, code and pre-trained models will be released upon acceptance.  
  Address Yokohama; Japan; May 2024  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference ICRA  
  Notes DAG Approved no  
  Call Number Admin @ si @ DBP2024 Serial 3979  
Permanent link to this record
 

 
Author C. Alejandro Parraga; Jordi Roca; Maria Vanrell edit  url
doi  openurl
  Title (up) Do Basic Colors Influence Chromatic Adaptation? Type Journal Article
  Year 2011 Publication Journal of Vision Abbreviated Journal VSS  
  Volume 11 Issue 11 Pages 85  
  Keywords  
  Abstract Color constancy (the ability to perceive colors relatively stable under different illuminants) is the result of several mechanisms spread across different neural levels and responding to several visual scene cues. It is usually measured by estimating the perceived color of a grey patch under an illuminant change. In this work, we hypothesize whether chromatic adaptation (without a reference white or grey) could be driven by certain colors, specifically those corresponding to the universal color terms proposed by Berlin and Kay (1969). To this end we have developed a new psychophysical paradigm in which subjects adjust the color of a test patch (in CIELab space) to match their memory of the best example of a given color chosen from the universal terms list (grey, red, green, blue, yellow, purple, pink, orange and brown). The test patch is embedded inside a Mondrian image and presented on a calibrated CRT screen inside a dark cabin. All subjects were trained to “recall” their most exemplary colors reliably from memory and asked to always produce the same basic colors when required under several adaptation conditions. These include achromatic and colored Mondrian backgrounds, under a simulated D65 illuminant and several colored illuminants. A set of basic colors were measured for each subject under neutral conditions (achromatic background and D65 illuminant) and used as “reference” for the rest of the experiment. The colors adjusted by the subjects in each adaptation condition were compared to the reference colors under the corresponding illuminant and a “constancy index” was obtained for each of them. Our results show that for some colors the constancy index was better than for grey. The set of best adapted colors in each condition were common to a majority of subjects and were dependent on the chromaticity of the illuminant and the chromatic background considered.  
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN 1534-7362 ISBN Medium  
  Area Expedition Conference  
  Notes CIC Approved no  
  Call Number Admin @ si @ PRV2011 Serial 1759  
Permanent link to this record
 

 
Author Adriana Romero; Carlo Gatta edit   pdf
doi  isbn
openurl 
  Title (up) Do We Really Need All These Neurons? Type Conference Article
  Year 2013 Publication 6th Iberian Conference on Pattern Recognition and Image Analysis Abbreviated Journal  
  Volume 7887 Issue Pages 460--467  
  Keywords Retricted Boltzmann Machine; hidden units; unsupervised learning; classification  
  Abstract Restricted Boltzmann Machines (RBMs) are generative neural networks that have received much attention recently. In particular, choosing the appropriate number of hidden units is important as it might hinder their representative power. According to the literature, RBM require numerous hidden units to approximate any distribution properly. In this paper, we present an experiment to determine whether such amount of hidden units is required in a classification context. We then propose an incremental algorithm that trains RBM reusing the previously trained parameters using a trade-off measure to determine the appropriate number of hidden units. Results on the MNIST and OCR letters databases show that using a number of hidden units, which is one order of magnitude smaller than the literature estimate, suffices to achieve similar performance. Moreover, the proposed algorithm allows to estimate the required number of hidden units without the need of training many RBM from scratch.  
  Address Madeira; Portugal; June 2013  
  Corporate Author Thesis  
  Publisher Springer Berlin Heidelberg Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title LNCS  
  Series Volume Series Issue Edition  
  ISSN 0302-9743 ISBN 978-3-642-38627-5 Medium  
  Area Expedition Conference IbPRIA  
  Notes MILAB; 600.046 Approved no  
  Call Number Admin @ si @ RoG2013 Serial 2311  
Permanent link to this record
 

 
Author Andrea Gemelli; Sanket Biswas; Enrico Civitelli; Josep Llados; Simone Marinai edit   pdf
url  doi
isbn  openurl
  Title (up) Doc2Graph: A Task Agnostic Document Understanding Framework Based on Graph Neural Networks Type Conference Article
  Year 2022 Publication 17th European Conference on Computer Vision Workshops Abbreviated Journal  
  Volume 13804 Issue Pages 329–344  
  Keywords  
  Abstract Geometric Deep Learning has recently attracted significant interest in a wide range of machine learning fields, including document analysis. The application of Graph Neural Networks (GNNs) has become crucial in various document-related tasks since they can unravel important structural patterns, fundamental in key information extraction processes. Previous works in the literature propose task-driven models and do not take into account the full power of graphs. We propose Doc2Graph, a task-agnostic document understanding framework based on a GNN model, to solve different tasks given different types of documents. We evaluated our approach on two challenging datasets for key information extraction in form understanding, invoice layout analysis and table detection.  
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title LNCS  
  Series Volume Series Issue Edition  
  ISSN ISBN 978-3-031-25068-2 Medium  
  Area Expedition Conference ECCV-TiE  
  Notes DAG; 600.162; 600.140; 110.312 Approved no  
  Call Number Admin @ si @ GBC2022 Serial 3795  
Permanent link to this record
 

 
Author Mohamed Ali Souibgui; Sanket Biswas; Sana Khamekhem Jemni; Yousri Kessentini; Alicia Fornes; Josep Llados; Umapada Pal edit   pdf
doi  openurl
  Title (up) DocEnTr: An End-to-End Document Image Enhancement Transformer Type Conference Article
  Year 2022 Publication 26th International Conference on Pattern Recognition Abbreviated Journal  
  Volume Issue Pages 1699-1705  
  Keywords Degradation; Head; Optical character recognition; Self-supervised learning; Benchmark testing; Transformers; Magnetic heads  
  Abstract Document images can be affected by many degradation scenarios, which cause recognition and processing difficulties. In this age of digitization, it is important to denoise them for proper usage. To address this challenge, we present a new encoder-decoder architecture based on vision transformers to enhance both machine-printed and handwritten document images, in an end-to-end fashion. The encoder operates directly on the pixel patches with their positional information without the use of any convolutional layers, while the decoder reconstructs a clean image from the encoded patches. Conducted experiments show a superiority of the proposed model compared to the state-of the-art methods on several DIBCO benchmarks. Code and models will be publicly available at: https://github.com/dali92002/DocEnTR  
  Address August 21-25, 2022 , Montréal Québec  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference ICPR  
  Notes DAG; 600.121; 600.162; 602.230; 600.140 Approved no  
  Call Number Admin @ si @ SBJ2022 Serial 3730  
Permanent link to this record
Select All    Deselect All
 |   | 
Details
   print

Save Citations:
Export Records: