toggle visibility Search & Display Options

Select All    Deselect All
 |   | 
Details
   print
  Records Links
Author Javad Zolfaghari Bengar; Bogdan Raducanu; Joost Van de Weijer edit  url
openurl 
  Title When Deep Learners Change Their Mind: Learning Dynamics for Active Learning Type Conference Article
  Year (up) 2021 Publication 19th International Conference on Computer Analysis of Images and Patterns Abbreviated Journal  
  Volume 13052 Issue 1 Pages 403-413  
  Keywords  
  Abstract Active learning aims to select samples to be annotated that yield the largest performance improvement for the learning algorithm. Many methods approach this problem by measuring the informativeness of samples and do this based on the certainty of the network predictions for samples. However, it is well-known that neural networks are overly confident about their prediction and are therefore an untrustworthy source to assess sample informativeness. In this paper, we propose a new informativeness-based active learning method. Our measure is derived from the learning dynamics of a neural network. More precisely we track the label assignment of the unlabeled data pool during the training of the algorithm. We capture the learning dynamics with a metric called label-dispersion, which is low when the network consistently assigns the same label to the sample during the training of the network and high when the assigned label changes frequently. We show that label-dispersion is a promising predictor of the uncertainty of the network, and show on two benchmark datasets that an active learning algorithm based on label-dispersion obtains excellent results.  
  Address September 2021  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference CAIP  
  Notes LAMP; Approved no  
  Call Number Admin @ si @ ZRV2021 Serial 3673  
Permanent link to this record
 

 
Author Pau Riba; Sounak Dey; Ali Furkan Biten; Josep Llados edit   pdf
openurl 
  Title Localizing Infinity-shaped fishes: Sketch-guided object localization in the wild Type Miscellaneous
  Year (up) 2021 Publication Arxiv Abbreviated Journal  
  Volume Issue Pages  
  Keywords  
  Abstract This work investigates the problem of sketch-guided object localization (SGOL), where human sketches are used as queries to conduct the object localization in natural images. In this cross-modal setting, we first contribute with a tough-to-beat baseline that without any specific SGOL training is able to outperform the previous works on a fixed set of classes. The baseline is useful to analyze the performance of SGOL approaches based on available simple yet powerful methods. We advance prior arts by proposing a sketch-conditioned DETR (DEtection TRansformer) architecture which avoids a hard classification and alleviates the domain gap between sketches and images to localize object instances. Although the main goal of SGOL is focused on object detection, we explored its natural extension to sketch-guided instance segmentation. This novel task allows to move towards identifying the objects at pixel level, which is of key importance in several applications. We experimentally demonstrate that our model and its variants significantly advance over previous state-of-the-art results. All training and testing code of our model will be released to facilitate future researchhttps://github.com/priba/sgol_wild.  
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes DAG; 600.121 Approved no  
  Call Number Admin @ si @ RDB2021 Serial 3674  
Permanent link to this record
 

 
Author Albert Suso; Pau Riba; Oriol Ramos Terrades; Josep Llados edit  url
openurl 
  Title A Self-supervised Inverse Graphics Approach for Sketch Parametrization Type Conference Article
  Year (up) 2021 Publication 16th International Conference on Document Analysis and Recognition Abbreviated Journal  
  Volume 12916 Issue Pages 28-42  
  Keywords  
  Abstract The study of neural generative models of handwritten text and human sketches is a hot topic in the computer vision field. The landmark SketchRNN provided a breakthrough by sequentially generating sketches as a sequence of waypoints, and more recent articles have managed to generate fully vector sketches by coding the strokes as Bézier curves. However, the previous attempts with this approach need them all a ground truth consisting in the sequence of points that make up each stroke, which seriously limits the datasets the model is able to train in. In this work, we present a self-supervised end-to-end inverse graphics approach that learns to embed each image to its best fit of Bézier curves. The self-supervised nature of the training process allows us to train the model in a wider range of datasets, but also to perform better after-training predictions by applying an overfitting process on the input binary image. We report qualitative an quantitative evaluations on the MNIST and the Quick, Draw! datasets.  
  Address Lausanne; Suissa; September 2021  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title LNCS  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference ICDAR  
  Notes DAG; 600.121 Approved no  
  Call Number Admin @ si @ SRR2021 Serial 3675  
Permanent link to this record
 

 
Author Sanket Biswas; Pau Riba; Josep Llados; Umapada Pal edit   pdf
url  doi
openurl 
  Title Graph-Based Deep Generative Modelling for Document Layout Generation Type Conference Article
  Year (up) 2021 Publication 16th International Conference on Document Analysis and Recognition Abbreviated Journal  
  Volume 12917 Issue Pages 525-537  
  Keywords  
  Abstract One of the major prerequisites for any deep learning approach is the availability of large-scale training data. When dealing with scanned document images in real world scenarios, the principal information of its content is stored in the layout itself. In this work, we have proposed an automated deep generative model using Graph Neural Networks (GNNs) to generate synthetic data with highly variable and plausible document layouts that can be used to train document interpretation systems, in this case, specially in digital mailroom applications. It is also the first graph-based approach for document layout generation task experimented on administrative document images, in this case, invoices.  
  Address Lausanne; Suissa; September 2021  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title LNCS  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes DAG; 600.121; 600.140; 110.312 Approved no  
  Call Number Admin @ si @ BRL2021 Serial 3676  
Permanent link to this record
 

 
Author Josep Llados edit  openurl
  Title The 5G of Document Intelligence Type Conference Article
  Year (up) 2021 Publication 3rd Workshop on Future of Document Analysis and Recognition Abbreviated Journal  
  Volume Issue Pages  
  Keywords  
  Abstract  
  Address Lausanne; Suissa; September 2021  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference FDAR  
  Notes DAG Approved no  
  Call Number Admin @ si @ Serial 3677  
Permanent link to this record
 

 
Author AN Ruchai; VI Kober; KA Dorofeev; VN Karnaukhov; Mikhail Mozerov edit  url
doi  openurl
  Title Classification of breast abnormalities using a deep convolutional neural network and transfer learning Type Journal Article
  Year (up) 2021 Publication Journal of Communications Technology and Electronics Abbreviated Journal  
  Volume 66 Issue 6 Pages 778–783  
  Keywords  
  Abstract A new algorithm for classification of breast pathologies in digital mammography using a convolutional neural network and transfer learning is proposed. The following pretrained neural networks were chosen: MobileNetV2, InceptionResNetV2, Xception, and ResNetV2. All mammographic images were pre-processed to improve classification reliability. Transfer training was carried out using additional data augmentation and fine-tuning. The performance of the proposed algorithm for classification of breast pathologies in terms of accuracy on real data is discussed and compared with that of state-of-the-art algorithms on the available MIAS database.  
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes LAMP; Approved no  
  Call Number Admin @ si @ RKD2022 Serial 3680  
Permanent link to this record
 

 
Author Shun Yao; Fei Yang; Yongmei Cheng; Mikhail Mozerov edit   pdf
url  doi
openurl 
  Title 3D Shapes Local Geometry Codes Learning with SDF Type Conference Article
  Year (up) 2021 Publication International Conference on Computer Vision Workshops Abbreviated Journal  
  Volume Issue Pages 2110-2117  
  Keywords  
  Abstract A signed distance function (SDF) as the 3D shape description is one of the most effective approaches to represent 3D geometry for rendering and reconstruction. Our work is inspired by the state-of-the-art method DeepSDF [17] that learns and analyzes the 3D shape as the iso-surface of its shell and this method has shown promising results especially in the 3D shape reconstruction and compression domain. In this paper, we consider the degeneration problem of reconstruction coming from the capacity decrease of the DeepSDF model, which approximates the SDF with a neural network and a single latent code. We propose Local Geometry Code Learning (LGCL), a model that improves the original DeepSDF results by learning from a local shape geometry of the full 3D shape. We add an extra graph neural network to split the single transmittable latent code into a set of local latent codes distributed on the 3D shape. Mentioned latent codes are used to approximate the SDF in their local regions, which will alleviate the complexity of the approximation compared to the original DeepSDF. Furthermore, we introduce a new geometric loss function to facilitate the training of these local latent codes. Note that other local shape adjusting methods use the 3D voxel representation, which in turn is a problem highly difficult to solve or even is insolvable. In contrast, our architecture is based on graph processing implicitly and performs the learning regression process directly in the latent code space, thus make the proposed architecture more flexible and also simple for realization. Our experiments on 3D shape reconstruction demonstrate that our LGCL method can keep more details with a significantly smaller size of the SDF decoder and outperforms considerably the original DeepSDF method under the most important quantitative metrics.  
  Address VIRTUAL; October 2021  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference ICCVW  
  Notes LAMP Approved no  
  Call Number Admin @ si @ YYC2021 Serial 3681  
Permanent link to this record
 

 
Author Shiqi Yang; Yaxing Wang; Joost Van de Weijer; Luis Herranz; Shangling Jui edit   pdf
url  openurl
  Title Exploiting the Intrinsic Neighborhood Structure for Source-free Domain Adaptation Type Conference Article
  Year (up) 2021 Publication Thirty-fifth Conference on Neural Information Processing Systems (NeurIPS 2021) Abbreviated Journal  
  Volume Issue Pages  
  Keywords  
  Abstract Domain adaptation (DA) aims to alleviate the domain shift between source domain and target domain. Most DA methods require access to the source data, but often that is not possible (e.g. due to data privacy or intellectual property). In this paper, we address the challenging source-free domain adaptation (SFDA) problem, where the source pretrained model is adapted to the target domain in the absence of source data. Our method is based on the observation that target data, which might no longer align with the source domain classifier, still forms clear clusters. We capture this intrinsic structure by defining local affinity of the target data, and encourage label consistency among data with high local affinity. We observe that higher affinity should be assigned to reciprocal neighbors, and propose a self regularization loss to decrease the negative impact of noisy neighbors. Furthermore, to aggregate information with more context, we consider expanded neighborhoods with small affinity values. In the experimental results we verify that the inherent structure of the target features is an important source of information for domain adaptation. We demonstrate that this local structure can be efficiently captured by considering the local neighbors, the reciprocal neighbors, and the expanded neighborhood. Finally, we achieve state-of-the-art performance on several 2D image and 3D point cloud recognition datasets. Code is available in https://github.com/Albert0147/SFDA_neighbors.  
  Address Online; December 7-10, 2021  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference NIPS  
  Notes LAMP; 600.147; 600.141 Approved no  
  Call Number Admin @ si @ Serial 3691  
Permanent link to this record
 

 
Author Jose Elias Yauri; Aura Hernandez-Sabate; Pau Folch; Debora Gil edit  doi
openurl 
  Title Mental Workload Detection Based on EEG Analysis Type Conference Article
  Year (up) 2021 Publication Artificial Intelligent Research and Development. Proceedings 23rd International Conference of the Catalan Association for Artificial Intelligence. Abbreviated Journal  
  Volume 339 Issue Pages 268-277  
  Keywords Cognitive states; Mental workload; EEG analysis; Neural Networks.  
  Abstract The study of mental workload becomes essential for human work efficiency, health conditions and to avoid accidents, since workload compromises both performance and awareness. Although workload has been widely studied using several physiological measures, minimising the sensor network as much as possible remains both a challenge and a requirement.
Electroencephalogram (EEG) signals have shown a high correlation to specific cognitive and mental states like workload. However, there is not enough evidence in the literature to validate how well models generalize in case of new subjects performing tasks of a workload similar to the ones included during model’s training.
In this paper we propose a binary neural network to classify EEG features across different mental workloads. Two workloads, low and medium, are induced using two variants of the N-Back Test. The proposed model was validated in a dataset collected from 16 subjects and shown a high level of generalization capability: model reported an average recall of 81.81% in a leave-one-out subject evaluation.
 
  Address Virtual; October 20-22 2021  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference CCIA  
  Notes IAM; 600.139; 600.118; 600.145 Approved no  
  Call Number Admin @ si @ Serial 3723  
Permanent link to this record
 

 
Author Trevor Canham; Javier Vazquez; D Long; Richard F. Murray; Michael S Brown edit   pdf
openurl 
  Title Noise Prism: A Novel Multispectral Visualization Technique Type Journal Article
  Year (up) 2021 Publication 31st Color and Imaging Conference Abbreviated Journal  
  Volume Issue Pages  
  Keywords  
  Abstract A novel technique for visualizing multispectral images is proposed. Inspired by how prisms work, our method spreads spectral information over a chromatic noise pattern. This is accomplished by populating the pattern with pixels representing each measurement band at a count proportional to its measured intensity. The method is advantageous because it allows for lightweight encoding and visualization of spectral information
while maintaining the color appearance of the stimulus. A four alternative forced choice (4AFC) experiment was conducted to validate the method’s information-carrying capacity in displaying metameric stimuli of varying colors and spectral basis functions. The scores ranged from 100% to 20% (less than chance given the 4AFC task), with many conditions falling somewhere in between at statistically significant intervals. Using this data, color and texture difference metrics can be evaluated and optimized to predict the legibility of the visualization technique.
 
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference CIC  
  Notes MACO; CIC Approved no  
  Call Number Admin @ si @ CVL2021 Serial 4000  
Permanent link to this record
 

 
Author Parichehr Behjati Ardakani; Pau Rodriguez; Carles Fernandez; Armin Mehri; Xavier Roca; Seiichi Ozawa; Jordi Gonzalez edit  doi
openurl 
  Title Frequency-based Enhancement Network for Efficient Super-Resolution Type Journal Article
  Year (up) 2022 Publication IEEE Access Abbreviated Journal ACCESS  
  Volume 10 Issue Pages 57383-57397  
  Keywords Deep learning; Frequency-based methods; Lightweight architectures; Single image super-resolution  
  Abstract Recently, deep convolutional neural networks (CNNs) have provided outstanding performance in single image super-resolution (SISR). Despite their remarkable performance, the lack of high-frequency information in the recovered images remains a core problem. Moreover, as the networks increase in depth and width, deep CNN-based SR methods are faced with the challenge of computational complexity in practice. A promising and under-explored solution is to adapt the amount of compute based on the different frequency bands of the input. To this end, we present a novel Frequency-based Enhancement Block (FEB) which explicitly enhances the information of high frequencies while forwarding low-frequencies to the output. In particular, this block efficiently decomposes features into low- and high-frequency and assigns more computation to high-frequency ones. Thus, it can help the network generate more discriminative representations by explicitly recovering finer details. Our FEB design is simple and generic and can be used as a direct replacement of commonly used SR blocks with no need to change network architectures. We experimentally show that when replacing SR blocks with FEB we consistently improve the reconstruction error, while reducing the number of parameters in the model. Moreover, we propose a lightweight SR model — Frequency-based Enhancement Network (FENet) — based on FEB that matches the performance of larger models. Extensive experiments demonstrate that our proposal performs favorably against the state-of-the-art SR algorithms in terms of visual quality, memory footprint, and inference time. The code is available at https://github.com/pbehjatii/FENet  
  Address 18 May 2022  
  Corporate Author Thesis  
  Publisher IEEE Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes ISE Approved no  
  Call Number Admin @ si @ BRF2022a Serial 3747  
Permanent link to this record
 

 
Author Giacomo Magnifico; Beata Megyesi; Mohamed Ali Souibgui; Jialuo Chen; Alicia Fornes edit   pdf
url  openurl
  Title Lost in Transcription of Graphic Signs in Ciphers Type Conference Article
  Year (up) 2022 Publication International Conference on Historical Cryptology (HistoCrypt 2022) Abbreviated Journal  
  Volume Issue Pages 153-158  
  Keywords transcription of ciphers; hand-written text recognition of symbols; graphic signs  
  Abstract Hand-written Text Recognition techniques with the aim to automatically identify and transcribe hand-written text have been applied to historical sources including ciphers. In this paper, we compare the performance of two machine learning architectures, an unsupervised method based on clustering and a deep learning method with few-shot learning. Both models are tested on seen and unseen data from historical ciphers with different symbol sets consisting of various types of graphic signs. We compare the models and highlight their differences in performance, with their advantages and shortcomings.  
  Address Amsterdam, Netherlands, June 20-22, 2022  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference HystoCrypt  
  Notes DAG; 600.121; 600.162; 602.230; 600.140 Approved no  
  Call Number Admin @ si @ MBS2022 Serial 3731  
Permanent link to this record
 

 
Author Pau Riba; Lutz Goldmann; Oriol Ramos Terrades; Diede Rusticus; Alicia Fornes; Josep Llados edit  doi
openurl 
  Title Table detection in business document images by message passing networks Type Journal Article
  Year (up) 2022 Publication Pattern Recognition Abbreviated Journal PR  
  Volume 127 Issue Pages 108641  
  Keywords  
  Abstract Tabular structures in business documents offer a complementary dimension to the raw textual data. For instance, there is information about the relationships among pieces of information. Nowadays, digital mailroom applications have become a key service for workflow automation. Therefore, the detection and interpretation of tables is crucial. With the recent advances in information extraction, table detection and recognition has gained interest in document image analysis, in particular, with the absence of rule lines and unknown information about rows and columns. However, business documents usually contain sensitive contents limiting the amount of public benchmarking datasets. In this paper, we propose a graph-based approach for detecting tables in document images which do not require the raw content of the document. Hence, the sensitive content can be previously removed and, instead of using the raw image or textual content, we propose a purely structural approach to keep sensitive data anonymous. Our framework uses graph neural networks (GNNs) to describe the local repetitive structures that constitute a table. In particular, our main application domain are business documents. We have carefully validated our approach in two invoice datasets and a modern document benchmark. Our experiments demonstrate that tables can be detected by purely structural approaches.  
  Address July 2022  
  Corporate Author Thesis  
  Publisher Elsevier Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes DAG; 600.162; 600.121 Approved no  
  Call Number Admin @ si @ RGR2022 Serial 3729  
Permanent link to this record
 

 
Author Hugo Jair Escalante; Heysem Kaya; Albert Ali Salah; Sergio Escalera; Yagmur Gucluturk; Umut Guçlu; Xavier Baro; Isabelle Guyon; Julio C. S. Jacques Junior; Meysam Madadi; Stephane Ayache; Evelyne Viegas; Furkan Gurpinar; Achmadnoer Sukma Wicaksana; Cynthia Liem; Marcel A. J. Van Gerven; Rob Van Lier edit   pdf
url  doi
openurl 
  Title Modeling, Recognizing, and Explaining Apparent Personality from Videos Type Journal Article
  Year (up) 2022 Publication IEEE Transactions on Affective Computing Abbreviated Journal TAC  
  Volume 13 Issue 2 Pages 894-911  
  Keywords  
  Abstract Explainability and interpretability are two critical aspects of decision support systems. Despite their importance, it is only recently that researchers are starting to explore these aspects. This paper provides an introduction to explainability and interpretability in the context of apparent personality recognition. To the best of our knowledge, this is the first effort in this direction. We describe a challenge we organized on explainability in first impressions analysis from video. We analyze in detail the newly introduced data set, evaluation protocol, proposed solutions and summarize the results of the challenge. We investigate the issue of bias in detail. Finally, derived from our study, we outline research opportunities that we foresee will be relevant in this area in the near future.  
  Address 1 April-June 2022  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes HuPBA; no menciona Approved no  
  Call Number Admin @ si @ EKS2022 Serial 3406  
Permanent link to this record
 

 
Author Mohamed Ali Souibgui; Y.Kessentini edit   pdf
url  doi
openurl 
  Title DE-GAN: A Conditional Generative Adversarial Network for Document Enhancement Type Journal Article
  Year (up) 2022 Publication IEEE Transactions on Pattern Analysis and Machine Intelligence Abbreviated Journal TPAMI  
  Volume 44 Issue 3 Pages 1180-1191  
  Keywords  
  Abstract Documents often exhibit various forms of degradation, which make it hard to be read and substantially deteriorate the performance of an OCR system. In this paper, we propose an effective end-to-end framework named Document Enhancement Generative Adversarial Networks (DE-GAN) that uses the conditional GANs (cGANs) to restore severely degraded document images. To the best of our knowledge, this practice has not been studied within the context of generative adversarial deep networks. We demonstrate that, in different tasks (document clean up, binarization, deblurring and watermark removal), DE-GAN can produce an enhanced version of the degraded document with a high quality. In addition, our approach provides consistent improvements compared to state-of-the-art methods over the widely used DIBCO 2013, DIBCO 2017 and H-DIBCO 2018 datasets, proving its ability to restore a degraded document image to its ideal condition. The obtained results on a wide variety of degradation reveal the flexibility of the proposed model to be exploited in other document enhancement problems.  
  Address 1 March 2022  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes DAG; 602.230; 600.121; 600.140 Approved no  
  Call Number Admin @ si @ SoK2022 Serial 3454  
Permanent link to this record
Select All    Deselect All
 |   | 
Details
   print

Save Citations:
Export Records: