|   | 
Details
   web
Records
Author Mohamed Ali Souibgui; Sanket Biswas; Sana Khamekhem Jemni; Yousri Kessentini; Alicia Fornes; Josep Llados; Umapada Pal
Title DocEnTr: An End-to-End Document Image Enhancement Transformer Type Conference Article
Year 2022 Publication 26th International Conference on Pattern Recognition Abbreviated Journal
Volume Issue Pages 1699-1705
Keywords Degradation; Head; Optical character recognition; Self-supervised learning; Benchmark testing; Transformers; Magnetic heads
Abstract Document images can be affected by many degradation scenarios, which cause recognition and processing difficulties. In this age of digitization, it is important to denoise them for proper usage. To address this challenge, we present a new encoder-decoder architecture based on vision transformers to enhance both machine-printed and handwritten document images, in an end-to-end fashion. The encoder operates directly on the pixel patches with their positional information without the use of any convolutional layers, while the decoder reconstructs a clean image from the encoded patches. Conducted experiments show a superiority of the proposed model compared to the state-of the-art methods on several DIBCO benchmarks. Code and models will be publicly available at: https://github.com/dali92002/DocEnTR
Address August 21-25, 2022 , Montréal Québec
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference (down) ICPR
Notes DAG; 600.121; 600.162; 602.230; 600.140 Approved no
Call Number Admin @ si @ SBJ2022 Serial 3730
Permanent link to this record
 

 
Author Carlos Boned Riera; Oriol Ramos Terrades
Title Discriminative Neural Variational Model for Unbalanced Classification Tasks in Knowledge Graph Type Conference Article
Year 2022 Publication 26th International Conference on Pattern Recognition Abbreviated Journal
Volume Issue Pages 2186-2191
Keywords Measurement; Couplings; Semantics; Ear; Benchmark testing; Data models; Pattern recognition
Abstract Nowadays the paradigm of link discovery problems has shown significant improvements on Knowledge Graphs. However, method performances are harmed by the unbalanced nature of this classification problem, since many methods are easily biased to not find proper links. In this paper we present a discriminative neural variational auto-encoder model, called DNVAE from now on, in which we have introduced latent variables to serve as embedding vectors. As a result, the learnt generative model approximate better the underlying distribution and, at the same time, it better differentiate the type of relations in the knowledge graph. We have evaluated this approach on benchmark knowledge graph and Census records. Results in this last data set are quite impressive since we reach the highest possible score in the evaluation metrics. However, further experiments are still needed to deeper evaluate the performance of the method in more challenging tasks.
Address Montreal; Quebec; Canada; August 2022
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference (down) ICPR
Notes DAG; 600.121; 600.162 Approved no
Call Number Admin @ si @ BoR2022 Serial 3741
Permanent link to this record
 

 
Author Vacit Oguz Yazici; Joost Van de Weijer; Longlong Yu
Title Visual Transformers with Primal Object Queries for Multi-Label Image Classification Type Conference Article
Year 2022 Publication 26th International Conference on Pattern Recognition Abbreviated Journal
Volume Issue Pages
Keywords
Abstract Multi-label image classification is about predicting a set of class labels that can be considered as orderless sequential data. Transformers process the sequential data as a whole, therefore they are inherently good at set prediction. The first vision-based transformer model, which was proposed for the object detection task introduced the concept of object queries. Object queries are learnable positional encodings that are used by attention modules in decoder layers to decode the object classes or bounding boxes using the region of interests in an image. However, inputting the same set of object queries to different decoder layers hinders the training: it results in lower performance and delays convergence. In this paper, we propose the usage of primal object queries that are only provided at the start of the transformer decoder stack. In addition, we improve the mixup technique proposed for multi-label classification. The proposed transformer model with primal object queries improves the state-of-the-art class wise F1 metric by 2.1% and 1.8%; and speeds up the convergence by 79.0% and 38.6% on MS-COCO and NUS-WIDE datasets respectively.
Address Montreal; Quebec; Canada; August 2022
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference (down) ICPR
Notes LAMP; 600.147; 601.309 Approved no
Call Number Admin @ si @ YWY2022 Serial 3786
Permanent link to this record
 

 
Author Ayan Banerjee; Palaiahnakote Shivakumara; Parikshit Acharya; Umapada Pal; Josep Llados
Title TWD: A New Deep E2E Model for Text Watermark Detection in Video Images Type Conference Article
Year 2022 Publication 26th International Conference on Pattern Recognition Abbreviated Journal
Volume Issue Pages
Keywords Deep learning; U-Net; FCENet; Scene text detection; Video text detection; Watermark text detection
Abstract Text watermark detection in video images is challenging because text watermark characteristics are different from caption and scene texts in the video images. Developing a successful model for detecting text watermark, caption, and scene texts is an open challenge. This study aims at developing a new Deep End-to-End model for Text Watermark Detection (TWD), caption and scene text in video images. To standardize non-uniform contrast, quality, and resolution, we explore the U-Net3+ model for enhancing poor quality text without affecting high-quality text. Similarly, to address the challenges of arbitrary orientation, text shapes and complex background, we explore Stacked Hourglass Encoded Fourier Contour Embedding Network (SFCENet) by feeding the output of the U-Net3+ model as input. Furthermore, the proposed work integrates enhancement and detection models as an end-to-end model for detecting multi-type text in video images. To validate the proposed model, we create our own dataset (named TW-866), which provides video images containing text watermark, caption (subtitles), as well as scene text. The proposed model is also evaluated on standard natural scene text detection datasets, namely, ICDAR 2019 MLT, CTW1500, Total-Text, and DAST1500. The results show that the proposed method outperforms the existing methods. This is the first work on text watermark detection in video images to the best of our knowledge
Address Montreal; Quebec; Canada; August 2022
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference (down) ICPR
Notes DAG; Approved no
Call Number Admin @ si @ BSA2022 Serial 3788
Permanent link to this record
 

 
Author Katerine Diaz; Francesc J. Ferri; W. Diaz
Title Fast Approximated Discriminative Common Vectors using rank-one SVD updates Type Conference Article
Year 2013 Publication 20th International Conference On Neural Information Processing Abbreviated Journal
Volume 8228 Issue III Pages 368-375
Keywords
Abstract An efficient incremental approach to the discriminative common vector (DCV) method for dimensionality reduction and classification is presented. The proposal consists of a rank-one update along with an adaptive restriction on the rank of the null space which leads to an approximate but convenient solution. The algorithm can be implemented very efficiently in terms of matrix operations and space complexity, which enables its use in large-scale dynamic application domains. Deep comparative experimentation using publicly available high dimensional image datasets has been carried out in order to properly assess the proposed algorithm against several recent incremental formulations.
K. Diaz-Chito, F.J. Ferri, W. Diaz
Address Daegu; Korea; November 2013
Corporate Author Thesis
Publisher Springer Berlin Heidelberg Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title LNCS
Series Volume Series Issue Edition
ISSN 0302-9743 ISBN 978-3-642-42050-4 Medium
Area Expedition Conference (down) ICONIP
Notes ADAS Approved no
Call Number Admin @ si @ DFD2013 Serial 2439
Permanent link to this record
 

 
Author Jaume Amores; David Geronimo; Antonio Lopez
Title Multiple instance and active learning for weakly-supervised object-class segmentation Type Conference Article
Year 2010 Publication 3rd IEEE International Conference on Machine Vision Abbreviated Journal
Volume Issue Pages
Keywords Multiple Instance Learning; Active Learning; Object-class segmentation.
Abstract In object-class segmentation, one of the most tedious tasks is to manually segment many object examples in order to learn a model of the object category. Yet, there has been little research on reducing the degree of manual annotation for
object-class segmentation. In this work we explore alternative strategies which do not require full manual segmentation of the object in the training set. In particular, we study the use of bounding boxes as a coarser and much cheaper form of segmentation and we perform a comparative study of several Multiple-Instance Learning techniques that allow to obtain a model with this type of weak annotation. We show that some of these methods can be competitive, when used with coarse
segmentations, with methods that require full manual segmentation of the objects. Furthermore, we show how to use active learning combined with this weakly supervised strategy.
As we see, this strategy permits to reduce the amount of annotation and optimize the number of examples that require full manual segmentation in the training set.
Address Hong-Kong
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference (down) ICMV
Notes ADAS Approved no
Call Number ADAS @ adas @ AGL2010b Serial 1429
Permanent link to this record
 

 
Author Maedeh Aghaei; Mariella Dimiccoli; Petia Radeva
Title Towards social interaction detection in egocentric photo-streams Type Conference Article
Year 2015 Publication Proceedings of SPIE, 8th International Conference on Machine Vision , ICMV 2015 Abbreviated Journal
Volume 9875 Issue Pages
Keywords
Abstract Detecting social interaction in videos relying solely on visual cues is a valuable task that is receiving increasing attention in recent years. In this work, we address this problem in the challenging domain of egocentric photo-streams captured by a low temporal resolution wearable camera (2fpm). The major difficulties to be handled in this context are the sparsity of observations as well as unpredictability of camera motion and attention orientation due to the fact that the camera is worn as part of clothing. Our method consists of four steps: multi-faces localization and tracking, 3D localization, pose estimation and analysis of f-formations. By estimating pair-to-pair interaction probabilities over the sequence, our method states the presence or absence of interaction with the camera wearer and specifies which people are more involved in the interaction. We tested our method over a dataset of 18.000 images and we show its reliability on our considered purpose. © (2015) COPYRIGHT Society of Photo-Optical Instrumentation Engineers (SPIE). Downloading of the abstract is permitted for personal use only.
Address
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference (down) ICMV
Notes MILAB Approved no
Call Number Admin @ si @ ADR2015a Serial 2702
Permanent link to this record
 

 
Author Muhammad Anwer Rao; Fahad Shahbaz Khan; Joost Van de Weijer; Jorma Laaksonen
Title Combining Holistic and Part-based Deep Representations for Computational Painting Categorization Type Conference Article
Year 2016 Publication 6th International Conference on Multimedia Retrieval Abbreviated Journal
Volume Issue Pages
Keywords
Abstract Automatic analysis of visual art, such as paintings, is a challenging inter-disciplinary research problem. Conventional approaches only rely on global scene characteristics by encoding holistic information for computational painting categorization.We argue that such approaches are sub-optimal and that discriminative common visual structures provide complementary information for painting classification. We present an approach that encodes both the global scene layout and discriminative latent common structures for computational painting categorization. The region of interests are automatically extracted, without any manual part labeling, by training class-specific deformable part-based models. Both holistic and region-of-interests are then described using multi-scale dense convolutional features. These features are pooled separately using Fisher vector encoding and concatenated afterwards in a single image representation. Experiments are performed on a challenging dataset with 91 different painters and 13 diverse painting styles. Our approach outperforms the standard method, which only employs the global scene characteristics. Furthermore, our method achieves state-of-the-art results outperforming a recent multi-scale deep features based approach [11] by 6.4% and 3.8% respectively on artist and style classification.
Address New York; USA; June 2016
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference (down) ICMR
Notes LAMP; 600.068; 600.079;ADAS Approved no
Call Number Admin @ si @ RKW2016 Serial 2763
Permanent link to this record
 

 
Author Y. Patel; Lluis Gomez; Marçal Rusiñol; Dimosthenis Karatzas; C.V. Jawahar
Title Self-Supervised Visual Representations for Cross-Modal Retrieval Type Conference Article
Year 2019 Publication ACM International Conference on Multimedia Retrieval Abbreviated Journal
Volume Issue Pages 182–186
Keywords
Abstract Cross-modal retrieval methods have been significantly improved in last years with the use of deep neural networks and large-scale annotated datasets such as ImageNet and Places. However, collecting and annotating such datasets requires a tremendous amount of human effort and, besides, their annotations are limited to discrete sets of popular visual classes that may not be representative of the richer semantics found on large-scale cross-modal retrieval datasets. In this paper, we present a self-supervised cross-modal retrieval framework that leverages as training data the correlations between images and text on the entire set of Wikipedia articles. Our method consists in training a CNN to predict: (1) the semantic context of the article in which an image is more probable to appear as an illustration, and (2) the semantic context of its caption. Our experiments demonstrate that the proposed method is not only capable of learning discriminative visual representations for solving vision tasks like classification, but that the learned representations are better for cross-modal retrieval when compared to supervised pre-training of the network on the ImageNet dataset.
Address Otawa; Canada; june 2019
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference (down) ICMR
Notes DAG; 600.121; 600.129 Approved no
Call Number Admin @ si @ PGR2019 Serial 3288
Permanent link to this record
 

 
Author Alex Falcon; Swathikiran Sudhakaran; Giuseppe Serra; Sergio Escalera; Oswald Lanz
Title Relevance-based Margin for Contrastively-trained Video Retrieval Models Type Conference Article
Year 2022 Publication ICMR '22: Proceedings of the 2022 International Conference on Multimedia Retrieval Abbreviated Journal
Volume Issue Pages 146-157
Keywords
Abstract Video retrieval using natural language queries has attracted increasing interest due to its relevance in real-world applications, from intelligent access in private media galleries to web-scale video search. Learning the cross-similarity of video and text in a joint embedding space is the dominant approach. To do so, a contrastive loss is usually employed because it organizes the embedding space by putting similar items close and dissimilar items far. This framework leads to competitive recall rates, as they solely focus on the rank of the groundtruth items. Yet, assessing the quality of the ranking list is of utmost importance when considering intelligent retrieval systems, since multiple items may share similar semantics, hence a high relevance. Moreover, the aforementioned framework uses a fixed margin to separate similar and dissimilar items, treating all non-groundtruth items as equally irrelevant. In this paper we propose to use a variable margin: we argue that varying the margin used during training based on how much relevant an item is to a given query, i.e. a relevance-based margin, easily improves the quality of the ranking lists measured through nDCG and mAP. We demonstrate the advantages of our technique using different models on EPIC-Kitchens-100 and YouCook2. We show that even if we carefully tuned the fixed margin, our technique (which does not have the margin as a hyper-parameter) would still achieve better performance. Finally, extensive ablation studies and qualitative analysis support the robustness of our approach. Code will be released at \urlhttps://github.com/aranciokov/RelevanceMargin-ICMR22.
Address Newwark, NJ, USA, 27 June 2022
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference (down) ICMR
Notes HuPBA; no menciona Approved no
Call Number Admin @ si @ FSS2022 Serial 3808
Permanent link to this record
 

 
Author Marc Masana; Bartlomiej Twardowski; Joost Van de Weijer
Title On Class Orderings for Incremental Learning Type Conference Article
Year 2020 Publication ICML Workshop on Continual Learning Abbreviated Journal
Volume Issue Pages
Keywords
Abstract The influence of class orderings in the evaluation of incremental learning has received very little attention. In this paper, we investigate the impact of class orderings for incrementally learned classifiers. We propose a method to compute various orderings for a dataset. The orderings are derived by simulated annealing optimization from the confusion matrix and reflect different incremental learning scenarios, including maximally and minimally confusing tasks. We evaluate a wide range of state-of-the-art incremental learning methods on the proposed orderings. Results show that orderings can have a significant impact on performance and the ranking of the methods.
Address Virtual; July 2020
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference (down) ICMLW
Notes LAMP; 600.120 Approved no
Call Number Admin @ si @ MTW2020 Serial 3505
Permanent link to this record
 

 
Author David Berga; Marc Masana; Joost Van de Weijer
Title Disentanglement of Color and Shape Representations for Continual Learning Type Conference Article
Year 2020 Publication ICML Workshop on Continual Learning Abbreviated Journal
Volume Issue Pages
Keywords
Abstract We hypothesize that disentangled feature representations suffer less from catastrophic forgetting. As a case study we perform explicit disentanglement of color and shape, by adjusting the network architecture. We tested classification accuracy and forgetting in a task-incremental setting with Oxford-102 Flowers dataset. We combine our method with Elastic Weight Consolidation, Learning without Forgetting, Synaptic Intelligence and Memory Aware Synapses, and show that feature disentanglement positively impacts continual learning performance.
Address Virtual; July 2020
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference (down) ICMLW
Notes LAMP; 600.120 Approved no
Call Number Admin @ si @ BMW2020 Serial 3506
Permanent link to this record
 

 
Author Albin Soutif; Marc Masana; Joost Van de Weijer; Bartlomiej Twardowski
Title On the importance of cross-task features for class-incremental learning Type Conference Article
Year 2021 Publication Theory and Foundation of continual learning workshop of ICML Abbreviated Journal
Volume Issue Pages
Keywords
Abstract In class-incremental learning, an agent with limited resources needs to learn a sequence of classification tasks, forming an ever growing classification problem, with the constraint of not being able to access data from previous tasks. The main difference with task-incremental learning, where a task-ID is available at inference time, is that the learner also needs to perform crosstask discrimination, i.e. distinguish between classes that have not been seen together. Approaches to tackle this problem are numerous and mostly make use of an external memory (buffer) of non-negligible size. In this paper, we ablate the learning of crosstask features and study its influence on the performance of basic replay strategies used for class-IL. We also define a new forgetting measure for class-incremental learning, and see that forgetting is not the principal cause of low performance. Our experimental results show that future algorithms for class-incremental learning should not only prevent forgetting, but also aim to improve the quality of the cross-task features. This is especially important when the number of classes per task is small.
Address Virtual; July 2021
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference (down) ICMLW
Notes LAMP Approved no
Call Number Admin @ si @ SMW2021 Serial 3588
Permanent link to this record
 

 
Author Isabelle Guyon; Kristin Bennett; Gavin Cawley; Hugo Jair Escalante; Sergio Escalera; Tin Kam Ho; Nuria Macia; Bisakha Ray; Mehreen Saeed; Alexander Statnikov; Evelyne Viegas
Title AutoML Challenge 2015: Design and First Results Type Conference Article
Year 2015 Publication 32nd International Conference on Machine Learning, ICML workshop, JMLR proceedings ICML15 Abbreviated Journal
Volume Issue Pages 1-8
Keywords AutoML Challenge; machine learning; model selection; meta-learning; repre- sentation learning; active learning
Abstract ChaLearn is organizing the Automatic Machine Learning (AutoML) contest 2015, which challenges participants to solve classi cation and regression problems without any human intervention. Participants' code is automatically run on the contest servers to train and test learning machines. However, there is no obligation to submit code; half of the prizes can be won by submitting prediction results only. Datasets of progressively increasing diculty are introduced throughout the six rounds of the challenge. (Participants can
enter the competition in any round.) The rounds alternate phases in which learners are tested on datasets participants have not seen (AutoML), and phases in which participants have limited time to tweak their algorithms on those datasets to improve performance (Tweakathon). This challenge will push the state of the art in fully automatic machine learning on a wide range of real-world problems. The platform will remain available beyond the termination of the challenge: http://codalab.org/AutoML.
Address Lille; France; July 2015
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference (down) ICML
Notes HuPBA;MILAB Approved no
Call Number Admin @ si @ GBC2015c Serial 2656
Permanent link to this record
 

 
Author Isabelle Guyon; Imad Chaabane; Hugo Jair Escalante; Sergio Escalera; Damir Jajetic; James Robert Lloyd; Nuria Macia; Bisakha Ray; Lukasz Romaszko; Michele Sebag; Alexander Statnikov; Sebastien Treguer; Evelyne Viegas
Title A brief Review of the ChaLearn AutoML Challenge: Any-time Any-dataset Learning without Human Intervention Type Conference Article
Year 2016 Publication AutoML Workshop Abbreviated Journal
Volume Issue 1 Pages 1-8
Keywords AutoML Challenge; machine learning; model selection; meta-learning; repre- sentation learning; active learning
Abstract The ChaLearn AutoML Challenge team conducted a large scale evaluation of fully automatic, black-box learning machines for feature-based classification and regression problems. The test bed was composed of 30 data sets from a wide variety of application domains and ranged across different types of complexity. Over six rounds, participants succeeded in delivering AutoML software capable of being trained and tested without human intervention. Although improvements can still be made to close the gap between human-tweaked and AutoML models, this competition contributes to the development of fully automated environments by challenging practitioners to solve problems under specific constraints and sharing their approaches; the platform will remain available for post-challenge submissions at http://codalab.org/AutoML.
Address New York; USA; June 2016
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference (down) ICML
Notes HuPBA;MILAB Approved no
Call Number Admin @ si @ GCE2016 Serial 2769
Permanent link to this record