|   | 
Details
   web
Records
Author Vacit Oguz Yazici; Abel Gonzalez-Garcia; Arnau Ramisa; Bartlomiej Twardowski; Joost Van de Weijer
Title Orderless Recurrent Models for Multi-label Classification Type Conference Article
Year 2020 Publication 33rd IEEE Conference on Computer Vision and Pattern Recognition Abbreviated Journal
Volume Issue Pages
Keywords
Abstract (up) Recurrent neural networks (RNN) are popular for many computer vision tasks, including multi-label classification. Since RNNs produce sequential outputs, labels need to be ordered for the multi-label classification task. Current approaches sort labels according to their frequency, typically ordering them in either rare-first or frequent-first. These imposed orderings do not take into account that the natural order to generate the labels can change for each image, e.g.\ first the dominant object before summing up the smaller objects in the image. Therefore, in this paper, we propose ways to dynamically order the ground truth labels with the predicted label sequence. This allows for the faster training of more optimal LSTM models for multi-label classification. Analysis evidences that our method does not suffer from duplicate generation, something which is common for other models. Furthermore, it outperforms other CNN-RNN models, and we show that a standard architecture of an image encoder and language decoder trained with our proposed loss obtains the state-of-the-art results on the challenging MS-COCO, WIDER Attribute and PA-100K and competitive results on NUS-WIDE.
Address
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference CVPR
Notes LAMP; 600.109; 601.309; 600.141; 600.120 Approved no
Call Number Admin @ si @ YGR2020 Serial 3408
Permanent link to this record
 

 
Author Jaume Garcia; Francesc Carreras; Sandra Pujades; Debora Gil
Title Regional motion patterns for the Left Ventricle function assessment Type Conference Article
Year 2008 Publication Proc. 19th Int. Conf. Pattern Recognition ICPR 2008 Abbreviated Journal
Volume Issue Pages 1-4
Keywords
Abstract (up) Regional scores (e.g. strain, perfusion) of the Left Ventricle (LV) functionality are playing an increasing role in the diagnosis of cardiac diseases. A main limitation is the lack of normality models for complementary scores oriented to assessment of the LV integrity. This paper introduces an original framework based on a parametrization of the LV domain, which allows comparison across subjects of local physiological measures of different nature. We compute regional normality patterns in a feature space characterizing the LV function. We show the consistency of the model for the regional motion on healthy and hypokinetic pathological cases
Address
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference
Notes IAM Approved no
Call Number IAM @ iam @ GCP2008 Serial 1510
Permanent link to this record
 

 
Author Marçal Rusiñol; Lluis Pere de las Heras; Oriol Ramos Terrades
Title Flowchart Recognition for Non-Textual Information Retrieval in Patent Search Type Journal Article
Year 2014 Publication Information Retrieval Abbreviated Journal IR
Volume 17 Issue 5-6 Pages 545-562
Keywords Flowchart recognition; Patent documents; Text/graphics separation; Raster-to-vector conversion; Symbol recognition
Abstract (up) Relatively little research has been done on the topic of patent image retrieval and in general in most of the approaches the retrieval is performed in terms of a similarity measure between the query image and the images in the corpus. However, systems aimed at overcoming the semantic gap between the visual description of patent images and their conveyed concepts would be very helpful for patent professionals. In this paper we present a flowchart recognition method aimed at achieving a structured representation of flowchart images that can be further queried semantically. The proposed method was submitted to the CLEF-IP 2012 flowchart recognition task. We report the obtained results on this dataset.
Address
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN 1386-4564 ISBN Medium
Area Expedition Conference
Notes DAG; 600.077 Approved no
Call Number Admin @ si @ RHR2013 Serial 2342
Permanent link to this record
 

 
Author Marc Oliu; Ciprian Corneanu; Kamal Nasrollahi; Olegs Nikisins; Sergio Escalera; Yunlian Sun; Haiqing Li; Zhenan Sun; Thomas B. Moeslund; Modris Greitans
Title Improved RGB-D-T based Face Recognition Type Journal Article
Year 2016 Publication IET Biometrics Abbreviated Journal BIO
Volume 5 Issue 4 Pages 297 - 303
Keywords
Abstract (up) Reliable facial recognition systems are of crucial importance in various applications from entertainment to security. Thanks to the deep-learning concepts introduced in the field, a significant improvement in the performance of the unimodal facial recognition systems has been observed in the recent years. At the same time a multimodal facial recognition is a promising approach. This study combines the latest successes in both directions by applying deep learning convolutional neural networks (CNN) to the multimodal RGB, depth, and thermal (RGB-D-T) based facial recognition problem outperforming previously published results. Furthermore, a late fusion of the CNN-based recognition block with various hand-crafted features (local binary patterns, histograms of oriented gradients, Haar-like rectangular features, histograms of Gabor ordinal measures) is introduced, demonstrating even better recognition performance on a benchmark RGB-D-T database. The obtained results in this study show that the classical engineered features and CNN-based features can complement each other for recognition purposes.
Address
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference
Notes HuPBA;MILAB; Approved no
Call Number Admin @ si @ OCN2016 Serial 2854
Permanent link to this record
 

 
Author Partha Pratim Roy; Umapada Pal; Josep Llados
Title Seal detection and recognition: An approach for document indexing Type Conference Article
Year 2009 Publication 10th International Conference on Document Analysis and Recognition Abbreviated Journal
Volume Issue Pages 101–105
Keywords
Abstract (up) Reliable indexing of documents having seal instances can be achieved by recognizing seal information. This paper presents a novel approach for detecting and classifying such multi-oriented seals in these documents. First, Hough Transform based methods are applied to extract the seal regions in documents. Next, isolated text characters within these regions are detected. Rotation and size invariant features and a support vector machine based classifier have been used to recognize these detected text characters. Next, for each pair of character, we encode their relative spatial organization using their distance and angular position with respect to the centre of the seal, and enter this code into a hash table. Given an input seal, we recognize the individual text characters and compute the code for pair-wise character based on the relative spatial organization. The code obtained from the input seal helps to retrieve model hypothesis from the hash table. The seal model to which we get maximum hypothesis is selected for the recognition of the input seal. The methodology is tested to index seal in rotation and size invariant environment and we obtained encouraging results.
Address Barcelona, Spain
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN 1520-5363 ISBN 978-1-4244-4500-4 Medium
Area Expedition Conference ICDAR
Notes DAG Approved no
Call Number DAG @ dag @ RPL2009b Serial 1239
Permanent link to this record
 

 
Author Mohamed Ramzy Ibrahim; Robert Benavente; Daniel Ponsa; Felipe Lumbreras
Title SWViT-RRDB: Shifted Window Vision Transformer Integrating Residual in Residual Dense Block for Remote Sensing Super-Resolution Type Conference Article
Year 2024 Publication 19th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications Abbreviated Journal
Volume Issue Pages
Keywords
Abstract (up) Remote sensing applications, impacted by acquisition season and sensor variety, require high-resolution images. Transformer-based models improve satellite image super-resolution but are less effective than convolutional neural networks (CNNs) at extracting local details, crucial for image clarity. This paper introduces SWViT-RRDB, a new deep learning model for satellite imagery super-resolution. The SWViT-RRDB, combining transformer with convolution and attention blocks, overcomes the limitations of existing models by better representing small objects in satellite images. In this model, a pipeline of residual fusion group (RFG) blocks is used to combine the multi-headed self-attention (MSA) with residual in residual dense block (RRDB). This combines global and local image data for better super-resolution. Additionally, an overlapping cross-attention block (OCAB) is used to enhance fusion and allow interaction between neighboring pixels to maintain long-range pixel dependencies across the image. The SWViT-RRDB model and its larger variants outperform state-of-the-art (SoTA) models on two different satellite datasets in terms of PSNR and SSIM.
Address Roma; Italia; February 2024
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference
Notes MSIAU Approved no
Call Number Admin @ si @ RBP2024 Serial 4004
Permanent link to this record
 

 
Author Jaume Gibert; Ernest Valveny; Horst Bunke
Title Feature Selection on Node Statistics Based Embedding of Graphs Type Journal Article
Year 2012 Publication Pattern Recognition Letters Abbreviated Journal PRL
Volume 33 Issue 15 Pages 1980–1990
Keywords Structural pattern recognition; Graph embedding; Feature ranking; PCA; Graph classification
Abstract (up) Representing a graph with a feature vector is a common way of making statistical machine learning algorithms applicable to the domain of graphs. Such a transition from graphs to vectors is known as graphembedding. A key issue in graphembedding is to select a proper set of features in order to make the vectorial representation of graphs as strong and discriminative as possible. In this article, we propose features that are constructed out of frequencies of node label representatives. We first build a large set of features and then select the most discriminative ones according to different ranking criteria and feature transformation algorithms. On different classification tasks, we experimentally show that only a small significant subset of these features is needed to achieve the same classification rates as competing to state-of-the-art methods.
Address
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference
Notes DAG Approved no
Call Number Admin @ si @ GVB2012b Serial 1993
Permanent link to this record
 

 
Author Naveen Onkarappa; Angel Sappa
Title Synthetic sequences and ground-truth flow field generation for algorithm validation Type Journal Article
Year 2015 Publication Multimedia Tools and Applications Abbreviated Journal MTAP
Volume 74 Issue 9 Pages 3121-3135
Keywords Ground-truth optical flow; Synthetic sequence; Algorithm validation
Abstract (up) Research in computer vision is advancing by the availability of good datasets that help to improve algorithms, validate results and obtain comparative analysis. The datasets can be real or synthetic. For some of the computer vision problems such as optical flow it is not possible to obtain ground-truth optical flow with high accuracy in natural outdoor real scenarios directly by any sensor, although it is possible to obtain ground-truth data of real scenarios in a laboratory setup with limited motion. In this difficult situation computer graphics offers a viable option for creating realistic virtual scenarios. In the current work we present a framework to design virtual scenes and generate sequences as well as ground-truth flow fields. Particularly, we generate a dataset containing sequences of driving scenarios. The sequences in the dataset vary in different speeds of the on-board vision system, different road textures, complex motion of vehicle and independent moving vehicles in the scene. This dataset enables analyzing and adaptation of existing optical flow methods, and leads to invention of new approaches particularly for driver assistance systems.
Address
Corporate Author Thesis
Publisher Springer US Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN 1380-7501 ISBN Medium
Area Expedition Conference
Notes ADAS; 600.055; 601.215; 600.076 Approved no
Call Number Admin @ si @ OnS2014b Serial 2472
Permanent link to this record
 

 
Author Riccardo Del Chiaro; Bartlomiej Twardowski; Andrew Bagdanov; Joost Van de Weijer
Title Recurrent attention to transient tasks for continual image captioning Type Conference Article
Year 2020 Publication 34th Conference on Neural Information Processing Systems Abbreviated Journal
Volume Issue Pages
Keywords
Abstract (up) Research on continual learning has led to a variety of approaches to mitigating catastrophic forgetting in feed-forward classification networks. Until now surprisingly little attention has been focused on continual learning of recurrent models applied to problems like image captioning. In this paper we take a systematic look at continual learning of LSTM-based models for image captioning. We propose an attention-based approach that explicitly accommodates the transient nature of vocabularies in continual image captioning tasks -- i.e. that task vocabularies are not disjoint. We call our method Recurrent Attention to Transient Tasks (RATT), and also show how to adapt continual learning approaches based on weight egularization and knowledge distillation to recurrent continual learning problems. We apply our approaches to incremental image captioning problem on two new continual learning benchmarks we define using the MS-COCO and Flickr30 datasets. Our results demonstrate that RATT is able to sequentially learn five captioning tasks while incurring no forgetting of previously learned ones.
Address virtual; December 2020
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference NEURIPS
Notes LAMP; 600.120 Approved no
Call Number Admin @ si @ CTB2020 Serial 3484
Permanent link to this record
 

 
Author Soumya Jahagirdar; Minesh Mathew; Dimosthenis Karatzas; CV Jawahar
Title Understanding Video Scenes Through Text: Insights from Text-Based Video Question Answering Type Conference Article
Year 2023 Publication Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV) Workshops Abbreviated Journal
Volume Issue Pages
Keywords
Abstract (up) Researchers have extensively studied the field of vision and language, discovering that both visual and textual content is crucial for understanding scenes effectively. Particularly, comprehending text in videos holds great significance, requiring both scene text understanding and temporal reasoning. This paper focuses on exploring two recently introduced datasets, NewsVideoQA and M4-ViteVQA, which aim to address video question answering based on textual content. The NewsVideoQA dataset contains question-answer pairs related to the text in news videos, while M4- ViteVQA comprises question-answer pairs from diverse categories like vlogging, traveling, and shopping. We provide an analysis of the formulation of these datasets on various levels, exploring the degree of visual understanding and multi-frame comprehension required for answering the questions. Additionally, the study includes experimentation with BERT-QA, a text-only model, which demonstrates comparable performance to the original methods on both datasets, indicating the shortcomings in the formulation of these datasets. Furthermore, we also look into the domain adaptation aspect by examining the effectiveness of training on M4-ViteVQA and evaluating on NewsVideoQA and vice-versa, thereby shedding light on the challenges and potential benefits of out-of-domain training.
Address Paris; France; October 2023
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference ICCVW
Notes DAG Approved no
Call Number Admin @ si @ JMK2023 Serial 3946
Permanent link to this record
 

 
Author Alejandro Tabas; Emili Balaguer-Ballester; Laura Igual
Title Spatial Discriminant ICA for RS-fMRI characterisation Type Conference Article
Year 2014 Publication 4th International Workshop on Pattern Recognition in Neuroimaging Abbreviated Journal
Volume Issue Pages 1-4
Keywords
Abstract (up) Resting-State fMRI (RS-fMRI) is a brain imaging technique useful for exploring functional connectivity. A major point of interest in RS-fMRI analysis is to isolate connectivity patterns characterising disorders such as for instance ADHD. Such characterisation is usually performed in two steps: first, all connectivity patterns in the data are extracted by means of Independent Component Analysis (ICA); second, standard statistical tests are performed over the extracted patterns to find differences between control and clinical groups. In this work we introduce a novel, single-step, approach for this problem termed Spatial Discriminant ICA. The algorithm can efficiently isolate networks of functional connectivity characterising a clinical group by combining ICA and a new variant of the Fisher’s Linear Discriminant also introduced in this work. As the characterisation is carried out in a single step, it potentially provides for a richer characterisation of inter-class differences. The algorithm is tested using synthetic and real fMRI data, showing promising results in both experiments.
Address Tübingen; June 2014
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN 978-1-4799-4150-6 Medium
Area Expedition Conference PRNI
Notes OR;MILAB Approved no
Call Number Admin @ si @ TBI2014 Serial 2493
Permanent link to this record
 

 
Author Debora Gil; Oriol Rodriguez; J. Mauri; Petia Radeva
Title Statistical descriptors of the Myocardial perfusion in angiographic images Type Conference Article
Year 2006 Publication Proc. Computers in Cardiology Abbreviated Journal
Volume Issue Pages 677-680
Keywords Anisotropic processing; intravascular ultrasound (IVUS); vessel border segmentation; vessel structure classification.
Abstract (up) Restoration of coronary flow after primary percutaneous coronary intervention in acute myocardial infarction does not always correlate with adequate myocardial perfusion. Recently, coronary angiography has been used to assess microcirculation integrity (Myocardial BlushAnalysis, MBA). Although MBA correlates with patient prognosis there are few image processing methods addressing objective perfusion quantification. The goal of this work is to develop statistical descriptors of the myocardial dyeing pattern allowing objective assessment of myocardial perfusion. Experiments on healthy right coronary arteries show that our approach allows reliable measurements without any specific image acquisition protocol.
Address
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference
Notes IAM;MILAB Approved no
Call Number IAM @ iam @ GRR2006 Serial 1528
Permanent link to this record
 

 
Author Adriana Romero; Carlo Gatta
Title Do We Really Need All These Neurons? Type Conference Article
Year 2013 Publication 6th Iberian Conference on Pattern Recognition and Image Analysis Abbreviated Journal
Volume 7887 Issue Pages 460--467
Keywords Retricted Boltzmann Machine; hidden units; unsupervised learning; classification
Abstract (up) Restricted Boltzmann Machines (RBMs) are generative neural networks that have received much attention recently. In particular, choosing the appropriate number of hidden units is important as it might hinder their representative power. According to the literature, RBM require numerous hidden units to approximate any distribution properly. In this paper, we present an experiment to determine whether such amount of hidden units is required in a classification context. We then propose an incremental algorithm that trains RBM reusing the previously trained parameters using a trade-off measure to determine the appropriate number of hidden units. Results on the MNIST and OCR letters databases show that using a number of hidden units, which is one order of magnitude smaller than the literature estimate, suffices to achieve similar performance. Moreover, the proposed algorithm allows to estimate the required number of hidden units without the need of training many RBM from scratch.
Address Madeira; Portugal; June 2013
Corporate Author Thesis
Publisher Springer Berlin Heidelberg Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title LNCS
Series Volume Series Issue Edition
ISSN 0302-9743 ISBN 978-3-642-38627-5 Medium
Area Expedition Conference IbPRIA
Notes MILAB; 600.046 Approved no
Call Number Admin @ si @ RoG2013 Serial 2311
Permanent link to this record
 

 
Author Masakazu Iwamura; Naoyuki Morimoto; Keishi Tainaka; Dena Bazazian; Lluis Gomez; Dimosthenis Karatzas
Title ICDAR2017 Robust Reading Challenge on Omnidirectional Video Type Conference Article
Year 2017 Publication 14th International Conference on Document Analysis and Recognition Abbreviated Journal
Volume Issue Pages
Keywords
Abstract (up) Results of ICDAR 2017 Robust Reading Challenge on Omnidirectional Video are presented. This competition uses Downtown Osaka Scene Text (DOST) Dataset that was captured in Osaka, Japan with an omnidirectional camera. Hence, it consists of sequential images (videos) of different view angles. Regarding the sequential images as videos (video mode), two tasks of localisation and end-to-end recognition are prepared. Regarding them as a set of still images (still image mode), three tasks of localisation, cropped word recognition and end-to-end recognition are prepared. As the dataset has been captured in Japan, the dataset contains Japanese text but also include text consisting of alphanumeric characters (Latin text). Hence, a submitted result for each task is evaluated in three ways: using Japanese only ground truth (GT), using Latin only GT and using combined GTs of both. Finally, by the submission deadline, we have received two submissions in the text localisation task of the still image mode. We intend to continue the competition in the open mode. Expecting further submissions, in this report we provide baseline results in all the tasks in addition to the submissions from the community.
Address
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference ICDAR
Notes DAG; 600.084; 600.121 Approved no
Call Number Admin @ si @ IMT2017 Serial 3077
Permanent link to this record
 

 
Author David Lloret; Joan Serrat; Antonio Lopez; A. Soler; Juan J. Villanueva
Title Retinal image registration using creases as anatomical landmarks. Type Conference Article
Year 2000 Publication 15 th International Conference on Pattern Recognition Abbreviated Journal
Volume 3 Issue Pages 207-2010
Keywords
Abstract (up) Retinal images are routinely used in ophthalmology to study the optical nerve head and the retina. To assess objectively the evolution of an illness, images taken at different times must be registered. Most methods so far have been designed specifically for a single image modality, like temporal series or stereo pairs of angiographies, fluorescein angiographies or scanning laser ophthalmoscope (SLO) images, which makes them prone to fail when conditions vary. In contrast, the method we propose has shown to be accurate and reliable on all the former modalities. It has been adapted from the 3D registration of CT and MR image to 2D. Relevant features (also known as landmarks) are extracted by means of a robust creaseness operator, and resulting images are iteratively transformed until a maximum in their correlation is achieved. Our method has succeeded in more than 100 pairs tried so far, in all cases including also the scaling as a parameter to be optimized
Address Barcelona.
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference ICPR
Notes ADAS Approved no
Call Number ADAS @ adas @ LSL2000 c Serial 233
Permanent link to this record