Publicacions CVC -- Query Results

[161–170] << 171 172 173 174 175 176 177 178 179 180 >> [181–190]

Details

Records
Author	Vacit Oguz Yazici; Abel Gonzalez-Garcia; Arnau Ramisa; Bartlomiej Twardowski; Joost Van de Weijer
Title	Orderless Recurrent Models for Multi-label Classification			Type	Conference Article
Year	2020	Publication	33rd IEEE Conference on Computer Vision and Pattern Recognition	Abbreviated Journal
Volume		Issue		Pages
Keywords
Abstract	Recurrent neural networks (RNN) are popular for many computer vision tasks, including multi-label classification. Since RNNs produce sequential outputs, labels need to be ordered for the multi-label classification task. Current approaches sort labels according to their frequency, typically ordering them in either rare-first or frequent-first. These imposed orderings do not take into account that the natural order to generate the labels can change for each image, e.g.\ first the dominant object before summing up the smaller objects in the image. Therefore, in this paper, we propose ways to dynamically order the ground truth labels with the predicted label sequence. This allows for the faster training of more optimal LSTM models for multi-label classification. Analysis evidences that our method does not suffer from duplicate generation, something which is common for other models. Furthermore, it outperforms other CNN-RNN models, and we show that a standard architecture of an image encoder and language decoder trained with our proposed loss obtains the state-of-the-art results on the challenging MS-COCO, WIDER Attribute and PA-100K and competitive results on NUS-WIDE.
Address
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference	CVPR
Notes	LAMP; 600.109; 601.309; 600.141; 600.120			Approved	no
Call Number	Admin @ si @ YGR2020			Serial	3408
Permanent link to this record



Author	Jaume Garcia; Francesc Carreras; Sandra Pujades; Debora Gil
Title	Regional motion patterns for the Left Ventricle function assessment			Type	Conference Article
Year	2008	Publication	Proc. 19th Int. Conf. Pattern Recognition ICPR 2008	Abbreviated Journal
Volume		Issue		Pages	1-4
Keywords
Abstract	Regional scores (e.g. strain, perfusion) of the Left Ventricle (LV) functionality are playing an increasing role in the diagnosis of cardiac diseases. A main limitation is the lack of normality models for complementary scores oriented to assessment of the LV integrity. This paper introduces an original framework based on a parametrization of the LV domain, which allows comparison across subjects of local physiological measures of different nature. We compute regional normality patterns in a feature space characterizing the LV function. We show the consistency of the model for the regional motion on healthy and hypokinetic pathological cases
Address
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference
Notes	IAM			Approved	no
Call Number	IAM @ iam @ GCP2008			Serial	1510
Permanent link to this record



Author	Marçal Rusiñol; Lluis Pere de las Heras; Oriol Ramos Terrades
Title	Flowchart Recognition for Non-Textual Information Retrieval in Patent Search			Type	Journal Article
Year	2014	Publication	Information Retrieval	Abbreviated Journal	IR
Volume	17	Issue	5-6	Pages	545-562
Keywords	Flowchart recognition; Patent documents; Text/graphics separation; Raster-to-vector conversion; Symbol recognition
Abstract	Relatively little research has been done on the topic of patent image retrieval and in general in most of the approaches the retrieval is performed in terms of a similarity measure between the query image and the images in the corpus. However, systems aimed at overcoming the semantic gap between the visual description of patent images and their conveyed concepts would be very helpful for patent professionals. In this paper we present a flowchart recognition method aimed at achieving a structured representation of flowchart images that can be further queried semantically. The proposed method was submitted to the CLEF-IP 2012 flowchart recognition task. We report the obtained results on this dataset.
Address
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN	1386-4564	ISBN		Medium
Area		Expedition		Conference
Notes	DAG; 600.077			Approved	no
Call Number	Admin @ si @ RHR2013			Serial	2342
Permanent link to this record



Author	Marc Oliu; Ciprian Corneanu; Kamal Nasrollahi; Olegs Nikisins; Sergio Escalera; Yunlian Sun; Haiqing Li; Zhenan Sun; Thomas B. Moeslund; Modris Greitans
Title	Improved RGB-D-T based Face Recognition			Type	Journal Article
Year	2016	Publication	IET Biometrics	Abbreviated Journal	BIO
Volume	5	Issue	4	Pages	297 - 303
Keywords
Abstract	Reliable facial recognition systems are of crucial importance in various applications from entertainment to security. Thanks to the deep-learning concepts introduced in the field, a significant improvement in the performance of the unimodal facial recognition systems has been observed in the recent years. At the same time a multimodal facial recognition is a promising approach. This study combines the latest successes in both directions by applying deep learning convolutional neural networks (CNN) to the multimodal RGB, depth, and thermal (RGB-D-T) based facial recognition problem outperforming previously published results. Furthermore, a late fusion of the CNN-based recognition block with various hand-crafted features (local binary patterns, histograms of oriented gradients, Haar-like rectangular features, histograms of Gabor ordinal measures) is introduced, demonstrating even better recognition performance on a benchmark RGB-D-T database. The obtained results in this study show that the classical engineered features and CNN-based features can complement each other for recognition purposes.
Address
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference
Notes	HuPBA;MILAB;			Approved	no
Call Number	Admin @ si @ OCN2016			Serial	2854
Permanent link to this record



Author	Partha Pratim Roy; Umapada Pal; Josep Llados
Title	Seal detection and recognition: An approach for document indexing			Type	Conference Article
Year	2009	Publication	10th International Conference on Document Analysis and Recognition	Abbreviated Journal
Volume		Issue		Pages	101–105
Keywords
Abstract	Reliable indexing of documents having seal instances can be achieved by recognizing seal information. This paper presents a novel approach for detecting and classifying such multi-oriented seals in these documents. First, Hough Transform based methods are applied to extract the seal regions in documents. Next, isolated text characters within these regions are detected. Rotation and size invariant features and a support vector machine based classifier have been used to recognize these detected text characters. Next, for each pair of character, we encode their relative spatial organization using their distance and angular position with respect to the centre of the seal, and enter this code into a hash table. Given an input seal, we recognize the individual text characters and compute the code for pair-wise character based on the relative spatial organization. The code obtained from the input seal helps to retrieve model hypothesis from the hash table. The seal model to which we get maximum hypothesis is selected for the recognition of the input seal. The methodology is tested to index seal in rotation and size invariant environment and we obtained encouraging results.
Address	Barcelona, Spain
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN	1520-5363	ISBN	978-1-4244-4500-4	Medium
Area		Expedition		Conference	ICDAR
Notes	DAG			Approved	no
Call Number	DAG @ dag @ RPL2009b			Serial	1239
Permanent link to this record



Author	Mohamed Ramzy Ibrahim; Robert Benavente; Daniel Ponsa; Felipe Lumbreras
Title	SWViT-RRDB: Shifted Window Vision Transformer Integrating Residual in Residual Dense Block for Remote Sensing Super-Resolution			Type	Conference Article
Year	2024	Publication	19th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications	Abbreviated Journal
Volume		Issue		Pages
Keywords
Abstract	Remote sensing applications, impacted by acquisition season and sensor variety, require high-resolution images. Transformer-based models improve satellite image super-resolution but are less effective than convolutional neural networks (CNNs) at extracting local details, crucial for image clarity. This paper introduces SWViT-RRDB, a new deep learning model for satellite imagery super-resolution. The SWViT-RRDB, combining transformer with convolution and attention blocks, overcomes the limitations of existing models by better representing small objects in satellite images. In this model, a pipeline of residual fusion group (RFG) blocks is used to combine the multi-headed self-attention (MSA) with residual in residual dense block (RRDB). This combines global and local image data for better super-resolution. Additionally, an overlapping cross-attention block (OCAB) is used to enhance fusion and allow interaction between neighboring pixels to maintain long-range pixel dependencies across the image. The SWViT-RRDB model and its larger variants outperform state-of-the-art (SoTA) models on two different satellite datasets in terms of PSNR and SSIM.
Address	Roma; Italia; February 2024
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference
Notes	MSIAU			Approved	no
Call Number	Admin @ si @ RBP2024			Serial	4004
Permanent link to this record



Author	Jaume Gibert; Ernest Valveny; Horst Bunke
Title	Feature Selection on Node Statistics Based Embedding of Graphs			Type	Journal Article
Year	2012	Publication	Pattern Recognition Letters	Abbreviated Journal	PRL
Volume	33	Issue	15	Pages	1980–1990
Keywords	Structural pattern recognition; Graph embedding; Feature ranking; PCA; Graph classification
Abstract	Representing a graph with a feature vector is a common way of making statistical machine learning algorithms applicable to the domain of graphs. Such a transition from graphs to vectors is known as graphembedding. A key issue in graphembedding is to select a proper set of features in order to make the vectorial representation of graphs as strong and discriminative as possible. In this article, we propose features that are constructed out of frequencies of node label representatives. We first build a large set of features and then select the most discriminative ones according to different ranking criteria and feature transformation algorithms. On different classification tasks, we experimentally show that only a small significant subset of these features is needed to achieve the same classification rates as competing to state-of-the-art methods.
Address
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference
Notes	DAG			Approved	no
Call Number	Admin @ si @ GVB2012b			Serial	1993
Permanent link to this record



Author	Naveen Onkarappa; Angel Sappa
Title	Synthetic sequences and ground-truth flow field generation for algorithm validation			Type	Journal Article
Year	2015	Publication	Multimedia Tools and Applications	Abbreviated Journal	MTAP
Volume	74	Issue	9	Pages	3121-3135
Keywords	Ground-truth optical flow; Synthetic sequence; Algorithm validation
Abstract	Research in computer vision is advancing by the availability of good datasets that help to improve algorithms, validate results and obtain comparative analysis. The datasets can be real or synthetic. For some of the computer vision problems such as optical flow it is not possible to obtain ground-truth optical flow with high accuracy in natural outdoor real scenarios directly by any sensor, although it is possible to obtain ground-truth data of real scenarios in a laboratory setup with limited motion. In this difficult situation computer graphics offers a viable option for creating realistic virtual scenarios. In the current work we present a framework to design virtual scenes and generate sequences as well as ground-truth flow fields. Particularly, we generate a dataset containing sequences of driving scenarios. The sequences in the dataset vary in different speeds of the on-board vision system, different road textures, complex motion of vehicle and independent moving vehicles in the scene. This dataset enables analyzing and adaptation of existing optical flow methods, and leads to invention of new approaches particularly for driver assistance systems.
Address
Corporate Author				Thesis
Publisher	Springer US	Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN	1380-7501	ISBN		Medium
Area		Expedition		Conference
Notes	ADAS; 600.055; 601.215; 600.076			Approved	no
Call Number	Admin @ si @ OnS2014b			Serial	2472
Permanent link to this record



Author	Riccardo Del Chiaro; Bartlomiej Twardowski; Andrew Bagdanov; Joost Van de Weijer
Title	Recurrent attention to transient tasks for continual image captioning			Type	Conference Article
Year	2020	Publication	34th Conference on Neural Information Processing Systems	Abbreviated Journal
Volume		Issue		Pages
Keywords
Abstract	Research on continual learning has led to a variety of approaches to mitigating catastrophic forgetting in feed-forward classification networks. Until now surprisingly little attention has been focused on continual learning of recurrent models applied to problems like image captioning. In this paper we take a systematic look at continual learning of LSTM-based models for image captioning. We propose an attention-based approach that explicitly accommodates the transient nature of vocabularies in continual image captioning tasks -- i.e. that task vocabularies are not disjoint. We call our method Recurrent Attention to Transient Tasks (RATT), and also show how to adapt continual learning approaches based on weight egularization and knowledge distillation to recurrent continual learning problems. We apply our approaches to incremental image captioning problem on two new continual learning benchmarks we define using the MS-COCO and Flickr30 datasets. Our results demonstrate that RATT is able to sequentially learn five captioning tasks while incurring no forgetting of previously learned ones.
Address	virtual; December 2020
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference	NEURIPS
Notes	LAMP; 600.120			Approved	no
Call Number	Admin @ si @ CTB2020			Serial	3484
Permanent link to this record



Author	Soumya Jahagirdar; Minesh Mathew; Dimosthenis Karatzas; CV Jawahar
Title	Understanding Video Scenes Through Text: Insights from Text-Based Video Question Answering			Type	Conference Article
Year	2023	Publication	Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV) Workshops	Abbreviated Journal
Volume		Issue		Pages
Keywords
Abstract	Researchers have extensively studied the field of vision and language, discovering that both visual and textual content is crucial for understanding scenes effectively. Particularly, comprehending text in videos holds great significance, requiring both scene text understanding and temporal reasoning. This paper focuses on exploring two recently introduced datasets, NewsVideoQA and M4-ViteVQA, which aim to address video question answering based on textual content. The NewsVideoQA dataset contains question-answer pairs related to the text in news videos, while M4- ViteVQA comprises question-answer pairs from diverse categories like vlogging, traveling, and shopping. We provide an analysis of the formulation of these datasets on various levels, exploring the degree of visual understanding and multi-frame comprehension required for answering the questions. Additionally, the study includes experimentation with BERT-QA, a text-only model, which demonstrates comparable performance to the original methods on both datasets, indicating the shortcomings in the formulation of these datasets. Furthermore, we also look into the domain adaptation aspect by examining the effectiveness of training on M4-ViteVQA and evaluating on NewsVideoQA and vice-versa, thereby shedding light on the challenges and potential benefits of out-of-domain training.
Address	Paris; France; October 2023
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference	ICCVW
Notes	DAG			Approved	no
Call Number	Admin @ si @ JMK2023			Serial	3946
Permanent link to this record



Author	Alejandro Tabas; Emili Balaguer-Ballester; Laura Igual
Title	Spatial Discriminant ICA for RS-fMRI characterisation			Type	Conference Article
Year	2014	Publication	4th International Workshop on Pattern Recognition in Neuroimaging	Abbreviated Journal
Volume		Issue		Pages	1-4
Keywords
Abstract	Resting-State fMRI (RS-fMRI) is a brain imaging technique useful for exploring functional connectivity. A major point of interest in RS-fMRI analysis is to isolate connectivity patterns characterising disorders such as for instance ADHD. Such characterisation is usually performed in two steps: first, all connectivity patterns in the data are extracted by means of Independent Component Analysis (ICA); second, standard statistical tests are performed over the extracted patterns to find differences between control and clinical groups. In this work we introduce a novel, single-step, approach for this problem termed Spatial Discriminant ICA. The algorithm can efficiently isolate networks of functional connectivity characterising a clinical group by combining ICA and a new variant of the Fisher’s Linear Discriminant also introduced in this work. As the characterisation is carried out in a single step, it potentially provides for a richer characterisation of inter-class differences. The algorithm is tested using synthetic and real fMRI data, showing promising results in both experiments.
Address	Tübingen; June 2014
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN	978-1-4799-4150-6	Medium
Area		Expedition		Conference	PRNI
Notes	OR;MILAB			Approved	no
Call Number	Admin @ si @ TBI2014			Serial	2493
Permanent link to this record



Author	Debora Gil; Oriol Rodriguez; J. Mauri; Petia Radeva
Title	Statistical descriptors of the Myocardial perfusion in angiographic images			Type	Conference Article
Year	2006	Publication	Proc. Computers in Cardiology	Abbreviated Journal
Volume		Issue		Pages	677-680
Keywords	Anisotropic processing; intravascular ultrasound (IVUS); vessel border segmentation; vessel structure classification.
Abstract	Restoration of coronary flow after primary percutaneous coronary intervention in acute myocardial infarction does not always correlate with adequate myocardial perfusion. Recently, coronary angiography has been used to assess microcirculation integrity (Myocardial BlushAnalysis, MBA). Although MBA correlates with patient prognosis there are few image processing methods addressing objective perfusion quantification. The goal of this work is to develop statistical descriptors of the myocardial dyeing pattern allowing objective assessment of myocardial perfusion. Experiments on healthy right coronary arteries show that our approach allows reliable measurements without any specific image acquisition protocol.
Address
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference
Notes	IAM;MILAB			Approved	no
Call Number	IAM @ iam @ GRR2006			Serial	1528
Permanent link to this record



Author	Adriana Romero; Carlo Gatta
Title	Do We Really Need All These Neurons?			Type	Conference Article
Year	2013	Publication	6th Iberian Conference on Pattern Recognition and Image Analysis	Abbreviated Journal
Volume	7887	Issue		Pages	460--467
Keywords	Retricted Boltzmann Machine; hidden units; unsupervised learning; classification
Abstract	Restricted Boltzmann Machines (RBMs) are generative neural networks that have received much attention recently. In particular, choosing the appropriate number of hidden units is important as it might hinder their representative power. According to the literature, RBM require numerous hidden units to approximate any distribution properly. In this paper, we present an experiment to determine whether such amount of hidden units is required in a classification context. We then propose an incremental algorithm that trains RBM reusing the previously trained parameters using a trade-off measure to determine the appropriate number of hidden units. Results on the MNIST and OCR letters databases show that using a number of hidden units, which is one order of magnitude smaller than the literature estimate, suffices to achieve similar performance. Moreover, the proposed algorithm allows to estimate the required number of hidden units without the need of training many RBM from scratch.
Address	Madeira; Portugal; June 2013
Corporate Author				Thesis
Publisher	Springer Berlin Heidelberg	Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title	LNCS
Series Volume		Series Issue		Edition
ISSN	0302-9743	ISBN	978-3-642-38627-5	Medium
Area		Expedition		Conference	IbPRIA
Notes	MILAB; 600.046			Approved	no
Call Number	Admin @ si @ RoG2013			Serial	2311
Permanent link to this record



Author	Masakazu Iwamura; Naoyuki Morimoto; Keishi Tainaka; Dena Bazazian; Lluis Gomez; Dimosthenis Karatzas
Title	ICDAR2017 Robust Reading Challenge on Omnidirectional Video			Type	Conference Article
Year	2017	Publication	14th International Conference on Document Analysis and Recognition	Abbreviated Journal
Volume		Issue		Pages
Keywords
Abstract	Results of ICDAR 2017 Robust Reading Challenge on Omnidirectional Video are presented. This competition uses Downtown Osaka Scene Text (DOST) Dataset that was captured in Osaka, Japan with an omnidirectional camera. Hence, it consists of sequential images (videos) of different view angles. Regarding the sequential images as videos (video mode), two tasks of localisation and end-to-end recognition are prepared. Regarding them as a set of still images (still image mode), three tasks of localisation, cropped word recognition and end-to-end recognition are prepared. As the dataset has been captured in Japan, the dataset contains Japanese text but also include text consisting of alphanumeric characters (Latin text). Hence, a submitted result for each task is evaluated in three ways: using Japanese only ground truth (GT), using Latin only GT and using combined GTs of both. Finally, by the submission deadline, we have received two submissions in the text localisation task of the still image mode. We intend to continue the competition in the open mode. Expecting further submissions, in this report we provide baseline results in all the tasks in addition to the submissions from the community.
Address
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference	ICDAR
Notes	DAG; 600.084; 600.121			Approved	no
Call Number	Admin @ si @ IMT2017			Serial	3077
Permanent link to this record



Author	David Lloret; Joan Serrat; Antonio Lopez; A. Soler; Juan J. Villanueva
Title	Retinal image registration using creases as anatomical landmarks.			Type	Conference Article
Year	2000	Publication	15 th International Conference on Pattern Recognition	Abbreviated Journal
Volume	3	Issue		Pages	207-2010
Keywords
Abstract	Retinal images are routinely used in ophthalmology to study the optical nerve head and the retina. To assess objectively the evolution of an illness, images taken at different times must be registered. Most methods so far have been designed specifically for a single image modality, like temporal series or stereo pairs of angiographies, fluorescein angiographies or scanning laser ophthalmoscope (SLO) images, which makes them prone to fail when conditions vary. In contrast, the method we propose has shown to be accurate and reliable on all the former modalities. It has been adapted from the 3D registration of CT and MR image to 2D. Relevant features (also known as landmarks) are extracted by means of a robust creaseness operator, and resulting images are iteratively transformed until a maximum in their correlation is achieved. Our method has succeeded in more than 100 pairs tried so far, in all cases including also the scaling as a parameter to be optimized
Address	Barcelona.
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference	ICPR
Notes	ADAS			Approved	no
Call Number	ADAS @ adas @ LSL2000 c			Serial	233
Permanent link to this record