Publicacions CVC -- Query Results

	Publicacions CVC Home \| Show All \| Simple Search \| Advanced Search \| Add Record \| Import	Login Quick Search: Field: contains: ...
	196–210 of 1420 records found matching your query (RSS \| history):

Search & Display Options

Select All Deselect All

[1–10] << 11 12 13 14 15 16 17 18 19 20 >> [21–30]

List View

Citations

Details

	Records
	Author	Ali Furkan Biten; Lluis Gomez; Dimosthenis Karatzas
	Title	Let there be a clock on the beach: Reducing Object Hallucination in Image Captioning			Type	Conference Article
	Year	2022	Publication	Winter Conference on Applications of Computer Vision	Abbreviated Journal
	Volume		Issue		Pages	1381-1390
	Keywords	Measurement; Training; Visualization; Analytical models; Computer vision; Computational modeling; Training data
	Abstract	Explaining an image with missing or non-existent objects is known as object bias (hallucination) in image captioning. This behaviour is quite common in the state-of-the-art captioning models which is not desirable by humans. To decrease the object hallucination in captioning, we propose three simple yet efficient training augmentation method for sentences which requires no new training data or increase in the model size. By extensive analysis, we show that the proposed methods can significantly diminish our models’ object bias on hallucination metrics. Moreover, we experimentally demonstrate that our methods decrease the dependency on the visual features. All of our code, configuration files and model weights are available online.
	Address	Virtual; Waikoloa; Hawai; USA; January 2022
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference	WACV
	Notes	DAG; 600.155; 302.105			Approved	no
	Call Number	Admin @ si @ BGK2022			Serial	3662
Permanent link to this record



	Author	Josep Brugues Pujolras; Lluis Gomez; Dimosthenis Karatzas
	Title	A Multilingual Approach to Scene Text Visual Question Answering			Type	Conference Article
	Year	2022	Publication	Document Analysis Systems.15th IAPR International Workshop, (DAS2022)	Abbreviated Journal
	Volume		Issue		Pages	65-79
	Keywords	Scene text; Visual question answering; Multilingual word embeddings; Vision and language; Deep learning
	Abstract	Scene Text Visual Question Answering (ST-VQA) has recently emerged as a hot research topic in Computer Vision. Current ST-VQA models have a big potential for many types of applications but lack the ability to perform well on more than one language at a time due to the lack of multilingual data, as well as the use of monolingual word embeddings for training. In this work, we explore the possibility to obtain bilingual and multilingual VQA models. In that regard, we use an already established VQA model that uses monolingual word embeddings as part of its pipeline and substitute them by FastText and BPEmb multilingual word embeddings that have been aligned to English. Our experiments demonstrate that it is possible to obtain bilingual and multilingual VQA models with a minimal loss in performance in languages not used during training, as well as a multilingual model trained in multiple languages that match the performance of the respective monolingual baselines.
	Address	La Rochelle, France; May 22–25, 2022
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference	DAS
	Notes	DAG; 611.004; 600.155; 601.002			Approved	no
	Call Number	Admin @ si @ BGK2022b			Serial	3695
Permanent link to this record



	Author	Dena Bazazian; Raul Gomez; Anguelos Nicolaou; Lluis Gomez; Dimosthenis Karatzas; Andrew Bagdanov
	Title	Improving Text Proposals for Scene Images with Fully Convolutional Networks			Type	Conference Article
	Year	2016	Publication	23rd International Conference on Pattern Recognition Workshops	Abbreviated Journal
	Volume		Issue		Pages
	Keywords
	Abstract	Text Proposals have emerged as a class-dependent version of object proposals – efficient approaches to reduce the search space of possible text object locations in an image. Combined with strong word classifiers, text proposals currently yield top state of the art results in end-to-end scene text recognition. In this paper we propose an improvement over the original Text Proposals algorithm of [1], combining it with Fully Convolutional Networks to improve the ranking of proposals. Results on the ICDAR RRC and the COCO-text datasets show superior performance over current state-of-the-art.
	Address	Cancun; Mexico; December 2016
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference	ICPRW
	Notes	DAG; LAMP; 600.084			Approved	no
	Call Number	Admin @ si @ BGN2016			Serial	2823
Permanent link to this record



	Author	R. Bertrand; P. Gomez-Krämer; Oriol Ramos Terrades; P. Franco; Jean-Marc Ogier
	Title	A System Based On Intrinsic Features for Fraudulent Document Detection			Type	Conference Article
	Year	2013	Publication	12th International Conference on Document Analysis and Recognition	Abbreviated Journal
	Volume		Issue		Pages	106-110
	Keywords	paper document; document analysis; fraudulent document; forgery; fake
	Abstract	Paper documents still represent a large amount of information supports used nowadays and may contain critical data. Even though official documents are secured with techniques such as printed patterns or artwork, paper documents suffer froma lack of security. However, the high availability of cheap scanning and printing hardware allows non-experts to easily create fake documents. As the use of a watermarking system added during the document production step is hardly possible, solutions have to be proposed to distinguish a genuine document from a forged one. In this paper, we present an automatic forgery detection method based on document’s intrinsic features at character level. This method is based on the one hand on outlier character detection in a discriminant feature space and on the other hand on the detection of strictly similar characters. Therefore, a feature set iscomputed for all characters. Then, based on a distance between characters of the same class.
	Address	Washington; USA; August 2013
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN	1520-5363	ISBN		Medium
	Area		Expedition		Conference	ICDAR
	Notes	DAG; 600.061			Approved	no
	Call Number	Admin @ si @ BGR2013a			Serial	2332
Permanent link to this record



	Author	Marc Bolaños; Maite Garolera; Petia Radeva
	Title	Active labeling application applied to food-related object recognition			Type	Conference Article
	Year	2013	Publication	5th International Workshop on Multimedia for Cooking & Eating Activities	Abbreviated Journal
	Volume		Issue		Pages	45-50
	Keywords
	Abstract	Every day, lifelogging devices, available for recording different aspects of our daily life, increase in number, quality and functions, just like the multiple applications that we give to them. Applying wearable devices to analyse the nutritional habits of people is a challenging application based on acquiring and analyzing life records in long periods of time. However, to extract the information of interest related to the eating patterns of people, we need automatic methods to process large amount of life-logging data (e.g. recognition of food-related objects). Creating a rich set of manually labeled samples to train the algorithms is slow, tedious and subjective. To address this problem, we propose a novel method in the framework of Active Labeling for construct- ing a training set of thousands of images. Inspired by the hierarchical sampling method for active learning [6], we propose an Active forest that organizes hierarchically the data for easy and fast labeling. Moreover, introducing a classifier into the hierarchical structures, as well as transforming the feature space for better data clustering, additionally im- prove the algorithm. Our method is successfully tested to label 89.700 food-related objects and achieves significant reduction in expert time labelling. Active labeling application applied to food-related object recognition ResearchGate. Available from: http://www.researchgate.net/publication/262252017Activelabelingapplicationappliedtofood-relatedobjectrecognition [accessed Jul 14, 2015].
	Address	Barcelona; October 2013
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference	ACM-CEA
	Notes	MILAB			Approved	no
	Call Number	Admin @ si @ BGR2013b			Serial	2637
Permanent link to this record



	Author	Marc Bolaños; Maite Garolera; Petia Radeva
	Title	Video Segmentation of Life-Logging Videos			Type	Conference Article
	Year	2014	Publication	8th Conference on Articulated Motion and Deformable Objects	Abbreviated Journal
	Volume	8563	Issue		Pages	1-9
	Keywords
	Abstract
	Address
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference	AMDO
	Notes	MILAB			Approved	no
	Call Number	Admin @ si @ BGR2014			Serial	2558
Permanent link to this record



	Author	Marc Bolaños; Maite Garolera; Petia Radeva
	Title	Object Discovery using CNN Features in Egocentric Videos			Type	Conference Article
	Year	2015	Publication	Pattern Recognition and Image Analysis, Proceedings of 7th Iberian Conference , ibPRIA 2015	Abbreviated Journal
	Volume	9117	Issue		Pages	67-74
	Keywords	Object discovery; Egocentric videos; Lifelogging; CNN
	Abstract	Lifelogging devices based on photo/video are spreading faster everyday. This growth can represent great benefits to develop methods for extraction of meaningful information about the user wearing the device and his/her environment. In this paper, we propose a semi-supervised strategy for easily discovering objects relevant to the person wearing a first-person camera. The egocentric video sequence acquired by the camera, uses both the appearance extracted by means of a deep convolutional neural network and an object refill methodology that allow to discover objects even in case of small amount of object appearance in the collection of images. We validate our method on a sequence of 1000 egocentric daily images and obtain results with an F-measure of 0.5, 0.17 better than the state of the art approach.
	Address	Santiago de Compostela; España; June 2015
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title	LNCS
	Series Volume		Series Issue		Edition
	ISSN	0302-9743	ISBN	978-3-319-19389-2	Medium
	Area		Expedition		Conference	IbPRIA
	Notes	MILAB			Approved	no
	Call Number	Admin @ si @ BGR2015			Serial	2596
Permanent link to this record



	Author	Ali Furkan Biten; Lluis Gomez; Marçal Rusiñol; Dimosthenis Karatzas
	Title	Good News, Everyone! Context driven entity-aware captioning for news images			Type	Conference Article
	Year	2019	Publication	32nd IEEE Conference on Computer Vision and Pattern Recognition	Abbreviated Journal
	Volume		Issue		Pages	12458-12467
	Keywords
	Abstract	Current image captioning systems perform at a merely descriptive level, essentially enumerating the objects in the scene and their relations. Humans, on the contrary, interpret images by integrating several sources of prior knowledge of the world. In this work, we aim to take a step closer to producing captions that offer a plausible interpretation of the scene, by integrating such contextual information into the captioning pipeline. For this we focus on the captioning of images used to illustrate news articles. We propose a novel captioning method that is able to leverage contextual information provided by the text of news articles associated with an image. Our model is able to selectively draw information from the article guided by visual cues, and to dynamically extend the output dictionary to out-of-vocabulary named entities that appear in the context source. Furthermore we introduce“ GoodNews”, the largest news image captioning dataset in the literature and demonstrate state-of-the-art results.
	Address	Long beach; California; USA; june 2019
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference	CVPR
	Notes	DAG; 600.129; 600.135; 601.338; 600.121			Approved	no
	Call Number	Admin @ si @ BGR2019			Serial	3289
Permanent link to this record



	Author	Jorge Bernal; Debora Gil; Carles Sanchez; F. Javier Sanchez
	Title	Discarding Non Informative Regions for Efficient Colonoscopy Image Analysis			Type	Conference Article
	Year	2014	Publication	1st MICCAI Workshop on Computer-Assisted and Robotic Endoscopy	Abbreviated Journal
	Volume	8899	Issue		Pages	1-10
	Keywords	Image Segmentation; Polyps, Colonoscopy; Valley Information; Energy Maps
	Abstract	In this paper we present a novel polyp region segmentation method for colonoscopy videos. Our method uses valley information associated to polyp boundaries in order to provide an initial segmentation. This first segmentation is refined to eliminate boundary discontinuities caused by image artifacts or other elements of the scene. Experimental results over a publicly annotated database show that our method outperforms both general and specific segmentation methods by providing more accurate regions rich in polyp content. We also prove how image preprocessing is needed to improve final polyp region segmentation.
	Address	Boston; USA; September 2014
	Corporate Author				Thesis
	Publisher	Springer International Publishing	Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title	LNCS
	Series Volume		Series Issue		Edition
	ISSN	0302-9743	ISBN	978-3-319-13409-3	Medium
	Area		Expedition		Conference	CARE
	Notes	MV; IAM; 600.044; 600.047; 600.060; 600.075			Approved	no
	Call Number	Admin @ si @ BGS2014b			Serial	2503
Permanent link to this record



	Author	Sonia Baeza; Debora Gil; Carles Sanchez; Guillermo Torres; Ignasi Garcia Olive; Ignasi Guasch; Samuel Garcia Reina; Felipe Andreo; Jose Luis Mate; Jose Luis Vercher; Antonio Rosell
	Title	Biopsia virtual radiomica para el diagnóstico histológico de nódulos pulmonares – Resultados intermedios del proyecto Radiolung			Type	Conference Article
	Year	2023	Publication	SEPAR	Abbreviated Journal
	Volume		Issue		Pages
	Keywords
	Abstract	Pòster
	Address	Granada; Spain; June 2023
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference	SEPAR
	Notes	IAM			Approved	no
	Call Number	Admin @ si @ BGS2023			Serial	3951
Permanent link to this record



	Author	Mohammad Ali Bagheri; Gang Hu; Qigang Gao; Sergio Escalera
	Title	A Framework of Multi-Classifier Fusion for Human Action Recognition			Type	Conference Article
	Year	2014	Publication	22nd International Conference on Pattern Recognition	Abbreviated Journal
	Volume		Issue		Pages	1260 - 1265
	Keywords
	Abstract	The performance of different action-recognition methods using skeleton joint locations have been recently studied by several computer vision researchers. However, the potential improvement in classification through classifier fusion by ensemble-based methods has remained unattended. In this work, we evaluate the performance of an ensemble of five action learning techniques, each performing the recognition task from a different perspective. The underlying rationale of the fusion approach is that different learners employ varying structures of input descriptors/features to be trained. These varying structures cannot be attached and used by a single learner. In addition, combining the outputs of several learners can reduce the risk of an unfortunate selection of a poorly performing learner. This leads to having a more robust and general-applicable framework. Also, we propose two simple, yet effective, action description techniques. In order to improve the recognition performance, a powerful combination strategy is utilized based on the Dempster-Shafer theory, which can effectively make use of diversity of base learners trained on different sources of information. The recognition results of the individual classifiers are compared with those obtained from fusing the classifiers' output, showing advanced performance of the proposed methodology.
	Address	Stockholm; Sweden; August 2014
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN	1051-4651	ISBN		Medium
	Area		Expedition		Conference	ICPR
	Notes	HuPBA;MILAB			Approved	no
	Call Number	Admin @ si @ BHG2014			Serial	2446
Permanent link to this record



	Author	Jorge Bernal; Aymeric Histace; Marc Masana; Quentin Angermann; Cristina Sanchez Montes; Cristina Rodriguez de Miguel; Maroua Hammami; Ana Garcia Rodriguez; Henry Cordova; Olivier Romain; Gloria Fernandez Esparrach; Xavier Dray; F. Javier Sanchez
	Title	Polyp Detection Benchmark in Colonoscopy Videos using GTCreator: A Novel Fully Configurable Tool for Easy and Fast Annotation of Image Databases			Type	Conference Article
	Year	2018	Publication	32nd International Congress and Exhibition on Computer Assisted Radiology & Surgery	Abbreviated Journal
	Volume		Issue		Pages
	Keywords
	Abstract
	Address
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference	CARS
	Notes	ISE; MV; 600.119			Approved	no
	Call Number	Admin @ si @ BHM2018			Serial	3089
Permanent link to this record



	Author	Miguel Angel Bautista; Antonio Hernandez; Victor Ponce; Xavier Perez Sala; Xavier Baro; Oriol Pujol; Cecilio Angulo; Sergio Escalera
	Title	Probability-based Dynamic TimeWarping for Gesture Recognition on RGB-D data			Type	Conference Article
	Year	2012	Publication	21st International Conference on Pattern Recognition International Workshop on Depth Image Analysis	Abbreviated Journal
	Volume	7854	Issue		Pages	126-135
	Keywords
	Abstract	Dynamic Time Warping (DTW) is commonly used in gesture recognition tasks in order to tackle the temporal length variability of gestures. In the DTW framework, a set of gesture patterns are compared one by one to a maybe infinite test sequence, and a query gesture category is recognized if a warping cost below a certain threshold is found within the test sequence. Nevertheless, either taking one single sample per gesture category or a set of isolated samples may not encode the variability of such gesture category. In this paper, a probability-based DTW for gesture recognition is proposed. Different samples of the same gesture pattern obtained from RGB-Depth data are used to build a Gaussian-based probabilistic model of the gesture. Finally, the cost of DTW has been adapted accordingly to the new model. The proposed approach is tested in a challenging scenario, showing better performance of the probability-based DTW in comparison to state-of-the-art approaches for gesture recognition on RGB-D data.
	Address
	Corporate Author				Thesis
	Publisher	Springer Berlin Heidelberg	Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN	0302-9743	ISBN	978-3-642-40302-6	Medium
	Area		Expedition		Conference	WDIA
	Notes	MILAB; OR;HuPBA;MV			Approved	no
	Call Number	Admin @ si @ BHP2012			Serial	2120
Permanent link to this record



	Author	Marco Bellantonio; Mohammad A. Haque; Pau Rodriguez; Kamal Nasrollahi; Taisi Telve; Sergio Escalera; Jordi Gonzalez; Thomas B. Moeslund; Pejman Rasti; Golamreza Anbarjafari
	Title	Spatio-Temporal Pain Recognition in CNN-based Super-Resolved Facial Images			Type	Conference Article
	Year	2016	Publication	23rd International Conference on Pattern Recognition	Abbreviated Journal
	Volume	10165	Issue		Pages
	Keywords
	Abstract	Automatic pain detection is a long expected solution to a prevalent medical problem of pain management. This is more relevant when the subject of pain is young children or patients with limited ability to communicate about their pain experience. Computer vision-based analysis of facial pain expression provides a way of efficient pain detection. When deep machine learning methods came into the scene, automatic pain detection exhibited even better performance. In this paper, we figured out three important factors to exploit in automatic pain detection: spatial information available regarding to pain in each of the facial video frames, temporal axis information regarding to pain expression pattern in a subject video sequence, and variation of face resolution. We employed a combination of convolutional neural network and recurrent neural network to setup a deep hybrid pain detection framework that is able to exploit both spatial and temporal pain information from facial video. In order to analyze the effect of different facial resolutions, we introduce a super-resolution algorithm to generate facial video frames with different resolution setups. We investigated the performance on the publicly available UNBC-McMaster Shoulder Pain database. As a contribution, the paper provides novel and important information regarding to the performance of a hybrid deep learning framework for pain detection in facial images of different resolution.
	Address	Cancun; Mexico; December 2016
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title	LNCS
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference	ICPR
	Notes	HuPBA; ISE; 600.098; 600.119			Approved	no
	Call Number	Admin @ si @ BHR2016			Serial	2902
Permanent link to this record



	Author	Dena Bazazian; Dimosthenis Karatzas; Andrew Bagdanov
	Title	Soft-PHOC Descriptor for End-to-End Word Spotting in Egocentric Scene Images			Type	Conference Article
	Year	2018	Publication	International Workshop on Egocentric Perception, Interaction and Computing at ECCV	Abbreviated Journal
	Volume		Issue		Pages
	Keywords
	Abstract	Word spotting in natural scene images has many applications in scene understanding and visual assistance. We propose Soft-PHOC, an intermediate representation of images based on character probability maps. Our representation extends the concept of the Pyramidal Histogram Of Characters (PHOC) by exploiting Fully Convolutional Networks to derive a pixel-wise mapping of the character distribution within candidate word regions. We show how to use our descriptors for word spotting tasks in egocentric camera streams through an efficient text line proposal algorithm. This is based on the Hough Transform over character attribute maps followed by scoring using Dynamic Time Warping (DTW). We evaluate our results on ICDAR 2015 Challenge 4 dataset of incidental scene text captured by an egocentric camera.
	Address	Munich; Alemanya; September 2018
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference	ECCVW
	Notes	DAG; 600.129; 600.121;			Approved	no
	Call Number	Admin @ si @ BKB2018b			Serial	3174
Permanent link to this record