Publicacions CVC -- Query Results

	Publicacions CVC Home \| Show All \| Simple Search \| Advanced Search \| Add Record \| Import	Login Quick Search: Field: contains: ...
	3346–3360 of 3413 records found matching your query (RSS \| history):

Search & Display Options

Select All Deselect All

[211–220] << 221 222 223 224 225 226 227 228 >>

List View

Citations

Details

	Records
	Author	Xose M. Pardo; Petia Radeva
	Title	Discriminant snakes for 3D reconstruction in medical Images.			Type	Conference Article
	Year	2000	Publication	15 th International Conference on Pattern Recognition	Abbreviated Journal
	Volume	4	Issue		Pages	336-339
	Keywords
	Abstract
	Address	Barcelona.
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference	ICPR
	Notes	MILAB			Approved	no
	Call Number	BCNPCL @ bcnpcl @ PaR2000			Serial	234
Permanent link to this record



	Author	Xose M. Pardo; Petia Radeva; D. Cabello
	Title	Discriminant Snakes for 3D Reconstruction of Anatomical Organs			Type	Journal
	Year	2003	Publication	Medical Image Analysis, 7(3): 293–310 (IF: 4.442)	Abbreviated Journal
	Volume		Issue		Pages
	Keywords
	Abstract
	Address
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference
	Notes	MILAB			Approved	no
	Call Number	BCNPCL @ bcnpcl @ PPC2003			Serial	398
Permanent link to this record



	Author	Xose M. Pardo; Petia Radeva; Juan J. Villanueva
	Title	Self-Training Statistic Snake for Image Segmentation and Tracking.			Type	Miscellaneous
	Year	1999	Publication		Abbreviated Journal
	Volume		Issue		Pages
	Keywords
	Abstract
	Address	Venice
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference
	Notes	MILAB			Approved	no
	Call Number	BCNPCL @ bcnpcl @ PRV1999			Serial	26
Permanent link to this record



	Author	Xu Hu
	Title	Real-Time Part Based Models for Object Detection			Type	Report
	Year	2012	Publication	CVC Technical Report	Abbreviated Journal
	Volume	171	Issue		Pages
	Keywords
	Abstract
	Address
	Corporate Author				Thesis	Master's thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference
	Notes	ADAS;ISE			Approved	no
	Call Number	Admin @ si @ Hu2012			Serial	2415
Permanent link to this record



	Author	Y. Mori; M.Misawa; Jorge Bernal; M. Bretthauer; S.Kudo; A. Rastogi; Gloria Fernandez Esparrach
	Title	Artificial Intelligence for Disease Diagnosis-the Gold Standard Challenge			Type	Journal Article
	Year	2022	Publication	Gastrointestinal Endoscopy	Abbreviated Journal
	Volume	96	Issue	2	Pages	370-372
	Keywords
	Abstract
	Address
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference
	Notes	ISE			Approved	no
	Call Number	Admin @ si @ MMB2022			Serial	3701
Permanent link to this record



	Author	Y. Patel; Lluis Gomez; Marçal Rusiñol; Dimosthenis Karatzas
	Title	Dynamic Lexicon Generation for Natural Scene Images			Type	Conference Article
	Year	2016	Publication	14th European Conference on Computer Vision Workshops	Abbreviated Journal
	Volume		Issue		Pages	395-410
	Keywords	scene text; photo OCR; scene understanding; lexicon generation; topic modeling; CNN
	Abstract	Many scene text understanding methods approach the endtoend recognition problem from a word-spotting perspective and take huge benet from using small per-image lexicons. Such customized lexicons are normally assumed as given and their source is rarely discussed. In this paper we propose a method that generates contextualized lexicons for scene images using only visual information. For this, we exploit the correlation between visual and textual information in a dataset consisting of images and textual content associated with them. Using the topic modeling framework to discover a set of latent topics in such a dataset allows us to re-rank a xed dictionary in a way that prioritizes the words that are more likely to appear in a given image. Moreover, we train a CNN that is able to reproduce those word rankings but using only the image raw pixels as input. We demonstrate that the quality of the automatically obtained custom lexicons is superior to a generic frequency-based baseline.
	Address	Amsterdam; The Netherlands; October 2016
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference	ECCVW
	Notes	DAG; 600.084			Approved	no
	Call Number	Admin @ si @ PGR2016			Serial	2825
Permanent link to this record



	Author	Y. Patel; Lluis Gomez; Marçal Rusiñol; Dimosthenis Karatzas; C.V. Jawahar
	Title	Self-Supervised Visual Representations for Cross-Modal Retrieval			Type	Conference Article
	Year	2019	Publication	ACM International Conference on Multimedia Retrieval	Abbreviated Journal
	Volume		Issue		Pages	182–186
	Keywords
	Abstract	Cross-modal retrieval methods have been significantly improved in last years with the use of deep neural networks and large-scale annotated datasets such as ImageNet and Places. However, collecting and annotating such datasets requires a tremendous amount of human effort and, besides, their annotations are limited to discrete sets of popular visual classes that may not be representative of the richer semantics found on large-scale cross-modal retrieval datasets. In this paper, we present a self-supervised cross-modal retrieval framework that leverages as training data the correlations between images and text on the entire set of Wikipedia articles. Our method consists in training a CNN to predict: (1) the semantic context of the article in which an image is more probable to appear as an illustration, and (2) the semantic context of its caption. Our experiments demonstrate that the proposed method is not only capable of learning discriminative visual representations for solving vision tasks like classification, but that the learned representations are better for cross-modal retrieval when compared to supervised pre-training of the network on the ImageNet dataset.
	Address	Otawa; Canada; june 2019
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference	ICMR
	Notes	DAG; 600.121; 600.129			Approved	no
	Call Number	Admin @ si @ PGR2019			Serial	3288
Permanent link to this record



	Author	Y. Patel; Lluis Gomez; Raul Gomez; Marçal Rusiñol; Dimosthenis Karatzas; C.V. Jawahar
	Title	TextTopicNet-Self-Supervised Learning of Visual Features Through Embedding Images on Semantic Text Spaces			Type	Miscellaneous
	Year	2018	Publication	Arxiv	Abbreviated Journal
	Volume		Issue		Pages
	Keywords
	Abstract	The immense success of deep learning based methods in computer vision heavily relies on large scale training datasets. These richly annotated datasets help the network learn discriminative visual features. Collecting and annotating such datasets requires a tremendous amount of human effort and annotations are limited to popular set of classes. As an alternative, learning visual features by designing auxiliary tasks which make use of freely available self-supervision has become increasingly popular in the computer vision community. In this paper, we put forward an idea to take advantage of multi-modal context to provide self-supervision for the training of computer vision algorithms. We show that adequate visual features can be learned efficiently by training a CNN to predict the semantic textual context in which a particular image is more probable to appear as an illustration. More specifically we use popular text embedding techniques to provide the self-supervision for the training of deep CNN.
	Address
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference
	Notes	DAG; 600.084; 601.338; 600.121			Approved	no
	Call Number	Admin @ si @ PGG2018			Serial	3177
Permanent link to this record



	Author	Yael Tudela; Ana Garcia Rodriguez; Gloria Fernandez Esparrach; Jorge Bernal
	Title	Towards Fine-Grained Polyp Segmentation and Classification			Type	Conference Article
	Year	2023	Publication	Workshop on Clinical Image-Based Procedures	Abbreviated Journal
	Volume	14242	Issue		Pages	32-42
	Keywords	Medical image segmentation; Colorectal Cancer; Vision Transformer; Classification
	Abstract	Colorectal cancer is one of the main causes of cancer death worldwide. Colonoscopy is the gold standard screening tool as it allows lesion detection and removal during the same procedure. During the last decades, several efforts have been made to develop CAD systems to assist clinicians in lesion detection and classification. Regarding the latter, and in order to be used in the exploration room as part of resect and discard or leave-in-situ strategies, these systems must identify correctly all different lesion types. This is a challenging task, as the data used to train these systems presents great inter-class similarity, high class imbalance, and low representation of clinically relevant histology classes such as serrated sessile adenomas. In this paper, a new polyp segmentation and classification method, Swin-Expand, is introduced. Based on Swin-Transformer, it uses a simple and lightweight decoder. The performance of this method has been assessed on a novel dataset, comprising 1126 high-definition images representing the three main histological classes. Results show a clear improvement in both segmentation and classification performance, also achieving competitive results when tested in public datasets. These results confirm that both the method and the data are important to obtain more accurate polyp representations.
	Address	Vancouver; October 2023
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title	LNCS
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference	MICCAIW
	Notes	ISE			Approved	no
	Call Number	Admin @ si @ TGF2023			Serial	3837
Permanent link to this record



	Author	Yagmur Gucluturk; Umut Guclu; Marc Perez; Hugo Jair Escalante; Xavier Baro; Isabelle Guyon; Carlos Andujar; Julio C. S. Jacques Junior; Meysam Madadi; Sergio Escalera
	Title	Visualizing Apparent Personality Analysis with Deep Residual Networks			Type	Conference Article
	Year	2017	Publication	Chalearn Workshop on Action, Gesture, and Emotion Recognition: Large Scale Multimodal Gesture Recognition and Real versus Fake expressed emotions at ICCV	Abbreviated Journal
	Volume		Issue		Pages	3101-3109
	Keywords
	Abstract	Automatic prediction of personality traits is a subjective task that has recently received much attention. Specifically, automatic apparent personality trait prediction from multimodal data has emerged as a hot topic within the filed of computer vision and, more particularly, the so called “looking at people” sub-field. Considering “apparent” personality traits as opposed to real ones considerably reduces the subjectivity of the task. The real world applications are encountered in a wide range of domains, including entertainment, health, human computer interaction, recruitment and security. Predictive models of personality traits are useful for individuals in many scenarios (e.g., preparing for job interviews, preparing for public speaking). However, these predictions in and of themselves might be deemed to be untrustworthy without human understandable supportive evidence. Through a series of experiments on a recently released benchmark dataset for automatic apparent personality trait prediction, this paper characterizes the audio and visual information that is used by a state-of-the-art model while making its predictions, so as to provide such supportive evidence by explaining predictions made. Additionally, the paper describes a new web application, which gives feedback on apparent personality traits of its users by combining model predictions with their explanations.
	Address	Venice; Italy; October 2017
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference	ICCVW
	Notes	HUPBA; 6002.143			Approved	no
	Call Number	Admin @ si @ GGP2017			Serial	3067
Permanent link to this record



	Author	Yagmur Gucluturk; Umut Guclu; Xavier Baro; Hugo Jair Escalante; Isabelle Guyon; Sergio Escalera; Marcel A. J. van Gerven; Rob van Lier
	Title	Multimodal First Impression Analysis with Deep Residual Networks			Type	Journal Article
	Year	2018	Publication	IEEE Transactions on Affective Computing	Abbreviated Journal	TAC
	Volume	8	Issue	3	Pages	316-329
	Keywords
	Abstract	People form first impressions about the personalities of unfamiliar individuals even after very brief interactions with them. In this study we present and evaluate several models that mimic this automatic social behavior. Specifically, we present several models trained on a large dataset of short YouTube video blog posts for predicting apparent Big Five personality traits of people and whether they seem suitable to be recommended to a job interview. Along with presenting our audiovisual approach and results that won the third place in the ChaLearn First Impressions Challenge, we investigate modeling in different modalities including audio only, visual only, language only, audiovisual, and combination of audiovisual and language. Our results demonstrate that the best performance could be obtained using a fusion of all data modalities. Finally, in order to promote explainability in machine learning and to provide an example for the upcoming ChaLearn challenges, we present a simple approach for explaining the predictions for job interview recommendations
	Address
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference
	Notes	HUPBA; no proj			Approved	no
	Call Number	Admin @ si @ GGB2018			Serial	3210
Permanent link to this record



	Author	Yainuvis Socarras
	Title	Image segmentation for improving pedestrian detection			Type	Report
	Year	2011	Publication	CVC Technical Report	Abbreviated Journal
	Volume	167	Issue		Pages
	Keywords
	Abstract
	Address	Bellaterra (Spain)
	Corporate Author	Computer Vision Center			Thesis	Master's thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference
	Notes	ADAS;			Approved	no
	Call Number	Admin @ si @ Soc2011			Serial	1933
Permanent link to this record



	Author	Yainuvis Socarras; David Vazquez; Antonio Lopez; David Geronimo; Theo Gevers
	Title	Improving HOG with Image Segmentation: Application to Human Detection			Type	Conference Article
	Year	2012	Publication	11th International Conference on Advanced Concepts for Intelligent Vision Systems	Abbreviated Journal
	Volume	7517	Issue		Pages	178-189
	Keywords	Segmentation; Pedestrian Detection
	Abstract	In this paper we improve the histogram of oriented gradients (HOG), a core descriptor of state-of-the-art object detection, by the use of higher-level information coming from image segmentation. The idea is to re-weight the descriptor while computing it without increasing its size. The benefits of the proposal are two-fold: (i) to improve the performance of the detector by enriching the descriptor information and (ii) take advantage of the information of image segmentation, which in fact is likely to be used in other stages of the detection system such as candidate generation or refinement. We test our technique in the INRIA person dataset, which was originally developed to test HOG, embedding it in a human detection system. The well-known segmentation method, mean-shift (from smaller to larger super-pixels), and different methods to re-weight the original descriptor (constant, region-luminance, color or texture-dependent) has been evaluated. We achieve performance improvements of 4:47% in detection rate through the use of differences of color between contour pixel neighborhoods as re-weighting function.
	Address	Brno, Czech Republic
	Corporate Author				Thesis
	Publisher	Springer Berlin Heidelberg	Place of Publication		Editor	J. Blanc-Talon et al.
	Language	English	Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title	LNCS
	Series Volume		Series Issue		Edition
	ISSN	0302-9743	ISBN	978-3-642-33139-8	Medium
	Area		Expedition		Conference	ACIVS
	Notes	ADAS;ISE			Approved	no
	Call Number	ADAS @ adas @ SLV2012			Serial	1980
Permanent link to this record



	Author	Yainuvis Socarras; Sebastian Ramos; David Vazquez; Antonio Lopez; Theo Gevers
	Title	Adapting Pedestrian Detection from Synthetic to Far Infrared Images			Type	Conference Article
	Year	2013	Publication	ICCV Workshop on Visual Domain Adaptation and Dataset Bias	Abbreviated Journal
	Volume		Issue		Pages
	Keywords	Domain Adaptation; Far Infrared; Pedestrian Detection
	Abstract	We present different techniques to adapt a pedestrian classifier trained with synthetic images and the corresponding automatically generated annotations to operate with far infrared (FIR) images. The information contained in this kind of images allow us to develop a robust pedestrian detector invariant to extreme illumination changes.
	Address	Sydney; Australia; December 2013
	Corporate Author				Thesis
	Publisher		Place of Publication	Sydney, Australy	Editor
	Language	English	Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference	ICCVW-VisDA
	Notes	ADAS; 600.054; 600.055; 600.057; 601.217;ISE			Approved	no
	Call Number	ADAS @ adas @ SRV2013			Serial	2334
Permanent link to this record



	Author	Yasuko Sugito; Javier Vazquez; Trevor Canham; Marcelo Bertalmio
	Title	Image quality evaluation in professional HDR/WCG production questions the need for HDR metrics			Type	Journal Article
	Year	2022	Publication	IEEE Transactions on Image Processing	Abbreviated Journal	TIP
	Volume	31	Issue		Pages	5163 - 5177
	Keywords	Measurement; Image color analysis; Image coding; Production; Dynamic range; Brightness; Extraterrestrial measurements
	Abstract	In the quality evaluation of high dynamic range and wide color gamut (HDR/WCG) images, a number of works have concluded that native HDR metrics, such as HDR visual difference predictor (HDR-VDP), HDR video quality metric (HDR-VQM), or convolutional neural network (CNN)-based visibility metrics for HDR content, provide the best results. These metrics consider only the luminance component, but several color difference metrics have been specifically developed for, and validated with, HDR/WCG images. In this paper, we perform subjective evaluation experiments in a professional HDR/WCG production setting, under a real use case scenario. The results are quite relevant in that they show, firstly, that the performance of HDR metrics is worse than that of a classic, simple standard dynamic range (SDR) metric applied directly to the HDR content; and secondly, that the chrominance metrics specifically developed for HDR/WCG imaging have poor correlation with observer scores and are also outperformed by an SDR metric. Based on these findings, we show how a very simple framework for creating color HDR metrics, that uses only luminance SDR metrics, transfer functions, and classic color spaces, is able to consistently outperform, by a considerable margin, state-of-the-art HDR metrics on a varied set of HDR content, for both perceptual quantization (PQ) and Hybrid Log-Gamma (HLG) encoding, luminance and chroma distortions, and on different color spaces of common use.
	Address
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference
	Notes	600.161; 611.007			Approved	no
	Call Number	Admin @ si @ SVG2022			Serial	3683
Permanent link to this record