Publicacions CVC -- Query Results

[31–40] << 41 42 43 44 45 46 47 48 49 50 >> [51–60]

Details

Records
Author	Artur Xarles; Sergio Escalera; Thomas B. Moeslund; Albert Clapes
Title	ASTRA: An Action Spotting TRAnsformer for Soccer Videos			Type	Conference Article
Year	2023	Publication	Proceedings of the 6th International Workshop on Multimedia Content Analysis in Sports	Abbreviated Journal
Volume		Issue		Pages	93–102
Keywords
Abstract	In this paper, we introduce ASTRA, a Transformer-based model designed for the task of Action Spotting in soccer matches. ASTRA addresses several challenges inherent in the task and dataset, including the requirement for precise action localization, the presence of a long-tail data distribution, non-visibility in certain actions, and inherent label noise. To do so, ASTRA incorporates (a) a Transformer encoder-decoder architecture to achieve the desired output temporal resolution and to produce precise predictions, (b) a balanced mixup strategy to handle the long-tail distribution of the data, (c) an uncertainty-aware displacement head to capture the label variability, and (d) input audio signal to enhance detection of non-visible actions. Results demonstrate the effectiveness of ASTRA, achieving a tight Average-mAP of 66.82 on the test set. Moreover, in the SoccerNet 2023 Action Spotting challenge, we secure the 3rd position with an Average-mAP of 70.21 on the challenge set.
Address	Otawa; Canada; October 2023
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference	MMSports
Notes	HUPBA			Approved	no
Call Number	Admin @ si @ XEM2023			Serial	3970
Permanent link to this record



Author	Y. Patel; Lluis Gomez; Marçal Rusiñol; Dimosthenis Karatzas; C.V. Jawahar
Title	Self-Supervised Visual Representations for Cross-Modal Retrieval			Type	Conference Article
Year	2019	Publication	ACM International Conference on Multimedia Retrieval	Abbreviated Journal
Volume		Issue		Pages	182–186
Keywords
Abstract	Cross-modal retrieval methods have been significantly improved in last years with the use of deep neural networks and large-scale annotated datasets such as ImageNet and Places. However, collecting and annotating such datasets requires a tremendous amount of human effort and, besides, their annotations are limited to discrete sets of popular visual classes that may not be representative of the richer semantics found on large-scale cross-modal retrieval datasets. In this paper, we present a self-supervised cross-modal retrieval framework that leverages as training data the correlations between images and text on the entire set of Wikipedia articles. Our method consists in training a CNN to predict: (1) the semantic context of the article in which an image is more probable to appear as an illustration, and (2) the semantic context of its caption. Our experiments demonstrate that the proposed method is not only capable of learning discriminative visual representations for solving vision tasks like classification, but that the learned representations are better for cross-modal retrieval when compared to supervised pre-training of the network on the ImageNet dataset.
Address	Otawa; Canada; june 2019
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference	ICMR
Notes	DAG; 600.121; 600.129			Approved	no
Call Number	Admin @ si @ PGR2019			Serial	3288
Permanent link to this record



Author	Jorge Bernal; F. Javier Sanchez; Fernando Vilariño
Title	Impact of Image Preprocessing Methods on Polyp Localization in Colonoscopy Frames			Type	Conference Article
Year	2013	Publication	35th Annual International Conference of the IEEE Engineering in Medicine and Biology Society	Abbreviated Journal
Volume		Issue		Pages	7350 - 7354
Keywords
Abstract	In this paper we present our image preprocessing methods as a key part of our automatic polyp localization scheme. These methods are used to assess the impact of different endoluminal scene elements when characterizing polyps. More precisely we tackle the influence of specular highlights, blood vessels and black mask surrounding the scene. Experimental results prove that the appropriate handling of these elements leads to a great improvement in polyp localization results.
Address	Osaka; Japan; July 2013
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN	1557-170X	ISBN		Medium
Area	800	Expedition		Conference	EMBC
Notes	MV; 600.047; 600.060;SIAI			Approved	no
Call Number	Admin @ si @ BSV2013			Serial	2286
Permanent link to this record



Author	Fernando Vilariño; Panagiota Spyridonos; Jordi Vitria; Fernando Azpiroz; Petia Radeva
Title	Cascade analysis for intestinal contraction detection			Type	Conference Article
Year	2006	Publication	20th International Congress and exhibition Computer Assisted Radiology and Surgery	Abbreviated Journal
Volume		Issue		Pages	9-10
Keywords	intestine video analysis, anisotropic features, support vector machine, cascade of classifiers
Abstract	In this work, we address the study of intestinal contractions in a novel approach based on a machine learning framework to process data from Wireless Capsule Video Endoscopy. Wireless endoscopy represents a unique way to visualize the intestine motility by creating long videos to visualize intestine dynamics. In this paper we argue that to analyze huge amount of wireless endoscopy data and define robust methods for contraction detection we should base our approach on sophisticated machine learning techniques. In particular, we propose a cascade of classifiers in order to remove different physiological phenomenon and obtain the motility pattern of small intestines. Our results show obtaining high specificity and sensitivity rates that highlight the high efficiency of the selected approach and support the feasibility of the proposed methodology in the automatic detection and analysis of intestine contractions.
Address	Osaka (Japan)
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area	800	Expedition		Conference	CARS
Notes	MV;OR;MILAB;SIAI			Approved	no
Call Number	BCNPCL @ bcnpcl @ VSV2006a; IAM @ iam @ VSV2006h			Serial	726
Permanent link to this record



Author	Kamal Nasrollahi; Sergio Escalera; P. Rasti; Gholamreza Anbarjafari; Xavier Baro; Hugo Jair Escalante; Thomas B. Moeslund
Title	Deep Learning based Super-Resolution for Improved Action Recognition			Type	Conference Article
Year	2015	Publication	5th International Conference on Image Processing Theory, Tools and Applications IPTA2015	Abbreviated Journal
Volume		Issue		Pages	67 - 72
Keywords
Abstract	Action recognition systems mostly work with videos of proper quality and resolution. Even most challenging benchmark databases for action recognition, hardly include videos of low-resolution from, e.g., surveillance cameras. In videos recorded by such cameras, due to the distance between people and cameras, people are pictured very small and hence challenge action recognition algorithms. Simple upsampling methods, like bicubic interpolation, cannot retrieve all the detailed information that can help the recognition. To deal with this problem, in this paper we combine results of bicubic interpolation with results of a state-ofthe-art deep learning-based super-resolution algorithm, through an alpha-blending approach. The experimental results obtained on down-sampled version of a large subset of Hoolywood2 benchmark database show the importance of the proposed system in increasing the recognition rate of a state-of-the-art action recognition system for handling low-resolution videos.
Address	Orleans; France; November 2015
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference	IPTA
Notes	HuPBA;MV			Approved	no
Call Number	Admin @ si @ NER2015			Serial	2648
Permanent link to this record



Author	Ekaterina Zaytseva; Jordi Vitria
Title	A search based approach to non maximum suppression in face detection			Type	Conference Article
Year	2012	Publication	19th IEEE International Conference on Image Processing	Abbreviated Journal
Volume		Issue		Pages
Keywords
Abstract	Poster paper TA.P5.12 Face detectors typically produce a large number of false positives and this leads to the need to have a further non maximum suppression stage to eliminate multiple and spurious responses. This stage is based on considering spatial heuristics: true positive responses are selected by implicitly considering several restrictions on the spatial distribution of detector responses in natural images. In this paper we analyze the limitations of this approach and propose an efficient search method to overcome them. Results show how the application of this new non-maximum suppression approach to a simple face detector boosts its performance to state of the art results.
Address	Orlando; USA; September 2012
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN	1522-4880	ISBN	978-1-4673-2534-9	Medium
Area		Expedition		Conference	ICIP
Notes	OR;MV			Approved	no
Call Number	Admin @ si @ ZaV2012			Serial	2060
Permanent link to this record



Author	Miquel Ferrer; Ernest Valveny; F. Serratosa; Horst Bunke
Title	Exact Median Graph Computation via Graph Embedding			Type	Conference Article
Year	2008	Publication	12th International Workshop on Structural and Syntactic Pattern Recognition	Abbreviated Journal
Volume	5324	Issue		Pages	15–24
Keywords
Abstract
Address	Orlando – Florida (USA)
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title	LNCS
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference	SSPR
Notes	DAG			Approved	no
Call Number	DAG @ dag @ FVS2008b			Serial	1076
Permanent link to this record



Author	T. Alejandra Vidal; Andrew J. Davison; Juan Andrade; David W. Murray
Title	Active Control for Single Camera SLAM			Type	Miscellaneous
Year	2006	Publication	IEEE International Conference on Robotics and Automation, 1930–1936	Abbreviated Journal
Volume		Issue		Pages
Keywords
Abstract
Address	Orlando (Florida)
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference
Notes				Approved	no
Call Number	DAG @ dag @ VDA2006			Serial	666
Permanent link to this record



Author	Rafael E. Rivadeneira; Angel Sappa; Boris X. Vintimilla
Title	Multi-Image Super-Resolution for Thermal Images			Type	Conference Article
Year	2022	Publication	17th International Conference on Computer Vision Theory and Applications (VISAPP 2022)	Abbreviated Journal
Volume	4	Issue		Pages	635-642
Keywords	Thermal Images; Multi-view; Multi-frame; Super-Resolution; Deep Learning; Attention Block
Abstract	This paper proposes a novel CNN architecture for the multi-thermal image super-resolution problem. In the proposed scheme, the multi-images are synthetically generated by downsampling and slightly shifting the given image; noise is also added to each of these synthesized images. The proposed architecture uses two attention blocks paths to extract high-frequency details taking advantage of the large information extracted from multiple images of the same scene. Experimental results are provided, showing the proposed scheme has overcome the state-of-the-art approaches.
Address	Online; Feb 6-8, 2022
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference	VISAPP
Notes	MSIAU; 601.349			Approved	no
Call Number	Admin @ si @ RSV2022a			Serial	3690
Permanent link to this record



Author	Shiqi Yang; Yaxing Wang; Joost Van de Weijer; Luis Herranz; Shangling Jui
Title	Exploiting the Intrinsic Neighborhood Structure for Source-free Domain Adaptation			Type	Conference Article
Year	2021	Publication	Thirty-fifth Conference on Neural Information Processing Systems (NeurIPS 2021)	Abbreviated Journal
Volume		Issue		Pages
Keywords
Abstract	Domain adaptation (DA) aims to alleviate the domain shift between source domain and target domain. Most DA methods require access to the source data, but often that is not possible (e.g. due to data privacy or intellectual property). In this paper, we address the challenging source-free domain adaptation (SFDA) problem, where the source pretrained model is adapted to the target domain in the absence of source data. Our method is based on the observation that target data, which might no longer align with the source domain classifier, still forms clear clusters. We capture this intrinsic structure by defining local affinity of the target data, and encourage label consistency among data with high local affinity. We observe that higher affinity should be assigned to reciprocal neighbors, and propose a self regularization loss to decrease the negative impact of noisy neighbors. Furthermore, to aggregate information with more context, we consider expanded neighborhoods with small affinity values. In the experimental results we verify that the inherent structure of the target features is an important source of information for domain adaptation. We demonstrate that this local structure can be efficiently captured by considering the local neighbors, the reciprocal neighbors, and the expanded neighborhood. Finally, we achieve state-of-the-art performance on several 2D image and 3D point cloud recognition datasets. Code is available in https://github.com/Albert0147/SFDA_neighbors.
Address	Online; December 7-10, 2021
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference	NIPS
Notes	LAMP; 600.147; 600.141			Approved	no
Call Number	Admin @ si @			Serial	3691
Permanent link to this record



Author	Jorge Charco; Angel Sappa; Boris X. Vintimilla
Title	Human Pose Estimation through a Novel Multi-view Scheme			Type	Conference Article
Year	2022	Publication	17th International Conference on Computer Vision Theory and Applications (VISAPP 2022)	Abbreviated Journal
Volume	5	Issue		Pages	855-862
Keywords	Multi-view Scheme; Human Pose Estimation; Relative Camera Pose; Monocular Approach
Abstract	This paper presents a multi-view scheme to tackle the challenging problem of the self-occlusion in human pose estimation problem. The proposed approach first obtains the human body joints of a set of images, which are captured from different views at the same time. Then, it enhances the obtained joints by using a multi-view scheme. Basically, the joints from a given view are used to enhance poorly estimated joints from another view, especially intended to tackle the self occlusions cases. A network architecture initially proposed for the monocular case is adapted to be used in the proposed multi-view scheme. Experimental results and comparisons with the state-of-the-art approaches on Human3.6m dataset are presented showing improvements in the accuracy of body joints estimations.
Address	On line; Feb 6, 2022 – Feb 8, 2022
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN	2184-4321	ISBN	978-989-758-555-5	Medium
Area		Expedition		Conference	VISAPP
Notes	MSIAU; 600.160			Approved	no
Call Number	Admin @ si @ CSV2022			Serial	3689
Permanent link to this record



Author	Eduardo Aguilar; Bhalaji Nagarajan; Beatriz Remeseiro; Petia Radeva
Title	Bayesian deep learning for semantic segmentation of food images			Type	Journal Article
Year	2022	Publication	Computers and Electrical Engineering	Abbreviated Journal	CEE
Volume	103	Issue		Pages	108380
Keywords	Deep learning; Uncertainty quantification; Bayesian inference; Image segmentation; Food analysis
Abstract	Deep learning has provided promising results in various applications; however, algorithms tend to be overconfident in their predictions, even though they may be entirely wrong. Particularly for critical applications, the model should provide answers only when it is very sure of them. This article presents a Bayesian version of two different state-of-the-art semantic segmentation methods to perform multi-class segmentation of foods and estimate the uncertainty about the given predictions. The proposed methods were evaluated on three public pixel-annotated food datasets. As a result, we can conclude that Bayesian methods improve the performance achieved by the baseline architectures and, in addition, provide information to improve decision-making. Furthermore, based on the extracted uncertainty map, we proposed three measures to rank the images according to the degree of noisy annotations they contained. Note that the top 135 images ranked by one of these measures include more than half of the worst-labeled food images.
Address	October 2022
Corporate Author				Thesis
Publisher	Science Direct	Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference
Notes	MILAB			Approved	no
Call Number	Admin @ si @ ANR2022			Serial	3763
Permanent link to this record



Author	Javad Zolfaghari Bengar; Joost Van de Weijer; Bartlomiej Twardowski; Bogdan Raducanu
Title	Reducing Label Effort: Self- Supervised Meets Active Learning			Type	Conference Article
Year	2021	Publication	International Conference on Computer Vision Workshops	Abbreviated Journal
Volume		Issue		Pages	1631-1639
Keywords
Abstract	Active learning is a paradigm aimed at reducing the annotation effort by training the model on actively selected informative and/or representative samples. Another paradigm to reduce the annotation effort is self-training that learns from a large amount of unlabeled data in an unsupervised way and fine-tunes on few labeled samples. Recent developments in self-training have achieved very impressive results rivaling supervised learning on some datasets. The current work focuses on whether the two paradigms can benefit from each other. We studied object recognition datasets including CIFAR10, CIFAR100 and Tiny ImageNet with several labeling budgets for the evaluations. Our experiments reveal that self-training is remarkably more efficient than active learning at reducing the labeling effort, that for a low labeling budget, active learning offers no benefit to self-training, and finally that the combination of active learning and self-training is fruitful when the labeling budget is high. The performance gap between active learning trained either with self-training or from scratch diminishes as we approach to the point where almost half of the dataset is labeled.
Address	October 2021
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference	ICCVW
Notes	LAMP; OR			Approved	no
Call Number	Admin @ si @ ZVT2021			Serial	3672
Permanent link to this record



Author	Antonio Esteban Lansaque
Title	An Endoscopic Navigation System for Lung Cancer Biopsy			Type	Book Whole
Year	2019	Publication	PhD Thesis, Universitat Autonoma de Barcelona-CVC	Abbreviated Journal
Volume		Issue		Pages
Keywords
Abstract	Lung cancer is one of the most diagnosed cancers among men and women. Actually, lung cancer accounts for 13% of the total cases with a 5-year global survival rate in patients. Although Early detection increases survival rate from 38% to 67%, accurate diagnosis remains a challenge. Pathological confirmation requires extracting a sample of the lesion tissue for its biopsy. The preferred procedure for tissue biopsy is called bronchoscopy. A bronchoscopy is an endoscopic technique for the internal exploration of airways which facilitates the performance of minimal invasive interventions with low risk for the patient. Recent advances in bronchoscopic devices have increased their use for minimal invasive diagnostic and intervention procedures, like lung cancer biopsy sampling. Despite the improvement in bronchoscopic device quality, there is a lack of intelligent computational systems for supporting in-vivo clinical decision during examinations. Existing technologies fail to accurately reach the lesion due to several aspects at intervention off-line planning and poor intra-operative guidance at exploration time. Existing guiding systems radiate patients and clinical staff,might be expensive and achieve a suboptimlal 70% of yield boost. Diagnostic yield could be improved reducing radiation and costs by developing intra-operative support systems able to guide the bronchoscopist to the lesion during the intervention. The goal of this PhD thesis is to develop an image-based navigation systemfor intra-operative guidance of bronchoscopists to a target lesion across a path previously planned on a CT-scan. We propose a 3D navigation system which uses the anatomy of video bronchoscopy frames to locate the bronchoscope within the airways. Once the bronchoscope is located, our navigation system is able to indicate the bifurcation which needs to be followed to reach the lesion. In order to facilitate an off-line validation as realistic as possible, we also present a method for augmenting simulated virtual bronchoscopies with the appearance of intra-operative videos. Experiments performed on augmented and intra-operative videos, prove that our algorithm can be speeded up for an on-line implementation in the operating room.
Address	October 2019
Corporate Author				Thesis	Ph.D. thesis
Publisher	Ediciones Graficas Rey	Place of Publication		Editor	Debora Gil;Carles Sanchez
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN	978-84-121011-0-2	Medium
Area		Expedition		Conference
Notes	IAM; 600.139; 600.145			Approved	no
Call Number	Admin @ si @ Est2019			Serial	3392
Permanent link to this record



Author	Aymen Azaza
Title	Context, Motion and Semantic Information for Computational Saliency			Type	Book Whole
Year	2018	Publication	PhD Thesis, Universitat Autonoma de Barcelona-CVC	Abbreviated Journal
Volume		Issue		Pages
Keywords
Abstract	The main objective of this thesis is to highlight the salient object in an image or in a video sequence. We address three important—but in our opinion insufficiently investigated—aspects of saliency detection. Firstly, we start by extending previous research on saliency which explicitly models the information provided from the context. Then, we show the importance of explicit context modelling for saliency estimation. Several important works in saliency are based on the usage of object proposals. However, these methods focus on the saliency of the object proposal itself and ignore the context. To introduce context in such saliency approaches, we couple every object proposal with its direct context. This allows us to evaluate the importance of the immediate surround (context) for its saliency. We propose several saliency features which are computed from the context proposals including features based on omni-directional and horizontal context continuity. Secondly, we investigate the usage of top-downmethods (high-level semantic information) for the task of saliency prediction since most computational methods are bottom-up or only include few semantic classes. We propose to consider a wider group of object classes. These objects represent important semantic information which we will exploit in our saliency prediction approach. Thirdly, we develop a method to detect video saliency by computing saliency from supervoxels and optical flow. In addition, we apply the context features developed in this thesis for video saliency detection. The method combines shape and motion features with our proposed context features. To summarize, we prove that extending object proposals with their direct context improves the task of saliency detection in both image and video data. Also the importance of the semantic information in saliency estimation is evaluated. Finally, we propose a newmotion feature to detect saliency in video data. The three proposed novelties are evaluated on standard saliency benchmark datasets and are shown to improve with respect to state-of-the-art.
Address	October 2018
Corporate Author				Thesis	Ph.D. thesis
Publisher	Ediciones Graficas Rey	Place of Publication		Editor	Joost Van de Weijer;Ali Douik
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN	978-84-945373-9-4	Medium
Area		Expedition		Conference
Notes	LAMP; 600.120			Approved	no
Call Number	Admin @ si @ Aza2018			Serial	3218
Permanent link to this record