Publicacions CVC -- Query Results

	Publicacions CVC Home \| Show All \| Simple Search \| Advanced Search \| Add Record \| Import	Login Quick Search: Field: contains: ...
	31–45 of 3413 records found matching your query (RSS):

Search & Display Options

Select All Deselect All

<< 1 2 3 4 5 6 7 8 9 10 >> [11–20]

List View

Citations

Details

	Records
	Author	Sergio Vera; Debora Gil; Agnes Borras; F. Javier Sanchez; Frederic Perez; Marius G. Linguraru; Miguel Angel Gonzalez Ballester
	Title	Computation and Evaluation of Medial Surfaces for Shape Representation of Abdominal Organs			Type	Book Chapter
	Year	2012	Publication	Workshop on Computational and Clinical Applications in Abdominal Imaging	Abbreviated Journal
	Volume	7029	Issue		Pages	223–230
	Keywords	medial manifolds, abdomen.
	Abstract	Medial representations are powerful tools for describing and parameterizing the volumetric shape of anatomical structures. Existing methods show excellent results when applied to 2D objects, but their quality drops across dimensions. This paper contributes to the computation of medial manifolds in two aspects. First, we provide a standard scheme for the computation of medial manifolds that avoid degenerated medial axis segments; second, we introduce an energy based method which performs independently of the dimension. We evaluate quantitatively the performance of our method with respect to existing approaches, by applying them to synthetic shapes of known medial geometry. Finally, we show results on shape representation of multiple abdominal organs, exploring the use of medial manifolds for the representation of multi-organ relations.
	Address	Toronto; Canada;
	Corporate Author				Thesis
	Publisher	Springer Link	Place of Publication	Berlin	Editor	H. Yoshida et al
	Language	English	Summary Language	English	Original Title
	Series Editor		Series Title	Lecture Notes in Computer Science	Abbreviated Series Title	LNCS
	Series Volume		Series Issue		Edition
	ISSN	0302-9743	ISBN	978-3-642-28556-1	Medium
	Area		Expedition		Conference	ABDI
	Notes	IAM;MV			Approved	no
	Call Number	IAM @ iam @ VGB2012			Serial	1834
Permanent link to this record



	Author	Sergio Vera; Debora Gil; Agnes Borras; F. Javier Sanchez; Frederic Perez; Marius G. Linguraru
	Title	Computation and Evaluation of Medial Surfaces for Shape Representation of Abdominal Organs			Type	Conference Article
	Year	2011	Publication	Workshop on Computational and Clinical Applications in Abdominal Imaging	Abbreviated Journal
	Volume	7029	Issue		Pages	223-230
	Keywords
	Abstract	Medial representations are powerful tools for describing and parameterizing the volumetric shape of anatomical structures. Existing methods show excellent results when applied to 2D objects, but their quality drops across dimensions. This paper contributes to the computation of medial manifolds in two aspects. First, we provide a standard scheme for the computation of medial manifolds that avoid degenerated medial axis segments; second, we introduce an energy based method which performs independently of the dimension. We evaluate quantitatively the performance of our method with respect to existing approaches, by applying them to synthetic shapes of known medial geometry. Finally, we show results on shape representation of multiple abdominal organs, exploring the use of medial manifolds for the representation of multi-organ relations.
	Address	Nice, France
	Corporate Author				Thesis
	Publisher	Springer Berlin Heidelberg	Place of Publication		Editor	In H. Yoshida et al
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference	ABDI
	Notes	IAM; MV			Approved	no
	Call Number	VGB2011			Serial	2036
Permanent link to this record



	Author	Yael Tudela; Ana Garcia Rodriguez; Gloria Fernandez Esparrach; Jorge Bernal
	Title	Towards Fine-Grained Polyp Segmentation and Classification			Type	Conference Article
	Year	2023	Publication	Workshop on Clinical Image-Based Procedures	Abbreviated Journal
	Volume	14242	Issue		Pages	32-42
	Keywords	Medical image segmentation; Colorectal Cancer; Vision Transformer; Classification
	Abstract	Colorectal cancer is one of the main causes of cancer death worldwide. Colonoscopy is the gold standard screening tool as it allows lesion detection and removal during the same procedure. During the last decades, several efforts have been made to develop CAD systems to assist clinicians in lesion detection and classification. Regarding the latter, and in order to be used in the exploration room as part of resect and discard or leave-in-situ strategies, these systems must identify correctly all different lesion types. This is a challenging task, as the data used to train these systems presents great inter-class similarity, high class imbalance, and low representation of clinically relevant histology classes such as serrated sessile adenomas. In this paper, a new polyp segmentation and classification method, Swin-Expand, is introduced. Based on Swin-Transformer, it uses a simple and lightweight decoder. The performance of this method has been assessed on a novel dataset, comprising 1126 high-definition images representing the three main histological classes. Results show a clear improvement in both segmentation and classification performance, also achieving competitive results when tested in public datasets. These results confirm that both the method and the data are important to obtain more accurate polyp representations.
	Address	Vancouver; October 2023
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title	LNCS
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference	MICCAIW
	Notes	ISE			Approved	no
	Call Number	Admin @ si @ TGF2023			Serial	3837
Permanent link to this record



	Author	Pierdomenico Fiadino; Victor Ponce; Juan Antonio Torrero-Gonzalez; Marc Torrent-Moreno
	Title	Call Detail Records for Human Mobility Studies: Taking Stock of the Situation in the “Always Connected Era"			Type	Conference Article
	Year	2017	Publication	Workshop on Big Data Analytics and Machine Learning for Data Communication Networks	Abbreviated Journal
	Volume		Issue		Pages	43-48
	Keywords	mobile networks; call detail records; human mobility
	Abstract	The exploitation of cellular network data for studying human mobility has been a popular research topic in the last decade. Indeed, mobile terminals could be considered ubiquitous sensors that allow the observation of human movements on large scale without the need of relying on non-scalable techniques, such as surveys, or dedicated and expensive monitoring infrastructures. In particular, Call Detail Records (CDRs), collected by operators for billing purposes, have been extensively employed due to their rather large availability, compared to other types of cellular data (e.g., signaling). Despite the interest aroused around this topic, the research community has generally agreed about the scarcity of information provided by CDRs: the position of mobile terminals is logged when some kind of activity (calls, SMS, data connections) occurs, which translates in a picture of mobility somehow biased by the activity degree of users. By studying two datasets collected by a Nation-wide operator in 2014 and 2016, we show that the situation has drastically changed in terms of data volume and quality. The increase of flat data plans and the higher penetration of “ always connected” terminals have driven up the number of recorded CDRs, providing higher temporal accuracy for users’ locations.
	Address	UCLA; USA; August 2017
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN	978-1-4503-5054-9	Medium
	Area		Expedition		Conference	ACMW (SIGCOMM)
	Notes	HuPBA; no menciona			Approved	no
	Call Number	Admin @ si @ FPT2017			Serial	2980
Permanent link to this record



	Author	Fernando Vilariño
	Title	Unveiling the Social Impact of AI			Type	Conference Article
	Year	2020	Publication	Workshop at Digital Living Lab Days Conference	Abbreviated Journal
	Volume		Issue		Pages
	Keywords
	Abstract
	Address	September 2020
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference
	Notes	MV; DAG; 600.121; 600.140;SIAI			Approved	no
	Call Number	Admin @ si @ Vil2020			Serial	3459
Permanent link to this record



	Author	Debora Gil; Oriol Ramos Terrades; Raquel Perez
	Title	Topological Radiomics (TOPiomics): Early Detection of Genetic Abnormalities in Cancer Treatment Evolution			Type	Conference Article
	Year	2020	Publication	Women in Geometry and Topology	Abbreviated Journal
	Volume		Issue		Pages
	Keywords
	Abstract
	Address	Barcelona; September 2019
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference
	Notes	IAM; DAG; 600.139; 600.145; 600.121			Approved	no
	Call Number	Admin @ si @ GRP2020			Serial	3473
Permanent link to this record



	Author	Mohamed Ali Souibgui; Ali Furkan Biten; Sounak Dey; Alicia Fornes; Yousri Kessentini; Lluis Gomez; Dimosthenis Karatzas; Josep Llados
	Title	One-shot Compositional Data Generation for Low Resource Handwritten Text Recognition			Type	Conference Article
	Year	2022	Publication	Winter Conference on Applications of Computer Vision	Abbreviated Journal
	Volume		Issue		Pages
	Keywords	Document Analysis
	Abstract	Low resource Handwritten Text Recognition (HTR) is a hard problem due to the scarce annotated data and the very limited linguistic information (dictionaries and language models). This appears, for example, in the case of historical ciphered manuscripts, which are usually written with invented alphabets to hide the content. Thus, in this paper we address this problem through a data generation technique based on Bayesian Program Learning (BPL). Contrary to traditional generation approaches, which require a huge amount of annotated images, our method is able to generate human-like handwriting using only one sample of each symbol from the desired alphabet. After generating symbols, we create synthetic lines to train state-of-the-art HTR architectures in a segmentation free fashion. Quantitative and qualitative analyses were carried out and confirm the effectiveness of the proposed method, achieving competitive results compared to the usage of real annotated data.
	Address	Virtual; January 2022
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference	WACV
	Notes	DAG; 602.230; 600.140			Approved	no
	Call Number	Admin @ si @ SBD2022			Serial	3615
Permanent link to this record



	Author	Minesh Mathew; Viraj Bagal; Ruben Tito; Dimosthenis Karatzas; Ernest Valveny; C.V. Jawahar
	Title	InfographicVQA			Type	Conference Article
	Year	2022	Publication	Winter Conference on Applications of Computer Vision	Abbreviated Journal
	Volume		Issue		Pages	1697-1706
	Keywords	Document Analysis Datasets; Evaluation and Comparison of Vision Algorithms; Vision and Languages
	Abstract	Infographics communicate information using a combination of textual, graphical and visual elements. This work explores the automatic understanding of infographic images by using a Visual Question Answering technique. To this end, we present InfographicVQA, a new dataset comprising a diverse collection of infographics and question-answer annotations. The questions require methods that jointly reason over the document layout, textual content, graphical elements, and data visualizations. We curate the dataset with an emphasis on questions that require elementary reasoning and basic arithmetic skills. For VQA on the dataset, we evaluate two Transformer-based strong baselines. Both the baselines yield unsatisfactory results compared to near perfect human performance on the dataset. The results suggest that VQA on infographics--images that are designed to communicate information quickly and clearly to human brain--is ideal for benchmarking machine understanding of complex document images. The dataset is available for download at docvqa. org
	Address	Virtual; Waikoloa; Hawai; USA; January 2022
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference	WACV
	Notes	DAG; 600.155			Approved	no
	Call Number	MBT2022			Serial	3625
Permanent link to this record



	Author	Joakim Bruslund Haurum; Meysam Madadi; Sergio Escalera; Thomas B. Moeslund
	Title	Multi-Task Classification of Sewer Pipe Defects and Properties Using a Cross-Task Graph Neural Network Decoder			Type	Conference Article
	Year	2022	Publication	Winter Conference on Applications of Computer Vision	Abbreviated Journal
	Volume		Issue		Pages	2806-2817
	Keywords	Vision Systems; Applications Multi-Task Classification
	Abstract	The sewerage infrastructure is one of the most important and expensive infrastructures in modern society. In order to efficiently manage the sewerage infrastructure, automated sewer inspection has to be utilized. However, while sewer defect classification has been investigated for decades, little attention has been given to classifying sewer pipe properties such as water level, pipe material, and pipe shape, which are needed to evaluate the level of sewer pipe deterioration. In this work we classify sewer pipe defects and properties concurrently and present a novel decoder-focused multi-task classification architecture Cross-Task Graph Neural Network (CT-GNN), which refines the disjointed per-task predictions using cross-task information. The CT-GNN architecture extends the traditional disjointed task-heads decoder, by utilizing a cross-task graph and unique class node embeddings. The cross-task graph can either be determined a priori based on the conditional probability between the task classes or determined dynamically using self-attention. CT-GNN can be added to any backbone and trained end-toend at a small increase in the parameter count. We achieve state-of-the-art performance on all four classification tasks in the Sewer-ML dataset, improving defect classification and water level classification by 5.3 and 8.0 percentage points, respectively. We also outperform the single task methods as well as other multi-task classification approaches while introducing 50 times fewer parameters than previous modelfocused approaches.
	Address
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference	WACV
	Notes	HUPBA; no proj			Approved	no
	Call Number	Admin @ si @ BME2022			Serial	3638
Permanent link to this record



	Author	Ali Furkan Biten; Lluis Gomez; Dimosthenis Karatzas
	Title	Let there be a clock on the beach: Reducing Object Hallucination in Image Captioning			Type	Conference Article
	Year	2022	Publication	Winter Conference on Applications of Computer Vision	Abbreviated Journal
	Volume		Issue		Pages	1381-1390
	Keywords	Measurement; Training; Visualization; Analytical models; Computer vision; Computational modeling; Training data
	Abstract	Explaining an image with missing or non-existent objects is known as object bias (hallucination) in image captioning. This behaviour is quite common in the state-of-the-art captioning models which is not desirable by humans. To decrease the object hallucination in captioning, we propose three simple yet efficient training augmentation method for sentences which requires no new training data or increase in the model size. By extensive analysis, we show that the proposed methods can significantly diminish our models’ object bias on hallucination metrics. Moreover, we experimentally demonstrate that our methods decrease the dependency on the visual features. All of our code, configuration files and model weights are available online.
	Address	Virtual; Waikoloa; Hawai; USA; January 2022
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference	WACV
	Notes	DAG; 600.155; 302.105			Approved	no
	Call Number	Admin @ si @ BGK2022			Serial	3662
Permanent link to this record



	Author	Ali Furkan Biten; Andres Mafla; Lluis Gomez; Dimosthenis Karatzas
	Title	Is An Image Worth Five Sentences? A New Look into Semantics for Image-Text Matching			Type	Conference Article
	Year	2022	Publication	Winter Conference on Applications of Computer Vision	Abbreviated Journal
	Volume		Issue		Pages	1391-1400
	Keywords	Measurement; Training; Integrated circuits; Annotations; Semantics; Training data; Semisupervised learning
	Abstract	The task of image-text matching aims to map representations from different modalities into a common joint visual-textual embedding. However, the most widely used datasets for this task, MSCOCO and Flickr30K, are actually image captioning datasets that offer a very limited set of relationships between images and sentences in their ground-truth annotations. This limited ground truth information forces us to use evaluation metrics based on binary relevance: given a sentence query we consider only one image as relevant. However, many other relevant images or captions may be present in the dataset. In this work, we propose two metrics that evaluate the degree of semantic relevance of retrieved items, independently of their annotated binary relevance. Additionally, we incorporate a novel strategy that uses an image captioning metric, CIDEr, to define a Semantic Adaptive Margin (SAM) to be optimized in a standard triplet loss. By incorporating our formulation to existing models, a large improvement is obtained in scenarios where available training data is limited. We also demonstrate that the performance on the annotated image-caption pairs is maintained while improving on other non-annotated relevant items when employing the full training set. The code for our new metric can be found at github. com/furkanbiten/ncsmetric and the model implementation at github. com/andrespmd/semanticadaptive_margin.
	Address	Virtual; Waikoloa; Hawai; USA; January 2022
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference	WACV
	Notes	DAG; 600.155; 302.105;			Approved	no
	Call Number	Admin @ si @ BMG2022			Serial	3663
Permanent link to this record



	Author	Javad Zolfaghari Bengar; Joost Van de Weijer; Laura Lopez-Fuentes; Bogdan Raducanu
	Title	Class-Balanced Active Learning for Image Classification			Type	Conference Article
	Year	2022	Publication	Winter Conference on Applications of Computer Vision	Abbreviated Journal
	Volume		Issue		Pages
	Keywords
	Abstract	Active learning aims to reduce the labeling effort that is required to train algorithms by learning an acquisition function selecting the most relevant data for which a label should be requested from a large unlabeled data pool. Active learning is generally studied on balanced datasets where an equal amount of images per class is available. However, real-world datasets suffer from severe imbalanced classes, the so called long-tail distribution. We argue that this further complicates the active learning process, since the imbalanced data pool can result in suboptimal classifiers. To address this problem in the context of active learning, we proposed a general optimization framework that explicitly takes class-balancing into account. Results on three datasets showed that the method is general (it can be combined with most existing active learning algorithms) and can be effectively applied to boost the performance of both informative and representative-based active learning methods. In addition, we showed that also on balanced datasets our method 1 generally results in a performance gain.
	Address	Virtual; Waikoloa; Hawai; USA; January 2022
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference	WACV
	Notes	LAMP; 602.200; 600.147; 600.120			Approved	no
	Call Number	Admin @ si @ ZWL2022			Serial	3703
Permanent link to this record



	Author	Alloy Das; Sanket Biswas; Ayan Banerjee; Josep Llados; Umapada Pal; Saumik Bhattacharya
	Title	Harnessing the Power of Multi-Lingual Datasets for Pre-training: Towards Enhancing Text Spotting Performance			Type	Conference Article
	Year	2024	Publication	Winter Conference on Applications of Computer Vision	Abbreviated Journal
	Volume		Issue		Pages	718-728
	Keywords
	Abstract	The adaptation capability to a wide range of domains is crucial for scene text spotting models when deployed to real-world conditions. However, existing state-of-the-art (SOTA) approaches usually incorporate scene text detection and recognition simply by pretraining on natural scene text datasets, which do not directly exploit the intermediate feature representations between multiple domains. Here, we investigate the problem of domain-adaptive scene text spotting, i.e., training a model on multi-domain source data such that it can directly adapt to target domains rather than being specialized for a specific domain or scenario. Further, we investigate a transformer baseline called Swin-TESTR to focus on solving scene-text spotting for both regular and arbitrary-shaped scene text along with an exhaustive evaluation. The results clearly demonstrate the potential of intermediate representations to achieve significant performance on text spotting benchmarks across multiple domains (e.g. language, synth-to-real, and documents). both in terms of accuracy and efficiency.
	Address	Waikoloa; Hawai; USA; January 2024
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference	WACV
	Notes	DAG			Approved	no
	Call Number	Admin @ si @ DBB2024			Serial	3986
Permanent link to this record



	Author	Alex Gomez-Villa; Bartlomiej Twardowski; Kai Wang; Joost van de Weijer
	Title	Plasticity-Optimized Complementary Networks for Unsupervised Continual Learning			Type	Conference Article
	Year	2024	Publication	Winter Conference on Applications of Computer Vision	Abbreviated Journal
	Volume		Issue		Pages	1690-1700
	Keywords
	Abstract	Continuous unsupervised representation learning (CURL) research has greatly benefited from improvements in self-supervised learning (SSL) techniques. As a result, existing CURL methods using SSL can learn high-quality representations without any labels, but with a notable performance drop when learning on a many-tasks data stream. We hypothesize that this is caused by the regularization losses that are imposed to prevent forgetting, leading to a suboptimal plasticity-stability trade-off: they either do not adapt fully to the incoming data (low plasticity), or incur significant forgetting when allowed to fully adapt to a new SSL pretext-task (low stability). In this work, we propose to train an expert network that is relieved of the duty of keeping the previous knowledge and can focus on performing optimally on the new tasks (optimizing plasticity). In the second phase, we combine this new knowledge with the previous network in an adaptation-retrospection phase to avoid forgetting and initialize a new expert with the knowledge of the old network. We perform several experiments showing that our proposed approach outperforms other CURL exemplar-free methods in few- and many-task split settings. Furthermore, we show how to adapt our approach to semi-supervised continual learning (Semi-SCL) and show that we surpass the accuracy of other exemplar-free Semi-SCL methods and reach the results of some others that use exemplars.
	Address	Waikoloa; Hawai; USA; January 2024
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference	WACV
	Notes	LAMP			Approved	no
	Call Number	Admin @ si @ GTW2024			Serial	3989
Permanent link to this record



	Author	Sergi Garcia Bordils; Dimosthenis Karatzas; Marçal Rusiñol
	Title	STEP – Towards Structured Scene-Text Spotting			Type	Conference Article
	Year	2024	Publication	Winter Conference on Applications of Computer Vision	Abbreviated Journal
	Volume		Issue		Pages	883-892
	Keywords
	Abstract	We introduce the structured scene-text spotting task, which requires a scene-text OCR system to spot text in the wild according to a query regular expression. Contrary to generic scene text OCR, structured scene-text spotting seeks to dynamically condition both scene text detection and recognition on user-provided regular expressions. To tackle this task, we propose the Structured TExt sPotter (STEP), a model that exploits the provided text structure to guide the OCR process. STEP is able to deal with regular expressions that contain spaces and it is not bound to detection at the word-level granularity. Our approach enables accurate zero-shot structured text spotting in a wide variety of real-world reading scenarios and is solely trained on publicly available data. To demonstrate the effectiveness of our approach, we introduce a new challenging test dataset that contains several types of out-of-vocabulary structured text, reflecting important reading applications of fields such as prices, dates, serial numbers, license plates etc. We demonstrate that STEP can provide specialised OCR performance on demand in all tested scenarios.
	Address	Waikoloa; Hawai; USA; January 2024
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference	WACV
	Notes	DAG			Approved	no
	Call Number	Admin @ si @ GKR2024			Serial	3992
Permanent link to this record