Publicacions CVC -- Query Results

[201–210] << 211 212 213 214 215 216 217 218 219 220 >> [221–228]

Details

Records
Author	Michael Teutsch; Angel Sappa; Riad I. Hammoud
Title	Cross-Spectral Image Processing			Type	Book Chapter
Year	2022	Publication	Computer Vision in the Infrared Spectrum. Synthesis Lectures on Computer Vision	Abbreviated Journal
Volume		Issue		Pages	23-34
Keywords
Abstract	Although this book is on IR computer vision and its main focus lies on IR image and video processing and analysis, a special attention is dedicated to cross-spectral image processing due to the increasing number of publications and applications in this domain. In these cross-spectral frameworks, IR information is used together with information from other spectral bands to tackle some specific problems by developing more robust solutions. Tasks considered for cross-spectral processing are for instance dehazing, segmentation, vegetation index estimation, or face recognition. This increasing number of applications is motivated by cross- and multi-spectral camera setups available already on the market like for example smartphones, remote sensing multispectral cameras, or multi-spectral cameras for automotive systems or drones. In this chapter, different cross-spectral image processing techniques will be reviewed together with possible applications. Initially, image registration approaches for the cross-spectral case are reviewed: the registration stage is the first image processing task, which is needed to align images acquired by different sensors within the same reference coordinate system. Then, recent cross-spectral image colorization approaches, which are intended to colorize infrared images for different applications are presented. Finally, the cross-spectral image enhancement problem is tackled by including guided super resolution techniques, image dehazing approaches, cross-spectral filtering and edge detection. Figure 3.1 illustrates cross-spectral image processing stages as well as their possible connections. Table 3.1 presents some of the available public cross-spectral datasets generally used as reference data to evaluate cross-spectral image registration, colorization, enhancement, or exploitation results.
Address
Corporate Author				Thesis
Publisher	Springer	Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title	SLCV
Series Volume		Series Issue		Edition
ISSN		ISBN	978-3-031-00698-2	Medium
Area		Expedition		Conference
Notes	MSIAU; MACO			Approved	no
Call Number	Admin @ si @ TSH2022b			Serial	3805
Permanent link to this record



Author	Michael Teutsch; Angel Sappa; Riad I. Hammoud
Title	Detection, Classification, and Tracking			Type	Book Chapter
Year	2022	Publication	Computer Vision in the Infrared Spectrum. Synthesis Lectures on Computer Vision	Abbreviated Journal
Volume		Issue		Pages	35-58
Keywords
Abstract	Automatic image and video exploitation or content analysis is a technique to extract higher-level information from a scene such as objects, behavior, (inter-)actions, environment, or even weather conditions. The relevant information is assumed to be contained in the two-dimensional signal provided in an image (width and height in pixels) or the three-dimensional signal provided in a video (width, height, and time). But also intermediate-level information such as object classes [196], locations [197], or motion [198] can help applications to fulfill certain tasks such as intelligent compression [199], video summarization [200], or video retrieval [201]. Usually, videos with their temporal dimension are a richer source of data compared to single images [202] and thus certain video content can be extracted from videos only such as object motion or object behavior. Often, machine learning or nowadays deep learning techniques are utilized to model prior knowledge about object or scene appearance using labeled training samples [203, 204]. After a learning phase, these models are then applied in real world applications, which is called inference.
Address
Corporate Author				Thesis
Publisher	Springer	Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title	SLCV
Series Volume		Series Issue		Edition
ISSN		ISBN	978-3-031-00698-2	Medium
Area		Expedition		Conference
Notes	MSIAU; MACO			Approved	no
Call Number	Admin @ si @ TSH2022c			Serial	3806
Permanent link to this record



Author	Michael Teutsch; Angel Sappa; Riad I. Hammoud
Title	Image and Video Enhancement			Type	Book Chapter
Year	2022	Publication	Computer Vision in the Infrared Spectrum. Synthesis Lectures on Computer Vision	Abbreviated Journal
Volume		Issue		Pages	9-21
Keywords
Abstract	Image and video enhancement aims at improving the signal quality relative to imaging artifacts such as noise and blur or atmospheric perturbations such as turbulence and haze. It is usually performed in order to assist humans in analyzing image and video content or simply to present humans visually appealing images and videos. However, image and video enhancement can also be used as a preprocessing technique to ease the task and thus improve the performance of subsequent automatic image content analysis algorithms: preceding dehazing can improve object detection as shown by [23] or explicit turbulence modeling can improve moving object detection as discussed by [24]. But it remains an open question whether image and video enhancement should rather be performed explicitly as a preprocessing step or implicitly for example by feeding affected images directly to a neural network for image content analysis like object detection [25]. Especially for real-time video processing at low latency it can be better to handle image perturbation implicitly in order to minimize the processing time of an algorithm. This can be achieved by making algorithms for image content analysis robust or even invariant to perturbations such as noise or blur. Additionally, mistakes of an individual preprocessing module can obviously affect the quality of the entire processing pipeline.
Address
Corporate Author				Thesis
Publisher	Springer	Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title	SLCV
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference
Notes	MSIAU; MACO			Approved	no
Call Number	Admin @ si @ TSH2022a			Serial	3807
Permanent link to this record



Author	Alex Falcon; Swathikiran Sudhakaran; Giuseppe Serra; Sergio Escalera; Oswald Lanz
Title	Relevance-based Margin for Contrastively-trained Video Retrieval Models			Type	Conference Article
Year	2022	Publication	ICMR '22: Proceedings of the 2022 International Conference on Multimedia Retrieval	Abbreviated Journal
Volume		Issue		Pages	146-157
Keywords
Abstract	Video retrieval using natural language queries has attracted increasing interest due to its relevance in real-world applications, from intelligent access in private media galleries to web-scale video search. Learning the cross-similarity of video and text in a joint embedding space is the dominant approach. To do so, a contrastive loss is usually employed because it organizes the embedding space by putting similar items close and dissimilar items far. This framework leads to competitive recall rates, as they solely focus on the rank of the groundtruth items. Yet, assessing the quality of the ranking list is of utmost importance when considering intelligent retrieval systems, since multiple items may share similar semantics, hence a high relevance. Moreover, the aforementioned framework uses a fixed margin to separate similar and dissimilar items, treating all non-groundtruth items as equally irrelevant. In this paper we propose to use a variable margin: we argue that varying the margin used during training based on how much relevant an item is to a given query, i.e. a relevance-based margin, easily improves the quality of the ranking lists measured through nDCG and mAP. We demonstrate the advantages of our technique using different models on EPIC-Kitchens-100 and YouCook2. We show that even if we carefully tuned the fixed margin, our technique (which does not have the margin as a hyper-parameter) would still achieve better performance. Finally, extensive ablation studies and qualitative analysis support the robustness of our approach. Code will be released at \urlhttps://github.com/aranciokov/RelevanceMargin-ICMR22.
Address	Newwark, NJ, USA, 27 June 2022
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference	ICMR
Notes	HuPBA; no menciona			Approved	no
Call Number	Admin @ si @ FSS2022			Serial	3808
Permanent link to this record



Author	Utkarsh Porwal; Alicia Fornes; Faisal Shafait (eds)
Title	Frontiers in Handwriting Recognition. International Conference on Frontiers in Handwriting Recognition. 18th International Conference, ICFHR 2022			Type	Book Whole
Year	2022	Publication	Frontiers in Handwriting Recognition.	Abbreviated Journal
Volume	13639	Issue		Pages
Keywords
Abstract
Address	ICFHR 2022, Hyderabad, India, December 4–7, 2022
Corporate Author				Thesis
Publisher	Springer	Place of Publication		Editor	Utkarsh Porwal; Alicia Fornes; Faisal Shafait
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title	LNCS
Series Volume		Series Issue		Edition
ISSN		ISBN	978-3-031-21648-0	Medium
Area		Expedition		Conference	ICFHR
Notes	DAG			Approved	no
Call Number	Admin @ si @ PFS2022			Serial	3809
Permanent link to this record



Author	Jorge Charco; Angel Sappa; Boris X. Vintimilla; Henry Velesaca
Title	Human Body Pose Estimation in Multi-view Environments			Type	Book Chapter
Year	2022	Publication	ICT Applications for Smart Cities. Intelligent Systems Reference Library	Abbreviated Journal
Volume	224	Issue		Pages	79-99
Keywords
Abstract	This chapter tackles the challenging problem of human pose estimation in multi-view environments to handle scenes with self-occlusions. The proposed approach starts by first estimating the camera pose—extrinsic parameters—in multi-view scenarios; due to few real image datasets, different virtual scenes are generated by using a special simulator, for training and testing the proposed convolutional neural network based approaches. Then, these extrinsic parameters are used to establish the relation between different cameras into the multi-view scheme, which captures the pose of the person from different points of view at the same time. The proposed multi-view scheme allows to robustly estimate human body joints’ position even in situations where they are occluded. This would help to avoid possible false alarms in behavioral analysis systems of smart cities, as well as applications for physical therapy, safe moving assistance for the elderly among other. The chapter concludes by presenting experimental results in real scenes by using state-of-the-art and the proposed multi-view approaches.
Address	September 2022
Corporate Author				Thesis
Publisher	Springer	Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title	ISRL
Series Volume		Series Issue		Edition
ISSN		ISBN	978-3-031-06306-0	Medium
Area		Expedition		Conference
Notes	MSIAU; MACO			Approved	no
Call Number	Admin @ si @ CSV2022b			Serial	3810
Permanent link to this record



Author	Henry Velesaca; Patricia Suarez; Dario Carpio; Rafael E. Rivadeneira; Angel Sanchez; Angel Morera
Title	Video Analytics in Urban Environments: Challenges and Approaches			Type	Book Chapter
Year	2022	Publication	ICT Applications for Smart Cities	Abbreviated Journal
Volume	224	Issue		Pages	101-121
Keywords
Abstract	This chapter reviews state-of-the-art approaches generally present in the pipeline of video analytics on urban scenarios. A typical pipeline is used to cluster approaches in the literature, including image preprocessing, object detection, object classification, and object tracking modules. Then, a review of recent approaches for each module is given. Additionally, applications and datasets generally used for training and evaluating the performance of these approaches are included. This chapter does not pretend to be an exhaustive review of state-of-the-art video analytics in urban environments but rather an illustration of some of the different recent contributions. The chapter concludes by presenting current trends in video analytics in the urban scenario field.
Address	September 2022
Corporate Author				Thesis
Publisher	Springer	Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title	ISRL
Series Volume		Series Issue		Edition
ISSN		ISBN	978-3-031-06306-0	Medium
Area		Expedition		Conference
Notes	MSIAU; MACO			Approved	no
Call Number	Admin @ si @ VSC2022			Serial	3811
Permanent link to this record



Author	Guillermo Torres; Debora Gil; Antoni Rosell; S. Mena; Carles Sanchez
Title	Virtual Radiomics Biopsy for the Histological Diagnosis of Pulmonary Nodules – Intermediate Results of the RadioLung Project			Type	Journal Article
Year	2023	Publication	International Journal of Computer Assisted Radiology and Surgery	Abbreviated Journal	IJCARS
Volume		Issue		Pages
Keywords
Abstract
Address
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference
Notes	IAM			Approved	no
Call Number	Admin @ si @ TGM2023			Serial	3830
Permanent link to this record



Author	Angel Sappa (ed)
Title	ICT Applications for Smart Cities			Type	Book Whole
Year	2022	Publication	ICT Applications for Smart Cities	Abbreviated Journal
Volume	224	Issue		Pages
Keywords	Computational Intelligence; Intelligent Systems; Smart Cities; ICT Applications; Machine Learning; Pattern Recognition; Computer Vision; Image Processing
Abstract	Part of the book series: Intelligent Systems Reference Library (ISRL) This book is the result of four-year work in the framework of the Ibero-American Research Network TICs4CI funded by the CYTED program. In the following decades, 85% of the world's population is expected to live in cities; hence, urban centers should be prepared to provide smart solutions for problems ranging from video surveillance and intelligent mobility to the solid waste recycling processes, just to mention a few. More specifically, the book describes underlying technologies and practical implementations of several successful case studies of ICTs developed in the following smart city areas: • Urban environment monitoring • Intelligent mobility • Waste recycling processes • Video surveillance • Computer-aided diagnose in healthcare systems • Computer vision-based approaches for efficiency in production processes The book is intended for researchers and engineers in the field of ICTs for smart cities, as well as to anyone who wants to know about state-of-the-art approaches and challenges on this field.
Address	September 2022
Corporate Author				Thesis
Publisher	Springer	Place of Publication		Editor	Angel Sappa
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title	ISRL
Series Volume		Series Issue		Edition
ISSN		ISBN	978-3-031-06306-0	Medium
Area		Expedition		Conference
Notes	MSIAU; MACO			Approved	no
Call Number	Admin @ si @ Sap2022			Serial	3812
Permanent link to this record



Author	Victoria Ruiz; Angel Sanchez; Jose F. Velez; Bogdan Raducanu
Title	Waste Classification with Small Datasets and Limited Resources			Type	Book Chapter
Year	2022	Publication	ICT Applications for Smart Cities. Intelligent Systems Reference Library	Abbreviated Journal
Volume	224	Issue		Pages	185-203
Keywords
Abstract	Automatic waste recycling has become a very important societal challenge nowadays, raising people’s awareness for a cleaner environment and a more sustainable lifestyle. With the transition to Smart Cities, and thanks to advanced ICT solutions, this problem has received a new impulse. The waste recycling focus has shifted from general waste treating facilities to an individual responsibility, where each person should become aware of selective waste separation. The surge of the mobile devices, accompanied by a significant increase in computation power, has potentiated and facilitated this individual role. An automated image-based waste classification mechanism can help with a more efficient recycling and a reduction of contamination from residuals. Despite the good results achieved with the deep learning methodologies for this task, the Achille’s heel is that they require large neural networks which need significant computational resources for training and therefore are not suitable for mobile devices. To circumvent this apparently intractable problem, we will rely on knowledge distillation in order to transfer the network’s knowledge from a larger network (called ‘teacher’) to a smaller, more compact one, (referred as ‘student’) and thus making it possible the task of image classification on a device with limited resources. For evaluation, we considered as ‘teachers’ large architectures such as InceptionResNet or DenseNet and as ‘students’, several configurations of the MobileNets. We used the publicly available TrashNet dataset to demonstrate that the distillation process does not significantly affect system’s performance (e.g. classification accuracy) of the student network.
Address	September 2022
Corporate Author				Thesis
Publisher	Springer	Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title	ISRL
Series Volume		Series Issue		Edition
ISSN		ISBN	978-3-031-06306-0	Medium
Area		Expedition		Conference
Notes	LAMP			Approved	no
Call Number	Admin @ si @			Serial	3813
Permanent link to this record



Author	Shiqi Yang; Yaxing Wang; Kai Wang; Shangling Jui; Joost Van de Weijer
Title	Local Prediction Aggregation: A Frustratingly Easy Source-free Domain Adaptation Method			Type	Miscellaneous
Year	2022	Publication	Arxiv	Abbreviated Journal
Volume		Issue		Pages
Keywords
Abstract	We propose a simple but effective source-free domain adaptation (SFDA) method. Treating SFDA as an unsupervised clustering problem and following the intuition that local neighbors in feature space should have more similar predictions than other features, we propose to optimize an objective of prediction consistency. This objective encourages local neighborhood features in feature space to have similar predictions while features farther away in feature space have dissimilar predictions, leading to efficient feature clustering and cluster assignment simultaneously. For efficient training, we seek to optimize an upper-bound of the objective resulting in two simple terms. Furthermore, we relate popular existing methods in domain adaptation, source-free domain adaptation and contrastive learning via the perspective of discriminability and diversity. The experimental results prove the superiority of our method, and our method can be adopted as a simple but strong baseline for future research in SFDA. Our method can be also adapted to source-free open-set and partial-set DA which further shows the generalization ability of our method. Code is available in this https URL.
Address
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference
Notes	LAMP; 600.147			Approved	no
Call Number	Admin @ si @ YWW2022b			Serial	3815
Permanent link to this record



Author	Kunal Biswas; Palaiahnakote Shivakumara; Umapada Pal; Tong Lu; Michel Blumenstein; Josep Llados
Title	Classification of aesthetic natural scene images using statistical and semantic features			Type	Journal Article
Year	2023	Publication	Multimedia Tools and Applications	Abbreviated Journal	MTAP
Volume	82	Issue	9	Pages	13507-13532
Keywords
Abstract	Aesthetic image analysis is essential for improving the performance of multimedia image retrieval systems, especially from a repository of social media and multimedia content stored on mobile devices. This paper presents a novel method for classifying aesthetic natural scene images by studying the naturalness of image content using statistical features, and reading text in the images using semantic features. Unlike existing methods that focus only on image quality with human information, the proposed approach focuses on image features as well as text-based semantic features without human intervention to reduce the gap between subjectivity and objectivity in the classification. The aesthetic classes considered in this work are (i) Very Pleasant, (ii) Pleasant, (iii) Normal and (iv) Unpleasant. The naturalness is represented by features of focus, defocus, perceived brightness, perceived contrast, blurriness and noisiness, while semantics are represented by text recognition, description of the images and labels of images, profile pictures, and banner images. Furthermore, a deep learning model is proposed in a novel way to fuse statistical and semantic features for the classification of aesthetic natural scene images. Experiments on our own dataset and the standard datasets demonstrate that the proposed approach achieves 92.74%, 88.67% and 83.22% average classification rates on our own dataset, AVA dataset and CUHKPQ dataset, respectively. Furthermore, a comparative study of the proposed model with the existing methods shows that the proposed method is effective for the classification of aesthetic social media images.
Address
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference
Notes	DAG			Approved	no
Call Number	Admin @ si @ BSP2023			Serial	3873
Permanent link to this record



Author	Asma Bensalah; Antonio Parziale; Giuseppe De Gregorio; Angelo Marcelli; Alicia Fornes; Josep Llados
Title	I Can’t Believe It’s Not Better: In-air Movement for Alzheimer Handwriting Synthetic Generation			Type	Conference Article
Year	2023	Publication	21st International Graphonomics Conference	Abbreviated Journal
Volume		Issue		Pages	136–148
Keywords
Abstract	During recent years, there here has been a boom in terms of deep learning use for handwriting analysis and recognition. One main application for handwriting analysis is early detection and diagnosis in the health field. Unfortunately, most real case problems still suffer a scarcity of data, which makes difficult the use of deep learning-based models. To alleviate this problem, some works resort to synthetic data generation. Lately, more works are directed towards guided data synthetic generation, a generation that uses the domain and data knowledge to generate realistic data that can be useful to train deep learning models. In this work, we combine the domain knowledge about the Alzheimer’s disease for handwriting and use it for a more guided data generation. Concretely, we have explored the use of in-air movements for synthetic data generation.
Address	Evora; Portugal; October 2023
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference	IGS
Notes	DAG			Approved	no
Call Number	Admin @ si @ BPG2023			Serial	3838
Permanent link to this record



Author	Ali Furkan Biten; Ruben Tito; Lluis Gomez; Ernest Valveny; Dimosthenis Karatzas
Title	OCR-IDL: OCR Annotations for Industry Document Library Dataset			Type	Conference Article
Year	2022	Publication	ECCV Workshop on Text in Everything	Abbreviated Journal
Volume		Issue		Pages
Keywords
Abstract	Pretraining has proven successful in Document Intelligence tasks where deluge of documents are used to pretrain the models only later to be finetuned on downstream tasks. One of the problems of the pretraining approaches is the inconsistent usage of pretraining data with different OCR engines leading to incomparable results between models. In other words, it is not obvious whether the performance gain is coming from diverse usage of amount of data and distinct OCR engines or from the proposed models. To remedy the problem, we make public the OCR annotations for IDL documents using commercial OCR engine given their superior performance over open source OCR models. The contributed dataset (OCR-IDL) has an estimated monetary value over 20K US$. It is our hope that OCR-IDL can be a starting point for future works on Document Intelligence. All of our data and its collection process with the annotations can be found in this https URL.
Address
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference	ECCV
Notes	DAG; no proj			Approved	no
Call Number	Admin @ si @ BTG2022			Serial	3817
Permanent link to this record



Author	Shiqi Yang; Yaxing Wang; Kai Wang; Shangling Jui; Joost Van de Weijer
Title	One Ring to Bring Them All: Towards Open-Set Recognition under Domain Shift			Type	Miscellaneous
Year	2022	Publication	Arxiv	Abbreviated Journal
Volume		Issue		Pages
Keywords
Abstract	In this paper, we investigate model adaptation under domain and category shift, where the final goal is to achieve (SF-UNDA), which addresses the situation where there exist both domain and category shifts between source and target domains. Under the SF-UNDA setting, the model cannot access source data anymore during target adaptation, which aims to address data privacy concerns. We propose a novel training scheme to learn a ( +1)-way classifier to predict the source classes and the unknown class, where samples of only known source categories are available for training. Furthermore, for target adaptation, we simply adopt a weighted entropy minimization to adapt the source pretrained model to the unlabeled target domain without source data. In experiments, we show: After source training, the resulting source model can get excellent performance for ; After target adaptation, our method surpasses current UNDA approaches which demand source data during adaptation. The versatility to several different tasks strongly proves the efficacy and generalization ability of our method. When augmented with a closed-set domain adaptation approach during target adaptation, our source-free method further outperforms the current state-of-the-art UNDA method by 2.5%, 7.2% and 13% on Office-31, Office-Home and VisDA respectively.
Address
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference
Notes	LAMP; no proj			Approved	no
Call Number	Admin @ si @ YWW2022c			Serial	3818
Permanent link to this record