Publicacions CVC -- Query Results

	Publicacions CVC Home \| Show All \| Simple Search \| Advanced Search \| Add Record \| Import	Login Quick Search: Field: contains: ...
	106–120 of 140 records found matching your query (RSS \| history):

Search & Display Options

Select All Deselect All

<< 1 2 3 4 5 6 7 8 9 10 >>

List View

Citations

Details

	Records
	Author	Mohamed Ramzy Ibrahim; Robert Benavente; Felipe Lumbreras; Daniel Ponsa
	Title	3DRRDB: Super Resolution of Multiple Remote Sensing Images using 3D Residual in Residual Dense Blocks			Type	Conference Article
	Year	2022	Publication	CVPR 2022 Workshop on IEEE Perception Beyond the Visible Spectrum workshop series (PBVS, 18th Edition)	Abbreviated Journal
	Volume		Issue		Pages
	Keywords	Training; Solid modeling; Three-dimensional displays; PSNR; Convolution; Superresolution; Pattern recognition
	Abstract	The rapid advancement of Deep Convolutional Neural Networks helped in solving many remote sensing problems, especially the problems of super-resolution. However, most state-of-the-art methods focus more on Single Image Super-Resolution neglecting Multi-Image Super-Resolution. In this work, a new proposed 3D Residual in Residual Dense Blocks model (3DRRDB) focuses on remote sensing Multi-Image Super-Resolution for two different single spectral bands. The proposed 3DRRDB model explores the idea of 3D convolution layers in deeply connected Dense Blocks and the effect of local and global residual connections with residual scaling in Multi-Image Super-Resolution. The model tested on the Proba-V challenge dataset shows a significant improvement above the current state-of-the-art models scoring a Corrected Peak Signal to Noise Ratio (cPSNR) of 48.79 dB and 50.83 dB for Near Infrared (NIR) and RED Bands respectively. Moreover, the proposed 3DRRDB model scores a Corrected Structural Similarity Index Measure (cSSIM) of 0.9865 and 0.9909 for NIR and RED bands respectively.
	Address	New Orleans, USA; 19 June 2022
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference	CVPRW
	Notes	MSIAU; 600.130			Approved	no
	Call Number	Admin @ si @ IBL2022			Serial	3693
Permanent link to this record



	Author	Joakim Bruslund Haurum; Meysam Madadi; Sergio Escalera; Thomas B. Moeslund
	Title	Multi-Task Classification of Sewer Pipe Defects and Properties Using a Cross-Task Graph Neural Network Decoder			Type	Conference Article
	Year	2022	Publication	Winter Conference on Applications of Computer Vision	Abbreviated Journal
	Volume		Issue		Pages	2806-2817
	Keywords	Vision Systems; Applications Multi-Task Classification
	Abstract	The sewerage infrastructure is one of the most important and expensive infrastructures in modern society. In order to efficiently manage the sewerage infrastructure, automated sewer inspection has to be utilized. However, while sewer defect classification has been investigated for decades, little attention has been given to classifying sewer pipe properties such as water level, pipe material, and pipe shape, which are needed to evaluate the level of sewer pipe deterioration. In this work we classify sewer pipe defects and properties concurrently and present a novel decoder-focused multi-task classification architecture Cross-Task Graph Neural Network (CT-GNN), which refines the disjointed per-task predictions using cross-task information. The CT-GNN architecture extends the traditional disjointed task-heads decoder, by utilizing a cross-task graph and unique class node embeddings. The cross-task graph can either be determined a priori based on the conditional probability between the task classes or determined dynamically using self-attention. CT-GNN can be added to any backbone and trained end-toend at a small increase in the parameter count. We achieve state-of-the-art performance on all four classification tasks in the Sewer-ML dataset, improving defect classification and water level classification by 5.3 and 8.0 percentage points, respectively. We also outperform the single task methods as well as other multi-task classification approaches while introducing 50 times fewer parameters than previous modelfocused approaches.
	Address
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference	WACV
	Notes	HUPBA; no proj			Approved	no
	Call Number	Admin @ si @ BME2022			Serial	3638
Permanent link to this record



	Author	Silvio Giancola; Anthony Cioppa; Adrien Deliege; Floriane Magera; Vladimir Somers; Le Kang; Xin Zhou; Olivier Barnich; Christophe De Vleeschouwer; Alexandre Alahi; Bernard Ghanem; Marc Van Droogenbroeck; Abdulrahman Darwish; Adrien Maglo; Albert Clapes; Andreas Luyts; Andrei Boiarov; Artur Xarles; Astrid Orcesi; Avijit Shah; Baoyu Fan; Bharath Comandur; Chen Chen; Chen Zhang; Chen Zhao; Chengzhi Lin; Cheuk-Yiu Chan; Chun Chuen Hui; Dengjie Li; Fan Yang; Fan Liang; Fang Da; Feng Yan; Fufu Yu; Guanshuo Wang; H. Anthony Chan; He Zhu; Hongwei Kan; Jiaming Chu; Jianming Hu; Jianyang Gu; Jin Chen; Joao V. B. Soares; Jonas Theiner; Jorge De Corte; Jose Henrique Brito; Jun Zhang; Junjie Li; Junwei Liang; Leqi Shen; Lin Ma; Lingchi Chen; Miguel Santos Marques; Mike Azatov; Nikita Kasatkin; Ning Wang; Qiong Jia; Quoc Cuong Pham; Ralph Ewerth; Ran Song; Rengang Li; Rikke Gade; Ruben Debien; Runze Zhang; Sangrok Lee; Sergio Escalera; Shan Jiang; Shigeyuki Odashima; Shimin Chen; Shoichi Masui; Shouhong Ding; Sin-wai Chan; Siyu Chen; Tallal El-Shabrawy; Tao He; Thomas B. Moeslund; Wan-Chi Siu; Wei Zhang; Wei Li; Xiangwei Wang; Xiao Tan; Xiaochuan Li; Xiaolin Wei; Xiaoqing Ye; Xing Liu; Xinying Wang; Yandong Guo; Yaqian Zhao; Yi Yu; Yingying Li; Yue He; Yujie Zhong; Zhenhua Guo; Zhiheng Li
	Title	SoccerNet 2022 Challenges Results			Type	Conference Article
	Year	2022	Publication	5th International ACM Workshop on Multimedia Content Analysis in Sports	Abbreviated Journal
	Volume		Issue		Pages	75-86
	Keywords
	Abstract	The SoccerNet 2022 challenges were the second annual video understanding challenges organized by the SoccerNet team. In 2022, the challenges were composed of 6 vision-based tasks: (1) action spotting, focusing on retrieving action timestamps in long untrimmed videos, (2) replay grounding, focusing on retrieving the live moment of an action shown in a replay, (3) pitch localization, focusing on detecting line and goal part elements, (4) camera calibration, dedicated to retrieving the intrinsic and extrinsic camera parameters, (5) player re-identification, focusing on retrieving the same players across multiple views, and (6) multiple object tracking, focusing on tracking players and the ball through unedited video streams. Compared to last year's challenges, tasks (1-2) had their evaluation metrics redefined to consider tighter temporal accuracies, and tasks (3-6) were novel, including their underlying data and annotations. More information on the tasks, challenges and leaderboards are available on this https URL. Baselines and development kits are available on this https URL.
	Address	Lisboa; Portugal; October 2022
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference	ACMW
	Notes	HUPBA; no menciona			Approved	no
	Call Number	Admin @ si @ GCD2022			Serial	3801
Permanent link to this record



	Author	Ali Furkan Biten; Andres Mafla; Lluis Gomez; Dimosthenis Karatzas
	Title	Is An Image Worth Five Sentences? A New Look into Semantics for Image-Text Matching			Type	Conference Article
	Year	2022	Publication	Winter Conference on Applications of Computer Vision	Abbreviated Journal
	Volume		Issue		Pages	1391-1400
	Keywords	Measurement; Training; Integrated circuits; Annotations; Semantics; Training data; Semisupervised learning
	Abstract	The task of image-text matching aims to map representations from different modalities into a common joint visual-textual embedding. However, the most widely used datasets for this task, MSCOCO and Flickr30K, are actually image captioning datasets that offer a very limited set of relationships between images and sentences in their ground-truth annotations. This limited ground truth information forces us to use evaluation metrics based on binary relevance: given a sentence query we consider only one image as relevant. However, many other relevant images or captions may be present in the dataset. In this work, we propose two metrics that evaluate the degree of semantic relevance of retrieved items, independently of their annotated binary relevance. Additionally, we incorporate a novel strategy that uses an image captioning metric, CIDEr, to define a Semantic Adaptive Margin (SAM) to be optimized in a standard triplet loss. By incorporating our formulation to existing models, a large improvement is obtained in scenarios where available training data is limited. We also demonstrate that the performance on the annotated image-caption pairs is maintained while improving on other non-annotated relevant items when employing the full training set. The code for our new metric can be found at github. com/furkanbiten/ncsmetric and the model implementation at github. com/andrespmd/semanticadaptive_margin.
	Address	Virtual; Waikoloa; Hawai; USA; January 2022
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference	WACV
	Notes	DAG; 600.155; 302.105;			Approved	no
	Call Number	Admin @ si @ BMG2022			Serial	3663
Permanent link to this record



	Author	Arnau Baro
	Title	Reading Music Systems: From Deep Optical Music Recognition to Contextual Methods			Type	Book Whole
	Year	2022	Publication	PhD Thesis, Universitat Autonoma de Barcelona-CVC	Abbreviated Journal
	Volume		Issue		Pages
	Keywords
	Abstract	The transcription of sheet music into some machine-readable format can be carried out manually. However, the complexity of music notation inevitably leads to burdensome software for music score editing, which makes the whole process very time-consuming and prone to errors. Consequently, automatic transcription systems for musical documents represent interesting tools. Document analysis is the subject that deals with the extraction and processing of documents through image and pattern recognition. It is a branch of computer vision. Taking music scores as source, the field devoted to address this task is known as Optical Music Recognition (OMR). Typically, an OMR system takes an image of a music score and automatically extracts its content into some symbolic structure such as MEI or MusicXML. In this dissertation, we have investigated different methods for recognizing a single staff section (e.g. scores for violin, flute, etc.), much in the same way as most text recognition research focuses on recognizing words appearing in a given line image. These methods are based in two different methodologies. On the one hand, we present two methods based on Recurrent Neural Networks, in particular, the Long Short-Term Memory Neural Network. On the other hand, a method based on Sequence to Sequence models is detailed. Music context is needed to improve the OMR results, just like language models and dictionaries help in handwriting recognition. For example, syntactical rules and grammars could be easily defined to cope with the ambiguities in the rhythm. In music theory, for example, the time signature defines the amount of beats per bar unit. Thus, in the second part of this dissertation, different methodologies have been investigated to improve the OMR recognition. We have explored three different methods: (a) a graphic tree-structure representation, Dendrograms, that joins, at each level, its primitives following a set of rules, (b) the incorporation of Language Models to model the probability of a sequence of tokens, and (c) graph neural networks to analyze the music scores to avoid meaningless relationships between music primitives. Finally, to train all these methodologies, and given the method-specificity of the datasets in the literature, we have created four different music datasets. Two of them are synthetic with a modern or old handwritten appearance, whereas the other two are real handwritten scores, being one of them modern and the other old.
	Address
	Corporate Author				Thesis	Ph.D. thesis
	Publisher	IMPRIMA	Place of Publication		Editor	Alicia Fornes
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN	978-84-124793-8-6	Medium
	Area		Expedition		Conference
	Notes	DAG;			Approved	no
	Call Number	Admin @ si @ Bar2022			Serial	3754
Permanent link to this record



	Author	Henry Velesaca; Patricia Suarez; Dario Carpio; Rafael E. Rivadeneira; Angel Sanchez; Angel Morera
	Title	Video Analytics in Urban Environments: Challenges and Approaches			Type	Book Chapter
	Year	2022	Publication	ICT Applications for Smart Cities	Abbreviated Journal
	Volume	224	Issue		Pages	101-121
	Keywords
	Abstract	This chapter reviews state-of-the-art approaches generally present in the pipeline of video analytics on urban scenarios. A typical pipeline is used to cluster approaches in the literature, including image preprocessing, object detection, object classification, and object tracking modules. Then, a review of recent approaches for each module is given. Additionally, applications and datasets generally used for training and evaluating the performance of these approaches are included. This chapter does not pretend to be an exhaustive review of state-of-the-art video analytics in urban environments but rather an illustration of some of the different recent contributions. The chapter concludes by presenting current trends in video analytics in the urban scenario field.
	Address	September 2022
	Corporate Author				Thesis
	Publisher	Springer	Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title	ISRL
	Series Volume		Series Issue		Edition
	ISSN		ISBN	978-3-031-06306-0	Medium
	Area		Expedition		Conference
	Notes	MSIAU; MACO			Approved	no
	Call Number	Admin @ si @ VSC2022			Serial	3811
Permanent link to this record



	Author	Jorge Charco; Angel Sappa; Boris X. Vintimilla; Henry Velesaca
	Title	Human Body Pose Estimation in Multi-view Environments			Type	Book Chapter
	Year	2022	Publication	ICT Applications for Smart Cities. Intelligent Systems Reference Library	Abbreviated Journal
	Volume	224	Issue		Pages	79-99
	Keywords
	Abstract	This chapter tackles the challenging problem of human pose estimation in multi-view environments to handle scenes with self-occlusions. The proposed approach starts by first estimating the camera pose—extrinsic parameters—in multi-view scenarios; due to few real image datasets, different virtual scenes are generated by using a special simulator, for training and testing the proposed convolutional neural network based approaches. Then, these extrinsic parameters are used to establish the relation between different cameras into the multi-view scheme, which captures the pose of the person from different points of view at the same time. The proposed multi-view scheme allows to robustly estimate human body joints’ position even in situations where they are occluded. This would help to avoid possible false alarms in behavioral analysis systems of smart cities, as well as applications for physical therapy, safe moving assistance for the elderly among other. The chapter concludes by presenting experimental results in real scenes by using state-of-the-art and the proposed multi-view approaches.
	Address	September 2022
	Corporate Author				Thesis
	Publisher	Springer	Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title	ISRL
	Series Volume		Series Issue		Edition
	ISSN		ISBN	978-3-031-06306-0	Medium
	Area		Expedition		Conference
	Notes	MSIAU; MACO			Approved	no
	Call Number	Admin @ si @ CSV2022b			Serial	3810
Permanent link to this record



	Author	Spencer Low; Oliver Nina; Angel Sappa; Erik Blasch; Nathan Inkawhich
	Title	Multi-Modal Aerial View Object Classification Challenge Results – PBVS 2022			Type	Conference Article
	Year	2022	Publication	IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)	Abbreviated Journal
	Volume		Issue		Pages	350-358
	Keywords
	Abstract	This paper details the results and main findings of the second iteration of the Multi-modal Aerial View Object Classification (MAVOC) challenge. The primary goal of both MAVOC challenges is to inspire research into methods for building recognition models that utilize both synthetic aperture radar (SAR) and electro-optical (EO) imagery. Teams are encouraged to develop multi-modal approaches that incorporate complementary information from both domains. While the 2021 challenge showed a proof of concept that both modalities could be used together, the 2022 challenge focuses on the detailed multi-modal methods. The 2022 challenge uses the same UNIfied Coincident Optical and Radar for recognitioN (UNICORN) dataset and competition format that was used in 2021. Specifically, the challenge focuses on two tasks, (1) SAR classification and (2) SAR + EO classification. The bulk of this document is dedicated to discussing the top performing methods and describing their performance on our blind test set. Notably, all of the top ten teams outperform a Resnet-18 baseline. For SAR classification, the top team showed a 129% improvement over baseline and an 8% average improvement from the 2021 winner. The top team for SAR + EO classification shows a 165% improvement with a 32% average improvement over 2021.
	Address	New Orleans; USA; June 2022
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference	CVPRW
	Notes	MSIAU			Approved	no
	Call Number	Admin @ si @ LNS2022			Serial	3768
Permanent link to this record



	Author	Xavier Soria; Gonzalo Pomboza-Junez; Angel Sappa
	Title	LDC: Lightweight Dense CNN for Edge Detection			Type	Journal Article
	Year	2022	Publication	IEEE Access	Abbreviated Journal	ACCESS
	Volume	10	Issue		Pages	68281-68290
	Keywords
	Abstract	This paper presents a Lightweight Dense Convolutional (LDC) neural network for edge detection. The proposed model is an adaptation of two state-of-the-art approaches, but it requires less than 4% of parameters in comparison with these approaches. The proposed architecture generates thin edge maps and reaches the highest score (i.e., ODS) when compared with lightweight models (models with less than 1 million parameters), and reaches a similar performance when compare with heavy architectures (models with about 35 million parameters). Both quantitative and qualitative results and comparisons with state-of-the-art models, using different edge detection datasets, are provided. The proposed LDC does not use pre-trained weights and requires straightforward hyper-parameter settings. The source code is released at https://github.com/xavysp/LDC
	Address	27 June 2022
	Corporate Author				Thesis
	Publisher	IEEE	Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference
	Notes	MSIAU; MACO; 600.160; 600.167			Approved	no
	Call Number	Admin @ si @ SPS2022			Serial	3751
Permanent link to this record



	Author	Jorge Charco; Angel Sappa; Boris X. Vintimilla
	Title	Human Pose Estimation through a Novel Multi-view Scheme			Type	Conference Article
	Year	2022	Publication	17th International Conference on Computer Vision Theory and Applications (VISAPP 2022)	Abbreviated Journal
	Volume	5	Issue		Pages	855-862
	Keywords	Multi-view Scheme; Human Pose Estimation; Relative Camera Pose; Monocular Approach
	Abstract	This paper presents a multi-view scheme to tackle the challenging problem of the self-occlusion in human pose estimation problem. The proposed approach first obtains the human body joints of a set of images, which are captured from different views at the same time. Then, it enhances the obtained joints by using a multi-view scheme. Basically, the joints from a given view are used to enhance poorly estimated joints from another view, especially intended to tackle the self occlusions cases. A network architecture initially proposed for the monocular case is adapted to be used in the proposed multi-view scheme. Experimental results and comparisons with the state-of-the-art approaches on Human3.6m dataset are presented showing improvements in the accuracy of body joints estimations.
	Address	On line; Feb 6, 2022 – Feb 8, 2022
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN	2184-4321	ISBN	978-989-758-555-5	Medium
	Area		Expedition		Conference	VISAPP
	Notes	MSIAU; 600.160			Approved	no
	Call Number	Admin @ si @ CSV2022			Serial	3689
Permanent link to this record



	Author	Patricia Suarez; Dario Carpio; Angel Sappa; Henry Velesaca
	Title	Transformer based Image Dehazing			Type	Conference Article
	Year	2022	Publication	16th IEEE International Conference on Signal Image Technology & Internet Based System	Abbreviated Journal
	Volume		Issue		Pages
	Keywords	atmospheric light; brightness component; computational cost; dehazing quality; haze-free image
	Abstract	This paper presents a novel approach to remove non homogeneous haze from real images. The proposed method consists mainly of image feature extraction, haze removal, and image reconstruction. To accomplish this challenging task, we propose an architecture based on transformers, which have been recently introduced and have shown great potential in different computer vision tasks. Our model is based on the SwinIR an image restoration architecture based on a transformer, but by modifying the deep feature extraction module, the depth level of the model, and by applying a combined loss function that improves styling and adapts the model for the non-homogeneous haze removal present in images. The obtained results prove to be superior to those obtained by state-of-the-art models.
	Address	Dijon; France; October 2022
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference	SITIS
	Notes	MSIAU; no proj			Approved	no
	Call Number	Admin @ si @ SCS2022			Serial	3803
Permanent link to this record



	Author	Rafael E. Rivadeneira; Angel Sappa; Boris X. Vintimilla; Riad I. Hammoud
	Title	A Novel Domain Transfer-Based Approach for Unsupervised Thermal Image Super-Resolution			Type	Journal Article
	Year	2022	Publication	Sensors	Abbreviated Journal	SENS
	Volume	22	Issue	6	Pages	2254
	Keywords	Thermal image super-resolution; unsupervised super-resolution; thermal images; attention module; semiregistered thermal images
	Abstract	This paper presents a transfer domain strategy to tackle the limitations of low-resolution thermal sensors and generate higher-resolution images of reasonable quality. The proposed technique employs a CycleGAN architecture and uses a ResNet as an encoder in the generator along with an attention module and a novel loss function. The network is trained on a multi-resolution thermal image dataset acquired with three different thermal sensors. Results report better performance benchmarking results on the 2nd CVPR-PBVS-2021 thermal image super-resolution challenge than state-of-the-art methods. The code of this work is available online.
	Address
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference
	Notes	MSIAU;			Approved	no
	Call Number	Admin @ si @ RSV2022b			Serial	3688
Permanent link to this record



	Author	Iban Berganzo-Besga; Hector A. Orengo; Felipe Lumbreras; Paloma Aliende; Monica N. Ramsey
	Title	Automated detection and classification of multi-cell Phytoliths using Deep Learning-Based Algorithms			Type	Journal Article
	Year	2022	Publication	Journal of Archaeological Science	Abbreviated Journal	JArchSci
	Volume	148	Issue		Pages	105654
	Keywords
	Abstract	This paper presents an algorithm for automated detection and classification of multi-cell phytoliths, one of the major components of many archaeological and paleoenvironmental deposits. This identification, based on phytolith wave pattern, is made using a pretrained VGG19 deep learning model. This approach has been tested in three key phytolith genera for the study of agricultural origins in Near East archaeology: Avena, Hordeum and Triticum. Also, this classification has been validated at species-level using Triticum boeoticum and dicoccoides images. Due to the diversity of microscopes, cameras and chemical treatments that can influence images of phytolith slides, three types of data augmentation techniques have been implemented: rotation of the images at 45-degree angles, random colour and brightness jittering, and random blur/sharpen. The implemented workflow has resulted in an overall accuracy of 93.68% for phytolith genera, improving previous attempts. The algorithm has also demonstrated its potential to automatize the classification of phytoliths species with an overall accuracy of 100%. The open code and platforms employed to develop the algorithm assure the method's accessibility, reproducibility and reusability.
	Address	December 2022
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference
	Notes	MSIAU; MACO; 600.167			Approved	no
	Call Number	Admin @ si @ BOL2022			Serial	3753
Permanent link to this record



	Author	Sergi Garcia Bordils; Andres Mafla; Ali Furkan Biten; Oren Nuriel; Aviad Aberdam; Shai Mazor; Ron Litman; Dimosthenis Karatzas
	Title	Out-of-Vocabulary Challenge Report			Type	Conference Article
	Year	2022	Publication	Proceedings European Conference on Computer Vision Workshops	Abbreviated Journal
	Volume	13804	Issue		Pages	359–375
	Keywords
	Abstract	This paper presents final results of the Out-Of-Vocabulary 2022 (OOV) challenge. The OOV contest introduces an important aspect that is not commonly studied by Optical Character Recognition (OCR) models, namely, the recognition of unseen scene text instances at training time. The competition compiles a collection of public scene text datasets comprising of 326,385 images with 4,864,405 scene text instances, thus covering a wide range of data distributions. A new and independent validation and test set is formed with scene text instances that are out of vocabulary at training time. The competition was structured in two tasks, end-to-end and cropped scene text recognition respectively. A thorough analysis of results from baselines and different participants is presented. Interestingly, current state-of-the-art models show a significant performance gap under the newly studied setting. We conclude that the OOV dataset proposed in this challenge will be an essential area to be explored in order to develop scene text models that achieve more robust and generalized predictions.
	Address	Tel-Aviv; Israel; October 2022
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title	LNCS
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference	ECCVW
	Notes	DAG; 600.155; 302.105; 611.002			Approved	no
	Call Number	Admin @ si @ GMB2022			Serial	3771
Permanent link to this record



	Author	Rafael E. Rivadeneira; Angel Sappa; Boris X. Vintimilla; Jin Kim; Dogun Kim; Zhihao Li; Yingchun Jian; Bo Yan; Leilei Cao; Fengliang Qi; Hongbin Wang Rongyuan Wu; Lingchen Sun; Yongqiang Zhao; Lin Li; Kai Wang; Yicheng Wang; Xuanming Zhang; Huiyuan Wei; Chonghua Lv; Qigong Sun; Xiaolin Tian; Zhuang Jia; Jiakui Hu; Chenyang Wang; Zhiwei Zhong; Xianming Liu; Junjun Jiang
	Title	Thermal Image Super-Resolution Challenge Results – PBVS 2022			Type	Conference Article
	Year	2022	Publication	IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)	Abbreviated Journal
	Volume		Issue		Pages	418-426
	Keywords
	Abstract	This paper presents results from the third Thermal Image Super-Resolution (TISR) challenge organized in the Perception Beyond the Visible Spectrum (PBVS) 2022 workshop. The challenge uses the same thermal image dataset as the first two challenges, with 951 training images and 50 validation images at each resolution. A set of 20 images was kept aside for testing. The evaluation tasks were to measure the PSNR and SSIM between the SR image and the ground truth (HR thermal noisy image downsampled by four), and also to measure the PSNR and SSIM between the SR image and the semi-registered HR image (acquired with another camera). The results outperformed those from last year’s challenge, improving both evaluation metrics. This year, almost 100 teams participants registered for the challenge, showing the community’s interest in this hot topic.
	Address	New Orleans; USA; June 2022
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference	CVPRW
	Notes	MSIAU; no menciona			Approved	no
	Call Number	Admin @ si @ RSV2022c			Serial	3775
Permanent link to this record