Publicacions CVC -- Query Results

[51–60] << 61 62 63 64 65 66 67 68 69 70 >> [71–80]

Details

Records
Author	Miquel Ferrer
Title	Theory and Algorithms on the Median Graph. Application to Graph-based Classification and Clustering			Type	Book Whole
Year	2008	Publication	PhD Thesis, Universitat Autonoma de Barcelona-CVC	Abbreviated Journal
Volume		Issue		Pages
Keywords
Abstract
Address
Corporate Author				Thesis	Ph.D. thesis
Publisher		Place of Publication		Editor	Francesc Serratosa Casanelles;Ernest Valveny
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition	978-84-935251-7-0	Conference
Notes				Approved	no
Call Number	Admin @ si @ Fer2008			Serial	1105
Permanent link to this record



Author	Miquel Angel Piera; Jose Luis Muñoz; Debora Gil; Gonzalo Martin; Jordi Manzano
Title	A Socio-Technical Simulation Model for the Design of the Future Single Pilot Cockpit: An Opportunity to Improve Pilot Performance			Type	Journal Article
Year	2022	Publication	IEEE Access	Abbreviated Journal	ACCESS
Volume	10	Issue		Pages	22330-22343
Keywords	Human factors ; Performance evaluation ; Simulation; Sociotechnical systems ; System performance
Abstract	The future deployment of single pilot operations must be supported by new cockpit computer services. Such services require an adaptive context-aware integration of technical functionalities with the concurrent tasks that a pilot must deal with. Advanced artificial intelligence supporting services and improved communication capabilities are the key enabling technologies that will render future cockpits more integrated with the present digitalized air traffic management system. However, an issue in the integration of such technologies is the lack of socio-technical analysis in the design of these teaming mechanisms. A key factor in determining how and when a service support should be provided is the dynamic evolution of pilot workload. This paper investigates how the socio-technical model-based systems engineering approach paves the way for the design of a digital assistant framework by formalizing this workload. The model was validated in an Airbus A-320 cockpit simulator, and the results confirmed the degraded pilot behavioral model and the performance impact according to different contextual flight deck information. This study contributes to practical knowledge for designing human-machine task-sharing systems.
Address	Feb 2022
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference
Notes	IAM;			Approved	no
Call Number	Admin @ si @ PMG2022			Serial	3697
Permanent link to this record



Author	Mingyi Yang; Luis Herranz; Fei Yang; Luka Murn; Marc Gorriz Blanch; Shuai Wan; Fuzheng Yang; Marta Mrak
Title	Semantic Preprocessor for Image Compression for Machines			Type	Conference Article
Year	2023	Publication	IEEE International Conference on Acoustics, Speech and Signal Processing	Abbreviated Journal
Volume		Issue		Pages
Keywords
Abstract	Visual content is being increasingly transmitted and consumed by machines rather than humans to perform automated content analysis tasks. In this paper, we propose an image preprocessor that optimizes the input image for machine consumption prior to encoding by an off-the-shelf codec designed for human consumption. To achieve a better trade-off between the accuracy of the machine analysis task and bitrate, we propose leveraging pre-extracted semantic information to improve the preprocessor’s ability to accurately identify and filter out task-irrelevant information. Furthermore, we propose a two-part loss function to optimize the preprocessor, consisted of a rate-task performance loss and a semantic distillation loss, which helps the reconstructed image obtain more information that contributes to the accuracy of the task. Experiments show that the proposed preprocessor can save up to 48.83% bitrate compared with the method without the preprocessor, and save up to 36.24% bitrate compared to existing preprocessors for machine vision.
Address	Rodhes Islands; Greece; June 2023
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference	ICASSP
Notes	MACO; LAMP			Approved	no
Call Number	Admin @ si @ YHY2023			Serial	3912
Permanent link to this record



Author	Mingyi Yang; Fei Yang; Luka Murn; Marc Gorriz Blanch; Juil Sock; Shuai Wan; Fuzheng Yang; Luis Herranz
Title	Task-Switchable Pre-Processor for Image Compression for Multiple Machine Vision Tasks			Type	Journal Article
Year	2024	Publication	IEEE Transactions on Circuits and Systems for Video Technology	Abbreviated Journal
Volume		Issue		Pages
Keywords	M Yang, F Yang, L Murn, MG Blanch, J Sock, S Wan, F Yang, L Herranz
Abstract	Visual content is increasingly being processed by machines for various automated content analysis tasks instead of being consumed by humans. Despite the existence of several compression methods tailored for machine tasks, few consider real-world scenarios with multiple tasks. In this paper, we aim to address this gap by proposing a task-switchable pre-processor that optimizes input images specifically for machine consumption prior to encoding by an off-the-shelf codec designed for human consumption. The proposed task-switchable pre-processor adeptly maintains relevant semantic information based on the specific characteristics of different downstream tasks, while effectively suppressing irrelevant information to reduce bitrate. To enhance the processing of semantic information for diverse tasks, we leverage pre-extracted semantic features to modulate the pixel-to-pixel mapping within the pre-processor. By switching between different modulations, multiple tasks can be seamlessly incorporated into the system. Extensive experiments demonstrate the practicality and simplicity of our approach. It significantly reduces the number of parameters required for handling multiple tasks while still delivering impressive performance. Our method showcases the potential to achieve efficient and effective compression for machine vision tasks, supporting the evolving demands of real-world applications.
Address
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference
Notes	xxx			Approved	no
Call Number	Admin @ si @ YYM2024			Serial	4007
Permanent link to this record



Author	Minesh Mathew; Viraj Bagal; Ruben Tito; Dimosthenis Karatzas; Ernest Valveny; C.V. Jawahar
Title	InfographicVQA			Type	Conference Article
Year	2022	Publication	Winter Conference on Applications of Computer Vision	Abbreviated Journal
Volume		Issue		Pages	1697-1706
Keywords	Document Analysis Datasets; Evaluation and Comparison of Vision Algorithms; Vision and Languages
Abstract	Infographics communicate information using a combination of textual, graphical and visual elements. This work explores the automatic understanding of infographic images by using a Visual Question Answering technique. To this end, we present InfographicVQA, a new dataset comprising a diverse collection of infographics and question-answer annotations. The questions require methods that jointly reason over the document layout, textual content, graphical elements, and data visualizations. We curate the dataset with an emphasis on questions that require elementary reasoning and basic arithmetic skills. For VQA on the dataset, we evaluate two Transformer-based strong baselines. Both the baselines yield unsatisfactory results compared to near perfect human performance on the dataset. The results suggest that VQA on infographics--images that are designed to communicate information quickly and clearly to human brain--is ideal for benchmarking machine understanding of complex document images. The dataset is available for download at docvqa. org
Address	Virtual; Waikoloa; Hawai; USA; January 2022
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference	WACV
Notes	DAG; 600.155			Approved	no
Call Number	MBT2022			Serial	3625
Permanent link to this record



Author	Minesh Mathew; Ruben Tito; Dimosthenis Karatzas; R.Manmatha; C.V. Jawahar
Title	Document Visual Question Answering Challenge 2020			Type	Conference Article
Year	2020	Publication	33rd IEEE Conference on Computer Vision and Pattern Recognition – Short paper	Abbreviated Journal
Volume		Issue		Pages
Keywords
Abstract	This paper presents results of Document Visual Question Answering Challenge organized as part of “Text and Documents in the Deep Learning Era” workshop, in CVPR 2020. The challenge introduces a new problem – Visual Question Answering on document images. The challenge comprised two tasks. The first task concerns with asking questions on a single document image. On the other hand, the second task is set as a retrieval task where the question is posed over a collection of images. For the task 1 a new dataset is introduced comprising 50,000 questions-answer(s) pairs defined over 12,767 document images. For task 2 another dataset has been created comprising 20 questions over 14,362 document images which share the same document template.
Address
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference	CVPR
Notes	DAG; 600.121			Approved	no
Call Number	Admin @ si @ MTK2020			Serial	3558
Permanent link to this record



Author	Minesh Mathew; Lluis Gomez; Dimosthenis Karatzas; C.V. Jawahar
Title	Asking questions on handwritten document collections			Type	Journal Article
Year	2021	Publication	International Journal on Document Analysis and Recognition	Abbreviated Journal	IJDAR
Volume	24	Issue		Pages	235-249
Keywords
Abstract	This work addresses the problem of Question Answering (QA) on handwritten document collections. Unlike typical QA and Visual Question Answering (VQA) formulations where the answer is a short text, we aim to locate a document snippet where the answer lies. The proposed approach works without recognizing the text in the documents. We argue that the recognition-free approach is suitable for handwritten documents and historical collections where robust text recognition is often difficult. At the same time, for human users, document image snippets containing answers act as a valid alternative to textual answers. The proposed approach uses an off-the-shelf deep embedding network which can project both textual words and word images into a common sub-space. This embedding bridges the textual and visual domains and helps us retrieve document snippets that potentially answer a question. We evaluate results of the proposed approach on two new datasets: (i) HW-SQuAD: a synthetic, handwritten document image counterpart of SQuAD1.0 dataset and (ii) BenthamQA: a smaller set of QA pairs defined on documents from the popular Bentham manuscripts collection. We also present a thorough analysis of the proposed recognition-free approach compared to a recognition-based approach which uses text recognized from the images using an OCR. Datasets presented in this work are available to download at docvqa.org.
Address
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference
Notes	DAG; 600.121			Approved	no
Call Number	Admin @ si @ MGK2021			Serial	3621
Permanent link to this record



Author	Minesh Mathew; Dimosthenis Karatzas; C.V. Jawahar
Title	DocVQA: A Dataset for VQA on Document Images			Type	Conference Article
Year	2021	Publication	IEEE Winter Conference on Applications of Computer Vision	Abbreviated Journal
Volume		Issue		Pages	2200-2209
Keywords
Abstract	We present a new dataset for Visual Question Answering (VQA) on document images called DocVQA. The dataset consists of 50,000 questions defined on 12,000+ document images. Detailed analysis of the dataset in comparison with similar datasets for VQA and reading comprehension is presented. We report several baseline results by adopting existing VQA and reading comprehension models. Although the existing models perform reasonably well on certain types of questions, there is large performance gap compared to human performance (94.36% accuracy). The models need to improve specifically on questions where understanding structure of the document is crucial. The dataset, code and leaderboard are available at docvqa. org
Address	Virtual; January 2021
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference	WACV
Notes	DAG; 600.121			Approved	no
Call Number	Admin @ si @ MKJ2021			Serial	3498
Permanent link to this record



Author	Mikkel Thogersen; Sergio Escalera; Jordi Gonzalez; Thomas B. Moeslund
Title	Segmentation of RGB-D Indoor scenes by Stacking Random Forests and Conditional Random Fields			Type	Journal Article
Year	2016	Publication	Pattern Recognition Letters	Abbreviated Journal	PRL
Volume	80	Issue		Pages	208–215
Keywords
Abstract	This paper proposes a technique for RGB-D scene segmentation using Multi-class Multi-scale Stacked Sequential Learning (MMSSL) paradigm. Following recent trends in state-of-the-art, a base classifier uses an initial SLIC segmentation to obtain superpixels which provide a diminution of data while retaining object boundaries. A series of color and depth features are extracted from the superpixels, and are used in a Conditional Random Field (CRF) to predict superpixel labels. Furthermore, a Random Forest (RF) classifier using random offset features is also used as an input to the CRF, acting as an initial prediction. As a stacked classifier, another Random Forest is used acting on a spatial multi-scale decomposition of the CRF confidence map to correct the erroneous labels assigned by the previous classifier. The model is tested on the popular NYU-v2 dataset. The approach shows that simple multi-modal features with the power of the MMSSL paradigm can achieve better performance than state of the art results on the same dataset.
Address
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference
Notes	HuPBA; ISE;MILAB; 600.098; 600.119			Approved	no
Call Number	Admin @ si @ TEG2016			Serial	2843
Permanent link to this record



Author	Mikhail Mozerov; V. Kober; I.A. Ovseyevich
Title	A Stereo Matching Algorithm with Global Smoothness Criterion			Type	Miscellaneous
Year	2006	Publication	Topical Meeting on Optoinformatics / Information Photonics, 133–135	Abbreviated Journal
Volume		Issue		Pages
Keywords
Abstract
Address	Saint-Petersburg (Russia)
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference
Notes	ISE			Approved	no
Call Number	ISE @ ise @ MKO2006			Serial	675
Permanent link to this record



Author	Mikhail Mozerov; V. Kober; I.A. Ovseyevich
Title	Robust Dynamic Programming Algorithm for Motion Detection and Estimation			Type	Journal
Year	2007	Publication		Abbreviated Journal
Volume		Issue		Pages
Keywords
Abstract
Address
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference
Notes	ISE			Approved	no
Call Number	ISE @ ise @ MKO2007			Serial	810
Permanent link to this record



Author	Mikhail Mozerov; V. Kober
Title	Impulse Noise Removal with Gradient Adaptive Neighborhoods			Type	Journal
Year	2006	Publication	Optical Engineering, 45: 67003	Abbreviated Journal
Volume		Issue		Pages
Keywords
Abstract
Address
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference
Notes	ISE			Approved	no
Call Number	ISE @ ise @ MoK2006			Serial	676
Permanent link to this record



Author	Mikhail Mozerov; Joost Van de Weijer
Title	Accurate stereo matching by two step global optimization			Type	Journal Article
Year	2015	Publication	IEEE Transactions on Image Processing	Abbreviated Journal	TIP
Volume	24	Issue	3	Pages	1153-1163
Keywords
Abstract	In stereo matching cost filtering methods and energy minimization algorithms are considered as two different techniques. Due to their global extend energy minimization methods obtain good stereo matching results. However, they tend to fail in occluded regions, in which cost filtering approaches obtain better results. In this paper we intend to combine both approaches with the aim to improve overall stereo matching results. We show that a global optimization with a fully connected model can be solved by cost fil tering methods. Based on this observation we propose to perform stereo matching as a two-step energy minimization algorithm. We consider two MRF models: a fully connected model defined on the complete set of pixels in an image and a conventional locally connected model. We solve the energy minimization problem for the fully connected model, after which the marginal function of the solution is used as the unary potential in the locally connected MRF model. Experiments on the Middlebury stereo datasets show that the proposed method achieves state-of-the-arts results.
Address
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN	1057-7149	ISBN		Medium
Area		Expedition		Conference
Notes	ISE; LAMP; 600.079; 600.078			Approved	no
Call Number	Admin @ si @ MoW2015a			Serial	2568
Permanent link to this record



Author	Mikhail Mozerov; Joost Van de Weijer
Title	Global Color Sparseness and a Local Statistics Prior for Fast Bilateral Filtering			Type	Journal Article
Year	2015	Publication	IEEE Transactions on Image Processing	Abbreviated Journal	TIP
Volume	24	Issue	12	Pages	5842-5853
Keywords
Abstract	The property of smoothing while preserving edges makes the bilateral filter a very popular image processing tool. However, its non-linear nature results in a computationally costly operation. Various works propose fast approximations to the bilateral filter. However, the majority does not generalize to vector input as is the case with color images. We propose a fast approximation to the bilateral filter for color images. The filter is based on two ideas. First, the number of colors, which occur in a single natural image, is limited. We exploit this color sparseness to rewrite the initial non-linear bilateral filter as a number of linear filter operations. Second, we impose a statistical prior to the image values that are locally present within the filter window. We show that this statistical prior leads to a closed-form solution of the bilateral filter. Finally, we combine both ideas into a single fast and accurate bilateral filter for color images. Experimental results show that our bilateral filter based on the local prior yields an extremely fast bilateral filter approximation, but with limited accuracy, which has potential application in real-time video filtering. Our bilateral filter, which combines color sparseness and local statistics, yields a fast and accurate bilateral filter approximation and obtains the state-of-the-art results.
Address
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN	1057-7149	ISBN		Medium
Area		Expedition		Conference
Notes	LAMP; 600.079;ISE			Approved	no
Call Number	Admin @ si @ MoW2015b			Serial	2689
Permanent link to this record



Author	Mikhail Mozerov; Joost Van de Weijer
Title	Improved Recursive Geodesic Distance Computation for Edge Preserving Filter			Type	Journal Article
Year	2017	Publication	IEEE Transactions on Image Processing	Abbreviated Journal	TIP
Volume	26	Issue	8	Pages	3696 - 3706
Keywords	Geodesic distance filter; color image filtering; image enhancement
Abstract	All known recursive filters based on the geodesic distance affinity are realized by two 1D recursions applied in two orthogonal directions of the image plane. The 2D extension of the filter is not valid and has theoretically drawbacks, which lead to known artifacts. In this paper, a maximum influence propagation method is proposed to approximate the 2D extension for the geodesic distance-based recursive filter. The method allows to partially overcome the drawbacks of the 1D recursion approach. We show that our improved recursion better approximates the true geodesic distance filter, and the application of this improved filter for image denoising outperforms the existing recursive implementation of the geodesic distance. As an application, we consider a geodesic distance-based filter for image denoising. Experimental evaluation of our denoising method demonstrates comparable and for several test images better results, than stateof-the-art approaches, while our algorithm is considerably fasterwith computational complexity O(8P).
Address
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference
Notes	LAMP; ISE; 600.120; 600.098; 600.119			Approved	no
Call Number	Admin @ si @ Moz2017			Serial	2921
Permanent link to this record