Publicacions CVC -- Query Results

[11–20] << 21 22 23 24 25 26 27 28 29 30 >> [31–40]

Details

Records
Author	Jiaolong Xu; Peng Wang; Heng Yang; Antonio Lopez
Title	Training a Binary Weight Object Detector by Knowledge Transfer for Autonomous Driving			Type	Conference Article
Year	2019	Publication	IEEE International Conference on Robotics and Automation	Abbreviated Journal
Volume		Issue		Pages	2379-2384
Keywords
Abstract	Autonomous driving has harsh requirements of small model size and energy efficiency, in order to enable the embedded system to achieve real-time on-board object detection. Recent deep convolutional neural network based object detectors have achieved state-of-the-art accuracy. However, such models are trained with numerous parameters and their high computational costs and large storage prohibit the deployment to memory and computation resource limited systems. Low-precision neural networks are popular techniques for reducing the computation requirements and memory footprint. Among them, binary weight neural network (BWN) is the extreme case which quantizes the float-point into just bit. BWNs are difficult to train and suffer from accuracy deprecation due to the extreme low-bit representation. To address this problem, we propose a knowledge transfer (KT) method to aid the training of BWN using a full-precision teacher network. We built DarkNet-and MobileNet-based binary weight YOLO-v2 detectors and conduct experiments on KITTI benchmark for car, pedestrian and cyclist detection. The experimental results show that the proposed method maintains high detection accuracy while reducing the model size of DarkNet-YOLO from 257 MB to 8.8 MB and MobileNet-YOLO from 193 MB to 7.9 MB.
Address	Montreal; Canada; May 2019
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference	ICRA
Notes	ADAS; 600.124; 600.116; 600.118			Approved	no
Call Number	Admin @ si @ XWY2018			Serial	3182
Permanent link to this record



Author	Sangeeth Reddy; Minesh Mathew; Lluis Gomez; Marçal Rusiñol; Dimosthenis Karatzas; C.V. Jawahar
Title	RoadText-1K: Text Detection and Recognition Dataset for Driving Videos			Type	Conference Article
Year	2020	Publication	IEEE International Conference on Robotics and Automation	Abbreviated Journal
Volume		Issue		Pages
Keywords
Abstract	Perceiving text is crucial to understand semantics of outdoor scenes and hence is a critical requirement to build intelligent systems for driver assistance and self-driving. Most of the existing datasets for text detection and recognition comprise still images and are mostly compiled keeping text in mind. This paper introduces a new ”RoadText-1K” dataset for text in driving videos. The dataset is 20 times larger than the existing largest dataset for text in videos. Our dataset comprises 1000 video clips of driving without any bias towards text and with annotations for text bounding boxes and transcriptions in every frame. State of the art methods for text detection, recognition and tracking are evaluated on the new dataset and the results signify the challenges in unconstrained driving videos compared to existing datasets. This suggests that RoadText-1K is suited for research and development of reading systems, robust enough to be incorporated into more complex downstream tasks like driver assistance and self-driving. The dataset can be found at http://cvit.iiit.ac.in/research/ projects/cvit-projects/roadtext-1k
Address	Paris; Francia; ???
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference	ICRA
Notes	DAG; 600.121; 600.129			Approved	no
Call Number	Admin @ si @ RMG2020			Serial	3400
Permanent link to this record



Author	Alloy Das; Sanket Biswas; Umapada Pal; Josep Llados
Title	Diving into the Depths of Spotting Text in Multi-Domain Noisy Scenes			Type	Conference Article
Year	2024	Publication	IEEE International Conference on Robotics and Automation in PACIFICO	Abbreviated Journal
Volume		Issue		Pages
Keywords
Abstract	When used in a real-world noisy environment, the capacity to generalize to multiple domains is essential for any autonomous scene text spotting system. However, existing state-of-the-art methods employ pretraining and fine-tuning strategies on natural scene datasets, which do not exploit the feature interaction across other complex domains. In this work, we explore and investigate the problem of domain-agnostic scene text spotting, i.e., training a model on multi-domain source data such that it can directly generalize to target domains rather than being specialized for a specific domain or scenario. In this regard, we present the community a text spotting validation benchmark called Under-Water Text (UWT) for noisy underwater scenes to establish an important case study. Moreover, we also design an efficient super-resolution based end-to-end transformer baseline called DA-TextSpotter which achieves comparable or superior performance over existing text spotting architectures for both regular and arbitrary-shaped scene text spotting benchmarks in terms of both accuracy and model efficiency. The dataset, code and pre-trained models will be released upon acceptance.
Address	Yokohama; Japan; May 2024
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference	ICRA
Notes	DAG			Approved	no
Call Number	Admin @ si @ DBP2024			Serial	3979
Permanent link to this record



Author	Dena Bazazian; Raul Gomez; Anguelos Nicolaou; Lluis Gomez; Dimosthenis Karatzas; Andrew Bagdanov
Title	Improving Text Proposals for Scene Images with Fully Convolutional Networks			Type	Conference Article
Year	2016	Publication	23rd International Conference on Pattern Recognition Workshops	Abbreviated Journal
Volume		Issue		Pages
Keywords
Abstract	Text Proposals have emerged as a class-dependent version of object proposals – efficient approaches to reduce the search space of possible text object locations in an image. Combined with strong word classifiers, text proposals currently yield top state of the art results in end-to-end scene text recognition. In this paper we propose an improvement over the original Text Proposals algorithm of [1], combining it with Fully Convolutional Networks to improve the ranking of proposals. Results on the ICDAR RRC and the COCO-text datasets show superior performance over current state-of-the-art.
Address	Cancun; Mexico; December 2016
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference	ICPRW
Notes	DAG; LAMP; 600.084			Approved	no
Call Number	Admin @ si @ BGN2016			Serial	2823
Permanent link to this record



Author	Fatemeh Noroozi; Marina Marjanovic; Angelina Njegus; Sergio Escalera; Gholamreza Anbarjafari
Title	Fusion of Classifier Predictions for Audio-Visual Emotion Recognition			Type	Conference Article
Year	2016	Publication	23rd International Conference on Pattern Recognition Workshops	Abbreviated Journal
Volume		Issue		Pages
Keywords
Abstract	In this paper is presented a novel multimodal emotion recognition system which is based on the analysis of audio and visual cues. MFCC-based features are extracted from the audio channel and facial landmark geometric relations are computed from visual data. Both sets of features are learnt separately using state-of-the-art classifiers. In addition, we summarise each emotion video into a reduced set of key-frames, which are learnt in order to visually discriminate emotions by means of a Convolutional Neural Network. Finally, confidence outputs of all classifiers from all modalities are used to define a new feature space to be learnt for final emotion prediction, in a late fusion/stacking fashion. The conducted experiments on eNTERFACE’05 database show significant performance improvements of our proposed system in comparison to state-of-the-art approaches.
Address	Cancun; Mexico; December 2016
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference	ICPRW
Notes	HuPBA;MILAB;			Approved	no
Call Number	Admin @ si @ NMN2016			Serial	2839
Permanent link to this record



Author	Iiris Lusi; Sergio Escalera; Gholamreza Anbarjafari
Title	Human Head Pose Estimation on SASE database using Random Hough Regression Forests			Type	Conference Article
Year	2016	Publication	23rd International Conference on Pattern Recognition Workshops	Abbreviated Journal
Volume	10165	Issue		Pages
Keywords
Abstract	In recent years head pose estimation has become an important task in face analysis scenarios. Given the availability of high resolution 3D sensors, the design of a high resolution head pose database would be beneficial for the community. In this paper, Random Hough Forests are used to estimate 3D head pose and location on a new 3D head database, SASE, which represents the baseline performance on the new data for an upcoming international head pose estimation competition. The data in SASE is acquired with a Microsoft Kinect 2 camera, including the RGB and depth information of 50 subjects with a large sample of head poses, allowing us to test methods for real-life scenarios. We briefly review the database while showing baseline head pose estimation results based on Random Hough Forests.
Address	Cancun; Mexico; December 2016
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title	LNCS
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference	ICPRW
Notes	HuPBA;			Approved	no
Call Number	Admin @ si @ LEA2016b			Serial	2910
Permanent link to this record



Author	Gabriela Ramirez; Esau Villatoro; Bogdan Ionescu; Hugo Jair Escalante; Sergio Escalera; Martha Larson; Henning Muller; Isabelle Guyon
Title	Overview of the Multimedia Information Processing for Personality & Social Networks Analysis Contes			Type	Conference Article
Year	2018	Publication	Multimedia Information Processing for Personality and Social Networks Analysis (MIPPSNA 2018)	Abbreviated Journal
Volume		Issue		Pages
Keywords
Abstract
Address	Beijing; China; August 2018
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference	ICPRW
Notes	HUPBA			Approved	no
Call Number	Admin @ si @ RVI2018			Serial	3211
Permanent link to this record



Author	Asma Bensalah; Jialuo Chen; Alicia Fornes; Cristina Carmona_Duarte; Josep Llados; Miguel A. Ferrer
Title	Towards Stroke Patients' Upper-limb Automatic Motor Assessment Using Smartwatches.			Type	Conference Article
Year	2020	Publication	International Workshop on Artificial Intelligence for Healthcare Applications	Abbreviated Journal
Volume	12661	Issue		Pages	476-489
Keywords
Abstract	Assessing the physical condition in rehabilitation scenarios is a challenging problem, since it involves Human Activity Recognition (HAR) and kinematic analysis methods. In addition, the difficulties increase in unconstrained rehabilitation scenarios, which are much closer to the real use cases. In particular, our aim is to design an upper-limb assessment pipeline for stroke patients using smartwatches. We focus on the HAR task, as it is the first part of the assessing pipeline. Our main target is to automatically detect and recognize four key movements inspired by the Fugl-Meyer assessment scale, which are performed in both constrained and unconstrained scenarios. In addition to the application protocol and dataset, we propose two detection and classification baseline methods. We believe that the proposed framework, dataset and baseline results will serve to foster this research field.
Address	Virtual; January 2021
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference	ICPRW
Notes	DAG; 600.121; 600.140;			Approved	no
Call Number	Admin @ si @ BCF2020			Serial	3508
Permanent link to this record



Author	Roberto Morales; Juan Quispe; Eduardo Aguilar
Title	Exploring multi-food detection using deep learning-based algorithms			Type	Conference Article
Year	2023	Publication	13th International Conference on Pattern Recognition Systems	Abbreviated Journal
Volume		Issue		Pages	1-7
Keywords
Abstract	People are becoming increasingly concerned about their diet, whether for disease prevention, medical treatment or other purposes. In meals served in restaurants, schools or public canteens, it is not easy to identify the ingredients and/or the nutritional information they contain. Currently, technological solutions based on deep learning models have facilitated the recording and tracking of food consumed based on the recognition of the main dish present in an image. Considering that sometimes there may be multiple foods served on the same plate, food analysis should be treated as a multi-class object detection problem. EfficientDet and YOLOv5 are object detection algorithms that have demonstrated high mAP and real-time performance on general domain data. However, these models have not been evaluated and compared on public food datasets. Unlike general domain objects, foods have more challenging features inherent in their nature that increase the complexity of detection. In this work, we performed a performance evaluation of Efficient-Det and YOLOv5 on three public food datasets: UNIMIB2016, UECFood256 and ChileanFood64. From the results obtained, it can be seen that YOLOv5 provides a significant difference in terms of both mAP and response time compared to EfficientDet in all datasets. Furthermore, YOLOv5 outperforms the state-of-the-art on UECFood256, achieving an improvement of more than 4% in terms of mAP@.50 over the best reported.
Address	Guayaquil; Ecuador; July 2023
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference	ICPRS
Notes	MILAB			Approved	no
Call Number	Admin @ si @ MQA2023			Serial	3843
Permanent link to this record



Author	Gisel Bastidas-Guacho; Patricio Moreno; Boris X. Vintimilla; Angel Sappa
Title	Application on the Loop of Multimodal Image Fusion: Trends on Deep-Learning Based Approaches			Type	Conference Article
Year	2023	Publication	13th International Conference on Pattern Recognition Systems	Abbreviated Journal
Volume	14234	Issue		Pages	25–36
Keywords
Abstract	Multimodal image fusion allows the combination of information from different modalities, which is useful for tasks such as object detection, edge detection, and tracking, to name a few. Using the fused representation for applications results in better task performance. There are several image fusion approaches, which have been summarized in surveys. However, the existing surveys focus on image fusion approaches where the application on the loop of multimodal image fusion is not considered. On the contrary, this study summarizes deep learning-based multimodal image fusion for computer vision (e.g., object detection) and image processing applications (e.g., semantic segmentation), that is, approaches where the application module leverages the multimodal fusion process to enhance the final result. Firstly, we introduce image fusion and the existing general frameworks for image fusion tasks such as multifocus, multiexposure and multimodal. Then, we describe the multimodal image fusion approaches. Next, we review the state-of-the-art deep learning multimodal image fusion approaches for vision applications. Finally, we conclude our survey with the trends of task-driven multimodal image fusion.
Address	Guayaquil; Ecuador; July 2023
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference	ICPRS
Notes	MSIAU			Approved	no
Call Number	Admin @ si @ BMV2023			Serial	3932
Permanent link to this record



Author	Santiago Segui; Michal Drozdzal; Petia Radeva; Jordi Vitria
Title	An Integrated Approach to Contextual Face Detection			Type	Conference Article
Year	2012	Publication	1st International Conference on Pattern Recognition Applications and Methods	Abbreviated Journal
Volume		Issue		Pages	143-150
Keywords
Abstract	Face detection is, in general, based on content-based detectors. Nevertheless, the face is a non-rigid object with well defined relations with respect to the human body parts. In this paper, we propose to take benefit of the context information in order to improve content-based face detections. We propose a novel framework for integrating multiple content- and context-based detectors in a discriminative way. Moreover, we develop an integrated scoring procedure that measures the ’faceness’ of each hypothesis and is used to discriminate the detection results. Our approach detects a higher rate of faces while minimizing the number of false detections, giving an average increase of more than 10% in average precision when comparing it to state-of-the art face detectors
Address	Vilamoura, Algarve, Portugal
Corporate Author				Thesis
Publisher	Springer	Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference	ICPRAM
Notes	MILAB; OR;MV			Approved	no
Call Number	Admin @ si @ SDR2012			Serial	1895
Permanent link to this record



Author	Diego Cheda; Daniel Ponsa; Antonio Lopez
Title	Monocular Egomotion Estimation based on Image Matching			Type	Conference Article
Year	2012	Publication	1st International Conference on Pattern Recognition Applications and Methods	Abbreviated Journal
Volume		Issue		Pages	425-430
Keywords	SLAM
Abstract
Address	Portugal
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference	ICPRAM
Notes	ADAS			Approved	no
Call Number	Admin @ si @ CPL2012a;; ADAS @ adas @			Serial	2011
Permanent link to this record



Author	Jose Carlos Rubio; Joan Serrat; Antonio Lopez
Title	Multiple target tracking and identity linking under split, merge and occlusion of targets and observations			Type	Conference Article
Year	2012	Publication	1st International Conference on Pattern Recognition Applications and Methods	Abbreviated Journal
Volume		Issue		Pages
Keywords
Abstract
Address	Algarve, Portugal
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference	ICPRAM
Notes	ADAS			Approved	no
Call Number	Admin @ si @ RSL2012c; ADAS @ adas			Serial	2034
Permanent link to this record



Author	Ferran Diego; G.D. Evangelidis; Joan Serrat
Title	Night-time outdoor surveillance by mobile cameras			Type	Conference Article
Year	2012	Publication	1st International Conference on Pattern Recognition Applications and Methods	Abbreviated Journal
Volume	2	Issue		Pages	365-371
Keywords
Abstract	This paper addresses the problem of video surveillance by mobile cameras. We present a method that allows online change detection in night-time outdoor surveillance. Because of the camera movement, background frames are not available and must be “localized” in former sequences and registered with the current frames. To this end, we propose a Frame Localization And Registration (FLAR) approach that solves the problem efficiently. Frames of former sequences define a database which is queried by current frames in turn. To quickly retrieve nearest neighbors, database is indexed through a visual dictionary method based on the SURF descriptor. Furthermore, the frame localization is benefited by a temporal filter that exploits the temporal coherence of videos. Next, the recently proposed ECC alignment scheme is used to spatially register the synchronized frames. Finally, change detection methods apply to aligned frames in order to mark suspicious areas. Experiments with real night sequences recorded by in-vehicle cameras demonstrate the performance of the proposed method and verify its efficiency and effectiveness against other methods.
Address	Algarve, Portugal
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference	ICPRAM
Notes	ADAS			Approved	no
Call Number	Admin @ si @ DES2012			Serial	2035
Permanent link to this record



Author	F. de la Torre; Jordi Vitria; Petia Radeva; J. Melenchon
Title	EigenFiltering for flexible Eigentracking.			Type	Conference Article
Year	2000	Publication	15 th International Conference on Pattern Recognition	Abbreviated Journal
Volume	3	Issue		Pages	1118-1121
Keywords
Abstract
Address	Barcelona.
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference	ICPR
Notes	OR;MILAB;MV			Approved	no
Call Number	BCNPCL @ bcnpcl @ TVR2000			Serial	179
Permanent link to this record