|   | 
Details
   web
Records
Author Sergi Garcia Bordils; George Tom; Sangeeth Reddy; Minesh Mathew; Marçal Rusiñol; C.V. Jawahar; Dimosthenis Karatzas
Title (up) Read While You Drive-Multilingual Text Tracking on the Road Type Conference Article
Year 2022 Publication 15th IAPR International workshop on document analysis systems Abbreviated Journal
Volume 13237 Issue Pages 756–770
Keywords
Abstract Visual data obtained during driving scenarios usually contain large amounts of text that conveys semantic information necessary to analyse the urban environment and is integral to the traffic control plan. Yet, research on autonomous driving or driver assistance systems typically ignores this information. To advance research in this direction, we present RoadText-3K, a large driving video dataset with fully annotated text. RoadText-3K is three times bigger than its predecessor and contains data from varied geographical locations, unconstrained driving conditions and multiple languages and scripts. We offer a comprehensive analysis of tracking by detection and detection by tracking methods exploring the limits of state-of-the-art text detection. Finally, we propose a new end-to-end trainable tracking model that yields state-of-the-art results on this challenging dataset. Our experiments demonstrate the complexity and variability of RoadText-3K and establish a new, realistic benchmark for scene text tracking in the wild.
Address La Rochelle; France; May 2022
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title LNCS
Series Volume Series Issue Edition
ISSN ISBN 978-3-031-06554-5 Medium
Area Expedition Conference DAS
Notes DAG; 600.155; 611.022; 611.004 Approved no
Call Number Admin @ si @ GTR2022 Serial 3783
Permanent link to this record
 

 
Author George Tom; Minesh Mathew; Sergi Garcia Bordils; Dimosthenis Karatzas; CV Jawahar
Title (up) Reading Between the Lanes: Text VideoQA on the Road Type Conference Article
Year 2023 Publication 17th International Conference on Document Analysis and Recognition Abbreviated Journal
Volume 14192 Issue Pages 137–154
Keywords VideoQA; scene text; driving videos
Abstract Text and signs around roads provide crucial information for drivers, vital for safe navigation and situational awareness. Scene text recognition in motion is a challenging problem, while textual cues typically appear for a short time span, and early detection at a distance is necessary. Systems that exploit such information to assist the driver should not only extract and incorporate visual and textual cues from the video stream but also reason over time. To address this issue, we introduce RoadTextVQA, a new dataset for the task of video question answering (VideoQA) in the context of driver assistance. RoadTextVQA consists of 3, 222 driving videos collected from multiple countries, annotated with 10, 500 questions, all based on text or road signs present in the driving videos. We assess the performance of state-of-the-art video question answering models on our RoadTextVQA dataset, highlighting the significant potential for improvement in this domain and the usefulness of the dataset in advancing research on in-vehicle support systems and text-aware multimodal question answering. The dataset is available at http://cvit.iiit.ac.in/research/projects/cvit-projects/roadtextvqa.
Address San Jose; CA; USA; August 2023
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title LNCS
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference ICDAR
Notes DAG Approved no
Call Number Admin @ si @ TMG2023 Serial 3906
Permanent link to this record
 

 
Author Arnau Baro
Title (up) Reading Music Systems: From Deep Optical Music Recognition to Contextual Methods Type Book Whole
Year 2022 Publication PhD Thesis, Universitat Autonoma de Barcelona-CVC Abbreviated Journal
Volume Issue Pages
Keywords
Abstract The transcription of sheet music into some machine-readable format can be carried out manually. However, the complexity of music notation inevitably leads to burdensome software for music score editing, which makes the whole process
very time-consuming and prone to errors. Consequently, automatic transcription
systems for musical documents represent interesting tools.
Document analysis is the subject that deals with the extraction and processing
of documents through image and pattern recognition. It is a branch of computer
vision. Taking music scores as source, the field devoted to address this task is
known as Optical Music Recognition (OMR). Typically, an OMR system takes an
image of a music score and automatically extracts its content into some symbolic
structure such as MEI or MusicXML.
In this dissertation, we have investigated different methods for recognizing a
single staff section (e.g. scores for violin, flute, etc.), much in the same way as most text recognition research focuses on recognizing words appearing in a given line image. These methods are based in two different methodologies. On the one hand, we present two methods based on Recurrent Neural Networks, in particular, the
Long Short-Term Memory Neural Network. On the other hand, a method based on Sequence to Sequence models is detailed.
Music context is needed to improve the OMR results, just like language models
and dictionaries help in handwriting recognition. For example, syntactical rules
and grammars could be easily defined to cope with the ambiguities in the rhythm.
In music theory, for example, the time signature defines the amount of beats per
bar unit. Thus, in the second part of this dissertation, different methodologies
have been investigated to improve the OMR recognition. We have explored three
different methods: (a) a graphic tree-structure representation, Dendrograms, that
joins, at each level, its primitives following a set of rules, (b) the incorporation of Language Models to model the probability of a sequence of tokens, and (c) graph neural networks to analyze the music scores to avoid meaningless relationships between music primitives.
Finally, to train all these methodologies, and given the method-specificity of
the datasets in the literature, we have created four different music datasets. Two of them are synthetic with a modern or old handwritten appearance, whereas the
other two are real handwritten scores, being one of them modern and the other
old.
Address
Corporate Author Thesis Ph.D. thesis
Publisher IMPRIMA Place of Publication Editor Alicia Fornes
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN 978-84-124793-8-6 Medium
Area Expedition Conference
Notes DAG; Approved no
Call Number Admin @ si @ Bar2022 Serial 3754
Permanent link to this record
 

 
Author Leonardo Galteri; Dena Bazazian; Lorenzo Seidenari; Marco Bertini; Andrew Bagdanov; Anguelos Nicolaou; Dimosthenis Karatzas; Alberto del Bimbo
Title (up) Reading Text in the Wild from Compressed Images Type Conference Article
Year 2017 Publication 1st International workshop on Egocentric Perception, Interaction and Computing Abbreviated Journal
Volume Issue Pages
Keywords
Abstract Reading text in the wild is gaining attention in the computer vision community. Images captured in the wild are almost always compressed to varying degrees, depending on application context, and this compression introduces artifacts
that distort image content into the captured images. In this paper we investigate the impact these compression artifacts have on text localization and recognition in the wild. We also propose a deep Convolutional Neural Network (CNN) that can eliminate text-specific compression artifacts and which leads to an improvement in text recognition. Experimental results on the ICDAR-Challenge4 dataset demonstrate that compression artifacts have a significant
impact on text localization and recognition and that our approach yields an improvement in both – especially at high compression rates.
Address Venice; Italy; October 2017
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference ICCV - EPIC
Notes DAG; 600.084; 600.121 Approved no
Call Number Admin @ si @ GBS2017 Serial 3006
Permanent link to this record
 

 
Author David Masip; Jordi Vitria
Title (up) Real Time Face Detection and Verification for Uncontrolled Environments Type Miscellaneous
Year 2004 Publication Second COST 275 Workshop Biometrics on the Internet: Fundamentals, Advances and Applications, 55–58. Abbreviated Journal
Volume Issue Pages
Keywords
Abstract
Address Vigo
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference
Notes OR;MV Approved no
Call Number BCNPCL @ bcnpcl @ MaV2004a Serial 446
Permanent link to this record
 

 
Author Fadi Dornaika; Angel Sappa
Title (up) Real Time Image Registration for Planar Structure and 3D Sensor Pose Estimation Type Book Chapter
Year 2008 Publication Stereo Vision Abbreviated Journal
Volume 18 Issue Pages 299–316
Keywords
Abstract
Address
Corporate Author Thesis
Publisher Place of Publication Editor Asim Bhatti
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference
Notes ADAS Approved no
Call Number ADAS @ adas @ DoS2008c Serial 1057
Permanent link to this record
 

 
Author Fadi Dornaika; Angel Sappa
Title (up) Real Time on Board Stereo Camera Pose through Image Registration Type Conference Article
Year 2008 Publication IEEE Intelligent Vehicles Symposium, Abbreviated Journal
Volume Issue Pages 804–809
Keywords
Abstract
Address Eindhoven (Netherlands)
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference
Notes ADAS Approved no
Call Number ADAS @ adas @ DoS2008a Serial 1015
Permanent link to this record
 

 
Author A. Pujol; Jordi Vitria; Petia Radeva; Xavier Binefa; Robert Benavente; Ernest Valveny; Craig Von Land
Title (up) Real time pharmaceutical product recognition using color and shape indexing. Type Conference Article
Year 1999 Publication Proceedings of the 2nd International Workshop on European Scientific and Industrial Collaboration (WESIC´99), Promotoring Advanced Technologies in Manufacturing. Abbreviated Journal
Volume Issue Pages
Keywords
Abstract
Address Wales
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference
Notes OR;MILAB;DAG;CIC;MV Approved no
Call Number BCNPCL @ bcnpcl @ PVR1999 Serial 24
Permanent link to this record
 

 
Author Jordi Vitria; Petia Radeva; X. Binefa; A. Pujol; Ernest Valveny; Robert Benavente; Craig Von Land
Title (up) Real time recognition of pharmaceutical products by subspace methods Type Report
Year 1999 Publication CVC Technical Report #35 Abbreviated Journal
Volume Issue Pages
Keywords
Abstract
Address CVC (UAB)
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference
Notes OR;MILAB;DAG;CIC;MV Approved no
Call Number BCNPCL @ bcnpcl @ VRB1999b Serial 54
Permanent link to this record
 

 
Author Angel Sappa; David Geronimo; Fadi Dornaika; Antonio Lopez
Title (up) Real Time Vehicle Pose Using On-Board Stereo Vision System Type Conference Article
Year 2006 Publication International Conference on Image Analysis and Recognition Abbreviated Journal ICIAR
Volume Issue LNCS 4142 Pages 205–216
Keywords
Abstract This paper presents a robust technique for a real time estimation of both camera’s position and orientation—referred as pose. A commercial stereo vision system is used. Unlike previous approaches, it can be used either for urban or highway scenarios. The proposed technique consists of two stages. Initially, a compact 2D representation of the original 3D data points is computed. Then, a RANSAC based least squares approach is used for fitting a plane to the road. At the same time,
relative camera’s position and orientation are computed. The proposed technique is intended to be used on a driving assistance scheme for applications such as obstacle or pedestrian detection. Experimental results on urban environments with different road geometries are presented.
Address
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference
Notes ADAS Approved no
Call Number ADAS @ adas @ SGD2006b Serial 671
Permanent link to this record
 

 
Author Daniel Hernandez; Juan Carlos Moure; Toni Espinosa; Alejandro Chacon; David Vazquez; Antonio Lopez
Title (up) Real-time 3D Reconstruction for Autonomous Driving via Semi-Global Matching Type Conference Article
Year 2016 Publication GPU Technology Conference Abbreviated Journal
Volume Issue Pages
Keywords Stereo; Autonomous Driving; GPU; 3d reconstruction
Abstract Robust and dense computation of depth information from stereo-camera systems is a computationally demanding requirement for real-time autonomous driving. Semi-Global Matching (SGM) [1] approximates heavy-computation global algorithms results but with lower computational complexity, therefore it is a good candidate for a real-time implementation. SGM minimizes energy along several 1D paths across the image. The aim of this work is to provide a real-time system producing reliable results on energy-efficient hardware. Our design runs on a NVIDIA Titan X GPU at 104.62 FPS and on a NVIDIA Drive PX at 6.7 FPS, promising for real-time platforms
Address Silicon Valley; San Francisco; USA; April 2016
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference GTC
Notes ADAS; 600.085; 600.082; 600.076 Approved no
Call Number ADAS @ adas @ HME2016 Serial 2738
Permanent link to this record
 

 
Author Miguel Reyes; Jordi Vitria; Petia Radeva; Sergio Escalera
Title (up) Real-time Activity Monitoring of Inpatients Type Conference Article
Year 2010 Publication Medical Image Computing in Catalunya: Graduate Student Workshop Abbreviated Journal
Volume Issue Pages 35–36
Keywords
Abstract In this paper, we present the development of an application capable of monitoring a set of patient vital signs in real time. The application has been designed to support the medical staff of a hospital. Preliminary results show the suitability
of the system to prevent the injury produced by the agitation of the patients.
Address Girona
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference MICCAT
Notes OR;MILAB;HUPBA;MV Approved no
Call Number BCNPCL @ bcnpcl @ RVR2010 Serial 1477
Permanent link to this record
 

 
Author Eduardo Tusa; Arash Akbarinia; Raquel Gil Rodriguez; Corina Barbalata
Title (up) Real-Time Face Detection and Tracking Utilising OpenMP and ROS Type Conference Article
Year 2015 Publication 3rd Asia-Pacific Conference on Computer Aided System Engineering Abbreviated Journal
Volume Issue Pages 179 - 184
Keywords RGB-D; Kinect; Human Detection and Tracking; ROS; OpenMP
Abstract The first requisite of a robot to succeed in social interactions is accurate human localisation, i.e. subject detection and tracking. Later, it is estimated whether an interaction partner seeks attention, for example by interpreting the position and orientation of the body. In computer vision, these cues usually are obtained in colour images, whose qualities are degraded in ill illuminated social scenes. In these scenarios depth sensors offer a richer representation. Therefore, it is important to combine colour and depth information. The
second aspect that plays a fundamental role in the acceptance of social robots is their real-time-ability. Processing colour and depth images is computationally demanding. To overcome this we propose a parallelisation strategy of face detection and tracking based on two different architectures: message passing and shared memory. Our results demonstrate high accuracy in
low computational time, processing nine times more number of frames in a parallel implementation. This provides a real-time social robot interaction.
Address Quito; Ecuador; July 2015
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference APCASE
Notes NEUROBIT Approved no
Call Number Admin @ si @ TAG2015 Serial 2659
Permanent link to this record
 

 
Author Bogdan Raducanu; Jordi Vitria
Title (up) Real-Time Face Tracking for Context-Aware Computing Type Miscellaneous
Year 2005 Publication 8th Catalan Conference on Artificial Intelligence (CCIA 2005) (published in Abbreviated Journal
Volume Issue Pages
Keywords
Abstract
Address
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference
Notes OR;MV Approved no
Call Number BCNPCL @ bcnpcl @ RaV2005a Serial 560
Permanent link to this record
 

 
Author Bogdan Raducanu; Jordi Vitria
Title (up) Real-Time Face Tracking for Context-Aware Computing Type Book Chapter
Year 2005 Publication Artificial Intelligence Research and Development, IOS Press, 91–98 Abbreviated Journal
Volume Issue Pages
Keywords
Abstract
Address Amsterdam
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference
Notes OR;MV Approved no
Call Number BCNPCL @ bcnpcl @ RaV2005b Serial 616
Permanent link to this record