|   | 
Details
   web
Records
Author Daniel Hernandez; Juan Carlos Moure; Toni Espinosa; Alejandro Chacon; David Vazquez; Antonio Lopez
Title Real-time 3D Reconstruction for Autonomous Driving via Semi-Global Matching Type Conference Article
Year 2016 Publication GPU Technology Conference Abbreviated Journal
Volume Issue Pages
Keywords Stereo; Autonomous Driving; GPU; 3d reconstruction
Abstract (down) Robust and dense computation of depth information from stereo-camera systems is a computationally demanding requirement for real-time autonomous driving. Semi-Global Matching (SGM) [1] approximates heavy-computation global algorithms results but with lower computational complexity, therefore it is a good candidate for a real-time implementation. SGM minimizes energy along several 1D paths across the image. The aim of this work is to provide a real-time system producing reliable results on energy-efficient hardware. Our design runs on a NVIDIA Titan X GPU at 104.62 FPS and on a NVIDIA Drive PX at 6.7 FPS, promising for real-time platforms
Address Silicon Valley; San Francisco; USA; April 2016
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference GTC
Notes ADAS; 600.085; 600.082; 600.076 Approved no
Call Number ADAS @ adas @ HME2016 Serial 2738
Permanent link to this record
 

 
Author Jose Manuel Alvarez
Title Combining Context and Appearance for Road Detection Type Book Whole
Year 2010 Publication PhD Thesis, Universitat Autonoma de Barcelona-CVC Abbreviated Journal
Volume Issue Pages
Keywords
Abstract (down) Road traffic crashes have become a major cause of death and injury throughout the world.
Hence, in order to improve road safety, the automobile manufacture is moving towards the
development of vehicles with autonomous functionalities such as keeping in the right lane, safe distance keeping between vehicles or regulating the speed of the vehicle according to the traffic conditions. A key component of these systems is vision–based road detection that aims to detect the free road surface ahead the moving vehicle. Detecting the road using a monocular vision system is very challenging since the road is an outdoor scenario imaged from a mobile platform. Hence, the detection algorithm must be able to deal with continuously changing imaging conditions such as the presence ofdifferent objects (vehicles, pedestrians), different environments (urban, highways, off–road), different road types (shape, color), and different imaging conditions (varying illumination, different viewpoints and changing weather conditions). Therefore, in this thesis, we focus on vision–based road detection using a single color camera. More precisely, we first focus on analyzing and grouping pixels according to their low–level properties. In this way, two different approaches are presented to exploit
color and photometric invariance. Then, we focus the research of the thesis on exploiting context information. This information provides relevant knowledge about the road not using pixel features from road regions but semantic information from the analysis of the scene.
In this way, we present two different approaches to infer the geometry of the road ahead
the moving vehicle. Finally, we focus on combining these context and appearance (color)
approaches to improve the overall performance of road detection algorithms. The qualitative and quantitative results presented in this thesis on real–world driving sequences show that the proposed method is robust to varying imaging conditions, road types and scenarios going beyond the state–of–the–art.
Address
Corporate Author Thesis Ph.D. thesis
Publisher Ediciones Graficas Rey Place of Publication Editor Antonio Lopez;Theo Gevers
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN 978-84-937261-8-8 Medium
Area Expedition Conference
Notes ADAS Approved no
Call Number Admin @ si @ Alv2010 Serial 1454
Permanent link to this record
 

 
Author Jose Manuel Alvarez; Theo Gevers; Y. LeCun; Antonio Lopez
Title Road Scene Segmentation from a Single Image Type Conference Article
Year 2012 Publication 12th European Conference on Computer Vision Abbreviated Journal
Volume 7578 Issue VII Pages 376-389
Keywords road detection
Abstract (down) Road scene segmentation is important in computer vision for different applications such as autonomous driving and pedestrian detection. Recovering the 3D structure of road scenes provides relevant contextual information to improve their understanding.
In this paper, we use a convolutional neural network based algorithm to learn features from noisy labels to recover the 3D scene layout of a road image. The novelty of the algorithm relies on generating training labels by applying an algorithm trained on a general image dataset to classify on–board images. Further, we propose a novel texture descriptor based on a learned color plane fusion to obtain maximal uniformity in road areas. Finally, acquired (off–line) and current (on–line) information are combined to detect road areas in single images.
From quantitative and qualitative experiments, conducted on publicly available datasets, it is concluded that convolutional neural networks are suitable for learning 3D scene layout from noisy labels and provides a relative improvement of 7% compared to the baseline. Furthermore, combining color planes provides a statistical description of road areas that exhibits maximal uniformity and provides a relative improvement of 8% compared to the baseline. Finally, the improvement is even bigger when acquired and current information from a single image are combined
Address Florence, Italy
Corporate Author Thesis
Publisher Springer Berlin Heidelberg Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title LNCS
Series Volume Series Issue Edition
ISSN 0302-9743 ISBN 978-3-642-33785-7 Medium
Area Expedition Conference ECCV
Notes ADAS;ISE Approved no
Call Number Admin @ si @ AGL2012; ADAS @ adas @ agl2012a Serial 2022
Permanent link to this record
 

 
Author Jose Manuel Alvarez; Felipe Lumbreras; Theo Gevers; Antonio Lopez
Title Geographic Information for vision-based Road Detection Type Conference Article
Year 2010 Publication IEEE Intelligent Vehicles Symposium Abbreviated Journal
Volume Issue Pages 621–626
Keywords road detection
Abstract (down) Road detection is a vital task for the development of autonomous vehicles. The knowledge of the free road surface ahead of the target vehicle can be used for autonomous driving, road departure warning, as well as to support advanced driver assistance systems like vehicle or pedestrian detection. Using vision to detect the road has several advantages in front of other sensors: richness of features, easy integration, low cost or low power consumption. Common vision-based road detection approaches use low-level features (such as color or texture) as visual cues to group pixels exhibiting similar properties. However, it is difficult to foresee a perfect clustering algorithm since roads are in outdoor scenarios being imaged from a mobile platform. In this paper, we propose a novel high-level approach to vision-based road detection based on geographical information. The key idea of the algorithm is exploiting geographical information to provide a rough detection of the road. Then, this segmentation is refined at low-level using color information to provide the final result. The results presented show the validity of our approach.
Address San Diego; CA; USA
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference IV
Notes ADAS;ISE Approved no
Call Number ADAS @ adas @ ALG2010 Serial 1428
Permanent link to this record
 

 
Author Xavier Baro; Sergio Escalera; Petia Radeva; Jordi Vitria
Title Visual Content Layer for Scalable Recognition in Urban Image Databases, Internet Multimedia Search and Mining Type Conference Article
Year 2009 Publication 10th IEEE International Conference on Multimedia and Expo Abbreviated Journal
Volume Issue Pages 1616–1619
Keywords
Abstract (down) Rich online map interaction represents a useful tool to get multimedia information related to physical places. With this type of systems, users can automatically compute the optimal route for a trip or to look for entertainment places or hotels near their actual position. Standard maps are defined as a fusion of layers, where each one contains specific data such height, streets, or a particular business location. In this paper we propose the construction of a visual content layer which describes the visual appearance of geographic locations in a city. We captured, by means of a Mobile Mapping system, a huge set of georeferenced images (> 500K) which cover the whole city of Barcelona. For each image, hundreds of region descriptions are computed off-line and described as a hash code. This allows an efficient and scalable way of accessing maps by visual content.
Address New York (USA)
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN 978-1-4244-4291-1 Medium
Area Expedition Conference ICME
Notes OR;MILAB;HuPBA;MV Approved no
Call Number BCNPCL @ bcnpcl @ BER2009 Serial 1189
Permanent link to this record
 

 
Author Sergio Vera
Title Finger joint modelling from hand X-ray images for assessing rheumatoid arthritis Type Report
Year 2010 Publication CVC Technical Report Abbreviated Journal
Volume 164 Issue Pages
Keywords Rheumatoid arthritis; joint detection; X-ray; Van der Heijde score
Abstract (down) Rheumatoid arthritis is an autoimmune, systemic, inflammatory disorder that mainly af- fects bone joints. While there is no cure for this disease, continuous advances on palliative treatments require frequent verification of patient’s illness evolution. Such evolution is mea- sured through several available semi-quantitative methods that require evaluation of hand and foot X-ray images. Accurate assessment is a time consuming task that requires highly trained personnel. This hinders a generalized use in clinical practice for early diagnose and disease follow-up. In the context of the automatization of such evaluation methods we present a method for detection and characterization of finger joints in hand radiography images. Several measures for assessing the reduction of joint space width are proposed. We compare for the first time such measures to the Van der Heijde score, the gold standard method for rheumatoid arthritis assessment. The proposed method outperforms existing strategies with a detection rate above 95%. Our comparison to Van der Heijde index shows a promising correlation that encourages further research.
Address
Corporate Author Thesis Master's thesis
Publisher Place of Publication Bellaterra 01893, Barcelona, Spain Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference
Notes IAM Approved no
Call Number IAM @ iam @ Ver2010 Serial 1661
Permanent link to this record
 

 
Author Joan M. Nuñez
Title Computer vision techniques for characterization of finger joints in X-ray image Type Report
Year 2011 Publication CVC Technical Report Abbreviated Journal
Volume 165 Issue Pages
Keywords Rheumatoid arthritis, X-ray, Sharp Van der Heijde, joint characterization, sclerosis detection, bone detection, edge, ridge
Abstract (down) Rheumatoid arthritis (RA) is an autoimmune inflammatory type of arthritis which mainly affects hands on its first stages. Though it is a chronic disease and there is no cure for it, treatments require an accurate assessment of illness evolution. Such assessment is based on evaluation of hand X-ray images by using one of the several available semi-quantitative methods. This task requires highly trained medical personnel. That is why the automation of the assessment would allow professionals to save time and effort. Two stages are involved in this task. Firstly, the joint detection, afterwards, the joint characterization. Unlike the little existing previous work, this contribution clearly separates those two stages and sets the foundations of a modular assessment system focusing on the characterization stage. A hand joint dataset is created and an accurate data analysis is achieved in order to identify relevant features. Since the sclerosis and the lower bone were decided to be the most important features, different computer vision techniques were used in order to develop a detector system for both of them. Joint space width measures are provided and their correlation with Sharp-Van der Heijde is verified
Address Bellaterra (Barcelona)
Corporate Author Computer Vision Center Thesis Master's thesis
Publisher Place of Publication Editor Dr. Fernando Vilariño and Dra. Debora Gil
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference
Notes MV;IAM; Approved no
Call Number IAM @ iam @ Nuñ2011 Serial 1795
Permanent link to this record
 

 
Author Hunor Laczko; Meysam Madadi; Sergio Escalera; Jordi Gonzalez
Title A Generative Multi-Resolution Pyramid and Normal-Conditioning 3D Cloth Draping Type Conference Article
Year 2024 Publication Winter Conference on Applications of Computer Vision Abbreviated Journal
Volume Issue Pages 8709-8718
Keywords
Abstract (down) RGB cloth generation has been deeply studied in the related literature, however, 3D garment generation remains an open problem. In this paper, we build a conditional variational autoencoder for 3D garment generation and draping. We propose a pyramid network to add garment details progressively in a canonical space, i.e. unposing and unshaping the garments w.r.t. the body. We study conditioning the network on surface normal UV maps, as an intermediate representation, which is an easier problem to optimize than 3D coordinates. Our results on two public datasets, CLOTH3D and CAPE, show that our model is robust, controllable in terms of detail generation by the use of multi-resolution pyramids, and achieves state-of-the-art results that can highly generalize to unseen garments, poses, and shapes even when training with small amounts of data.
Address Waikoloa; Hawai; USA; January 2024
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference WACV
Notes ISE; HUPBA Approved no
Call Number Admin @ si @ LME2024 Serial 3996
Permanent link to this record
 

 
Author David Lloret; Joan Serrat; Antonio Lopez; A. Soler; Juan J. Villanueva
Title Retinal image registration using creases as anatomical landmarks. Type Miscellaneous
Year 2000 Publication 15 th International Conference on Pattern Recognition, 3:207–210. Abbreviated Journal
Volume Issue Pages
Keywords
Abstract (down) Retinal images are routinely used in ophthalmology to study the optical nerve head and the retina. To assess objectively the evolution of an illness, images taken at different times must be registered. Most methods so far have been designed specifically for a single image modality, like temporal series or stereo pairs of angiographies, fluorescein angiographies or scanning laser ophthalmoscope (SLO) images, which makes them prone to fail when conditions vary. In contrast, the method we propose has shown to be accurate and reliable on all the former modalities. It has been adapted from the 3D registration of CT and MR image to 2D. Relevant features (also known as landmarks) are extracted by means of a robust creaseness operator, and resulting images are iteratively transformed until a maximum in their correlation is achieved. Our method has succeeded in more than 100 pairs tried so far, in all cases including also the scaling as a parameter to be optimized
Address Barcelona.
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference
Notes ADAS Approved no
Call Number ADAS @ adas @ LSL2000 c Serial 233
Permanent link to this record
 

 
Author Masakazu Iwamura; Naoyuki Morimoto; Keishi Tainaka; Dena Bazazian; Lluis Gomez; Dimosthenis Karatzas
Title ICDAR2017 Robust Reading Challenge on Omnidirectional Video Type Conference Article
Year 2017 Publication 14th International Conference on Document Analysis and Recognition Abbreviated Journal
Volume Issue Pages
Keywords
Abstract (down) Results of ICDAR 2017 Robust Reading Challenge on Omnidirectional Video are presented. This competition uses Downtown Osaka Scene Text (DOST) Dataset that was captured in Osaka, Japan with an omnidirectional camera. Hence, it consists of sequential images (videos) of different view angles. Regarding the sequential images as videos (video mode), two tasks of localisation and end-to-end recognition are prepared. Regarding them as a set of still images (still image mode), three tasks of localisation, cropped word recognition and end-to-end recognition are prepared. As the dataset has been captured in Japan, the dataset contains Japanese text but also include text consisting of alphanumeric characters (Latin text). Hence, a submitted result for each task is evaluated in three ways: using Japanese only ground truth (GT), using Latin only GT and using combined GTs of both. Finally, by the submission deadline, we have received two submissions in the text localisation task of the still image mode. We intend to continue the competition in the open mode. Expecting further submissions, in this report we provide baseline results in all the tasks in addition to the submissions from the community.
Address
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference ICDAR
Notes DAG; 600.084; 600.121 Approved no
Call Number Admin @ si @ IMT2017 Serial 3077
Permanent link to this record
 

 
Author Adriana Romero; Carlo Gatta
Title Do We Really Need All These Neurons? Type Conference Article
Year 2013 Publication 6th Iberian Conference on Pattern Recognition and Image Analysis Abbreviated Journal
Volume 7887 Issue Pages 460--467
Keywords Retricted Boltzmann Machine; hidden units; unsupervised learning; classification
Abstract (down) Restricted Boltzmann Machines (RBMs) are generative neural networks that have received much attention recently. In particular, choosing the appropriate number of hidden units is important as it might hinder their representative power. According to the literature, RBM require numerous hidden units to approximate any distribution properly. In this paper, we present an experiment to determine whether such amount of hidden units is required in a classification context. We then propose an incremental algorithm that trains RBM reusing the previously trained parameters using a trade-off measure to determine the appropriate number of hidden units. Results on the MNIST and OCR letters databases show that using a number of hidden units, which is one order of magnitude smaller than the literature estimate, suffices to achieve similar performance. Moreover, the proposed algorithm allows to estimate the required number of hidden units without the need of training many RBM from scratch.
Address Madeira; Portugal; June 2013
Corporate Author Thesis
Publisher Springer Berlin Heidelberg Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title LNCS
Series Volume Series Issue Edition
ISSN 0302-9743 ISBN 978-3-642-38627-5 Medium
Area Expedition Conference IbPRIA
Notes MILAB; 600.046 Approved no
Call Number Admin @ si @ RoG2013 Serial 2311
Permanent link to this record
 

 
Author Debora Gil; Oriol Rodriguez; Josepa Mauri; Petia Radeva
Title Statistical descriptors of the Myocardial perfusion in angiographic images Type Conference Article
Year 2006 Publication Proc. Computers in Cardiology Abbreviated Journal
Volume Issue Pages 677-680
Keywords Anisotropic processing; intravascular ultrasound (IVUS); vessel border segmentation; vessel structure classification.
Abstract (down) Restoration of coronary flow after primary percutaneous coronary intervention in acute myocardial infarction does not always correlate with adequate myocardial perfusion. Recently, coronary angiography has been used to assess microcirculation integrity (Myocardial BlushAnalysis, MBA). Although MBA correlates with patient prognosis there are few image processing methods addressing objective perfusion quantification. The goal of this work is to develop statistical descriptors of the myocardial dyeing pattern allowing objective assessment of myocardial perfusion. Experiments on healthy right coronary arteries show that our approach allows reliable measurements without any specific image acquisition protocol.
Address
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference
Notes IAM;MILAB Approved no
Call Number IAM @ iam @ GRR2006 Serial 1528
Permanent link to this record
 

 
Author Alejandro Tabas; Emili Balaguer-Ballester; Laura Igual
Title Spatial Discriminant ICA for RS-fMRI characterisation Type Conference Article
Year 2014 Publication 4th International Workshop on Pattern Recognition in Neuroimaging Abbreviated Journal
Volume Issue Pages 1-4
Keywords
Abstract (down) Resting-State fMRI (RS-fMRI) is a brain imaging technique useful for exploring functional connectivity. A major point of interest in RS-fMRI analysis is to isolate connectivity patterns characterising disorders such as for instance ADHD. Such characterisation is usually performed in two steps: first, all connectivity patterns in the data are extracted by means of Independent Component Analysis (ICA); second, standard statistical tests are performed over the extracted patterns to find differences between control and clinical groups. In this work we introduce a novel, single-step, approach for this problem termed Spatial Discriminant ICA. The algorithm can efficiently isolate networks of functional connectivity characterising a clinical group by combining ICA and a new variant of the Fisher’s Linear Discriminant also introduced in this work. As the characterisation is carried out in a single step, it potentially provides for a richer characterisation of inter-class differences. The algorithm is tested using synthetic and real fMRI data, showing promising results in both experiments.
Address Tübingen; June 2014
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN 978-1-4799-4150-6 Medium
Area Expedition Conference PRNI
Notes OR;MILAB Approved no
Call Number Admin @ si @ TBI2014 Serial 2493
Permanent link to this record
 

 
Author Soumya Jahagirdar; Minesh Mathew; Dimosthenis Karatzas; CV Jawahar
Title Understanding Video Scenes Through Text: Insights from Text-Based Video Question Answering Type Conference Article
Year 2023 Publication Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV) Workshops Abbreviated Journal
Volume Issue Pages
Keywords
Abstract (down) Researchers have extensively studied the field of vision and language, discovering that both visual and textual content is crucial for understanding scenes effectively. Particularly, comprehending text in videos holds great significance, requiring both scene text understanding and temporal reasoning. This paper focuses on exploring two recently introduced datasets, NewsVideoQA and M4-ViteVQA, which aim to address video question answering based on textual content. The NewsVideoQA dataset contains question-answer pairs related to the text in news videos, while M4- ViteVQA comprises question-answer pairs from diverse categories like vlogging, traveling, and shopping. We provide an analysis of the formulation of these datasets on various levels, exploring the degree of visual understanding and multi-frame comprehension required for answering the questions. Additionally, the study includes experimentation with BERT-QA, a text-only model, which demonstrates comparable performance to the original methods on both datasets, indicating the shortcomings in the formulation of these datasets. Furthermore, we also look into the domain adaptation aspect by examining the effectiveness of training on M4-ViteVQA and evaluating on NewsVideoQA and vice-versa, thereby shedding light on the challenges and potential benefits of out-of-domain training.
Address Paris; France; October 2023
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference ICCVW
Notes DAG Approved no
Call Number Admin @ si @ JMK2023 Serial 3946
Permanent link to this record
 

 
Author Riccardo Del Chiaro; Bartlomiej Twardowski; Andrew Bagdanov; Joost Van de Weijer
Title Recurrent attention to transient tasks for continual image captioning Type Conference Article
Year 2020 Publication 34th Conference on Neural Information Processing Systems Abbreviated Journal
Volume Issue Pages
Keywords
Abstract (down) Research on continual learning has led to a variety of approaches to mitigating catastrophic forgetting in feed-forward classification networks. Until now surprisingly little attention has been focused on continual learning of recurrent models applied to problems like image captioning. In this paper we take a systematic look at continual learning of LSTM-based models for image captioning. We propose an attention-based approach that explicitly accommodates the transient nature of vocabularies in continual image captioning tasks -- i.e. that task vocabularies are not disjoint. We call our method Recurrent Attention to Transient Tasks (RATT), and also show how to adapt continual learning approaches based on weight egularization and knowledge distillation to recurrent continual learning problems. We apply our approaches to incremental image captioning problem on two new continual learning benchmarks we define using the MS-COCO and Flickr30 datasets. Our results demonstrate that RATT is able to sequentially learn five captioning tasks while incurring no forgetting of previously learned ones.
Address virtual; December 2020
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference NEURIPS
Notes LAMP; 600.120 Approved no
Call Number Admin @ si @ CTB2020 Serial 3484
Permanent link to this record