|
Records |
Links |
|
Author |
Antoni Rosell; Sonia Baeza; S. Garcia-Reina; JL. Mate; Ignasi Guasch; I. Nogueira; I. Garcia-Olive; Guillermo Torres; Carles Sanchez; Debora Gil |
|
|
Title |
Radiomics to increase the effectiveness of lung cancer screening programs. Radiolung preliminary results. |
Type |
Journal Article |
|
Year |
2022 |
Publication |
European Respiratory Journal |
Abbreviated Journal |
ERJ |
|
|
Volume |
60 |
Issue |
66 |
Pages |
|
|
|
Keywords |
|
|
|
Abstract |
|
|
|
Address |
|
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
|
|
|
Notes |
IAM |
Approved |
no |
|
|
Call Number |
Admin @ si @ RBG2022c |
Serial |
3835 |
|
Permanent link to this record |
|
|
|
|
Author |
Zahra Raisi-Estabragh; Carlos Martin-Isla; Louise Nissen; Liliana Szabo; Victor M. Campello; Sergio Escalera; Simon Winther; Morten Bottcher; Karim Lekadir; and Steffen E. Petersen |
|
|
Title |
Radiomics analysis enhances the diagnostic performance of CMR stress perfusion: a proof-of-concept study using the Dan-NICAD dataset |
Type |
Journal Article |
|
Year |
2023 |
Publication |
Frontiers in Cardiovascular Medicine |
Abbreviated Journal |
FCM |
|
|
Volume |
|
Issue |
|
Pages |
|
|
|
Keywords |
|
|
|
Abstract |
|
|
|
Address |
|
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
|
|
|
Notes |
HUPBA |
Approved |
no |
|
|
Call Number |
Admin @ si @ RMN2023 |
Serial |
3937 |
|
Permanent link to this record |
|
|
|
|
Author |
Bogdan Raducanu; Jordi Vitria; Ales Leonardis |
|
|
Title |
Online pattern recognition and machine learning techniques for computer-vision: Theory and applications |
Type |
Journal Article |
|
Year |
2010 |
Publication |
Image and Vision Computing |
Abbreviated Journal |
IMAVIS |
|
|
Volume |
28 |
Issue |
7 |
Pages |
1063–1064 |
|
|
Keywords |
|
|
|
Abstract |
(Editorial for the Special Issue on Online pattern recognition and machine learning techniques)
In real life, visual learning is supposed to be a continuous process. This paradigm has found its way also in artificial vision systems. There is an increasing trend in pattern recognition represented by online learning approaches, which aims at continuously updating the data representation when new information arrives. Starting with a minimal dataset, the initial knowledge is expanded by incorporating incoming instances, which may have not been previously available or foreseen at the system’s design stage. An interesting characteristic of this strategy is that the train and test phases take place simultaneously. Given the increasing interest in this subject, the aim of this special issue is to be a landmark event in the development of online learning techniques and their applications with the hope that it will capture the interest of a wider audience and will attract even more researchers. We received 19 contributions, of which 9 have been accepted for publication, after having been subjected to usual peer review process. |
|
|
Address |
|
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
Elsevier |
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
0262-8856 |
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
|
|
|
Notes |
OR;MV |
Approved |
no |
|
|
Call Number |
BCNPCL @ bcnpcl @ RVL2010 |
Serial |
1280 |
|
Permanent link to this record |
|
|
|
|
Author |
Sergio Escalera; Oriol Pujol; Petia Radeva |
|
|
Title |
Error-Correcting Output Codes Library |
Type |
Journal Article |
|
Year |
2010 |
Publication |
Journal of Machine Learning Research |
Abbreviated Journal |
JMLR |
|
|
Volume |
11 |
Issue |
|
Pages |
661-664 |
|
|
Keywords |
|
|
|
Abstract |
(Feb):661−664
In this paper, we present an open source Error-Correcting Output Codes (ECOC) library. The ECOC framework is a powerful tool to deal with multi-class categorization problems. This library contains both state-of-the-art coding (one-versus-one, one-versus-all, dense random, sparse random, DECOC, forest-ECOC, and ECOC-ONE) and decoding designs (hamming, euclidean, inverse hamming, laplacian, β-density, attenuated, loss-based, probabilistic kernel-based, and loss-weighted) with the parameters defined by the authors, as well as the option to include your own coding, decoding, and base classifier. |
|
|
Address |
|
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
1532-4435 |
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
|
|
|
Notes |
MILAB;HUPBA |
Approved |
no |
|
|
Call Number |
BCNPCL @ bcnpcl @ EPR2010c |
Serial |
1286 |
|
Permanent link to this record |
|
|
|
|
Author |
Partha Pratim Roy; Umapada Pal; Josep Llados |
|
|
Title |
Text line extraction in graphical documents using background and foreground |
Type |
Journal Article |
|
Year |
2012 |
Publication |
International Journal on Document Analysis and Recognition |
Abbreviated Journal |
IJDAR |
|
|
Volume |
15 |
Issue |
3 |
Pages |
227-241 |
|
|
Keywords |
|
|
|
Abstract |
0,405 JCR
In graphical documents (e.g., maps, engineering drawings), artistic documents etc., the text lines are annotated in multiple orientations or curvilinear way to illustrate different locations or symbols. For the optical character recognition of such documents, individual text lines from the documents need to be extracted. In this paper, we propose a novel method to segment such text lines and the method is based on the foreground and background information of the text components. To effectively utilize the background information, a water reservoir concept is used here. In the proposed scheme, at first, individual components are detected and grouped into character clusters in a hierarchical way using size and positional information. Next, the clusters are extended in two extreme sides to determine potential candidate regions. Finally, with the help of these candidate regions,
individual lines are extracted. The experimental results are presented on different datasets of graphical documents, camera-based warped documents, noisy images containing seals, etc. The results demonstrate that our approach is robust and invariant to size and orientation of the text lines present in
the document. |
|
|
Address |
|
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
1433-2833 |
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
|
|
|
Notes |
DAG |
Approved |
no |
|
|
Call Number |
Admin @ si @ RPL2012b |
Serial |
2134 |
|
Permanent link to this record |
|
|
|
|
Author |
Alicia Fornes; Anjan Dutta; Albert Gordo; Josep Llados |
|
|
Title |
CVC-MUSCIMA: A Ground-Truth of Handwritten Music Score Images for Writer Identification and Staff Removal |
Type |
Journal Article |
|
Year |
2012 |
Publication |
International Journal on Document Analysis and Recognition |
Abbreviated Journal |
IJDAR |
|
|
Volume |
15 |
Issue |
3 |
Pages |
243-251 |
|
|
Keywords |
Music scores; Handwritten documents; Writer identification; Staff removal; Performance evaluation; Graphics recognition; Ground truths |
|
|
Abstract |
0,405JCR
The analysis of music scores has been an active research field in the last decades. However, there are no publicly available databases of handwritten music scores for the research community. In this paper we present the CVC-MUSCIMA database and ground-truth of handwritten music score images. The dataset consists of 1,000 music sheets written by 50 different musicians. It has been especially designed for writer identification and staff removal tasks. In addition to the description of the dataset, ground-truth, partitioning and evaluation metrics, we also provide some base-line results for easing the comparison between different approaches. |
|
|
Address |
|
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
1433-2833 |
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
|
|
|
Notes |
DAG |
Approved |
no |
|
|
Call Number |
Admin @ si @ FDG2012 |
Serial |
2129 |
|
Permanent link to this record |
|
|
|
|
Author |
Josep Llados; Marçal Rusiñol; Alicia Fornes; David Fernandez; Anjan Dutta |
|
|
Title |
On the Influence of Word Representations for Handwritten Word Spotting in Historical Documents |
Type |
Journal Article |
|
Year |
2012 |
Publication |
International Journal of Pattern Recognition and Artificial Intelligence |
Abbreviated Journal |
IJPRAI |
|
|
Volume |
26 |
Issue |
5 |
Pages |
1263002-126027 |
|
|
Keywords |
Handwriting recognition; word spotting; historical documents; feature representation; shape descriptors Read More: http://www.worldscientific.com/doi/abs/10.1142/S0218001412630025 |
|
|
Abstract |
0,624 JCR
Word spotting is the process of retrieving all instances of a queried keyword from a digital library of document images. In this paper we evaluate the performance of different word descriptors to assess the advantages and disadvantages of statistical and structural models in a framework of query-by-example word spotting in historical documents. We compare four word representation models, namely sequence alignment using DTW as a baseline reference, a bag of visual words approach as statistical model, a pseudo-structural model based on a Loci features representation, and a structural approach where words are represented by graphs. The four approaches have been tested with two collections of historical data: the George Washington database and the marriage records from the Barcelona Cathedral. We experimentally demonstrate that statistical representations generally give a better performance, however it cannot be neglected that large descriptors are difficult to be implemented in a retrieval scenario where word spotting requires the indexation of data with million word images. |
|
|
Address |
|
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
|
|
|
Notes |
DAG |
Approved |
no |
|
|
Call Number |
Admin @ si @ LRF2012 |
Serial |
2128 |
|
Permanent link to this record |
|
|
|
|
Author |
Partha Pratim Roy; Umapada Pal; Josep Llados; Mathieu Nicolas Delalandre |
|
|
Title |
Multi-oriented touching text character segmentation in graphical documents using dynamic programming |
Type |
Journal Article |
|
Year |
2012 |
Publication |
Pattern Recognition |
Abbreviated Journal |
PR |
|
|
Volume |
45 |
Issue |
5 |
Pages |
1972-1983 |
|
|
Keywords |
|
|
|
Abstract |
2,292 JCR
The touching character segmentation problem becomes complex when touching strings are multi-oriented. Moreover in graphical documents sometimes characters in a single-touching string have different orientations. Segmentation of such complex touching is more challenging. In this paper, we present a scheme towards the segmentation of English multi-oriented touching strings into individual characters. When two or more characters touch, they generate a big cavity region in the background portion. Based on the convex hull information, at first, we use this background information to find some initial points for segmentation of a touching string into possible primitives (a primitive consists of a single character or part of a character). Next, the primitives are merged to get optimum segmentation. A dynamic programming algorithm is applied for this purpose using the total likelihood of characters as the objective function. A SVM classifier is used to find the likelihood of a character. To consider multi-oriented touching strings the features used in the SVM are invariant to character orientation. Experiments were performed in different databases of real and synthetic touching characters and the results show that the method is efficient in segmenting touching characters of arbitrary orientations and sizes. |
|
|
Address |
|
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
0031-3203 |
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
|
|
|
Notes |
DAG |
Approved |
no |
|
|
Call Number |
Admin @ si @ RPL2012a |
Serial |
2133 |
|
Permanent link to this record |
|
|
|
|
Author |
Javier Vazquez; C. Alejandro Parraga; Maria Vanrell |
|
|
Title |
Ordinal pairwise method for natural images comparison |
Type |
Journal Article |
|
Year |
2009 |
Publication |
Perception |
Abbreviated Journal |
PER |
|
|
Volume |
38 |
Issue |
|
Pages |
180 |
|
|
Keywords |
|
|
|
Abstract |
38(Suppl.)ECVP Abstract Supplement
We developed a new psychophysical method to compare different colour appearance models when applied to natural scenes. The method was as follows: two images (processed by different algorithms) were displayed on a CRT monitor and observers were asked to select the most natural of them. The original images were gathered by means of a calibrated trichromatic digital camera and presented one on top of the other on a calibrated screen. The selection was made by pressing on a 6-button IR box, which allowed observers to consider not only the most natural but to rate their selection. The rating system allowed observers to register how much more natural was their chosen image (eg, much more, definitely more, slightly more), which gave us valuable extra information on the selection process. The results were analysed considering both the selection as a binary choice (using Thurstone's law of comparative judgement) and using Bradley-Terry method for ordinal comparison. Our results show a significant difference in the rating scales obtained. Although this method has been used in colour constancy algorithm comparisons, its uses are much wider, eg to compare algorithms of image compression, rendering, recolouring, etc. |
|
|
Address |
|
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
|
|
|
Notes |
CIC |
Approved |
no |
|
|
Call Number |
CAT @ cat @ VPV2009b |
Serial |
1191 |
|
Permanent link to this record |
|
|
|
|
Author |
Antonio Clavelli; Dimosthenis Karatzas; Josep Llados; Mario Ferraro; Giuseppe Boccignone |
|
|
Title |
Modelling task-dependent eye guidance to objects in pictures |
Type |
Journal Article |
|
Year |
2014 |
Publication |
Cognitive Computation |
Abbreviated Journal |
CoCom |
|
|
Volume |
6 |
Issue |
3 |
Pages |
558-584 |
|
|
Keywords |
Visual attention; Gaze guidance; Value; Payoff; Stochastic fixation prediction |
|
|
Abstract |
5Y Impact Factor: 1.14 / 3rd (Computer Science, Artificial Intelligence)
We introduce a model of attentional eye guidance based on the rationale that the deployment of gaze is to be considered in the context of a general action-perception loop relying on two strictly intertwined processes: sensory processing, depending on current gaze position, identifies sources of information that are most valuable under the given task; motor processing links such information with the oculomotor act by sampling the next gaze position and thus performing the gaze shift. In such a framework, the choice of where to look next is task-dependent and oriented to classes of objects embedded within pictures of complex scenes. The dependence on task is taken into account by exploiting the value and the payoff of gazing at certain image patches or proto-objects that provide a sparse representation of the scene objects. The different levels of the action-perception loop are represented in probabilistic form and eventually give rise to a stochastic process that generates the gaze sequence. This way the model also accounts for statistical properties of gaze shifts such as individual scan path variability. Results of the simulations are compared either with experimental data derived from publicly available datasets and from our own experiments. |
|
|
Address |
|
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
Springer US |
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
1866-9956 |
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
|
|
|
Notes |
DAG; 600.056; 600.045; 605.203; 601.212; 600.077 |
Approved |
no |
|
|
Call Number |
Admin @ si @ CKL2014 |
Serial |
2419 |
|
Permanent link to this record |
|
|
|
|
Author |
T.Chauhan; E.Perales; Kaida Xiao; E.Hird ; Dimosthenis Karatzas; Sophie Wuerger |
|
|
Title |
The achromatic locus: Effect of navigation direction in color space |
Type |
Journal Article |
|
Year |
2014 |
Publication |
Journal of Vision |
Abbreviated Journal |
VSS |
|
|
Volume |
14 (1) |
Issue |
25 |
Pages |
1-11 |
|
|
Keywords |
achromatic; unique hues; color constancy; luminance; color space |
|
|
Abstract |
5Y Impact Factor: 2.99 / 1st (Ophthalmology)
An achromatic stimulus is defined as a patch of light that is devoid of any hue. This is usually achieved by asking observers to adjust the stimulus such that it looks neither red nor green and at the same time neither yellow nor blue. Despite the theoretical and practical importance of the achromatic locus, little is known about the variability in these settings. The main purpose of the current study was to evaluate whether achromatic settings were dependent on the task of the observers, namely the navigation direction in color space. Observers could either adjust the test patch along the two chromatic axes in the CIE u*v* diagram or, alternatively, navigate along the unique-hue lines. Our main result is that the navigation method affects the reliability of these achromatic settings. Observers are able to make more reliable achromatic settings when adjusting the test patch along the directions defined by the four unique hues as opposed to navigating along the main axes in the commonly used CIE u*v* chromaticity plane. This result holds across different ambient viewing conditions (Dark, Daylight, Cool White Fluorescent) and different test luminance levels (5, 20, and 50 cd/m2). The reduced variability in the achromatic settings is consistent with the idea that internal color representations are more aligned with the unique-hue lines than the u* and v* axes. |
|
|
Address |
|
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
|
|
|
Notes |
DAG; 600.077 |
Approved |
no |
|
|
Call Number |
Admin @ si @ CPX2014 |
Serial |
2418 |
|
Permanent link to this record |
|
|
|
|
Author |
Arjan Gijsenij; Theo Gevers; Joost Van de Weijer |
|
|
Title |
Improving Color Constancy by Photometric Edge Weighting |
Type |
Journal Article |
|
Year |
2012 |
Publication |
IEEE Transaction on Pattern Analysis and Machine Intelligence |
Abbreviated Journal |
TPAMI |
|
|
Volume |
34 |
Issue |
5 |
Pages |
918-929 |
|
|
Keywords |
|
|
|
Abstract |
: Edge-based color constancy methods make use of image derivatives to estimate the illuminant. However, different edge types exist in real-world images such as material, shadow and highlight edges. These different edge types may have a distinctive influence on the performance of the illuminant estimation. Therefore, in this paper, an extensive analysis is provided of different edge types on the performance of edge-based color constancy methods. First, an edge-based taxonomy is presented classifying edge types based on their photometric properties (e.g. material, shadow-geometry and highlights). Then, a performance evaluation of edge-based color constancy is provided using these different edge types. From this performance evaluation it is derived that specular and shadow edge types are more valuable than material edges for the estimation of the illuminant. To this end, the (iterative) weighted Grey-Edge algorithm is proposed in which these edge types are more emphasized for the estimation of the illuminant. Images that are recorded under controlled circumstances demonstrate that the proposed iterative weighted Grey-Edge algorithm based on highlights reduces the median angular error with approximately $25\%$. In an uncontrolled environment, improvements in angular error up to $11\%$ are obtained with respect to regular edge-based color constancy. |
|
|
Address |
Los Alamitos; CA; USA; |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
0162-8828 |
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
|
|
|
Notes |
CIC;ISE |
Approved |
no |
|
|
Call Number |
Admin @ si @ GGW2012 |
Serial |
1850 |
|
Permanent link to this record |
|
|
|
|
Author |
Sergio Escalera; Oriol Pujol; Petia Radeva |
|
|
Title |
On the Decoding Process in Ternary Error-Correcting Output Codes |
Type |
Journal Article |
|
Year |
2010 |
Publication |
IEEE on Pattern Analysis and Machine Intelligence |
Abbreviated Journal |
TPAMI |
|
|
Volume |
32 |
Issue |
1 |
Pages |
120–134 |
|
|
Keywords |
|
|
|
Abstract |
A common way to model multiclass classification problems is to design a set of binary classifiers and to combine them. Error-correcting output codes (ECOC) represent a successful framework to deal with these type of problems. Recent works in the ECOC framework showed significant performance improvements by means of new problem-dependent designs based on the ternary ECOC framework. The ternary framework contains a larger set of binary problems because of the use of a ldquodo not carerdquo symbol that allows us to ignore some classes by a given classifier. However, there are no proper studies that analyze the effect of the new symbol at the decoding step. In this paper, we present a taxonomy that embeds all binary and ternary ECOC decoding strategies into four groups. We show that the zero symbol introduces two kinds of biases that require redefinition of the decoding design. A new type of decoding measure is proposed, and two novel decoding strategies are defined. We evaluate the state-of-the-art coding and decoding strategies over a set of UCI machine learning repository data sets and into a real traffic sign categorization problem. The experimental results show that, following the new decoding strategies, the performance of the ECOC design is significantly improved. |
|
|
Address |
|
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
0162-8828 |
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
|
|
|
Notes |
MILAB;HUPBA |
Approved |
no |
|
|
Call Number |
BCNPCL @ bcnpcl @ EPR2010b |
Serial |
1277 |
|
Permanent link to this record |
|
|
|
|
Author |
Yi Xiao; Felipe Codevilla; Akhil Gurram; Onay Urfalioglu; Antonio Lopez |
|
|
Title |
Multimodal end-to-end autonomous driving |
Type |
Journal Article |
|
Year |
2020 |
Publication |
IEEE Transactions on Intelligent Transportation Systems |
Abbreviated Journal |
TITS |
|
|
Volume |
|
Issue |
|
Pages |
1-11 |
|
|
Keywords |
|
|
|
Abstract |
A crucial component of an autonomous vehicle (AV) is the artificial intelligence (AI) is able to drive towards a desired destination. Today, there are different paradigms addressing the development of AI drivers. On the one hand, we find modular pipelines, which divide the driving task into sub-tasks such as perception and maneuver planning and control. On the other hand, we find end-to-end driving approaches that try to learn a direct mapping from input raw sensor data to vehicle control signals. The later are relatively less studied, but are gaining popularity since they are less demanding in terms of sensor data annotation. This paper focuses on end-to-end autonomous driving. So far, most proposals relying on this paradigm assume RGB images as input sensor data. However, AVs will not be equipped only with cameras, but also with active sensors providing accurate depth information (e.g., LiDARs). Accordingly, this paper analyses whether combining RGB and depth modalities, i.e. using RGBD data, produces better end-to-end AI drivers than relying on a single modality. We consider multimodality based on early, mid and late fusion schemes, both in multisensory and single-sensor (monocular depth estimation) settings. Using the CARLA simulator and conditional imitation learning (CIL), we show how, indeed, early fusion multimodality outperforms single-modality. |
|
|
Address |
|
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
|
|
|
Notes |
ADAS |
Approved |
no |
|
|
Call Number |
Admin @ si @ XCG2020 |
Serial |
3490 |
|
Permanent link to this record |
|
|
|
|
Author |
Joakim Bruslund Haurum; Meysam Madadi; Sergio Escalera; Thomas B. Moeslund |
|
|
Title |
Multi-scale hybrid vision transformer and Sinkhorn tokenizer for sewer defect classification |
Type |
Journal Article |
|
Year |
2022 |
Publication |
Automation in Construction |
Abbreviated Journal |
AC |
|
|
Volume |
144 |
Issue |
|
Pages |
104614 |
|
|
Keywords |
Sewer Defect Classification; Vision Transformers; Sinkhorn-Knopp; Convolutional Neural Networks; Closed-Circuit Television; Sewer Inspection |
|
|
Abstract |
A crucial part of image classification consists of capturing non-local spatial semantics of image content. This paper describes the multi-scale hybrid vision transformer (MSHViT), an extension of the classical convolutional neural network (CNN) backbone, for multi-label sewer defect classification. To better model spatial semantics in the images, features are aggregated at different scales non-locally through the use of a lightweight vision transformer, and a smaller set of tokens was produced through a novel Sinkhorn clustering-based tokenizer using distinct cluster centers. The proposed MSHViT and Sinkhorn tokenizer were evaluated on the Sewer-ML multi-label sewer defect classification dataset, showing consistent performance improvements of up to 2.53 percentage points. |
|
|
Address |
Dec 2022 |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
|
|
|
Notes |
HuPBA |
Approved |
no |
|
|
Call Number |
Admin @ si @ BME2022c |
Serial |
3780 |
|
Permanent link to this record |