|
Records |
Links |
|
Author |
Marçal Rusiñol; David Aldavert; Dimosthenis Karatzas; Ricardo Toledo; Josep Llados |
![goto web page (via DOI) doi](img/doi.gif)
![find record details (via OpenURL) openurl](img/xref.gif)
|
|
Title |
Interactive Trademark Image Retrieval by Fusing Semantic and Visual Content. Advances in Information Retrieval |
Type |
Conference Article |
|
Year |
2011 |
Publication |
33rd European Conference on Information Retrieval |
Abbreviated Journal |
|
|
|
Volume |
6611 |
Issue |
|
Pages ![sorted by First Page field, descending order (down)](img/sort_desc.gif) |
314-325 |
|
|
Keywords |
|
|
|
Abstract |
In this paper we propose an efficient queried-by-example retrieval system which is able to retrieve trademark images by similarity from patent and trademark offices' digital libraries. Logo images are described by both their semantic content, by means of the Vienna codes, and their visual contents, by using shape and color as visual cues. The trademark descriptors are then indexed by a locality-sensitive hashing data structure aiming to perform approximate k-NN search in high dimensional spaces in sub-linear time. The resulting ranked lists are combined by using the Condorcet method and a relevance feedback step helps to iteratively revise the query and refine the obtained results. The experiments demonstrate the effectiveness and efficiency of this system on a realistic and large dataset. |
|
|
Address |
Dublin, Ireland |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
Springer |
Place of Publication |
Berlin |
Editor |
P. Clough; C. Foley; C. Gurrin; G.J.F. Jones; W. Kraaij; H. Lee; V. Murdoch |
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
LNCS |
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
978-3-642-20160-8 |
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
ECIR |
|
|
Notes |
DAG; RV;ADAS |
Approved |
no |
|
|
Call Number |
Admin @ si @ RAK2011 |
Serial |
1737 |
|
Permanent link to this record |
|
|
|
|
Author |
Volkmar Frinken; Andreas Fischer; Horst Bunke; Alicia Fornes |
![goto web page (via DOI) doi](img/doi.gif)
|
|
Title |
Co-training for Handwritten Word Recognition |
Type |
Conference Article |
|
Year |
2011 |
Publication |
11th International Conference on Document Analysis and Recognition |
Abbreviated Journal |
|
|
|
Volume |
|
Issue |
|
Pages ![sorted by First Page field, descending order (down)](img/sort_desc.gif) |
314-318 |
|
|
Keywords |
|
|
|
Abstract |
To cope with the tremendous variations of writing styles encountered between different individuals, unconstrained automatic handwriting recognition systems need to be trained on large sets of labeled data. Traditionally, the training data has to be labeled manually, which is a laborious and costly process. Semi-supervised learning techniques offer methods to utilize unlabeled data, which can be obtained cheaply in large amounts in order, to reduce the need for labeled data. In this paper, we propose the use of Co-Training for improving the recognition accuracy of two weakly trained handwriting recognition systems. The first one is based on Recurrent Neural Networks while the second one is based on Hidden Markov Models. On the IAM off-line handwriting database we demonstrate a significant increase of the recognition accuracy can be achieved with Co-Training for single word recognition. |
|
|
Address |
Beijing, China |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
ICDAR |
|
|
Notes |
DAG |
Approved |
no |
|
|
Call Number |
Admin @ si @ FFB2011 |
Serial |
1789 |
|
Permanent link to this record |
|
|
|
|
Author |
Margarita Torre; Petia Radeva |
![goto web page (via DOI) doi](img/doi.gif)
|
|
Title |
Agricultural-Field Extraction on Aerial Images by Region Competition Algorithm |
Type |
Conference Article |
|
Year |
2000 |
Publication |
15 th International Conference on Pattern Recognition |
Abbreviated Journal |
|
|
|
Volume |
1 |
Issue |
|
Pages ![sorted by First Page field, descending order (down)](img/sort_desc.gif) |
313-316 |
|
|
Keywords |
|
|
|
Abstract |
|
|
|
Address |
Barcelona, Spain, 2000 |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
ICPR |
|
|
Notes |
MILAB |
Approved |
no |
|
|
Call Number |
BCNPCL @ bcnpcl @ Tor2000a |
Serial |
222 |
|
Permanent link to this record |
|
|
|
|
Author |
Carles Sanchez;F. Javier Sanchez; Antoni Rosell; Debora Gil |
![download PDF file pdf](img/file_PDF.gif)
![goto web page (via DOI) doi](img/doi.gif)
![find record details (via OpenURL) openurl](img/xref.gif)
|
|
Title |
An illumination model of the trachea appearance in videobronchoscopy images |
Type |
Book Chapter |
|
Year |
2012 |
Publication |
Image Analysis and Recognition |
Abbreviated Journal |
LNCS |
|
|
Volume |
7325 |
Issue |
|
Pages ![sorted by First Page field, descending order (down)](img/sort_desc.gif) |
313-320 |
|
|
Keywords |
Bronchoscopy, tracheal ring, stenosis assesment, trachea appearance model, segmentation |
|
|
Abstract |
Videobronchoscopy is a medical imaging technique that allows interactive navigation inside the respiratory pathways. This imaging modality provides realistic images and allows non-invasive minimal intervention procedures. Tracheal procedures are routinary interventions that require assessment of the percentage of obstructed pathway for injury (stenosis) detection. Visual assessment in videobronchoscopic sequences requires high expertise of trachea anatomy and is prone to human error.
This paper introduces an automatic method for the estimation of steneosed trachea percentage reduction in videobronchoscopic images. We look for tracheal rings , whose deformation determines the degree of obstruction. For ring extraction , we present a ring detector based on an illumination and appearance model. This model allows us to parametrise the ring detection. Finally, we can infer optimal estimation parameters for any video resolution. |
|
|
Address |
Aveiro, Portugal |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
Springer Berlin Heidelberg |
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
Lecture Notes in Computer Science |
Abbreviated Series Title |
LNCS |
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
0302-9743 |
ISBN |
978-3-642-31297-7 |
Medium |
|
|
|
Area |
800 |
Expedition |
|
Conference |
ICIAR |
|
|
Notes |
MV;IAM |
Approved |
no |
|
|
Call Number |
IAM @ iam @ SSR2012 |
Serial |
1898 |
|
Permanent link to this record |
|
|
|
|
Author |
Miguel Oliveira; Victor Santos; Angel Sappa; P. Dias; A. Moreira |
![download PDF file pdf](img/file_PDF.gif)
![find record details (via OpenURL) openurl](img/xref.gif)
|
|
Title |
Incremental Scenario Representations for Autonomous Driving using Geometric Polygonal Primitives |
Type |
Journal Article |
|
Year |
2016 |
Publication |
Robotics and Autonomous Systems |
Abbreviated Journal |
RAS |
|
|
Volume |
83 |
Issue |
|
Pages ![sorted by First Page field, descending order (down)](img/sort_desc.gif) |
312-325 |
|
|
Keywords |
Incremental scene reconstruction; Point clouds; Autonomous vehicles; Polygonal primitives |
|
|
Abstract |
When an autonomous vehicle is traveling through some scenario it receives a continuous stream of sensor data. This sensor data arrives in an asynchronous fashion and often contains overlapping or redundant information. Thus, it is not trivial how a representation of the environment observed by the vehicle can be created and updated over time. This paper presents a novel methodology to compute an incremental 3D representation of a scenario from 3D range measurements. We propose to use macro scale polygonal primitives to model the scenario. This means that the representation of the scene is given as a list of large scale polygons that describe the geometric structure of the environment. Furthermore, we propose mechanisms designed to update the geometric polygonal primitives over time whenever fresh sensor data is collected. Results show that the approach is capable of producing accurate descriptions of the scene, and that it is computationally very efficient when compared to other reconstruction techniques. |
|
|
Address |
|
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
Elsevier B.V. |
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
|
|
|
Notes |
ADAS; 600.086, 600.076 |
Approved |
no |
|
|
Call Number |
Admin @ si @OSS2016a |
Serial |
2806 |
|
Permanent link to this record |
|
|
|
|
Author |
C. Gratin; Jordi Vitria; F. Moreso; D. Seron |
![goto web page url](img/www.gif)
|
|
Title |
Texture Classification using Neural Networks and Local Granulometries |
Type |
Conference Article |
|
Year |
1994 |
Publication |
EURASIP Workshop, Mathematical Morphology and Its Applications to image Processing, J.Serra and P.Soille, editors |
Abbreviated Journal |
|
|
|
Volume |
|
Issue |
|
Pages ![sorted by First Page field, descending order (down)](img/sort_desc.gif) |
309-316 |
|
|
Keywords |
Neural Networks; Granulometry; Kidney; Texture; Classication |
|
|
Abstract |
|
|
|
Address |
Fointanebleau, France |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
|
|
|
Notes |
OR;MV |
Approved |
no |
|
|
Call Number |
BCNPCL @ bcnpcl @ GVM1994 |
Serial |
110 |
|
Permanent link to this record |
|
|
|
|
Author |
Ciprian Corneanu; Meysam Madadi; Sergio Escalera |
![download PDF file pdf](img/file_PDF.gif)
![find record details (via OpenURL) openurl](img/xref.gif)
|
|
Title |
Deep Structure Inference Network for Facial Action Unit Recognition |
Type |
Conference Article |
|
Year |
2018 |
Publication |
15th European Conference on Computer Vision |
Abbreviated Journal |
|
|
|
Volume |
11216 |
Issue |
|
Pages ![sorted by First Page field, descending order (down)](img/sort_desc.gif) |
309-324 |
|
|
Keywords |
Computer Vision; Machine Learning; Deep Learning; Facial Expression Analysis; Facial Action Units; Structure Inference |
|
|
Abstract |
Facial expressions are combinations of basic components called Action Units (AU). Recognizing AUs is key for general facial expression analysis. Recently, efforts in automatic AU recognition have been dedicated to learning combinations of local features and to exploiting correlations between AUs. We propose a deep neural architecture that tackles both problems by combining learned local and global features in its initial stages and replicating a message passing algorithm between classes similar to a graphical model inference approach in later stages. We show that by training the model end-to-end with increased supervision we improve state-of-the-art by 5.3% and 8.2% performance on BP4D and DISFA datasets, respectively. |
|
|
Address |
Munich; September 2018 |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
LNCS |
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
ECCV |
|
|
Notes |
HUPBA; no proj |
Approved |
no |
|
|
Call Number |
Admin @ si @ CME2018 |
Serial |
3205 |
|
Permanent link to this record |
|
|
|
|
Author |
Mariella Dimiccoli |
![download PDF file pdf](img/file_PDF.gif)
![find record details (via OpenURL) openurl](img/xref.gif)
|
|
Title |
Figure-ground segregation: A fully nonlocal approach |
Type |
Journal Article |
|
Year |
2016 |
Publication |
Vision Research |
Abbreviated Journal |
VR |
|
|
Volume |
126 |
Issue |
|
Pages ![sorted by First Page field, descending order (down)](img/sort_desc.gif) |
308-317 |
|
|
Keywords |
Figure-ground segregation; Nonlocal approach; Directional linear voting; Nonlinear diffusion |
|
|
Abstract |
We present a computational model that computes and integrates in a nonlocal fashion several configural cues for automatic figure-ground segregation. Our working hypothesis is that the figural status of each pixel is a nonlocal function of several geometric shape properties and it can be estimated without explicitly relying on object boundaries. The methodology is grounded on two elements: multi-directional linear voting and nonlinear diffusion. A first estimation of the figural status of each pixel is obtained as a result of a voting process, in which several differently oriented line-shaped neighborhoods vote to express their belief about the figural status of the pixel. A nonlinear diffusion process is then applied to enforce the coherence of figural status estimates among perceptually homogeneous regions. Computer simulations fit human perception and match the experimental evidence that several cues cooperate in defining figure-ground segregation. The results of this work suggest that figure-ground segregation involves feedback from cells with larger receptive fields in higher visual cortical areas. |
|
|
Address |
|
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
|
|
|
Notes |
MILAB; |
Approved |
no |
|
|
Call Number |
Admin @ si @ Dim2016b |
Serial |
2623 |
|
Permanent link to this record |
|
|
|
|
Author |
Ayan Banerjee; Sanket Biswas; Josep Llados; Umapada Pal |
![goto web page url](img/www.gif)
|
|
Title |
SwinDocSegmenter: An End-to-End Unified Domain Adaptive Transformer for Document Instance Segmentation |
Type |
Conference Article |
|
Year |
2023 |
Publication |
17th International Conference on Document Analysis and Recognition |
Abbreviated Journal |
|
|
|
Volume |
14187 |
Issue |
|
Pages ![sorted by First Page field, descending order (down)](img/sort_desc.gif) |
307–325 |
|
|
Keywords |
|
|
|
Abstract |
Instance-level segmentation of documents consists in assigning a class-aware and instance-aware label to each pixel of the image. It is a key step in document parsing for their understanding. In this paper, we present a unified transformer encoder-decoder architecture for en-to-end instance segmentation of complex layouts in document images. The method adapts a contrastive training with a mixed query selection for anchor initialization in the decoder. Later on, it performs a dot product between the obtained query embeddings and the pixel embedding map (coming from the encoder) for semantic reasoning. Extensive experimentation on competitive benchmarks like PubLayNet, PRIMA, Historical Japanese (HJ), and TableBank demonstrate that our model with SwinL backbone achieves better segmentation performance than the existing state-of-the-art approaches with the average precision of 93.72, 54.39, 84.65 and 98.04 respectively under one billion parameters. The code is made publicly available at: github.com/ayanban011/SwinDocSegmenter . |
|
|
Address |
San Jose; CA; USA; August 2023 |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
LNCS |
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
ICDAR |
|
|
Notes |
DAG |
Approved |
no |
|
|
Call Number |
Admin @ si @ BBL2023 |
Serial |
3893 |
|
Permanent link to this record |
|
|
|
|
Author |
Adria Molina; Pau Riba; Lluis Gomez; Oriol Ramos Terrades; Josep Llados |
![download PDF file pdf](img/file_PDF.gif)
![find record details (via OpenURL) openurl](img/xref.gif)
|
|
Title |
Date Estimation in the Wild of Scanned Historical Photos: An Image Retrieval Approach |
Type |
Conference Article |
|
Year |
2021 |
Publication |
16th International Conference on Document Analysis and Recognition |
Abbreviated Journal |
|
|
|
Volume |
12822 |
Issue |
|
Pages ![sorted by First Page field, descending order (down)](img/sort_desc.gif) |
306-320 |
|
|
Keywords |
|
|
|
Abstract |
This paper presents a novel method for date estimation of historical photographs from archival sources. The main contribution is to formulate the date estimation as a retrieval task, where given a query, the retrieved images are ranked in terms of the estimated date similarity. The closer are their embedded representations the closer are their dates. Contrary to the traditional models that design a neural network that learns a classifier or a regressor, we propose a learning objective based on the nDCG ranking metric. We have experimentally evaluated the performance of the method in two different tasks: date estimation and date-sensitive image retrieval, using the DEW public database, overcoming the baseline methods. |
|
|
Address |
Lausanne; Suissa; September 2021 |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
LNCS |
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
ICDAR |
|
|
Notes |
DAG; 600.121; 600.140; 110.312 |
Approved |
no |
|
|
Call Number |
Admin @ si @ MRG2021b |
Serial |
3571 |
|
Permanent link to this record |
|
|
|
|
Author |
Jordi Gonzalez; Thomas B. Moeslund; Liang Wang |
![download PDF file pdf](img/file_PDF.gif)
![find record details (via OpenURL) openurl](img/xref.gif)
|
|
Title |
Semantic Understanding of Human Behaviors in Image Sequences: From video-surveillance to video-hermeneutics |
Type |
Journal Article |
|
Year |
2012 |
Publication |
Computer Vision and Image Understanding |
Abbreviated Journal |
CVIU |
|
|
Volume |
116 |
Issue |
3 |
Pages ![sorted by First Page field, descending order (down)](img/sort_desc.gif) |
305–306 |
|
|
Keywords |
|
|
|
Abstract |
Purpose: Atheromatic plaque progression is affected, among others phenomena, by biomechanical, biochemical, and physiological factors. In this paper, the authors introduce a novel framework able to provide both morphological (vessel radius, plaque thickness, and type) and biomechanical (wall shear stress and Von Mises stress) indices of coronary arteries.Methods: First, the approach reconstructs the three-dimensional morphology of the vessel from intravascular ultrasound (IVUS) and Angiographic sequences, requiring minimal user interaction. Then, a computational pipeline allows to automatically assess fluid-dynamic and mechanical indices. Ten coronary arteries are analyzed illustrating the capabilities of the tool and confirming previous technical and clinical observations.Results: The relations between the arterial indices obtained by IVUS measurement and simulations have been quantitatively analyzed along the whole surface of the artery, extending the analysis of the coronary arteries shown in previous state of the art studies. Additionally, for the first time in the literature, the framework allows the computation of the membrane stresses using a simplified mechanical model of the arterial wall.Conclusions: Circumferentially (within a given frame), statistical analysis shows an inverse relation between the wall shear stress and the plaque thickness. At the global level (comparing a frame within the entire vessel), it is observed that heavy plaque accumulations are in general calcified and are located in the areas of the vessel having high wall shear stress. Finally, in their experiments the inverse proportionality between fluid and structural stresses is observed. |
|
|
Address |
|
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
1077-3142 |
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
|
|
|
Notes |
ISE |
Approved |
no |
|
|
Call Number |
Admin @ si @ GMW2012 |
Serial |
2005 |
|
Permanent link to this record |
|
|
|
|
Author |
Debora Gil; Jaume Garcia; Aura Hernandez-Sabate; Enric Marti |
![download PDF file pdf](img/file_PDF.gif)
![goto web page (via DOI) doi](img/doi.gif)
|
|
Title |
Manifold parametrization of the left ventricle for a statistical modelling of its complete anatomy |
Type |
Conference Article |
|
Year |
2010 |
Publication |
8th Medical Imaging |
Abbreviated Journal |
|
|
|
Volume |
7623 |
Issue |
762304 |
Pages ![sorted by First Page field, descending order (down)](img/sort_desc.gif) |
304 |
|
|
Keywords |
|
|
|
Abstract |
Distortion of Left Ventricle (LV) external anatomy is related to some dysfunctions, such as hypertrophy. The architecture of myocardial fibers determines LV electromechanical activation patterns as well as mechanics. Thus, their joined modelling would allow the design of specific interventions (such as peacemaker implantation and LV remodelling) and therapies (such as resynchronization). On one hand, accurate modelling of external anatomy requires either a dense sampling or a continuous infinite dimensional approach, which requires non-Euclidean statistics. On the other hand, computation of fiber models requires statistics on Riemannian spaces. Most approaches compute separate statistical models for external anatomy and fibers architecture. In this work we propose a general mathematical framework based on differential geometry concepts for computing a statistical model including, both, external and fiber anatomy. Our framework provides a continuous approach to external anatomy supporting standard statistics. We also provide a straightforward formula for the computation of the Riemannian fiber statistics. We have applied our methodology to the computation of complete anatomical atlas of canine hearts from diffusion tensor studies. The orientation of fibers over the average external geometry agrees with the segmental description of orientations reported in the literature. |
|
|
Address |
|
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
SPIE |
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
SPIE |
|
|
Notes |
IAM |
Approved |
no |
|
|
Call Number |
IAM @ iam @ GGH2010a |
Serial |
1522 |
|
Permanent link to this record |
|
|
|
|
Author |
David Dueñas; Mostafa Kamal; Petia Radeva |
![find record details (via OpenURL) openurl](img/xref.gif)
|
|
Title |
Efficient Deep Learning Ensemble for Skin Lesion Classification |
Type |
Conference Article |
|
Year |
2023 |
Publication |
Proceedings of the 18th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications |
Abbreviated Journal |
|
|
|
Volume |
|
Issue |
|
Pages ![sorted by First Page field, descending order (down)](img/sort_desc.gif) |
303-314 |
|
|
Keywords |
|
|
|
Abstract |
Vision Transformers (ViTs) are deep learning techniques that have been gaining in popularity in recent years.
In this work, we study the performance of ViTs and Convolutional Neural Networks (CNNs) on skin lesions classification tasks, specifically melanoma diagnosis. We show that regardless of the performance of both architectures, an ensemble of them can improve their generalization. We also present an adaptation to the Gram-OOD* method (detecting Out-of-distribution (OOD) using Gram matrices) for skin lesion images. Moreover, the integration of super-convergence was critical to success in building models with strict computing and training time constraints. We evaluated our ensemble of ViTs and CNNs, demonstrating that generalization is enhanced by placing first in the 2019 and third in the 2020 ISIC Challenge Live Leaderboards
(available at https://challenge.isic-archive.com/leaderboards/live/). |
|
|
Address |
Lisboa; Portugal; February 2023 |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
VISIGRAPP |
|
|
Notes |
MILAB |
Approved |
no |
|
|
Call Number |
Admin @ si @ DKR2023 |
Serial |
3928 |
|
Permanent link to this record |
|
|
|
|
Author |
Emanuel Indermühle; Volkmar Frinken; Horst Bunke |
![download PDF file pdf](img/file_PDF.gif)
![find book details (via ISBN) isbn](img/isbn.gif)
|
|
Title |
Mode Detection in Online Handwritten Documents using BLSTM Neural Networks |
Type |
Conference Article |
|
Year |
2012 |
Publication |
13th International Conference on Frontiers in Handwriting Recognition |
Abbreviated Journal |
|
|
|
Volume |
|
Issue |
|
Pages ![sorted by First Page field, descending order (down)](img/sort_desc.gif) |
302-307 |
|
|
Keywords |
|
|
|
Abstract |
Mode detection in online handwritten documents refers to the process of distinguishing different types of contents, such as text, formulas, diagrams, or tables, one from another. In this paper a new approach to mode detection is proposed that uses bidirectional long-short term memory (BLSTM) neural networks. The BLSTM neural network is a novel type of recursive neural network that has been successfully applied in speech and handwriting recognition. In this paper we show that it has the potential to significantly outperform traditional methods for mode detection, which are usually based on stroke classification. As a further advantage over previous approaches, the proposed system is trainable and does not rely on user-defined heuristics. Moreover, it can be easily adapted to new or additional types of modes by just providing the system with new training data. |
|
|
Address |
Bari, italy |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
978-1-4673-2262-1 |
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
ICFHR |
|
|
Notes |
DAG |
Approved |
no |
|
|
Call Number |
Admin @ si @ IFB2012 |
Serial |
2056 |
|
Permanent link to this record |
|
|
|
|
Author |
D. Perez; L. Tarazon; N. Serrano; F.M. Castro; Oriol Ramos Terrades; A. Juan |
![goto web page (via DOI) doi](img/doi.gif)
![find record details (via OpenURL) openurl](img/xref.gif)
|
|
Title |
The GERMANA Database |
Type |
Conference Article |
|
Year |
2009 |
Publication |
10th International Conference on Document Analysis and Recognition |
Abbreviated Journal |
|
|
|
Volume |
|
Issue |
|
Pages ![sorted by First Page field, descending order (down)](img/sort_desc.gif) |
301-305 |
|
|
Keywords |
|
|
|
Abstract |
A new handwritten text database, GERMANA, is presented to facilitate empirical comparison of different approaches to text line extraction and off-line handwriting recognition. GERMANA is the result of digitising and annotating a 764-page Spanish manuscript from 1891, in which most pages only contain nearly calligraphed text written on ruled sheets of well-separated lines. To our knowledge, it is the first publicly available database for handwriting research, mostly written in Spanish and comparable in size to standard databases. Due to its sequential book structure, it is also well-suited for realistic assessment of interactive handwriting recognition systems. To provide baseline results for reference in future studies, empirical results are also reported, using standard techniques and tools for preprocessing, feature extraction, HMM-based image modelling, and language modelling. |
|
|
Address |
Barcelona; Spain |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
1520-5363 |
ISBN |
978-1-4244-4500-4 |
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
ICDAR |
|
|
Notes |
DAG |
Approved |
no |
|
|
Call Number |
Admin @ si @ PTS2009 |
Serial |
1870 |
|
Permanent link to this record |