Records |
Author |
Partha Pratim Roy; Josep Llados; Umapada Pal |
Title |
Text/Graphics Separation in Color Maps |
Type |
Conference Article |
Year |
2007 |
Publication |
International Conference on Computing: Theory and Applications |
Abbreviated Journal |
|
Volume |
|
Issue |
|
Pages ![sorted by First Page field, descending order (down)](img/sort_desc.gif) |
545–551 |
Keywords |
|
Abstract |
|
Address |
Kolkata (India) |
Corporate Author |
|
Thesis |
|
Publisher |
|
Place of Publication |
|
Editor |
|
Language |
|
Summary Language |
|
Original Title |
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
Series Volume |
|
Series Issue |
|
Edition |
|
ISSN |
|
ISBN |
|
Medium |
|
Area |
|
Expedition |
|
Conference |
ICCTA |
Notes |
DAG |
Approved |
no |
Call Number |
DAG @ dag @ RLP2007a |
Serial |
806 |
Permanent link to this record |
|
|
|
Author |
Marçal Rusiñol; Lluis Pere de las Heras; Oriol Ramos Terrades |
Title |
Flowchart Recognition for Non-Textual Information Retrieval in Patent Search |
Type |
Journal Article |
Year |
2014 |
Publication |
Information Retrieval |
Abbreviated Journal |
IR |
Volume |
17 |
Issue |
5-6 |
Pages ![sorted by First Page field, descending order (down)](img/sort_desc.gif) |
545-562 |
Keywords |
Flowchart recognition; Patent documents; Text/graphics separation; Raster-to-vector conversion; Symbol recognition |
Abstract |
Relatively little research has been done on the topic of patent image retrieval and in general in most of the approaches the retrieval is performed in terms of a similarity measure between the query image and the images in the corpus. However, systems aimed at overcoming the semantic gap between the visual description of patent images and their conveyed concepts would be very helpful for patent professionals. In this paper we present a flowchart recognition method aimed at achieving a structured representation of flowchart images that can be further queried semantically. The proposed method was submitted to the CLEF-IP 2012 flowchart recognition task. We report the obtained results on this dataset. |
Address |
|
Corporate Author |
|
Thesis |
|
Publisher |
|
Place of Publication |
|
Editor |
|
Language |
|
Summary Language |
|
Original Title |
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
Series Volume |
|
Series Issue |
|
Edition |
|
ISSN |
1386-4564 |
ISBN |
|
Medium |
|
Area |
|
Expedition |
|
Conference |
|
Notes |
DAG; 600.077 |
Approved |
no |
Call Number |
Admin @ si @ RHR2013 |
Serial |
2342 |
Permanent link to this record |
|
|
|
Author |
P. Ricaurte; C. Chilan; Cristhian A. Aguilera-Carrasco; Boris X. Vintimilla; Angel Sappa |
Title |
Performance Evaluation of Feature Point Descriptors in the Infrared Domain |
Type |
Conference Article |
Year |
2014 |
Publication |
9th International Conference on Computer Vision Theory and Applications |
Abbreviated Journal |
|
Volume |
1 |
Issue |
|
Pages ![sorted by First Page field, descending order (down)](img/sort_desc.gif) |
545-550 |
Keywords |
Infrared Imaging; Feature Point Descriptors |
Abstract |
This paper presents a comparative evaluation of classical feature point descriptors when they are used in the long-wave infrared spectral band. Robustness to changes in rotation, scaling, blur, and additive noise are evaluated using a state of the art framework. Statistical results using an outdoor image data set are presented together with a discussion about the differences with respect to the results obtained when images from the visible spectrum are considered. |
Address |
Lisboa; Portugal; January 2014 |
Corporate Author |
|
Thesis |
|
Publisher |
|
Place of Publication |
|
Editor |
|
Language |
|
Summary Language |
|
Original Title |
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
Series Volume |
|
Series Issue |
|
Edition |
|
ISSN |
|
ISBN |
|
Medium |
|
Area |
|
Expedition |
|
Conference |
VISAPP |
Notes |
ADAS; 600.055; 600.076 |
Approved |
no |
Call Number |
Admin @ si @ RCA2014b |
Serial |
2476 |
Permanent link to this record |
|
|
|
Author |
Marçal Rusiñol; David Aldavert; Ricardo Toledo; Josep Llados |
Title |
Efficient segmentation-free keyword spotting in historical document collections |
Type |
Journal Article |
Year |
2015 |
Publication |
Pattern Recognition |
Abbreviated Journal |
PR |
Volume |
48 |
Issue |
2 |
Pages ![sorted by First Page field, descending order (down)](img/sort_desc.gif) |
545–555 |
Keywords |
Historical documents; Keyword spotting; Segmentation-free; Dense SIFT features; Latent semantic analysis; Product quantization |
Abstract |
In this paper we present an efficient segmentation-free word spotting method, applied in the context of historical document collections, that follows the query-by-example paradigm. We use a patch-based framework where local patches are described by a bag-of-visual-words model powered by SIFT descriptors. By projecting the patch descriptors to a topic space with the latent semantic analysis technique and compressing the descriptors with the product quantization method, we are able to efficiently index the document information both in terms of memory and time. The proposed method is evaluated using four different collections of historical documents achieving good performances on both handwritten and typewritten scenarios. The yielded performances outperform the recent state-of-the-art keyword spotting approaches. |
Address |
|
Corporate Author |
|
Thesis |
|
Publisher |
|
Place of Publication |
|
Editor |
|
Language |
|
Summary Language |
|
Original Title |
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
Series Volume |
|
Series Issue |
|
Edition |
|
ISSN |
|
ISBN |
|
Medium |
|
Area |
|
Expedition |
|
Conference |
|
Notes |
DAG; ADAS; 600.076; 600.077; 600.061; 601.223; 602.006; 600.055 |
Approved |
no |
Call Number |
Admin @ si @ RAT2015a |
Serial |
2544 |
Permanent link to this record |
|
|
|
Author |
Beata Megyesi; Bernhard Esslinger; Alicia Fornes; Nils Kopal; Benedek Lang; George Lasry; Karl de Leeuw; Eva Pettersson; Arno Wacker; Michelle Waldispuhl |
Title |
Decryption of historical manuscripts: the DECRYPT project |
Type |
Journal Article |
Year |
2020 |
Publication |
Cryptologia |
Abbreviated Journal |
CRYPT |
Volume |
44 |
Issue |
6 |
Pages ![sorted by First Page field, descending order (down)](img/sort_desc.gif) |
545-559 |
Keywords |
automatic decryption; cipher collection; historical cryptology; image transcription |
Abstract |
Many historians and linguists are working individually and in an uncoordinated fashion on the identification and decryption of historical ciphers. This is a time-consuming process as they often work without access to automatic methods and processes that can accelerate the decipherment. At the same time, computer scientists and cryptologists are developing algorithms to decrypt various cipher types without having access to a large number of original ciphertexts. In this paper, we describe the DECRYPT project aiming at the creation of resources and tools for historical cryptology by bringing the expertise of various disciplines together for collecting data, exchanging methods for faster progress to transcribe, decrypt and contextualize historical encrypted manuscripts. We present our goals and work-in progress of a general approach for analyzing historical encrypted manuscripts using standardized methods and a new set of state-of-the-art tools. We release the data and tools as open-source hoping that all mentioned disciplines would benefit and contribute to the research infrastructure of historical cryptology. |
Address |
|
Corporate Author |
|
Thesis |
|
Publisher |
|
Place of Publication |
|
Editor |
|
Language |
|
Summary Language |
|
Original Title |
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
Series Volume |
|
Series Issue |
|
Edition |
|
ISSN |
|
ISBN |
|
Medium |
|
Area |
|
Expedition |
|
Conference |
|
Notes |
DAG; 600.140; 600.121 |
Approved |
no |
Call Number |
Admin @ si @ MEF2020 |
Serial |
3347 |
Permanent link to this record |
|
|
|
Author |
Juan Ignacio Toledo; Sebastian Sudholt; Alicia Fornes; Jordi Cucurull; A. Fink; Josep Llados |
Title |
Handwritten Word Image Categorization with Convolutional Neural Networks and Spatial Pyramid Pooling |
Type |
Conference Article |
Year |
2016 |
Publication |
Joint IAPR International Workshops on Statistical Techniques in Pattern Recognition (SPR) and Structural and Syntactic Pattern Recognition (SSPR) |
Abbreviated Journal |
|
Volume |
10029 |
Issue |
|
Pages ![sorted by First Page field, descending order (down)](img/sort_desc.gif) |
543-552 |
Keywords |
Document image analysis; Word image categorization; Convolutional neural networks; Named entity detection |
Abstract |
The extraction of relevant information from historical document collections is one of the key steps in order to make these documents available for access and searches. The usual approach combines transcription and grammars in order to extract semantically meaningful entities. In this paper, we describe a new method to obtain word categories directly from non-preprocessed handwritten word images. The method can be used to directly extract information, being an alternative to the transcription. Thus it can be used as a first step in any kind of syntactical analysis. The approach is based on Convolutional Neural Networks with a Spatial Pyramid Pooling layer to deal with the different shapes of the input images. We performed the experiments on a historical marriage record dataset, obtaining promising results. |
Address |
Merida; Mexico; December 2016 |
Corporate Author |
|
Thesis |
|
Publisher |
Springer International Publishing |
Place of Publication |
|
Editor |
|
Language |
|
Summary Language |
|
Original Title |
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
LNCS |
Series Volume |
|
Series Issue |
|
Edition |
|
ISSN |
|
ISBN |
978-3-319-49054-0 |
Medium |
|
Area |
|
Expedition |
|
Conference |
S+SSPR |
Notes |
DAG; 600.097; 602.006 |
Approved |
no |
Call Number |
Admin @ si @ TSF2016 |
Serial |
2877 |
Permanent link to this record |
|
|
|
Author |
Smriti Joshi; Richard Osuala; Carlos Martin Isla; Victor M.Campello; Carla Sendra-Balcells; Karim Lekadir; Sergio Escalera |
Title |
nn-UNet Training on CycleGAN-Translated Images for Cross-modal Domain Adaptation in Biomedical Imaging |
Type |
Conference Article |
Year |
2022 |
Publication |
International MICCAI Brainlesion Workshop |
Abbreviated Journal |
|
Volume |
12963 |
Issue |
|
Pages ![sorted by First Page field, descending order (down)](img/sort_desc.gif) |
540–551 |
Keywords |
Domain adaptation; Vestibular schwannoma (VS); Deep learning; nn-UNet; CycleGAN |
Abstract |
In recent years, deep learning models have considerably advanced the performance of segmentation tasks on Brain Magnetic Resonance Imaging (MRI). However, these models show a considerable performance drop when they are evaluated on unseen data from a different distribution. Since annotation is often a hard and costly task requiring expert supervision, it is necessary to develop ways in which existing models can be adapted to the unseen domains without any additional labelled information. In this work, we explore one such technique which extends the CycleGAN [2] architecture to generate label-preserving data in the target domain. The synthetic target domain data is used to train the nn-UNet [3] framework for the task of multi-label segmentation. The experiments are conducted and evaluated on the dataset [1] provided in the ‘Cross-Modality Domain Adaptation for Medical Image Segmentation’ challenge [23] for segmentation of vestibular schwannoma (VS) tumour and cochlea on contrast enhanced (ceT1) and high resolution (hrT2) MRI scans. In the proposed approach, our model obtains dice scores (DSC) 0.73 and 0.49 for tumour and cochlea respectively on the validation set of the dataset. This indicates the applicability of the proposed technique to real-world problems where data may be obtained by different acquisition protocols as in [1] where hrT2 images are more reliable, safer, and lower-cost alternative to ceT1. |
Address |
|
Corporate Author |
|
Thesis |
|
Publisher |
|
Place of Publication |
|
Editor |
|
Language |
|
Summary Language |
|
Original Title |
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
LNCS |
Series Volume |
|
Series Issue |
|
Edition |
|
ISSN |
|
ISBN |
|
Medium |
|
Area |
|
Expedition |
|
Conference |
MICCAIW |
Notes |
HUPBA; no menciona |
Approved |
no |
Call Number |
Admin @ si @ JOM2022 |
Serial |
3800 |
Permanent link to this record |
|
|
|
Author |
Maryam Asadi-Aghbolaghi; Albert Clapes; Marco Bellantonio; Hugo Jair Escalante; Victor Ponce; Xavier Baro; Isabelle Guyon; Shohreh Kasaei; Sergio Escalera |
Title |
Deep Learning for Action and Gesture Recognition in Image Sequences: A Survey |
Type |
Book Chapter |
Year |
2017 |
Publication |
Gesture Recognition |
Abbreviated Journal |
|
Volume |
|
Issue |
|
Pages ![sorted by First Page field, descending order (down)](img/sort_desc.gif) |
539-578 |
Keywords |
Action recognition; Gesture recognition; Deep learning architectures; Fusion strategies |
Abstract |
Interest in automatic action and gesture recognition has grown considerably in the last few years. This is due in part to the large number of application domains for this type of technology. As in many other computer vision areas, deep learning based methods have quickly become a reference methodology for obtaining state-of-the-art performance in both tasks. This chapter is a survey of current deep learning based methodologies for action and gesture recognition in sequences of images. The survey reviews both fundamental and cutting edge methodologies reported in the last few years. We introduce a taxonomy that summarizes important aspects of deep learning for approaching both tasks. Details of the proposed architectures, fusion strategies, main datasets, and competitions are reviewed. Also, we summarize and discuss the main works proposed so far with particular interest on how they treat the temporal dimension of data, their highlighting features, and opportunities and challenges for future research. To the best of our knowledge this is the first survey in the topic. We foresee this survey will become a reference in this ever dynamic field of research. |
Address |
|
Corporate Author |
|
Thesis |
|
Publisher |
|
Place of Publication |
|
Editor |
|
Language |
|
Summary Language |
|
Original Title |
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
Series Volume |
|
Series Issue |
|
Edition |
|
ISSN |
|
ISBN |
|
Medium |
|
Area |
|
Expedition |
|
Conference |
|
Notes |
HUPBA; no proj |
Approved |
no |
Call Number |
Admin @ si @ ACB2017a |
Serial |
2981 |
Permanent link to this record |
|
|
|
Author |
Arnau Ramisa; Adriana Tapus; Ramon Lopez de Mantaras; Ricardo Toledo |
Title |
Mobile Robot Localization using Panoramic Vision and Combination of Feature Region Detectors |
Type |
Conference Article |
Year |
2008 |
Publication |
IEEE International Conference on Robotics and Automation, |
Abbreviated Journal |
|
Volume |
|
Issue |
|
Pages ![sorted by First Page field, descending order (down)](img/sort_desc.gif) |
538–543 |
Keywords |
|
Abstract |
|
Address |
Pasadena; CA; USA |
Corporate Author |
|
Thesis |
|
Publisher |
|
Place of Publication |
|
Editor |
|
Language |
|
Summary Language |
|
Original Title |
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
Series Volume |
|
Series Issue |
|
Edition |
|
ISSN |
|
ISBN |
|
Medium |
|
Area |
|
Expedition |
|
Conference |
ICRA |
Notes |
RV;ADAS |
Approved |
no |
Call Number |
Admin @ si @ RTL2008 |
Serial |
1144 |
Permanent link to this record |
|
|
|
Author |
Wenwen Yu; Chengquan Zhang; Haoyu Cao; Wei Hua; Bohan Li; Huang Chen; Mingyu Liu; Mingrui Chen; Jianfeng Kuang; Mengjun Cheng; Yuning Du; Shikun Feng; Xiaoguang Hu; Pengyuan Lyu; Kun Yao; Yuechen Yu; Yuliang Liu; Wanxiang Che; Errui Ding; Cheng-Lin Liu; Jiebo Luo; Shuicheng Yan; Min Zhang; Dimosthenis Karatzas; Xing Sun; Jingdong Wang; Xiang Bai |
Title |
ICDAR 2023 Competition on Structured Text Extraction from Visually-Rich Document Images |
Type |
Conference Article |
Year |
2023 |
Publication |
17th International Conference on Document Analysis and Recognition |
Abbreviated Journal |
|
Volume |
14188 |
Issue |
|
Pages ![sorted by First Page field, descending order (down)](img/sort_desc.gif) |
536–552 |
Keywords |
|
Abstract |
Structured text extraction is one of the most valuable and challenging application directions in the field of Document AI. However, the scenarios of past benchmarks are limited, and the corresponding evaluation protocols usually focus on the submodules of the structured text extraction scheme. In order to eliminate these problems, we organized the ICDAR 2023 competition on Structured text extraction from Visually-Rich Document images (SVRD). We set up two tracks for SVRD including Track 1: HUST-CELL and Track 2: Baidu-FEST, where HUST-CELL aims to evaluate the end-to-end performance of Complex Entity Linking and Labeling, and Baidu-FEST focuses on evaluating the performance and generalization of Zero-shot/Few-shot Structured Text extraction from an end-to-end perspective. Compared to the current document benchmarks, our two tracks of competition benchmark enriches the scenarios greatly and contains more than 50 types of visually-rich document images (mainly from the actual enterprise applications). The competition opened on 30th December, 2022 and closed on 24th March, 2023. There are 35 participants and 91 valid submissions received for Track 1, and 15 participants and 26 valid submissions received for Track 2. In this report we will presents the motivation, competition datasets, task definition, evaluation protocol, and submission summaries. According to the performance of the submissions, we believe there is still a large gap on the expected information extraction performance for complex and zero-shot scenarios. It is hoped that this competition will attract many researchers in the field of CV and NLP, and bring some new thoughts to the field of Document AI. |
Address |
San Jose; CA; USA; August 2023 |
Corporate Author |
|
Thesis |
|
Publisher |
|
Place of Publication |
|
Editor |
|
Language |
|
Summary Language |
|
Original Title |
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
LNCS |
Series Volume |
|
Series Issue |
|
Edition |
|
ISSN |
|
ISBN |
|
Medium |
|
Area |
|
Expedition |
|
Conference |
ICDAR |
Notes |
DAG |
Approved |
no |
Call Number |
Admin @ si @ YZC2023 |
Serial |
3896 |
Permanent link to this record |
|
|
|
Author |
Fadi Dornaika; Angel Sappa |
Title |
Instantaneous 3D motion from image derivatives using the Least Trimmed Square Regression |
Type |
Journal Article |
Year |
2009 |
Publication |
Pattern Recognition Letters |
Abbreviated Journal |
PRL |
Volume |
30 |
Issue |
5 |
Pages ![sorted by First Page field, descending order (down)](img/sort_desc.gif) |
535–543 |
Keywords |
|
Abstract |
This paper presents a new technique to the instantaneous 3D motion estimation. The main contributions are as follows. First, we show that the 3D camera or scene velocity can be retrieved from image derivatives only assuming that the scene contains a dominant plane. Second, we propose a new robust algorithm that simultaneously provides the Least Trimmed Square solution and the percentage of inliers-the non-contaminated data. Experiments on both synthetic and real image sequences demonstrated the effectiveness of the developed method. Those experiments show that the new robust approach can outperform classical robust schemes. |
Address |
|
Corporate Author |
|
Thesis |
|
Publisher |
Elsevier Science Inc. |
Place of Publication |
|
Editor |
|
Language |
|
Summary Language |
|
Original Title |
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
Series Volume |
|
Series Issue |
|
Edition |
|
ISSN |
0167-8655 |
ISBN |
|
Medium |
|
Area |
|
Expedition |
|
Conference |
|
Notes |
ADAS |
Approved |
no |
Call Number |
ADAS @ adas @ DoS2009a |
Serial |
1115 |
Permanent link to this record |
|
|
|
Author |
David Rotger; Petia Radeva; N. Bruining |
Title |
Automatic Detection of Bioabsorbable Coronary Stents in IVUS Images using a Cascade of Classifiers |
Type |
Journal Article |
Year |
2010 |
Publication |
IEEE Transactions on Information Technology in Biomedicine |
Abbreviated Journal |
TITB |
Volume |
14 |
Issue |
2 |
Pages ![sorted by First Page field, descending order (down)](img/sort_desc.gif) |
535 – 537 |
Keywords |
|
Abstract |
Bioabsorbable drug-eluting coronary stents present a very promising improvement to the common metallic ones solving some of the most important problems of stent implantation: the late restenosis. These stents made of poly-L-lactic acid cause a very subtle acoustic shadow (compared to the metallic ones) making difficult the automatic detection and measurements in images. In this paper, we propose a novel approach based on a cascade of GentleBoost classifiers to detect the stent struts using structural features to code the information of the different subregions of the struts. A stochastic gradient descent method is applied to optimize the overall performance of the detector. Validation results of struts detection are very encouraging with an average F-measure of 81%. |
Address |
|
Corporate Author |
|
Thesis |
|
Publisher |
|
Place of Publication |
|
Editor |
|
Language |
|
Summary Language |
|
Original Title |
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
Series Volume |
|
Series Issue |
|
Edition |
|
ISSN |
|
ISBN |
|
Medium |
|
Area |
|
Expedition |
|
Conference |
|
Notes |
MILAB |
Approved |
no |
Call Number |
BCNPCL @ bcnpcl @ RRB2010 |
Serial |
1287 |
Permanent link to this record |
|
|
|
Author |
Antonio Hernandez; Nadezhda Zlateva; Alexander Marinov; Miguel Reyes; Petia Radeva; Dimo Dimov; Sergio Escalera |
Title |
Human Limb Segmentation in Depth Maps based on Spatio-Temporal Graph Cuts Optimization |
Type |
Journal Article |
Year |
2012 |
Publication |
Journal of Ambient Intelligence and Smart Environments |
Abbreviated Journal |
JAISE |
Volume |
4 |
Issue |
6 |
Pages ![sorted by First Page field, descending order (down)](img/sort_desc.gif) |
535-546 |
Keywords |
Multi-modal vision processing; Random Forest; Graph-cuts; multi-label segmentation; human body segmentation |
Abstract |
We present a framework for object segmentation using depth maps based on Random Forest and Graph-cuts theory, and apply it to the segmentation of human limbs. First, from a set of random depth features, Random Forest is used to infer a set of label probabilities for each data sample. This vector of probabilities is used as unary term in α−β swap Graph-cuts algorithm. Moreover, depth values of spatio-temporal neighboring data points are used as boundary potentials. Results on a new multi-label human depth data set show high performance in terms of segmentation overlapping of the novel methodology compared to classical approaches. |
Address |
|
Corporate Author |
|
Thesis |
|
Publisher |
|
Place of Publication |
|
Editor |
|
Language |
|
Summary Language |
|
Original Title |
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
Series Volume |
|
Series Issue |
|
Edition |
|
ISSN |
1876-1364 |
ISBN |
|
Medium |
|
Area |
|
Expedition |
|
Conference |
|
Notes |
MILAB;HuPBA |
Approved |
no |
Call Number |
Admin @ si @ HZM2012a |
Serial |
2006 |
Permanent link to this record |
|
|
|
Author |
Mohamed Ilyes Lakhal; Albert Clapes; Sergio Escalera; Oswald Lanz; Andrea Cavallaro |
Title |
Residual Stacked RNNs for Action Recognition |
Type |
Conference Article |
Year |
2018 |
Publication |
9th International Workshop on Human Behavior Understanding |
Abbreviated Journal |
|
Volume |
|
Issue |
|
Pages ![sorted by First Page field, descending order (down)](img/sort_desc.gif) |
534-548 |
Keywords |
Action recognition; Deep residual learning; Two-stream RNN |
Abstract |
Action recognition pipelines that use Recurrent Neural Networks (RNN) are currently 5–10% less accurate than Convolutional Neural Networks (CNN). While most works that use RNNs employ a 2D CNN on each frame to extract descriptors for action recognition, we extract spatiotemporal features from a 3D CNN and then learn the temporal relationship of these descriptors through a stacked residual recurrent neural network (Res-RNN). We introduce for the first time residual learning to counter the degradation problem in multi-layer RNNs, which have been successful for temporal aggregation in two-stream action recognition pipelines. Finally, we use a late fusion strategy to combine RGB and optical flow data of the two-stream Res-RNN. Experimental results show that the proposed pipeline achieves competitive results on UCF-101 and state of-the-art results for RNN-like architectures on the challenging HMDB-51 dataset. |
Address |
Munich; September 2018 |
Corporate Author |
|
Thesis |
|
Publisher |
|
Place of Publication |
|
Editor |
|
Language |
|
Summary Language |
|
Original Title |
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
Series Volume |
|
Series Issue |
|
Edition |
|
ISSN |
|
ISBN |
|
Medium |
|
Area |
|
Expedition |
|
Conference |
ECCVW |
Notes |
HUPBA; no proj |
Approved |
no |
Call Number |
Admin @ si @ LCE2018b |
Serial |
3206 |
Permanent link to this record |
|
|
|
Author |
David Masip; Agata Lapedriza; Jordi Vitria |
Title |
Boosted Online Learning for Face Recognition |
Type |
Journal Article |
Year |
2009 |
Publication |
IEEE Transactions on Systems, Man and Cybernetics part B |
Abbreviated Journal |
TSMCB |
Volume |
39 |
Issue |
2 |
Pages ![sorted by First Page field, descending order (down)](img/sort_desc.gif) |
530–538 |
Keywords |
|
Abstract |
Face recognition applications commonly suffer from three main drawbacks: a reduced training set, information lying in high-dimensional subspaces, and the need to incorporate new people to recognize. In the recent literature, the extension of a face classifier in order to include new people in the model has been solved using online feature extraction techniques. The most successful approaches of those are the extensions of the principal component analysis or the linear discriminant analysis. In the current paper, a new online boosting algorithm is introduced: a face recognition method that extends a boosting-based classifier by adding new classes while avoiding the need of retraining the classifier each time a new person joins the system. The classifier is learned using the multitask learning principle where multiple verification tasks are trained together sharing the same feature space. The new classes are added taking advantage of the structure learned previously, being the addition of new classes not computationally demanding. The present proposal has been (experimentally) validated with two different facial data sets by comparing our approach with the current state-of-the-art techniques. The results show that the proposed online boosting algorithm fares better in terms of final accuracy. In addition, the global performance does not decrease drastically even when the number of classes of the base problem is multiplied by eight. |
Address |
|
Corporate Author |
|
Thesis |
|
Publisher |
|
Place of Publication |
|
Editor |
|
Language |
|
Summary Language |
|
Original Title |
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
Series Volume |
|
Series Issue |
|
Edition |
|
ISSN |
1083–4419 |
ISBN |
|
Medium |
|
Area |
|
Expedition |
|
Conference |
|
Notes |
OR;MV |
Approved |
no |
Call Number |
BCNPCL @ bcnpcl @ MLV2009 |
Serial |
1155 |
Permanent link to this record |