Home | << 1 2 3 4 5 6 7 8 9 10 >> [11–11] |
Records | |||||
---|---|---|---|---|---|
Author | Kamal Nasrollahi; Sergio Escalera; P. Rasti; Gholamreza Anbarjafari; Xavier Baro; Hugo Jair Escalante; Thomas B. Moeslund | ||||
Title | Deep Learning based Super-Resolution for Improved Action Recognition | Type | Conference Article | ||
Year | 2015 | Publication | 5th International Conference on Image Processing Theory, Tools and Applications IPTA2015 | Abbreviated Journal | |
Volume | Issue | Pages | 67 - 72 | ||
Keywords | |||||
Abstract | Action recognition systems mostly work with videos of proper quality and resolution. Even most challenging benchmark databases for action recognition, hardly include videos of low-resolution from, e.g., surveillance cameras. In videos recorded by such cameras, due to the distance between people and cameras, people are pictured very small and hence challenge action recognition algorithms. Simple upsampling methods, like bicubic interpolation, cannot retrieve all the detailed information that can help the recognition. To deal with this problem, in this paper we combine results of bicubic interpolation with results of a state-ofthe-art deep learning-based super-resolution algorithm, through an alpha-blending approach. The experimental results obtained on down-sampled version of a large subset of Hoolywood2 benchmark database show the importance of the proposed system in increasing the recognition rate of a state-of-the-art action recognition system for handling low-resolution videos. | ||||
Address | Orleans; France; November 2015 | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | IPTA | ||
Notes | HuPBA;MV | Approved | no | ||
Call Number | Admin @ si @ NER2015 | Serial | 2648 | ||
Permanent link to this record | |||||
Author | Katerine Diaz; Francesc J. Ferri; W. Diaz | ||||
Title | Incremental Generalized Discriminative Common Vectors for Image Classification | Type | Journal Article | ||
Year | 2015 | Publication | IEEE Transactions on Neural Networks and Learning Systems | Abbreviated Journal | TNNLS |
Volume | 26 | Issue | 8 | Pages | 1761 - 1775 |
Keywords | |||||
Abstract | Subspace-based methods have become popular due to their ability to appropriately represent complex data in such a way that both dimensionality is reduced and discriminativeness is enhanced. Several recent works have concentrated on the discriminative common vector (DCV) method and other closely related algorithms also based on the concept of null space. In this paper, we present a generalized incremental formulation of the DCV methods, which allows the update of a given model by considering the addition of new examples even from unseen classes. Having efficient incremental formulations of well-behaved batch algorithms allows us to conveniently adapt previously trained classifiers without the need of recomputing them from scratch. The proposed generalized incremental method has been empirically validated in different case studies from different application domains (faces, objects, and handwritten digits) considering several different scenarios in which new data are continuously added at different rates starting from an initial model. | ||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | 2162-237X | ISBN | Medium | ||
Area | Expedition | Conference | |||
Notes | ADAS; 600.076 | Approved | no | ||
Call Number | Admin @ si @ DFD2015 | Serial | 2547 | ||
Permanent link to this record | |||||
Author | Lluis Garrido; M.Guerrieri; Laura Igual | ||||
Title | Image Segmentation with Cage Active Contours | Type | Journal Article | ||
Year | 2015 | Publication | IEEE Transactions on Image Processing | Abbreviated Journal | TIP |
Volume | 24 | Issue | 12 | Pages | 5557 - 5566 |
Keywords | Level sets; Mean value coordinates; Parametrized active contours; level sets; mean value coordinates | ||||
Abstract | In this paper, we present a framework for image segmentation based on parametrized active contours. The evolving contour is parametrized according to a reduced set of control points that form a closed polygon and have a clear visual interpretation. The parametrization, called mean value coordinates, stems from the techniques used in computer graphics to animate virtual models. Our framework allows to easily formulate region-based energies to segment an image. In particular, we present three different local region-based energy terms: 1) the mean model; 2) the Gaussian model; 3) and the histogram model. We show the behavior of our method on synthetic and real images and compare the performance with state-of-the-art level set methods. | ||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | 1057-7149 | ISBN | Medium | ||
Area | Expedition | Conference | |||
Notes | MILAB | Approved | no | ||
Call Number | Admin @ si @ GGI2015 | Serial | 2673 | ||
Permanent link to this record | |||||
Author | Lluis Gomez; Dimosthenis Karatzas | ||||
Title | Object Proposals for Text Extraction in the Wild | Type | Conference Article | ||
Year | 2015 | Publication | 13th International Conference on Document Analysis and Recognition ICDAR2015 | Abbreviated Journal | |
Volume | Issue | Pages | 206 - 210 | ||
Keywords | |||||
Abstract | Object Proposals is a recent computer vision technique receiving increasing interest from the research community. Its main objective is to generate a relatively small set of bounding box proposals that are most likely to contain objects of interest. The use of Object Proposals techniques in the scene text understanding field is innovative. Motivated by the success of powerful while expensive techniques to recognize words in a holistic way, Object Proposals techniques emerge as an alternative to the traditional text detectors. In this paper we study to what extent the existing generic Object Proposals methods may be useful for scene text understanding. Also, we propose a new Object Proposals algorithm that is specifically designed for text and compare it with other generic methods in the state of the art. Experiments show that our proposal is superior in its ability of producing good quality word proposals in an efficient way. The source code of our method is made publicly available | ||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | ICDAR | ||
Notes | DAG; 600.077; 600.084; 601.197 | Approved | no | ||
Call Number | Admin @ si @ GoK2015 | Serial | 2691 | ||
Permanent link to this record | |||||
Author | Lluis Pere de las Heras; Oriol Ramos Terrades; Josep Llados | ||||
Title | Attributed Graph Grammar for floor plan analysis | Type | Conference Article | ||
Year | 2015 | Publication | 13th International Conference on Document Analysis and Recognition ICDAR2015 | Abbreviated Journal | |
Volume | Issue | Pages | 726 - 730 | ||
Keywords | |||||
Abstract | In this paper, we propose the use of an Attributed Graph Grammar as unique framework to model and recognize the structure of floor plans. This grammar represents a building as a hierarchical composition of structurally and semantically related elements, where common representations are learned stochastically from annotated data. Given an input image, the parsing consists on constructing that graph representation that better agrees with the probabilistic model defined by the grammar. The proposed method provides several advantages with respect to the traditional floor plan analysis techniques. It uses an unsupervised statistical approach for detecting walls that adapts to different graphical notations and relaxes strong structural assumptions such are straightness and orthogonality. Moreover, the independence between the knowledge model and the parsing implementation allows the method to learn automatically different building configurations and thus, to cope the existing variability. These advantages are clearly demonstrated by comparing it with the most recent floor plan interpretation techniques on 4 datasets of real floor plans with different notations. | ||||
Address | Nancy; France; August 2015 | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | ICDAR | ||
Notes | DAG; 600.077; 600.061 | Approved | no | ||
Call Number | Admin @ si @ HRL2015b | Serial | 2727 | ||
Permanent link to this record | |||||
Author | Lluis Pere de las Heras; Oriol Ramos Terrades; Josep Llados; David Fernandez; Cristina Cañero | ||||
Title | Use case visual Bag-of-Words techniques for camera based identity document classification | Type | Conference Article | ||
Year | 2015 | Publication | 13th International Conference on Document Analysis and Recognition ICDAR2015 | Abbreviated Journal | |
Volume | Issue | Pages | 721 - 725 | ||
Keywords | |||||
Abstract | Nowadays, automatic identity document recognition, including passport and driving license recognition, is at the core of many applications within the administrative and service sectors, such as police, hospitality, car renting, etc. In former years, the document information was manually extracted whereas today this data is recognized automatically from images obtained by flat-bed scanners. Yet, since these scanners tend to be expensive and voluminous, companies in the sector have recently turned their attention to cheaper, small and yet computationally powerful scanners: the mobile devices. The document identity recognition from mobile images enclose several new difficulties w.r.t traditional scanned images, such as the loss of a controlled background, perspective, blurring, etc. In this paper we present a real application for identity document classification of images taken from mobile devices. This classification process is of extreme importance since a prior knowledge of the document type and origin strongly facilitates the subsequent information extraction. The proposed method is based on a traditional Bagof-Words in which we have taken into consideration several key aspects to enhance recognition rate. The method performance has been studied on three datasets containing more than 2000 images from 129 different document classes. | ||||
Address | Nancy; France; August 2015 | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | ICDAR | ||
Notes | DAG; 600.077; 600.061; | Approved | no | ||
Call Number | Admin @ si @ HRL2015a | Serial | 2726 | ||
Permanent link to this record | |||||
Author | Lluis Pere de las Heras; Oriol Ramos Terrades; Sergi Robles; Gemma Sanchez | ||||
Title | CVC-FP and SGT: a new database for structural floor plan analysis and its groundtruthing tool | Type | Journal Article | ||
Year | 2015 | Publication | International Journal on Document Analysis and Recognition | Abbreviated Journal | IJDAR |
Volume | 18 | Issue | 1 | Pages | 15-30 |
Keywords | |||||
Abstract | Recent results on structured learning methods have shown the impact of structural information in a wide range of pattern recognition tasks. In the field of document image analysis, there is a long experience on structural methods for the analysis and information extraction of multiple types of documents. Yet, the lack of conveniently annotated and free access databases has not benefited the progress in some areas such as technical drawing understanding. In this paper, we present a floor plan database, named CVC-FP, that is annotated for the architectural objects and their structural relations. To construct this database, we have implemented a groundtruthing tool, the SGT tool, that allows to make specific this sort of information in a natural manner. This tool has been made for general purpose groundtruthing: It allows to define own object classes and properties, multiple labeling options are possible, grants the cooperative work, and provides user and version control. We finally have collected some of the recent work on floor plan interpretation and present a quantitative benchmark for this database. Both CVC-FP database and the SGT tool are freely released to the research community to ease comparisons between methods and boost reproducible research. | ||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Springer Berlin Heidelberg | Place of Publication | Editor | ||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | 1433-2833 | ISBN | Medium | ||
Area | Expedition | Conference | |||
Notes | DAG; ADAS; 600.061; 600.076; 600.077 | Approved | no | ||
Call Number | Admin @ si @ HRR2015 | Serial | 2567 | ||
Permanent link to this record | |||||
Author | M. Campos-Taberner; Adriana Romero; Carlo Gatta; Gustavo Camps-Valls | ||||
Title | Shared feature representations of LiDAR and optical images: Trading sparsity for semantic discrimination | Type | Conference Article | ||
Year | 2015 | Publication | IEEE International Geoscience and Remote Sensing Symposium IGARSS2015 | Abbreviated Journal | |
Volume | Issue | Pages | 4169 - 4172 | ||
Keywords | |||||
Abstract | This paper studies the level of complementary information conveyed by extremely high resolution LiDAR and optical images. We pursue this goal following an indirect approach via unsupervised spatial-spectral feature extraction. We used a recently presented unsupervised convolutional neural network trained to enforce both population and lifetime spar-sity in the feature representation. We derived independent and joint feature representations, and analyzed the sparsity scores and the discriminative power. Interestingly, the obtained results revealed that the RGB+LiDAR representation is no longer sparse, and the derived basis functions merge color and elevation yielding a set of more expressive colored edge filters. The joint feature representation is also more discriminative when used for clustering and topological data visualization. | ||||
Address | Milan; Italy; July 2015 | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | IGARSS | ||
Notes | LAMP; 600.079;MILAB | Approved | no | ||
Call Number | Admin @ si @ CRG2015 | Serial | 2724 | ||
Permanent link to this record | |||||
Author | M. Cruz; Cristhian A. Aguilera-Carrasco; Boris X. Vintimilla; Ricardo Toledo; Angel Sappa | ||||
Title | Cross-spectral image registration and fusion: an evaluation study | Type | Conference Article | ||
Year | 2015 | Publication | 2nd International Conference on Machine Vision and Machine Learning | Abbreviated Journal | |
Volume | Issue | Pages | |||
Keywords | multispectral imaging; image registration; data fusion; infrared and visible spectra | ||||
Abstract | This paper presents a preliminary study on the registration and fusion of cross-spectral imaging. The objective is to evaluate the validity of widely used computer vision approaches when they are applied at different
spectral bands. In particular, we are interested in merging images from the infrared (both long wave infrared: LWIR and near infrared: NIR) and visible spectrum (VS). Experimental results with different data sets are presented. |
||||
Address | Barcelona; July 2015 | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | MVML | ||
Notes | ADAS; 600.076 | Approved | no | ||
Call Number | Admin @ si @ CAV2015 | Serial | 2629 | ||
Permanent link to this record | |||||
Author | Maedeh Aghaei; Mariella Dimiccoli; Petia Radeva | ||||
Title | Towards social interaction detection in egocentric photo-streams | Type | Conference Article | ||
Year | 2015 | Publication | Proceedings of SPIE, 8th International Conference on Machine Vision , ICMV 2015 | Abbreviated Journal | |
Volume | 9875 | Issue | Pages | ||
Keywords | |||||
Abstract | Detecting social interaction in videos relying solely on visual cues is a valuable task that is receiving increasing attention in recent years. In this work, we address this problem in the challenging domain of egocentric photo-streams captured by a low temporal resolution wearable camera (2fpm). The major difficulties to be handled in this context are the sparsity of observations as well as unpredictability of camera motion and attention orientation due to the fact that the camera is worn as part of clothing. Our method consists of four steps: multi-faces localization and tracking, 3D localization, pose estimation and analysis of f-formations. By estimating pair-to-pair interaction probabilities over the sequence, our method states the presence or absence of interaction with the camera wearer and specifies which people are more involved in the interaction. We tested our method over a dataset of 18.000 images and we show its reliability on our considered purpose. © (2015) COPYRIGHT Society of Photo-Optical Instrumentation Engineers (SPIE). Downloading of the abstract is permitted for personal use only. | ||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | ICMV | ||
Notes | MILAB | Approved | no | ||
Call Number | Admin @ si @ ADR2015a | Serial | 2702 | ||
Permanent link to this record | |||||
Author | Maedeh Aghaei; Mariella Dimiccoli; Petia Radeva | ||||
Title | Multi-Face Tracking by Extended Bag-of-Tracklets in Egocentric Videos | Type | Miscellaneous | ||
Year | 2015 | Publication | Arxiv | Abbreviated Journal | |
Volume | Issue | Pages | |||
Keywords | |||||
Abstract | Egocentric images offer a hands-free way to record daily experiences and special events, where social interactions are of special interest. A natural question that arises is how to extract and track the appearance of multiple persons in a social event captured by a wearable camera. In this paper, we propose a novel method to find correspondences of multiple-faces in low temporal resolution egocentric sequences acquired through a wearable camera. This kind of sequences imposes additional challenges to the multitracking problem with respect to conventional videos. Due to the free motion of the camera and to its low temporal resolution (2 fpm), abrupt changes in the field of view, in illumination conditions and in the target location are very frequent. To overcome such a difficulty, we propose to generate, for each detected face, a set of correspondences along the whole sequence that we call tracklet and to take advantage of their redundancy to deal with both false positive face detections and unreliable tracklets. Similar tracklets are grouped into the so called extended bag-of-tracklets (eBoT), which are aimed to correspond to specific persons. Finally, a prototype tracklet is extracted for each eBoT. We validated our method over a dataset of 18.000 images from 38 egocentric sequences with 52 trackable persons and compared to the state-of-the-art methods, demonstrating its effectiveness and robustness. | ||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | |||
Notes | MILAB | Approved | no | ||
Call Number | Admin @ si @ ADR2015b | Serial | 2713 | ||
Permanent link to this record | |||||
Author | Manuel Graña; Bogdan Raducanu | ||||
Title | Special Issue on Bioinspired and knowledge based techniques and applications | Type | Journal Article | ||
Year | 2015 | Publication | Neurocomputing | Abbreviated Journal | NEUCOM |
Volume | Issue | Pages | 1-3 | ||
Keywords | |||||
Abstract | |||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | |||
Notes | LAMP; | Approved | no | ||
Call Number | Admin @ si @ GrR2015 | Serial | 2598 | ||
Permanent link to this record | |||||
Author | Marçal Rusiñol; David Aldavert; Ricardo Toledo; Josep Llados | ||||
Title | Efficient segmentation-free keyword spotting in historical document collections | Type | Journal Article | ||
Year | 2015 | Publication | Pattern Recognition | Abbreviated Journal | PR |
Volume | 48 | Issue | 2 | Pages | 545–555 |
Keywords | Historical documents; Keyword spotting; Segmentation-free; Dense SIFT features; Latent semantic analysis; Product quantization | ||||
Abstract | In this paper we present an efficient segmentation-free word spotting method, applied in the context of historical document collections, that follows the query-by-example paradigm. We use a patch-based framework where local patches are described by a bag-of-visual-words model powered by SIFT descriptors. By projecting the patch descriptors to a topic space with the latent semantic analysis technique and compressing the descriptors with the product quantization method, we are able to efficiently index the document information both in terms of memory and time. The proposed method is evaluated using four different collections of historical documents achieving good performances on both handwritten and typewritten scenarios. The yielded performances outperform the recent state-of-the-art keyword spotting approaches. | ||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | |||
Notes | DAG; ADAS; 600.076; 600.077; 600.061; 601.223; 602.006; 600.055 | Approved | no | ||
Call Number | Admin @ si @ RAT2015a | Serial | 2544 | ||
Permanent link to this record | |||||
Author | Marçal Rusiñol; David Aldavert; Ricardo Toledo; Josep Llados | ||||
Title | Towards Query-by-Speech Handwritten Keyword Spotting | Type | Conference Article | ||
Year | 2015 | Publication | 13th International Conference on Document Analysis and Recognition ICDAR2015 | Abbreviated Journal | |
Volume | Issue | Pages | 501-505 | ||
Keywords | |||||
Abstract | In this paper, we present a new querying paradigm for handwritten keyword spotting. We propose to represent handwritten word images both by visual and audio representations, enabling a query-by-speech keyword spotting system. The two representations are merged together and projected to a common sub-space in the training phase. This transform allows to, given a spoken query, retrieve word instances that were only represented by the visual modality. In addition, the same method can be used backwards at no additional cost to produce a handwritten text-tospeech system. We present our first results on this new querying mechanism using synthetic voices over the George Washington
dataset. |
||||
Address | Nancy; France; August 2015 | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | ICDAR | ||
Notes | DAG; 600.084; 600.061; 601.223; 600.077;ADAS | Approved | no | ||
Call Number | Admin @ si @ RAT2015b | Serial | 2682 | ||
Permanent link to this record | |||||
Author | Marçal Rusiñol; Dimosthenis Karatzas; Josep Llados | ||||
Title | Automatic Verification of Properly Signed Multi-page Document Images | Type | Conference Article | ||
Year | 2015 | Publication | Proceedings of the Eleventh International Symposium on Visual Computing | Abbreviated Journal | |
Volume | 9475 | Issue | Pages | 327-336 | |
Keywords | Document Image; Manual Inspection; Signature Verification; Rejection Criterion; Document Flow | ||||
Abstract | In this paper we present an industrial application for the automatic screening of incoming multi-page documents in a banking workflow aimed at determining whether these documents are properly signed or not. The proposed method is divided in three main steps. First individual pages are classified in order to identify the pages that should contain a signature. In a second step, we segment within those key pages the location where the signatures should appear. The last step checks whether the signatures are present or not. Our method is tested in a real large-scale environment and we report the results when checking two different types of real multi-page contracts, having in total more than 14,500 pages. | ||||
Address | Las Vegas, Nevada, USA; December 2015 | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | LNCS | ||
Series Volume | 9475 | Series Issue | Edition | ||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | ISVC | ||
Notes | DAG; 600.077 | Approved | no | ||
Call Number | Admin @ si @ | Serial | 3189 | ||
Permanent link to this record |