Home | << 1 2 3 4 5 6 7 8 9 10 >> [11–11] |
Records | |||||
---|---|---|---|---|---|
Author | Pau Riba; Josep Llados; Alicia Fornes | ||||
Title | Handwritten Word Spotting by Inexact Matching of Grapheme Graphs | Type | Conference Article | ||
Year | 2015 | Publication | 13th International Conference on Document Analysis and Recognition ICDAR2015 | Abbreviated Journal | |
Volume | Issue | Pages | 781 - 785 | ||
Keywords | |||||
Abstract | This paper presents a graph-based word spotting for handwritten documents. Contrary to most word spotting techniques, which use statistical representations, we propose a structural representation suitable to be robust to the inherent deformations of handwriting. Attributed graphs are constructed using a part-based approach. Graphemes extracted from shape convexities are used as stable units of handwriting, and are associated to graph nodes. Then, spatial relations between them determine graph edges. Spotting is defined in terms of an error-tolerant graph matching using bipartite-graph matching algorithm. To make the method usable in large datasets, a graph indexing approach that makes use of binary embeddings is used as preprocessing. Historical documents are used as experimental framework. The approach is comparable to statistical ones in terms of time and memory requirements, especially when dealing with large document collections. | ||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | ICDAR | ||
Notes | DAG; 600.077; 600.061; 602.006 | Approved | no | ||
Call Number | Admin @ si @ RLF2015b | Serial | 2642 | ||
Permanent link to this record | |||||
Author | Pau Riba; Josep Llados; Alicia Fornes; Anjan Dutta | ||||
Title | Large-scale Graph Indexing using Binary Embeddings of Node Contexts | Type | Conference Article | ||
Year | 2015 | Publication | 10th IAPR-TC15 Workshop on Graph-based Representations in Pattern Recognition | Abbreviated Journal | |
Volume | 9069 | Issue | Pages | 208-217 | |
Keywords | Graph matching; Graph indexing; Application in document analysis; Word spotting; Binary embedding | ||||
Abstract | Graph-based representations are experiencing a growing usage in visual recognition and retrieval due to their representational power in front of classical appearance-based representations in terms of feature vectors. Retrieving a query graph from a large dataset of graphs has the drawback of the high computational complexity required to compare the query and the target graphs. The most important property for a large-scale retrieval is the search time complexity to be sub-linear in the number of database examples. In this paper we propose a fast indexation formalism for graph retrieval. A binary embedding is defined as hashing keys for graph nodes. Given a database of labeled graphs, graph nodes are complemented with vectors of attributes representing their local context. Hence, each attribute counts the length of a walk of order k originated in a vertex with label l. Each attribute vector is converted to a binary code applying a binary-valued hash function. Therefore, graph retrieval is formulated in terms of finding target graphs in the database whose nodes have a small Hamming distance from the query nodes, easily computed with bitwise logical operators. As an application example, we validate the performance of the proposed methods in a handwritten word spotting scenario in images of historical documents. | ||||
Address | Beijing; China; May 2015 | ||||
Corporate Author | Thesis | ||||
Publisher | Springer International Publishing | Place of Publication | Editor | C.-L.Liu; B.Luo; W.G.Kropatsch; J.Cheng | |
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | LNCS | ||
Series Volume | Series Issue | Edition | |||
ISSN | 0302-9743 | ISBN | 978-3-319-18223-0 | Medium | |
Area | Expedition | Conference | GbRPR | ||
Notes | DAG; 600.061; 602.006; 600.077 | Approved | no | ||
Call Number | Admin @ si @ RLF2015a | Serial | 2618 | ||
Permanent link to this record | |||||
Author | Dennis G.Romero; Anselmo Frizera; Angel Sappa; Boris X. Vintimilla; Teodiano F.Bastos | ||||
Title | A predictive model for human activity recognition by observing actions and context | Type | Conference Article | ||
Year | 2015 | Publication | Advanced Concepts for Intelligent Vision Systems, Proceedings of 16th International Conference, ACIVS 2015 | Abbreviated Journal | |
Volume | 9386 | Issue | Pages | 323-333 | |
Keywords | |||||
Abstract | This paper presents a novel model to estimate human activities — a human activity is defined by a set of human actions. The proposed approach is based on the usage of Recurrent Neural Networks (RNN) and Bayesian inference through the continuous monitoring of human actions and its surrounding environment. In the current work human activities are inferred considering not only visual analysis but also additional resources; external sources of information, such as context information, are incorporated to contribute to the activity estimation. The novelty of the proposed approach lies in the way the information is encoded, so that it can be later associated according to a predefined semantic structure. Hence, a pattern representing a given activity can be defined by a set of actions, plus contextual information or other kind of information that could be relevant to describe the activity. Experimental results with real data are provided showing the validity of the proposed approach. | ||||
Address | Catania; Italy; October 2015 | ||||
Corporate Author | Thesis | ||||
Publisher | Springer International Publishing | Place of Publication | Editor | ||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | LNCS | ||
Series Volume | Series Issue | Edition | |||
ISSN | 0302-9743 | ISBN | 978-3-319-25902-4 | Medium | |
Area | Expedition | Conference | ACIVS | ||
Notes | ADAS; 600.076 | Approved | no | ||
Call Number | Admin @ si @ RFS2015 | Serial | 2661 | ||
Permanent link to this record | |||||
Author | Marçal Rusiñol; J. Chazalon; Jean-Marc Ogier; Josep Llados | ||||
Title | A Comparative Study of Local Detectors and Descriptors for Mobile Document Classification | Type | Conference Article | ||
Year | 2015 | Publication | 13th International Conference on Document Analysis and Recognition ICDAR2015 | Abbreviated Journal | |
Volume | Issue | Pages | 596-600 | ||
Keywords | |||||
Abstract | In this paper we conduct a comparative study of local key-point detectors and local descriptors for the specific task of mobile document classification. A classification architecture based on direct matching of local descriptors is used as baseline for the comparative study. A set of four different key-point
detectors and four different local descriptors are tested in all the possible combinations. The experiments are conducted in a database consisting of 30 model documents acquired on 6 different backgrounds, totaling more than 36.000 test images. |
||||
Address | Nancy; France; August 2015 | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | ICDAR | ||
Notes | DAG; 600.084; 600.61; 601.223; 600.077 | Approved | no | ||
Call Number | Admin @ si @ RCO2015 | Serial | 2684 | ||
Permanent link to this record | |||||
Author | Adriana Romero; Nicolas Ballas; Samira Ebrahimi Kahou; Antoine Chassang; Carlo Gatta; Yoshua Bengio | ||||
Title | FitNets: Hints for Thin Deep Nets | Type | Conference Article | ||
Year | 2015 | Publication | 3rd International Conference on Learning Representations ICLR2015 | Abbreviated Journal | |
Volume | Issue | Pages | |||
Keywords | Computer Science ; Learning; Computer Science ;Neural and Evolutionary Computing | ||||
Abstract | While depth tends to improve network performances, it also makes gradient-based training more difficult since deeper networks tend to be more non-linear. The recently proposed knowledge distillation approach is aimed at obtaining small and fast-to-execute models, and it has shown that a student network could imitate the soft output of a larger teacher network or ensemble of networks. In this paper, we extend this idea to allow the training of a student that is deeper and thinner than the teacher, using not only the outputs but also the intermediate representations learned by the teacher as hints to improve the training process and final performance of the student. Because the student intermediate hidden layer will generally be smaller than the teacher's intermediate hidden layer, additional parameters are introduced to map the student hidden layer to the prediction of the teacher hidden layer. This allows one to train deeper students that can generalize better or run faster, a trade-off that is controlled by the chosen student capacity. For example, on CIFAR-10, a deep student network with almost 10.4 times less parameters outperforms a larger, state-of-the-art teacher network. | ||||
Address | San Diego; CA; May 2015 | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | ICLR | ||
Notes | MILAB | Approved | no | ||
Call Number | Admin @ si @ RBK2015 | Serial | 2593 | ||
Permanent link to this record | |||||
Author | Bogdan Raducanu; Alireza Bosaghzadeh; Fadi Dornaika | ||||
Title | Multi-observation Face Recognition in Videos based on Label Propagation | Type | Conference Article | ||
Year | 2015 | Publication | 6th Workshop on Analysis and Modeling of Faces and Gestures AMFG2015 | Abbreviated Journal | |
Volume | Issue | Pages | 10-17 | ||
Keywords | |||||
Abstract | In order to deal with the huge amount of content generated by social media, especially for indexing and retrieval purposes, the focus shifted from single object recognition to multi-observation object recognition. Of particular interest is the problem of face recognition (used as primary cue for persons’ identity assessment), since it is highly required by popular social media search engines like Facebook and Youtube. Recently, several approaches for graph-based label propagation were proposed. However, the associated graphs were constructed in an ad-hoc manner (e.g., using the KNN graph) that cannot cope properly with the rapid and frequent changes in data appearance, a phenomenon intrinsically related with video sequences. In this paper, we
propose a novel approach for efficient and adaptive graph construction, based on a two-phase scheme: (i) the first phase is used to adaptively find the neighbors of a sample and also to find the adequate weights for the minimization function of the second phase; (ii) in the second phase, the selected neighbors along with their corresponding weights are used to locally and collaboratively estimate the sparse affinity matrix weights. Experimental results performed on Honda Video Database (HVDB) and a subset of video sequences extracted from the popular TV-series ’Friends’ show a distinct advantage of the proposed method over the existing standard graph construction methods. |
||||
Address | Boston; USA; June 2015 | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | CVPRW | ||
Notes | LAMP; 600.068; 600.072; | Approved | no | ||
Call Number | Admin @ si @ RBD2015 | Serial | 2627 | ||
Permanent link to this record | |||||
Author | Marçal Rusiñol; David Aldavert; Ricardo Toledo; Josep Llados | ||||
Title | Towards Query-by-Speech Handwritten Keyword Spotting | Type | Conference Article | ||
Year | 2015 | Publication | 13th International Conference on Document Analysis and Recognition ICDAR2015 | Abbreviated Journal | |
Volume | Issue | Pages | 501-505 | ||
Keywords | |||||
Abstract | In this paper, we present a new querying paradigm for handwritten keyword spotting. We propose to represent handwritten word images both by visual and audio representations, enabling a query-by-speech keyword spotting system. The two representations are merged together and projected to a common sub-space in the training phase. This transform allows to, given a spoken query, retrieve word instances that were only represented by the visual modality. In addition, the same method can be used backwards at no additional cost to produce a handwritten text-tospeech system. We present our first results on this new querying mechanism using synthetic voices over the George Washington
dataset. |
||||
Address | Nancy; France; August 2015 | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | ICDAR | ||
Notes | DAG; 600.084; 600.061; 601.223; 600.077;ADAS | Approved | no | ||
Call Number | Admin @ si @ RAT2015b | Serial | 2682 | ||
Permanent link to this record | |||||
Author | Marçal Rusiñol; David Aldavert; Ricardo Toledo; Josep Llados | ||||
Title | Efficient segmentation-free keyword spotting in historical document collections | Type | Journal Article | ||
Year | 2015 | Publication | Pattern Recognition | Abbreviated Journal | PR |
Volume | 48 | Issue | 2 | Pages | 545–555 |
Keywords | Historical documents; Keyword spotting; Segmentation-free; Dense SIFT features; Latent semantic analysis; Product quantization | ||||
Abstract | In this paper we present an efficient segmentation-free word spotting method, applied in the context of historical document collections, that follows the query-by-example paradigm. We use a patch-based framework where local patches are described by a bag-of-visual-words model powered by SIFT descriptors. By projecting the patch descriptors to a topic space with the latent semantic analysis technique and compressing the descriptors with the product quantization method, we are able to efficiently index the document information both in terms of memory and time. The proposed method is evaluated using four different collections of historical documents achieving good performances on both handwritten and typewritten scenarios. The yielded performances outperform the recent state-of-the-art keyword spotting approaches. | ||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | |||
Notes | DAG; ADAS; 600.076; 600.077; 600.061; 601.223; 602.006; 600.055 | Approved | no | ||
Call Number | Admin @ si @ RAT2015a | Serial | 2544 | ||
Permanent link to this record | |||||
Author | Marco Pedersoli; Andrea Vedaldi; Jordi Gonzalez; Xavier Roca | ||||
Title | A coarse-to-fine approach for fast deformable object detection | Type | Journal Article | ||
Year | 2015 | Publication | Pattern Recognition | Abbreviated Journal | PR |
Volume | 48 | Issue | 5 | Pages | 1844-1853 |
Keywords | |||||
Abstract | We present a method that can dramatically accelerate object detection with part based models. The method is based on the observation that the cost of detection is likely to be dominated by the cost of matching each part to the image, and not by the cost of computing the optimal configuration of the parts as commonly assumed. Therefore accelerating detection requires minimizing the number of
part-to-image comparisons. To this end we propose a multiple-resolutions hierarchical part based model and a corresponding coarse-to-fine inference procedure that recursively eliminates from the search space unpromising part placements. The method yields a ten-fold speedup over the standard dynamic programming approach and is complementary to the cascade-of-parts approach of [9]. Compared to the latter, our method does not have parameters to be determined empirically, which simplifies its use during the training of the model. Most importantly, the two techniques can be combined to obtain a very significant speedup, of two orders of magnitude in some cases. We evaluate our method extensively on the PASCAL VOC and INRIA datasets, demonstrating a very high increase in the detection speed with little degradation of the accuracy. |
||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | |||
Notes | ISE; 600.078; 602.005; 605.001; 302.012 | Approved | no | ||
Call Number | Admin @ si @ PVG2015 | Serial | 2628 | ||
Permanent link to this record | |||||
Author | Monica Piñol; Angel Sappa; Ricardo Toledo | ||||
Title | Adaptive Feature Descriptor Selection based on a Multi-Table Reinforcement Learning Strategy | Type | Journal Article | ||
Year | 2015 | Publication | Neurocomputing | Abbreviated Journal | NEUCOM |
Volume | 150 | Issue | A | Pages | 106–115 |
Keywords | Reinforcement learning; Q-learning; Bag of features; Descriptors | ||||
Abstract | This paper presents and evaluates a framework to improve the performance of visual object classification methods, which are based on the usage of image feature descriptors as inputs. The goal of the proposed framework is to learn the best descriptor for each image in a given database. This goal is reached by means of a reinforcement learning process using the minimum information. The visual classification system used to demonstrate the proposed framework is based on a bag of features scheme, and the reinforcement learning technique is implemented through the Q-learning approach. The behavior of the reinforcement learning with different state definitions is evaluated. Additionally, a method that combines all these states is formulated in order to select the optimal state. Finally, the chosen actions are obtained from the best set of image descriptors in the literature: PHOW, SIFT, C-SIFT, SURF and Spin. Experimental results using two public databases (ETH and COIL) are provided showing both the validity of the proposed approach and comparisons with state of the art. In all the cases the best results are obtained with the proposed approach. | ||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | |||
Notes | ADAS; 600.055; 600.076 | Approved | no | ||
Call Number | Admin @ si @ PST2015 | Serial | 2473 | ||
Permanent link to this record | |||||
Author | Olivier Penacchio; Xavier Otazu; A. wilkins; J. Harris | ||||
Title | Uncomfortable images prevent lateral interactions in the cortex from providing a sparse code | Type | Conference Article | ||
Year | 2015 | Publication | European Conference on Visual Perception ECVP2015 | Abbreviated Journal | |
Volume | Issue | Pages | |||
Keywords | |||||
Abstract | |||||
Address | Liverpool; uk; August 2015 | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | ECVP | ||
Notes | NEUROBIT; | Approved | no | ||
Call Number | Admin @ si @ POW2015 | Serial | 2633 | ||
Permanent link to this record | |||||
Author | Victor Ponce; Sergio Escalera; Marc Perez; Oriol Janes; Xavier Baro | ||||
Title | Non-Verbal Communication Analysis in Victim-Offender Mediations | Type | Journal Article | ||
Year | 2015 | Publication | Pattern Recognition Letters | Abbreviated Journal | PRL |
Volume | 67 | Issue | 1 | Pages | 19-27 |
Keywords | Victim–Offender Mediation; Multi-modal human behavior analysis; Face and gesture recognition; Social signal processing; Computer vision; Machine learning | ||||
Abstract | We present a non-invasive ambient intelligence framework for the semi-automatic analysis of non-verbal communication applied to the restorative justice field. We propose the use of computer vision and social signal processing technologies in real scenarios of Victim–Offender Mediations, applying feature extraction techniques to multi-modal audio-RGB-depth data. We compute a set of behavioral indicators that define communicative cues from the fields of psychology and observational methodology. We test our methodology on data captured in real Victim–Offender Mediation sessions in Catalonia. We define the ground truth based on expert opinions when annotating the observed social responses. Using different state of the art binary classification approaches, our system achieves recognition accuracies of 86% when predicting satisfaction, and 79% when predicting both agreement and receptivity. Applying a regression strategy, we obtain a mean deviation for the predictions between 0.5 and 0.7 in the range [1–5] for the computed social signals. | ||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | |||
Notes | HuPBA;MV | Approved | no | ||
Call Number | Admin @ si @ PEP2015 | Serial | 2583 | ||
Permanent link to this record | |||||
Author | Eloi Puertas; Sergio Escalera; Oriol Pujol | ||||
Title | Generalized Multi-scale Stacked Sequential Learning for Multi-class Classification | Type | Journal Article | ||
Year | 2015 | Publication | Pattern Analysis and Applications | Abbreviated Journal | PAA |
Volume | 18 | Issue | 2 | Pages | 247-261 |
Keywords | Stacked sequential learning; Multi-scale; Error-correct output codes (ECOC); Contextual classification | ||||
Abstract | In many classification problems, neighbor data labels have inherent sequential relationships. Sequential learning algorithms take benefit of these relationships in order to improve generalization. In this paper, we revise the multi-scale sequential learning approach (MSSL) for applying it in the multi-class case (MMSSL). We introduce the error-correcting output codesframework in the MSSL classifiers and propose a formulation for calculating confidence maps from the margins of the base classifiers. In addition, we propose a MMSSL compression approach which reduces the number of features in the extended data set without a loss in performance. The proposed methods are tested on several databases, showing significant performance improvement compared to classical approaches. | ||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Springer-Verlag | Place of Publication | Editor | ||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | 1433-7541 | ISBN | Medium | ||
Area | Expedition | Conference | |||
Notes | HuPBA;MILAB | Approved | no | ||
Call Number | Admin @ si @ PEP2013 | Serial | 2251 | ||
Permanent link to this record | |||||
Author | Victor Ponce; Hugo Jair Escalante; Sergio Escalera; Xavier Baro | ||||
Title | Gesture and Action Recognition by Evolved Dynamic Subgestures | Type | Conference Article | ||
Year | 2015 | Publication | 26th British Machine Vision Conference | Abbreviated Journal | |
Volume | Issue | Pages | 129.1-129.13 | ||
Keywords | |||||
Abstract | This paper introduces a framework for gesture and action recognition based on the evolution of temporal gesture primitives, or subgestures. Our work is inspired on the principle of producing genetic variations within a population of gesture subsequences, with the goal of obtaining a set of gesture units that enhance the generalization capability of standard gesture recognition approaches. In our context, gesture primitives are evolved over time using dynamic programming and generative models in order to recognize complex actions. In few generations, the proposed subgesture-based representation
of actions and gestures outperforms the state of the art results on the MSRDaily3D and MSRAction3D datasets. |
||||
Address | Swansea; uk; September 2015 | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | BMVC | ||
Notes | HuPBA;MV | Approved | no | ||
Call Number | Admin @ si @ PEE2015 | Serial | 2657 | ||
Permanent link to this record | |||||
Author | C. Alejandro Parraga | ||||
Title | Perceptual Psychophysics | Type | Book Chapter | ||
Year | 2015 | Publication | Biologically-Inspired Computer Vision: Fundamentals and Applications | Abbreviated Journal | |
Volume | Issue | Pages | |||
Keywords | |||||
Abstract | |||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | G.Cristobal; M.Keil; L.Perrinet | ||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | 978-3-527-41264-8 | Medium | ||
Area | Expedition | Conference | |||
Notes | CIC; 600.074 | Approved | no | ||
Call Number | Admin @ si @ Par2015 | Serial | 2600 | ||
Permanent link to this record |