Publicacions CVC -- Query Results

	Publicacions CVC Home \| Show All \| Simple Search \| Advanced Search \| Add Record \| Import	Login Quick Search: Field: contains: ...
	31–45 of 148 records found matching your query (RSS \| history):

Search & Display Options

Select All Deselect All

<< 1 2 3 4 5 6 7 8 9 10 >>

List View

Citations

Details

	Records
	Author	Michal Drozdzal
	Title	Sequential image analysis for computer-aided wireless endoscopy			Type	Book Whole
	Year	2014	Publication	PhD Thesis, Universitat de Barcelona-CVC	Abbreviated Journal
	Volume		Issue		Pages
	Keywords
	Abstract	Wireless Capsule Endoscopy (WCE) is a technique for inner-visualization of the entire small intestine and, thus, offers an interesting perspective on intestinal motility. The two major drawbacks of this technique are: 1) huge amount of data acquired by WCE makes the motility analysis tedious and 2) since the capsule is the first tool that offers complete inner-visualization of the small intestine,the exact importance of the observed events is still an open issue. Therefore, in this thesis, a novel computer-aided system for intestinal motility analysis is presented. The goal of the system is to provide an easily-comprehensible visual description of motility-related intestinal events to a physician. In order to do so, several tools based either on computer vision concepts or on machine learning techniques are presented. A method for transforming 3D video signal to a holistic image of intestinal motility, called motility bar, is proposed. The method calculates the optimal mapping from video into image from the intestinal motility point of view. To characterize intestinal motility, methods for automatic extraction of motility information from WCE are presented. Two of them are based on the motility bar and two of them are based on frame-per-frame analysis. In particular, four algorithms dealing with the problems of intestinal contraction detection, lumen size estimation, intestinal content characterization and wrinkle frame detection are proposed and validated. The results of the algorithms are converted into sequential features using an online statistical test. This test is designed to work with multivariate data streams. To this end, we propose a novel formulation of concentration inequality that is introduced into a robust adaptive windowing algorithm for multivariate data streams. The algorithm is used to obtain robust representation of segments with constant intestinal motility activity. The obtained sequential features are shown to be discriminative in the problem of abnormal motility characterization. Finally, we tackle the problem of efficient labeling. To this end, we incorporate active learning concepts to the problems present in WCE data and propose two approaches. The first one is based the concepts of sequential learning and the second one adapts the partition-based active learning to an error-free labeling scheme. All these steps are sufficient to provide an extensive visual description of intestinal motility that can be used by an expert as decision support system.
	Address
	Corporate Author				Thesis	Ph.D. thesis
	Publisher	Ediciones Graficas Rey	Place of Publication		Editor	Petia Radeva
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN	978-84-940902-3-3	Medium
	Area		Expedition		Conference
	Notes	MILAB			Approved	no
	Call Number	Admin @ si @ Dro2014			Serial	2486
Permanent link to this record



	Author	Fahad Shahbaz Khan; Joost Van de Weijer; Muhammad Anwer Rao; Michael Felsberg; Carlo Gatta
	Title	Semantic Pyramids for Gender and Action Recognition			Type	Journal Article
	Year	2014	Publication	IEEE Transactions on Image Processing	Abbreviated Journal	TIP
	Volume	23	Issue	8	Pages	3633-3645
	Keywords
	Abstract	Person description is a challenging problem in computer vision. We investigated two major aspects of person description: 1) gender and 2) action recognition in still images. Most state-of-the-art approaches for gender and action recognition rely on the description of a single body part, such as face or full-body. However, relying on a single body part is suboptimal due to significant variations in scale, viewpoint, and pose in real-world images. This paper proposes a semantic pyramid approach for pose normalization. Our approach is fully automatic and based on combining information from full-body, upper-body, and face regions for gender and action recognition in still images. The proposed approach does not require any annotations for upper-body and face of a person. Instead, we rely on pretrained state-of-the-art upper-body and face detectors to automatically extract semantic information of a person. Given multiple bounding boxes from each body part detector, we then propose a simple method to select the best candidate bounding box, which is used for feature extraction. Finally, the extracted features from the full-body, upper-body, and face regions are combined into a single representation for classification. To validate the proposed approach for gender recognition, experiments are performed on three large data sets namely: 1) human attribute; 2) head-shoulder; and 3) proxemics. For action recognition, we perform experiments on four data sets most used for benchmarking action recognition in still images: 1) Sports; 2) Willow; 3) PASCAL VOC 2010; and 4) Stanford-40. Our experiments clearly demonstrate that the proposed approach, despite its simplicity, outperforms state-of-the-art methods for gender and action recognition.
	Address
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN	1057-7149	ISBN		Medium
	Area		Expedition		Conference
	Notes	CIC; LAMP; 601.160; 600.074; 600.079;MILAB			Approved	no
	Call Number	Admin @ si @ KWR2014			Serial	2507
Permanent link to this record



	Author	Jon Almazan; Albert Gordo; Alicia Fornes; Ernest Valveny
	Title	Segmentation-free Word Spotting with Exemplar SVMs			Type	Journal Article
	Year	2014	Publication	Pattern Recognition	Abbreviated Journal	PR
	Volume	47	Issue	12	Pages	3967–3978
	Keywords	Word spotting; Segmentation-free; Unsupervised learning; Reranking; Query expansion; Compression
	Abstract	In this paper we propose an unsupervised segmentation-free method for word spotting in document images. Documents are represented with a grid of HOG descriptors, and a sliding-window approach is used to locate the document regions that are most similar to the query. We use the Exemplar SVM framework to produce a better representation of the query in an unsupervised way. Then, we use a more discriminative representation based on Fisher Vector to rerank the best regions retrieved, and the most promising ones are used to expand the Exemplar SVM training set and improve the query representation. Finally, the document descriptors are precomputed and compressed with Product Quantization. This offers two advantages: first, a large number of documents can be kept in RAM memory at the same time. Second, the sliding window becomes significantly faster since distances between quantized HOG descriptors can be precomputed. Our results significantly outperform other segmentation-free methods in the literature, both in accuracy and in speed and memory usage.
	Address
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference
	Notes	DAG; 600.045; 600.056; 600.061; 602.006; 600.077			Approved	no
	Call Number	Admin @ si @ AGF2014b			Serial	2485
Permanent link to this record



	Author	Lluis Gomez; Dimosthenis Karatzas
	Title	Scene Text Recognition: No Country for Old Men?			Type	Conference Article
	Year	2014	Publication	1st International Workshop on Robust Reading	Abbreviated Journal
	Volume		Issue		Pages
	Keywords
	Abstract
	Address
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference	IWRR
	Notes	DAG; 600.077			Approved	no
	Call Number	Admin @ si @ GoK2014c			Serial	2538
Permanent link to this record



	Author	Fahad Shahbaz Khan; Joost Van de Weijer; Andrew Bagdanov; Michael Felsberg
	Title	Scale Coding Bag-of-Words for Action Recognition			Type	Conference Article
	Year	2014	Publication	22nd International Conference on Pattern Recognition	Abbreviated Journal
	Volume		Issue		Pages	1514-1519
	Keywords
	Abstract	Recognizing human actions in still images is a challenging problem in computer vision due to significant amount of scale, illumination and pose variation. Given the bounding box of a person both at training and test time, the task is to classify the action associated with each bounding box in an image. Most state-of-the-art methods use the bag-of-words paradigm for action recognition. The bag-of-words framework employing a dense multi-scale grid sampling strategy is the de facto standard for feature detection. This results in a scale invariant image representation where all the features at multiple-scales are binned in a single histogram. We argue that such a scale invariant strategy is sub-optimal since it ignores the multi-scale information available with each bounding box of a person. This paper investigates alternative approaches to scale coding for action recognition in still images. We encode multi-scale information explicitly in three different histograms for small, medium and large scale visual-words. Our first approach exploits multi-scale information with respect to the image size. In our second approach, we encode multi-scale information relative to the size of the bounding box of a person instance. In each approach, the multi-scale histograms are then concatenated into a single representation for action classification. We validate our approaches on the Willow dataset which contains seven action categories: interacting with computer, photography, playing music, riding bike, riding horse, running and walking. Our results clearly suggest that the proposed scale coding approaches outperform the conventional scale invariant technique. Moreover, we show that our approach obtains promising results compared to more complex state-of-the-art methods.
	Address	Stockholm; August 2014
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference	ICPR
	Notes	CIC; LAMP; 601.240; 600.074; 600.079			Approved	no
	Call Number	Admin @ si @ KWB2014			Serial	2450
Permanent link to this record



	Author	Lluis Pere de las Heras; David Fernandez; Alicia Fornes; Ernest Valveny; Gemma Sanchez; Josep Llados
	Title	Runlength Histogram Image Signature for Perceptual Retrieval of Architectural Floor Plans			Type	Book Chapter
	Year	2014	Publication	Graphics Recognition. Current Trends and Challenges	Abbreviated Journal
	Volume	8746	Issue		Pages	135-146
	Keywords	Graphics recognition; Graphics retrieval; Image classification
	Abstract	This paper proposes a runlength histogram signature as a perceptual descriptor of architectural plans in a retrieval scenario. The style of an architectural drawing is characterized by the perception of lines, shapes and texture. Such visual stimuli are the basis for defining semantic concepts as space properties, symmetry, density, etc. We propose runlength histograms extracted in vertical, horizontal and diagonal directions as a characterization of line and space properties in floorplans, so it can be roughly associated to a description of walls and room structure. A retrieval application illustrates the performance of the proposed approach, where given a plan as a query, similar ones are obtained from a database. A ground truth based on human observation has been constructed to validate the hypothesis. Additional retrieval results on sketched building’s facades are reported qualitatively in this paper. Its good description and its adaptability to two different sketch drawings despite its simplicity shows the interest of the proposed approach and opens a challenging research line in graphics recognition.
	Address
	Corporate Author				Thesis
	Publisher	Springer Berlin Heidelberg	Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title	LNCS
	Series Volume		Series Issue		Edition
	ISSN	0302-9743	ISBN	978-3-662-44853-3	Medium
	Area		Expedition		Conference
	Notes	DAG; ADAS; 600.045; 600.056; 600.061; 600.076; 600.077			Approved	no
	Call Number	Admin @ si @ HFF2014			Serial	2536
Permanent link to this record



	Author	Juan Ramon Terven Salinas; Joaquin Salas; Bogdan Raducanu
	Title	Robust Head Gestures Recognition for Assistive Technology			Type	Book Chapter
	Year	2014	Publication	Pattern Recognition	Abbreviated Journal
	Volume	8495	Issue		Pages	152-161
	Keywords
	Abstract	This paper presents a system capable of recognizing six head gestures: nodding, shaking, turning right, turning left, looking up, and looking down. The main difference of our system compared to other methods is that the Hidden Markov Models presented in this paper, are fully connected and consider all possible states in any given order, providing the following advantages to the system: (1) allows unconstrained movement of the head and (2) it can be easily integrated into a wearable device (e.g. glasses, neck-hung devices), in which case it can robustly recognize gestures in the presence of ego-motion. Experimental results show that this approach outperforms common methods that use restricted HMMs for each gesture.
	Address
	Corporate Author				Thesis
	Publisher	Springer International Publishing	Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title	LNCS
	Series Volume		Series Issue		Edition
	ISSN	0302-9743	ISBN	978-3-319-07490-0	Medium
	Area		Expedition		Conference
	Notes	LAMP;			Approved	no
	Call Number	Admin @ si @ TSR2014b			Serial	2505
Permanent link to this record



	Author	P. Wang; V. Eglin; C. Garcia; C. Largeron; Josep Llados; Alicia Fornes
	Title	Représentation par graphe de mots manuscrits dans les images pour la recherche par similarité			Type	Conference Article
	Year	2014	Publication	Colloque International Francophone sur l'Écrit et le Document	Abbreviated Journal
	Volume		Issue		Pages	233-248
	Keywords	word spotting; graph-based representation; shape context description; graph edit distance; DTW; block merging; query by example
	Abstract	Effective information retrieval on handwritten document images has always been a challenging task. In this paper, we propose a novel handwritten word spotting approach based on graph representation. The presented model comprises both topological and morphological signatures of handwriting. Skeleton-based graphs with the Shape Context labeled vertexes are established for connected components. Each word image is represented as a sequence of graphs. In order to be robust to the handwriting variations, an exhaustive merging process based on DTW alignment results introduced in the similarity measure between word images. With respect to the computation complexity, an approximate graph edit distance approach using bipartite matching is employed for graph matching. The experiments on the George Washington dataset and the marriage records from the Barcelona Cathedral dataset demonstrate that the proposed approach outperforms the state-of-the-art structural methods.
	Address	Nancy; Francia; March 2014
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference	CIFED
	Notes	DAG; 600.061; 602.006; 600.077			Approved	no
	Call Number	Admin @ si @ WEG2014c			Serial	2564
Permanent link to this record



	Author	Cesar Isaza; Joaquin Salas; Bogdan Raducanu
	Title	Rendering ground truth data sets to detect shadows cast by static objects in outdoors			Type	Journal Article
	Year	2014	Publication	Multimedia Tools and Applications	Abbreviated Journal	MTAP
	Volume	70	Issue	1	Pages	557-571
	Keywords	Synthetic ground truth data set; Sun position; Shadow detection; Static objects shadow detection
	Abstract	In our work, we are particularly interested in studying the shadows cast by static objects in outdoor environments, during daytime. To assess the accuracy of a shadow detection algorithm, we need ground truth information. The collection of such information is a very tedious task because it is a process that requires manual annotation. To overcome this severe limitation, we propose in this paper a methodology to automatically render ground truth using a virtual environment. To increase the degree of realism and usefulness of the simulated environment, we incorporate in the scenario the precise longitude, latitude and elevation of the actual location of the object, as well as the sun’s position for a given time and day. To evaluate our method, we consider a qualitative and a quantitative comparison. In the quantitative one, we analyze the shadow cast by a real object in a particular geographical location and its corresponding rendered model. To evaluate qualitatively the methodology, we use some ground truth images obtained both manually and automatically.
	Address
	Corporate Author				Thesis
	Publisher	Springer US	Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN	1380-7501	ISBN		Medium
	Area		Expedition		Conference
	Notes	LAMP;			Approved	no
	Call Number	Admin @ si @ ISR2014			Serial	2229
Permanent link to this record



	Author	Lluis Pere de las Heras
	Title	Relational Models for Visual Understanding of Graphical Documents. Application to Architectural Drawings.			Type	Book Whole
	Year	2014	Publication	PhD Thesis, Universitat Autonoma de Barcelona-CVC	Abbreviated Journal
	Volume		Issue		Pages
	Keywords
	Abstract	Graphical documents express complex concepts using a visual language. This language consists of a vocabulary (symbols) and a syntax (structural relations between symbols) that articulate a semantic meaning in a certain context. Therefore, the automatic interpretation by computers of these sort of documents entails three main steps: the detection of the symbols, the extraction of the structural relations between these symbols, and the modeling of the knowledge that permits the extraction of the semantics. Dierent domains in graphical documents include: architectural and engineering drawings, maps, owcharts, etc. Graphics Recognition in particular and Document Image Analysis in general are born from the industrial need of interpreting a massive amount of digitalized documents after the emergence of the scanner. Although many years have passed, the graphical document understanding problem still seems to be far from being solved. The main reason is that the vast majority of the systems in the literature focus on very specic problems, where the domain of the document dictates the implementation of the interpretation. As a result, it is dicult to reuse these strategies on dierent data and on dierent contexts, hindering thus the natural progress in the eld. In this thesis, we face the graphical document understanding problem by proposing several relational models at dierent levels that are designed from a generic perspective. Firstly, we introduce three dierent strategies for the detection of symbols. The first method tackles the problem structurally, wherein general knowledge of the domain guides the detection. The second is a statistical method that learns the graphical appearance of the symbols and easily adapts to the big variability of the problem. The third method is a combination of the previous two methods that inherits their respective strengths, i.e. copes the big variability and does not need annotated data. Secondly, we present two relational strategies that tackle the problem of the visual context extraction. The first one is a full bottom up method that heuristically searches in a graph representation the contextual relations between symbols. Contrarily, the second is syntactic method that models probabilistically the structure of the documents. It automatically learns the model, which guides the inference algorithm to encounter the best structural representation for a given input. Finally, we construct a knowledge-based model consisting of an ontological denition of the domain and real data. This model permits to perform contextual reasoning and to detect semantic inconsistencies within the data. We evaluate the suitability of the proposed contributions in the framework of floor plan interpretation. Since there is no standard in the modeling of these documents there exists an enormous notation variability from plan to plan in terms of vocabulary and syntax. Therefore, floor plan interpretation is a relevant task in the graphical document understanding problem. It is also worth to mention that we make freely available all the resources used in this thesis {the data, the tool used to generate the data, and the evaluation scripts{ with the aim of fostering research in the graphical document understanding task.
	Address
	Corporate Author				Thesis	Ph.D. thesis
	Publisher	Ediciones Graficas Rey	Place of Publication		Editor	Gemma Sanchez
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN	978-84-940902-8-8	Medium
	Area		Expedition		Conference
	Notes	DAG; 600.077			Approved	no
	Call Number	Admin @ si @ Her2014			Serial	2574
Permanent link to this record



	Author	Monica Piñol
	Title	Reinforcement Learning of Visual Descriptors for Object Recognition			Type	Book Whole
	Year	2014	Publication	PhD Thesis, Universitat Autonoma de Barcelona-CVC	Abbreviated Journal
	Volume		Issue		Pages
	Keywords
	Abstract	The human visual system is able to recognize the object in an image even if the object is partially occluded, from various points of view, in different colors, or with independence of the distance to the object. To do this, the eye obtains an image and extracts features that are sent to the brain, and then, in the brain the object is recognized. In computer vision, the object recognition branch tries to learns from the human visual system behaviour to achieve its goal. Hence, an algorithm is used to identify representative features of the scene (detection), then another algorithm is used to describe these points (descriptor) and finally the extracted information is used for classifying the object in the scene. The selection of this set of algorithms is a very complicated task and thus, a very active research field. In this thesis we are focused on the selection/learning of the best descriptor for a given image. In the state of the art there are several descriptors but we do not know how to choose the best descriptor because depends on scenes that we will use (dataset) and the algorithm chosen to do the classification. We propose a framework based on reinforcement learning and bag of features to choose the best descriptor according to the given image. The system can analyse the behaviour of different learning algorithms and descriptor sets. Furthermore the proposed framework for improving the classification/recognition ratio can be used with minor changes in other computer vision fields, such as video retrieval.
	Address
	Corporate Author				Thesis	Ph.D. thesis
	Publisher	Ediciones Graficas Rey	Place of Publication		Editor	Ricardo Toledo;Angel Sappa
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN	978-84-940902-5-7	Medium
	Area		Expedition		Conference
	Notes	ADAS; 600.076			Approved	no
	Call Number	Admin @ si @ Piñ2014			Serial	2464
Permanent link to this record



	Author	Adria Ruiz; Joost Van de Weijer; Xavier Binefa
	Title	Regularized Multi-Concept MIL for weakly-supervised facial behavior categorization			Type	Conference Article
	Year	2014	Publication	25th British Machine Vision Conference	Abbreviated Journal
	Volume		Issue		Pages
	Keywords
	Abstract	We address the problem of estimating high-level semantic labels for videos of recorded people by means of analysing their facial expressions. This problem, to which we refer as facial behavior categorization, is a weakly-supervised learning problem where we do not have access to frame-by-frame facial gesture annotations but only weak-labels at the video level are available. Therefore, the goal is to learn a set of discriminative expressions and how they determine the video weak-labels. Facial behavior categorization can be posed as a Multi-Instance-Learning (MIL) problem and we propose a novel MIL method called Regularized Multi-Concept MIL to solve it. In contrast to previous approaches applied in facial behavior analysis, RMC-MIL follows a Multi-Concept assumption which allows different facial expressions (concepts) to contribute differently to the video-label. Moreover, to handle with the high-dimensional nature of facial-descriptors, RMC-MIL uses a discriminative approach to model the concepts and structured sparsity regularization to discard non-informative features. RMC-MIL is posed as a convex-constrained optimization problem where all the parameters are jointly learned using the Projected-Quasi-Newton method. In our experiments, we use two public data-sets to show the advantages of the Regularized Multi-Concept approach and its improvement compared to existing MIL methods. RMC-MIL outperforms state-of-the-art results in the UNBC data-set for pain detection.
	Address	Nottingham; UK; September 2014
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference	BMVC
	Notes	LAMP; CIC; 600.074; 600.079			Approved	no
	Call Number	Admin @ si @ RWB2014			Serial	2508
Permanent link to this record



	Author	E. Bondi ; L. Sidenari; Andrew Bagdanov; Alberto del Bimbo
	Title	Real-time people counting from depth imagery of crowded environments			Type	Conference Article
	Year	2014	Publication	11th IEEE International Conference on Advanced Video and Signal based Surveillance	Abbreviated Journal
	Volume		Issue		Pages	337 - 342
	Keywords
	Abstract	In this paper we describe a system for automatic people counting in crowded environments. The approach we propose is a counting-by-detection method based on depth imagery. It is designed to be deployed as an autonomous appliance for crowd analysis in video surveillance application scenarios. Our system performs foreground/background segmentation on depth image streams in order to coarsely segment persons, then depth information is used to localize head candidates which are then tracked in time on an automatically estimated ground plane. The system runs in real-time, at a frame-rate of about 20 fps. We collected a dataset of RGB-D sequences representing three typical and challenging surveillance scenarios, including crowds, queuing and groups. An extensive comparative evaluation is given between our system and more complex, Latent SVM-based head localization for person counting applications.
	Address	Seoul; Korea; August 2014
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference	AVSS
	Notes	LAMP; 600.079			Approved	no
	Call Number	Admin @ si @ BSB2014			Serial	2540
Permanent link to this record



	Author	Clement Guerin; Christophe Rigaud; Karell Bertet; Jean-Christophe Burie; Arnaud Revel ; Jean-Marc Ogier
	Title	Réduction de l’espace de recherche pour les personnages de bandes dessinées			Type	Conference Article
	Year	2014	Publication	19th National Congress Reconnaissance de Formes et l'Intelligence Artificielle	Abbreviated Journal
	Volume		Issue		Pages
	Keywords	contextual search; document analysis; comics characters
	Abstract	Les bandes dessinées représentent un patrimoine culturel important dans de nombreux pays et leur numérisation massive offre la possibilité d'effectuer des recherches dans le contenu des images. À ce jour, ce sont principalement les structures des pages et leurs contenus textuels qui ont été étudiés, peu de travaux portent sur le contenu graphique. Nous proposons de nous appuyer sur des éléments déjà étudiés tels que la position des cases et des bulles, pour réduire l'espace de recherche et localiser les personnages en fonction de la queue des bulles. L'évaluation de nos différentes contributions à partir de la base eBDtheque montre un taux de détection des queues de bulle de 81.2%, de localisation des personnages allant jusqu'à 85% et un gain d'espace de recherche de plus de 50%.
	Address	Rouen; Francia; July 2014
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference	RFIA
	Notes	DAG; 600.077			Approved	no
	Call Number	Admin @ si @ GRB2014			Serial	2480
Permanent link to this record



	Author	Antonio Hernandez; Miguel Angel Bautista; Xavier Perez Sala; Victor Ponce; Sergio Escalera; Xavier Baro; Oriol Pujol; Cecilio Angulo
	Title	Probability-based Dynamic Time Warping and Bag-of-Visual-and-Depth-Words for Human Gesture Recognition in RGB-D			Type	Journal Article
	Year	2014	Publication	Pattern Recognition Letters	Abbreviated Journal	PRL
	Volume	50	Issue	1	Pages	112-121
	Keywords	RGB-D; Bag-of-Words; Dynamic Time Warping; Human Gesture Recognition
	Abstract	PATREC5825 We present a methodology to address the problem of human gesture segmentation and recognition in video and depth image sequences. A Bag-of-Visual-and-Depth-Words (BoVDW) model is introduced as an extension of the Bag-of-Visual-Words (BoVW) model. State-of-the-art RGB and depth features, including a newly proposed depth descriptor, are analysed and combined in a late fusion form. The method is integrated in a Human Gesture Recognition pipeline, together with a novel probability-based Dynamic Time Warping (PDTW) algorithm which is used to perform prior segmentation of idle gestures. The proposed DTW variant uses samples of the same gesture category to build a Gaussian Mixture Model driven probabilistic model of that gesture class. Results of the whole Human Gesture Recognition pipeline in a public data set show better performance in comparison to both standard BoVW model and DTW approach.
	Address
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference
	Notes	HuPBA;MV; 605.203			Approved	no
	Call Number	Admin @ si @ HBP2014			Serial	2353
Permanent link to this record