Publicacions CVC -- Query Results

<< 1 2 3 4 5 6 7 8 9 10 >>

Details

Records
Author	Lluis Gomez; Dimosthenis Karatzas
Title	A fine-grained approach to scene text script identification			Type	Conference Article
Year	2016	Publication	12th IAPR Workshop on Document Analysis Systems	Abbreviated Journal
Volume		Issue		Pages	192-197
Keywords
Abstract	This paper focuses on the problem of script identification in unconstrained scenarios. Script identification is an important prerequisite to recognition, and an indispensable condition for automatic text understanding systems designed for multi-language environments. Although widely studied for document images and handwritten documents, it remains an almost unexplored territory for scene text images. We detail a novel method for script identification in natural images that combines convolutional features and the Naive-Bayes Nearest Neighbor classifier. The proposed framework efficiently exploits the discriminative power of small stroke-parts, in a fine-grained classification framework. In addition, we propose a new public benchmark dataset for the evaluation of joint text detection and script identification in natural scenes. Experiments done in this new dataset demonstrate that the proposed method yields state of the art results, while it generalizes well to different datasets and variable number of scripts. The evidence provided shows that multi-lingual scene text recognition in the wild is a viable proposition. Source code of the proposed method is made available online.
Address	Santorini; Grecia; April 2016
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference	DAS
Notes	DAG; 601.197; 600.084			Approved	no
Call Number	Admin @ si @ GoK2016b			Serial	2863
Permanent link to this record



Author	Lluis Gomez
Title	Exploiting Similarity Hierarchies for Multi-script Scene Text Understanding			Type	Book Whole
Year	2016	Publication	PhD Thesis, Universitat Autonoma de Barcelona-CVC	Abbreviated Journal
Volume		Issue		Pages
Keywords
Abstract	This thesis addresses the problem of automatic scene text understanding in unconstrained conditions. In particular, we tackle the tasks of multi-language and arbitrary-oriented text detection, tracking, and script identification in natural scenes. For this we have developed a set of generic methods that build on top of the basic observation that text has always certain key visual and structural characteristics that are independent of the language or script in which it is written. Text instances in any language or script are always formed as groups of similar atomic parts, being them either individual characters, small stroke parts, or even whole words in the case of cursive text. This holistic (sumof-parts) and recursive perspective has lead us to explore different variants of the “segmentation and grouping” paradigm of computer vision. Scene text detection methodologies are usually based in classification of individual regions or patches, using a priory knowledge for a given script or language. Human perception of text, on the other hand, is based on perceptual organization through which text emerges as a perceptually significant group of atomic objects. In this thesis, we argue that the text detection problem must be posed as the detection of meaningful groups of regions. We address the problem of text detection in natural scenes from a hierarchical perspective, making explicit use of the recursive nature of text, aiming directly to the detection of region groupings corresponding to text within a hierarchy produced by an agglomerative similarity clustering process over individual regions. We propose an optimal way to construct such an hierarchy introducing a feature space designed to produce text group hypothese with high recall and a novel stopping rule combining a discriminative classifier and a probabilistic measure of group meaningfulness based in perceptual organization. Within this generic framework, we design a text-specific object proposals algorithm that, contrary to existing generic object proposals methods, aims directly to the detection of text regions groupings. For this, we abandon the rigid definition of “what is text” of traditional specialized text detectors, and move towards more fuzzy perspective of grouping-based object proposals methods. Then, we present a hybrid algorithm for detection and tracking of scene text where the notion of region groupings plays also a central role. By leveraging the structural arrangement of text group components between consecutive frames we can improve the overall tracking performance of the system. Finally, since our generic detection framework is inherently designed for multi-language environments, we focus on the problem of script identification in order to build a multi-language end-toend reading system. Facing this problem with state of the art CNN classifiers is not straightforward, as they fail to address a key characteristic of scene text instances: their extremely variable aspect ratio. Instead of resizing input images to a fixed size as in the typical use of holistic CNN classifiers, we propose a patch-based classification framework in order to preserve discriminative parts of the image that are characteristic of its class. We describe a novel method based on the use of ensembles of conjoined networks to jointly learn discriminative stroke-parts representations and their relative importance in a patch-based classification scheme.
Address
Corporate Author				Thesis	Ph.D. thesis
Publisher		Place of Publication		Editor	Dimosthenis Karatzas
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference
Notes	DAG			Approved	no
Call Number	Admin @ si @ Gom2016			Serial	2891
Permanent link to this record



Author	L. Calvet; A. Ferrer; M. Gomes; A. Juan; David Masip
Title	Combining Statistical Learning with Metaheuristics for the Multi-Depot Vehicle Routing Problem with Market Segmentation			Type	Journal Article
Year	2016	Publication	Computers & Industrial Engineering	Abbreviated Journal	CIE
Volume	94	Issue		Pages	93-104
Keywords	Multi-Depot Vehicle Routing Problem; market segmentation applications; hybrid algorithms; statistical learning
Abstract	In real-life logistics and distribution activities it is usual to face situations in which the distribution of goods has to be made from multiple warehouses or depots to the nal customers. This problem is known as the Multi-Depot Vehicle Routing Problem (MDVRP), and it typically includes two sequential and correlated stages: (a) the assignment map of customers to depots, and (b) the corresponding design of the distribution routes. Most of the existing work in the literature has focused on minimizing distance-based distribution costs while satisfying a number of capacity constraints. However, no attention has been given so far to potential variations in demands due to the tness of the customerdepot mapping in the case of heterogeneous depots. In this paper, we consider this realistic version of the problem in which the depots are heterogeneous in terms of their commercial oer and customers show dierent willingness to consume depending on how well the assigned depot ts their preferences. Thus, we assume that dierent customer-depot assignment maps will lead to dierent customer-expenditure levels. As a consequence, market-segmentation strategiesneed to be considered in order to increase sales and total income while accounting for the distribution costs. To solve this extension of the MDVRP, we propose a hybrid approach that combines statistical learning techniques with a metaheuristic framework. First, a set of predictive models is generated from historical data. These statistical models allow estimating the demand of any customer depending on the assigned depot. Then, the estimated expenditure of each customer is included as part of an enriched objective function as a way to better guide the stochastic local search inside the metaheuristic framework. A set of computational experiments contribute to illustrate our approach and how the extended MDVRP considered here diers in terms of the proposed solutions from the traditional one.
Address
Corporate Author				Thesis
Publisher	PERGAMON-ELSEVIER SCIENCE LTD	Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title	CIE
Series Volume		Series Issue		Edition
ISSN	0360-8352	ISBN		Medium
Area		Expedition		Conference
Notes	OR;MV;			Approved	no
Call Number	Admin @ si @ CFG2016			Serial	2749
Permanent link to this record



Author	Katerine Diaz; Aura Hernandez-Sabate; Antonio Lopez
Title	A reduced feature set for driver head pose estimation			Type	Journal Article
Year	2016	Publication	Applied Soft Computing	Abbreviated Journal	ASOC
Volume	45	Issue		Pages	98-107
Keywords	Head pose estimation; driving performance evaluation; subspace based methods; linear regression
Abstract	Evaluation of driving performance is of utmost importance in order to reduce road accident rate. Since driving ability includes visual-spatial and operational attention, among others, head pose estimation of the driver is a crucial indicator of driving performance. This paper proposes a new automatic method for coarse and fine head's yaw angle estimation of the driver. We rely on a set of geometric features computed from just three representative facial keypoints, namely the center of the eyes and the nose tip. With these geometric features, our method combines two manifold embedding methods and a linear regression one. In addition, the method has a confidence mechanism to decide if the classification of a sample is not reliable. The approach has been tested using the CMU-PIE dataset and our own driver dataset. Despite the very few facial keypoints required, the results are comparable to the state-of-the-art techniques. The low computational cost of the method and its robustness makes feasible to integrate it in massive consume devices as a real time application.
Address
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference
Notes	ADAS; 600.085; 600.076;			Approved	no
Call Number	Admin @ si @ DHL2016			Serial	2760
Permanent link to this record



Author	Jun Wan; Yibing Zhao; Shuai Zhou; Isabelle Guyon; Sergio Escalera
Title	ChaLearn Looking at People RGB-D Isolated and Continuous Datasets for Gesture Recognition			Type	Conference Article
Year	2016	Publication	29th IEEE Conference on Computer Vision and Pattern Recognition Worshops	Abbreviated Journal
Volume		Issue		Pages
Keywords
Abstract	In this paper, we present two large video multi-modal datasets for RGB and RGB-D gesture recognition: the ChaLearn LAP RGB-D Isolated Gesture Dataset (IsoGD)and the Continuous Gesture Dataset (ConGD). Both datasets are derived from the ChaLearn Gesture Dataset (CGD) that has a total of more than 50000 gestures for the “one-shot-learning” competition. To increase the potential of the old dataset, we designed new well curated datasets composed of 249 gesture labels, and including 47933 gestures manually labeled the begin and end frames in sequences.Using these datasets we will open two competitions on the CodaLab platform so that researchers can test and compare their methods for “user independent” gesture recognition. The first challenge is designed for gesture spotting and recognition in continuous sequences of gestures while the second one is designed for gesture classification from segmented data. The baseline method based on the bag of visual words model is also presented.
Address	Las Vegas; USA; July 2016
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference	CVPRW
Notes	HuPBA;MILAB;			Approved	no
Call Number	Admin @ si @ WZZ2016			Serial	2771
Permanent link to this record



Author	Juan Ramon Terven Salinas; Bogdan Raducanu; Maria Elena Meza-de-Luna; Joaquin Salas
Title	Head-gestures mirroring detection in dyadic social linteractions with computer vision-based wearable devices			Type	Journal Article
Year	2016	Publication	Neurocomputing	Abbreviated Journal	NEUCOM
Volume	175	Issue	B	Pages	866–876
Keywords	Head gestures recognition; Mirroring detection; Dyadic social interaction analysis; Wearable devices
Abstract	During face-to-face human interaction, nonverbal communication plays a fundamental role. A relevant aspect that takes part during social interactions is represented by mirroring, in which a person tends to mimic the non-verbal behavior (head and body gestures, vocal prosody, etc.) of the counterpart. In this paper, we introduce a computer vision-based system to detect mirroring in dyadic social interactions with the use of a wearable platform. In our context, mirroring is inferred as simultaneous head noddings displayed by the interlocutors. Our approach consists of the following steps: (1) facial features extraction; (2) facial features stabilization; (3) head nodding recognition; and (4) mirroring detection. Our system achieves a mirroring detection accuracy of 72% on a custom mirroring dataset.
Address
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference
Notes	LAMP; 600.072; 600.068;			Approved	no
Call Number	Admin @ si @ TRM2016			Serial	2721
Permanent link to this record



Author	Juan Ignacio Toledo; Sebastian Sudholt; Alicia Fornes; Jordi Cucurull; A. Fink; Josep Llados
Title	Handwritten Word Image Categorization with Convolutional Neural Networks and Spatial Pyramid Pooling			Type	Conference Article
Year	2016	Publication	Joint IAPR International Workshops on Statistical Techniques in Pattern Recognition (SPR) and Structural and Syntactic Pattern Recognition (SSPR)	Abbreviated Journal
Volume	10029	Issue		Pages	543-552
Keywords	Document image analysis; Word image categorization; Convolutional neural networks; Named entity detection
Abstract	The extraction of relevant information from historical document collections is one of the key steps in order to make these documents available for access and searches. The usual approach combines transcription and grammars in order to extract semantically meaningful entities. In this paper, we describe a new method to obtain word categories directly from non-preprocessed handwritten word images. The method can be used to directly extract information, being an alternative to the transcription. Thus it can be used as a first step in any kind of syntactical analysis. The approach is based on Convolutional Neural Networks with a Spatial Pyramid Pooling layer to deal with the different shapes of the input images. We performed the experiments on a historical marriage record dataset, obtaining promising results.
Address	Merida; Mexico; December 2016
Corporate Author				Thesis
Publisher	Springer International Publishing	Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title	LNCS
Series Volume		Series Issue		Edition
ISSN		ISBN	978-3-319-49054-0	Medium
Area		Expedition		Conference	S+SSPR
Notes	DAG; 600.097; 602.006			Approved	no
Call Number	Admin @ si @ TSF2016			Serial	2877
Permanent link to this record



Author	Juan Ignacio Toledo; Alicia Fornes; Jordi Cucurull; Josep Llados
Title	Election Tally Sheets Processing System			Type	Conference Article
Year	2016	Publication	12th IAPR Workshop on Document Analysis Systems	Abbreviated Journal
Volume		Issue		Pages	364-368
Keywords
Abstract	In paper based elections, manual tallies at polling station level produce myriads of documents. These documents share a common form-like structure and a reduced vocabulary worldwide. On the other hand, each tally sheet is filled by a different writer and on different countries, different scripts are used. We present a complete document analysis system for electoral tally sheet processing combining state of the art techniques with a new handwriting recognition subprocess based on unsupervised feature discovery with Variational Autoencoders and sequence classification with BLSTM neural networks. The whole system is designed to be script independent and allows a fast and reliable results consolidation process with reduced operational cost.
Address	Santorini; Greece; April 2016
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference	DAS
Notes	DAG; 602.006; 600.061; 601.225; 600.077; 600.097			Approved	no
Call Number	TFC2016			Serial	2752
Permanent link to this record



Author	Juan A. Carvajal Ayala; Dennis Romero; Angel Sappa
Title	Fine-tuning based deep convolutional networks for lepidopterous genus recognition			Type	Conference Article
Year	2016	Publication	21st Ibero American Congress on Pattern Recognition	Abbreviated Journal
Volume		Issue		Pages	467-475
Keywords
Abstract	This paper describes an image classification approach oriented to identify specimens of lepidopterous insects at Ecuadorian ecological reserves. This work seeks to contribute to studies in the area of biology about genus of butterflies and also to facilitate the registration of unrecognized specimens. The proposed approach is based on the fine-tuning of three widely used pre-trained Convolutional Neural Networks (CNNs). This strategy is intended to overcome the reduced number of labeled images. Experimental results with a dataset labeled by expert biologists is presented, reaching a recognition accuracy above 92%.
Address	Lima; Perú; November 2016
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title	LNCS
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference	CIARP
Notes	ADAS; 600.086			Approved	no
Call Number	Admin @ si @ CRS2016			Serial	2913
Permanent link to this record



Author	Jose Ramirez Moreno; Juan R Revilla; Miguel Reyes; Sergio Escalera
Title	Validación del Software ADIBAS asociado al sensor Kinect de Microsoft para la evaluación de la posición corporal			Type	Conference Article
Year	2016	Publication	4th Congreso WCPT-SAR	Abbreviated Journal
Volume		Issue		Pages
Keywords
Abstract
Address	Buenos Aires; Argentina; June 2016
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference	WCPT-SAR
Notes	HuPBA;MILAB			Approved	no
Call Number	Admin @ si @ RRR2016			Serial	2853
Permanent link to this record



Author	Jose Marone; Simone Balocco; Marc Bolaños; Jose Massa; Petia Radeva
Title	Learning the Lumen Border using a Convolutional Neural Networks classiﬁer			Type	Conference Article
Year	2016	Publication	19th International Conference on Medical Image Computing and Computer Assisted Intervention Workshop	Abbreviated Journal
Volume		Issue		Pages
Keywords
Abstract	IntraVascular UltraSound (IVUS) is a technique allowing the diagnosis of coronary plaque. An accurate (semi-)automatic assessment of the luminal contours could speed up the diagnosis. In most of the approaches, the information on the vessel shape is obtained combining a supervised learning step with a local refinement algorithm. In this paper, we explore for the first time, the use of a Convolutional Neural Networks (CNN) architecture that on one hand is able to extract the optimal image features and at the same time can serve as a supervised classifier to detect the lumen border in IVUS images. The main limitation of CNN, relies on the fact that this technique requires a large amount of training data due to the huge amount of parameters that it has. To solve this issue, we introduce a patch classification approach to generate an extended training-set from a few annotated images. An accuracy of 93% and F-score of 71% was obtained with this technique, even when it was applied to challenging frames containig calcified plaques, stents and catheter shadows.
Address	Athens; Greece; October 2016
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference	MICCAIW
Notes	MILAB;			Approved	no
Call Number	Admin @ si @ MBB2016			Serial	2822
Permanent link to this record



Author	Jose A. Garcia; David Masip; Valerio Sbragaglia; Jacopo Aguzzi
Title	Automated Identification and Tracking of Nephrops norvegicus (L.) Using Infrared and Monochromatic Blue Light			Type	Conference Article
Year	2016	Publication	19th International Conference of the Catalan Association for Artificial Intelligence	Abbreviated Journal
Volume		Issue		Pages
Keywords	computer vision; video analysis; object recognition; tracking; behaviour; social; decapod; Nephrops norvegicus
Abstract	Automated video and image analysis can be a very efficient tool to analyze animal behavior based on sociality, especially in hard access environments for researchers. The understanding of this social behavior can play a key role in the sustainable design of capture policies of many species. This paper proposes the use of computer vision algorithms to identify and track a specific specie, the Norway lobster, Nephrops norvegicus, a burrowing decapod with relevant commercial value which is captured by trawling. These animals can only be captured when are engaged in seabed excursions, which are strongly related with their social behavior. This emergent behavior is modulated by the day-night cycle, but their social interactions remain unknown to the scientific community. The paper introduces an identification scheme made of four distinguishable black and white tags (geometric shapes). The project has recorded 15-day experiments in laboratory pools, under monochromatic blue light (472 nm.) and darkness conditions (recorded using Infra Red light). Using this massive image set, we propose a comparative of state-ofthe-art computer vision algorithms to distinguish and track the different animals’ movements. We evaluate the robustness to the high noise presence in the infrared video signals and free out-of-plane rotations due to animal movement. The experiments show promising accuracies under a cross-validation protocol, being adaptable to the automation and analysis of large scale data. In a second contribution, we created an extensive dataset of shapes (46027 different shapes) from four daily experimental video recordings, which will be available to the community.
Address	Barcelona; Spain; October 2016
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference	CCIA
Notes	OR;MV;			Approved	no
Call Number	Admin @ si @ GMS2016			Serial	2816
Permanent link to this record



Author	Jose A. Garcia; David Masip; Valerio Sbragaglia; Jacopo Aguzzi
Title	Using ORB, BoW and SVM to identificate and track tagged Norway lobster Nephrops Norvegicus (L.)			Type	Conference Article
Year	2016	Publication	3rd International Conference on Maritime Technology and Engineering	Abbreviated Journal
Volume		Issue		Pages
Keywords
Abstract	Sustainable capture policies of many species strongly depend on the understanding of their social behaviour. Nevertheless, the analysis of emergent behaviour in marine species poses several challenges. Usually animals are captured and observed in tanks, and their behaviour is inferred from their dynamics and interactions. Therefore, researchers must deal with thousands of hours of video data. Without loss of generality, this paper proposes a computer vision approach to identify and track specific species, the Norway lobster, Nephrops norvegicus. We propose an identification scheme were animals are marked using black and white tags with a geometric shape in the center (holed triangle, filled triangle, holed circle and filled circle). Using a massive labelled dataset; we extract local features based on the ORB descriptor. These features are a posteriori clustered, and we construct a Bag of Visual Words feature vector per animal. This approximation yields us invariance to rotation and translation. A SVM classifier achieves generalization results above 99%. In a second contribution, we will make the code and training data publically available.
Address	Lisboa; Portugal; July 2016
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference	MARTECH
Notes	OR;MV;			Approved	no
Call Number	Admin @ si @ GMS2016b			Serial	2817
Permanent link to this record



Author	Joana Maria Pujadas-Mora; Alicia Fornes; Josep Llados; Anna Cabre
Title	Bridging the gap between historical demography and computing: tools for computer-assisted transcription and the analysis of demographic sources			Type	Book Chapter
Year	2016	Publication	The future of historical demography. Upside down and inside out	Abbreviated Journal
Volume		Issue		Pages	127-131
Keywords
Abstract
Address
Corporate Author				Thesis
Publisher	Acco Publishers	Place of Publication		Editor	K.Matthijs; S.Hin; H.Matsuo; J.Kok
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN	978-94-6292-722-3	Medium
Area		Expedition		Conference
Notes	DAG; 600.097			Approved	no
Call Number	Admin @ si @ PFL2016			Serial	2907
Permanent link to this record



Author	Joan Mas; Alicia Fornes; Josep Llados
Title	An Interactive Transcription System of Census Records using Word-Spotting based Information Transfer			Type	Conference Article
Year	2016	Publication	12th IAPR Workshop on Document Analysis Systems	Abbreviated Journal
Volume		Issue		Pages	54-59
Keywords
Abstract	This paper presents a system to assist in the transcription of historical handwritten census records in a crowdsourcing platform. Census records have a tabular structured layout. They consist in a sequence of rows with information of homes ordered by street address. For each household snippet in the page, the list of family members is reported. The censuses are recorded in intervals of a few years and the information of individuals in each household is quite stable from a point in time to the next one. This redundancy is used to assist the transcriber, so the redundant information is transferred from the census already transcribed to the next one. Household records are aligned from one year to the next one using the knowledge of the ordering by street address. Given an already transcribed census, a query by string word spotting is applied. Thus, names from the census in time t are used as queries in the corresponding home record in time t+1. Since the search is constrained, the obtained precision-recall values are very high, with an important reduction in the transcription time. The proposed system has been tested in a real citizen-science experience where non expert users transcribe the census data of their home town.
Address	Santorini; Greece; April 2016
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference	DAS
Notes	DAG; 603.053; 602.006; 600.061; 600.077; 600.097			Approved	no
Call Number	Admin @ si @ MFL2016			Serial	2751
Permanent link to this record