Publicacions CVC -- Query Results

	Publicacions CVC Home \| Show All \| Simple Search \| Advanced Search \| Add Record \| Import	Login Quick Search: Field: contains: ...
	196–210 of 734 records found matching your query (RSS \| history):

Search & Display Options

Select All Deselect All

[1–10] << 11 12 13 14 15 16 17 18 19 20 >> [21–30]

List View

Citations

Details

	Records
	Author	Veronica Romero; Alicia Fornes; Nicolas Serrano; Joan Andreu Sanchez; A.H. Toselli; Volkmar Frinken; E. Vidal; Josep Llados
	Title	The ESPOSALLES database: An ancient marriage license corpus for off-line handwriting recognition			Type	Journal Article
	Year	2013	Publication	Pattern Recognition	Abbreviated Journal	PR
	Volume	46	Issue	6	Pages	1658-1669
	Keywords
	Abstract	Historical records of daily activities provide intriguing insights into the life of our ancestors, useful for demography studies and genealogical research. Automatic processing of historical documents, however, has mostly been focused on single works of literature and less on social records, which tend to have a distinct layout, structure, and vocabulary. Such information is usually collected by expert demographers that devote a lot of time to manually transcribe them. This paper presents a new database, compiled from a marriage license books collection, to support research in automatic handwriting recognition for historical documents containing social records. Marriage license books are documents that were used for centuries by ecclesiastical institutions to register marriage licenses. Books from this collection are handwritten and span nearly half a millennium until the beginning of the 20th century. In addition, a study is presented about the capability of state-of-the-art handwritten text recognition systems, when applied to the presented database. Baseline results are reported for reference in future studies.
	Address
	Corporate Author				Thesis
	Publisher	Elsevier Science Inc. New York, NY, USA	Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN	0031-3203	ISBN		Medium
	Area		Expedition		Conference
	Notes	DAG; 600.045; 602.006; 605.203			Approved	no
	Call Number	Admin @ si @ RFS2013			Serial	2298
Permanent link to this record



	Author	Albert Gordo; Florent Perronnin; Yunchao Gong; Svetlana Lazebnik
	Title	Asymmetric Distances for Binary Embeddings			Type	Journal Article
	Year	2014	Publication	IEEE Transactions on Pattern Analysis and Machine Intelligence	Abbreviated Journal	TPAMI
	Volume	36	Issue	1	Pages	33-47
	Keywords
	Abstract	In large-scale query-by-example retrieval, embedding image signatures in a binary space offers two benefits: data compression and search efficiency. While most embedding algorithms binarize both query and database signatures, it has been noted that this is not strictly a requirement. Indeed, asymmetric schemes which binarize the database signatures but not the query still enjoy the same two benefits but may provide superior accuracy. In this work, we propose two general asymmetric distances which are applicable to a wide variety of embedding techniques including Locality Sensitive Hashing (LSH), Locality Sensitive Binary Codes (LSBC), Spectral Hashing (SH), PCA Embedding (PCAE), PCA Embedding with random rotations (PCAE-RR), and PCA Embedding with iterative quantization (PCAE-ITQ). We experiment on four public benchmarks containing up to 1M images and show that the proposed asymmetric distances consistently lead to large improvements over the symmetric Hamming distance for all binary embedding techniques.
	Address
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN	0162-8828	ISBN		Medium
	Area		Expedition		Conference
	Notes	DAG; 600.045; 605.203; 600.077			Approved	no
	Call Number	Admin @ si @ GPG2014			Serial	2272
Permanent link to this record



	Author	Jon Almazan; Albert Gordo; Alicia Fornes; Ernest Valveny
	Title	Word Spotting and Recognition with Embedded Attributes			Type	Journal Article
	Year	2014	Publication	IEEE Transactions on Pattern Analysis and Machine Intelligence	Abbreviated Journal	TPAMI
	Volume	36	Issue	12	Pages	2552 - 2566
	Keywords
	Abstract	This article addresses the problems of word spotting and word recognition on images. In word spotting, the goal is to find all instances of a query word in a dataset of images. In recognition, the goal is to recognize the content of the word image, usually aided by a dictionary or lexicon. We describe an approach in which both word images and text strings are embedded in a common vectorial subspace. This is achieved by a combination of label embedding and attributes learning, and a common subspace regression. In this subspace, images and strings that represent the same word are close together, allowing one to cast recognition and retrieval tasks as a nearest neighbor problem. Contrary to most other existing methods, our representation has a fixed length, is low dimensional, and is very fast to compute and, especially, to compare. We test our approach on four public datasets of both handwritten documents and natural images showing results comparable or better than the state-of-the-art on spotting and recognition tasks.
	Address
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN	0162-8828	ISBN		Medium
	Area		Expedition		Conference
	Notes	DAG; 600.056; 600.045; 600.061; 602.006; 600.077			Approved	no
	Call Number	Admin @ si @ AGF2014a			Serial	2483
Permanent link to this record



	Author	Antonio Clavelli; Dimosthenis Karatzas; Josep Llados; Mario Ferraro; Giuseppe Boccignone
	Title	Modelling task-dependent eye guidance to objects in pictures			Type	Journal Article
	Year	2014	Publication	Cognitive Computation	Abbreviated Journal	CoCom
	Volume	6	Issue	3	Pages	558-584
	Keywords	Visual attention; Gaze guidance; Value; Payoff; Stochastic fixation prediction
	Abstract	5Y Impact Factor: 1.14 / 3rd (Computer Science, Artificial Intelligence) We introduce a model of attentional eye guidance based on the rationale that the deployment of gaze is to be considered in the context of a general action-perception loop relying on two strictly intertwined processes: sensory processing, depending on current gaze position, identifies sources of information that are most valuable under the given task; motor processing links such information with the oculomotor act by sampling the next gaze position and thus performing the gaze shift. In such a framework, the choice of where to look next is task-dependent and oriented to classes of objects embedded within pictures of complex scenes. The dependence on task is taken into account by exploiting the value and the payoff of gazing at certain image patches or proto-objects that provide a sparse representation of the scene objects. The different levels of the action-perception loop are represented in probabilistic form and eventually give rise to a stochastic process that generates the gaze sequence. This way the model also accounts for statistical properties of gaze shifts such as individual scan path variability. Results of the simulations are compared either with experimental data derived from publicly available datasets and from our own experiments.
	Address
	Corporate Author				Thesis
	Publisher	Springer US	Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN	1866-9956	ISBN		Medium
	Area		Expedition		Conference
	Notes	DAG; 600.056; 600.045; 605.203; 601.212; 600.077			Approved	no
	Call Number	Admin @ si @ CKL2014			Serial	2419
Permanent link to this record



	Author	David Fernandez; Josep Llados; Alicia Fornes
	Title	A graph-based approach for segmenting touching lines in historical handwritten documents			Type	Journal Article
	Year	2014	Publication	International Journal on Document Analysis and Recognition	Abbreviated Journal	IJDAR
	Volume	17	Issue	3	Pages	293-312
	Keywords	Text line segmentation; Handwritten documents; Document image processing; Historical document analysis
	Abstract	Text line segmentation in handwritten documents is an important task in the recognition of historical documents. Handwritten document images contain text lines with multiple orientations, touching and overlapping characters between consecutive text lines and different document structures, making line segmentation a difficult task. In this paper, we present a new approach for handwritten text line segmentation solving the problems of touching components, curvilinear text lines and horizontally overlapping components. The proposed algorithm formulates line segmentation as finding the central path in the area between two consecutive lines. This is solved as a graph traversal problem. A graph is constructed using the skeleton of the image. Then, a path-finding algorithm is used to find the optimum path between text lines. The proposed algorithm has been evaluated on a comprehensive dataset consisting of five databases: ICDAR2009, ICDAR2013, UMD, the George Washington and the Barcelona Marriages Database. The proposed method outperforms the state-of-the-art considering the different types and difficulties of the benchmarking data.
	Address
	Corporate Author				Thesis
	Publisher	Springer Berlin Heidelberg	Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN	1433-2833	ISBN		Medium
	Area		Expedition		Conference
	Notes	DAG; 600.056; 600.061; 602.006; 600.077			Approved	no
	Call Number	Admin @ si @ FLF2014			Serial	2459
Permanent link to this record



	Author	Christophe Rigaud; Clement Guerin; Dimosthenis Karatzas; Jean-Christophe Burie; Jean-Marc Ogier
	Title	Knowledge-driven understanding of images in comic books			Type	Journal Article
	Year	2015	Publication	International Journal on Document Analysis and Recognition	Abbreviated Journal	IJDAR
	Volume	18	Issue	3	Pages	199-221
	Keywords	Document Understanding; comics analysis; expert system
	Abstract	Document analysis is an active field of research, which can attain a complete understanding of the semantics of a given document. One example of the document understanding process is enabling a computer to identify the key elements of a comic book story and arrange them according to a predefined domain knowledge. In this study, we propose a knowledge-driven system that can interact with bottom-up and top-down information to progressively understand the content of a document. We model the comic book’s and the image processing domains knowledge for information consistency analysis. In addition, different image processing methods are improved or developed to extract panels, balloons, tails, texts, comic characters and their semantic relations in an unsupervised way.
	Address
	Corporate Author				Thesis
	Publisher	Springer Berlin Heidelberg	Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN	1433-2833	ISBN		Medium
	Area		Expedition		Conference
	Notes	DAG; 600.056; 600.077			Approved	no
	Call Number	RGK2015			Serial	2595
Permanent link to this record



	Author	Lluis Gomez; Dimosthenis Karatzas
	Title	A fast hierarchical method for multi‐script and arbitrary oriented scene text extraction			Type	Journal Article
	Year	2016	Publication	International Journal on Document Analysis and Recognition	Abbreviated Journal	IJDAR
	Volume	19	Issue	4	Pages	335-349
	Keywords	scene text; segmentation; detection; hierarchical grouping; perceptual organisation
	Abstract	Typography and layout lead to the hierarchical organisation of text in words, text lines, paragraphs. This inherent structure is a key property of text in any script and language, which has nonetheless been minimally leveraged by existing text detection methods. This paper addresses the problem of text segmentation in natural scenes from a hierarchical perspective. Contrary to existing methods, we make explicit use of text structure, aiming directly to the detection of region groupings corresponding to text within a hierarchy produced by an agglomerative similarity clustering process over individual regions. We propose an optimal way to construct such an hierarchy introducing a feature space designed to produce text group hypotheses with high recall and a novel stopping rule combining a discriminative classifier and a probabilistic measure of group meaningfulness based in perceptual organization. Results obtained over four standard datasets, covering text in variable orientations and different languages, demonstrate that our algorithm, while being trained in a single mixed dataset, outperforms state of the art methods in unconstrained scenarios.
	Address
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference
	Notes	DAG; 600.056; 601.197			Approved	no
	Call Number	Admin @ si @ GoK2016a			Serial	2862
Permanent link to this record



	Author	Thanh Ha Do; Salvatore Tabbone; Oriol Ramos Terrades
	Title	Sparse representation over learned dictionary for symbol recognition			Type	Journal Article
	Year	2016	Publication	Signal Processing	Abbreviated Journal	SP
	Volume	125	Issue		Pages	36-47
	Keywords	Symbol Recognition; Sparse Representation; Learned Dictionary; Shape Context; Interest Points
	Abstract	In this paper we propose an original sparse vector model for symbol retrieval task. More specically, we apply the K-SVD algorithm for learning a visual dictionary based on symbol descriptors locally computed around interest points. Results on benchmark datasets show that the obtained sparse representation is competitive related to state-of-the-art methods. Moreover, our sparse representation is invariant to rotation and scale transforms and also robust to degraded images and distorted symbols. Thereby, the learned visual dictionary is able to represent instances of unseen classes of symbols.
	Address
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference
	Notes	DAG; 600.061; 600.077			Approved	no
	Call Number	Admin @ si @ DTR2016			Serial	2946
Permanent link to this record



	Author	Palaiahnakote Shivakumara; Anjan Dutta; Chew Lim Tan; Umapada Pal
	Title	Multi-oriented scene text detection in video based on wavelet and angle projection boundary growing			Type	Journal Article
	Year	2014	Publication	Multimedia Tools and Applications	Abbreviated Journal	MTAP
	Volume	72	Issue	1	Pages	515-539
	Keywords
	Abstract	In this paper, we address two complex issues: 1) Text frame classification and 2) Multi-oriented text detection in video text frame. We first divide a video frame into 16 blocks and propose a combination of wavelet and median-moments with k-means clustering at the block level to identify probable text blocks. For each probable text block, the method applies the same combination of feature with k-means clustering over a sliding window running through the blocks to identify potential text candidates. We introduce a new idea of symmetry on text candidates in each block based on the observation that pixel distribution in text exhibits a symmetric pattern. The method integrates all blocks containing text candidates in the frame and then all text candidates are mapped on to a Sobel edge map of the original frame to obtain text representatives. To tackle the multi-orientation problem, we present a new method called Angle Projection Boundary Growing (APBG) which is an iterative algorithm and works based on a nearest neighbor concept. APBG is then applied on the text representatives to fix the bounding box for multi-oriented text lines in the video frame. Directional information is used to eliminate false positives. Experimental results on a variety of datasets such as non-horizontal, horizontal, publicly available data (Hua’s data) and ICDAR-03 competition data (camera images) show that the proposed method outperforms existing methods proposed for video and the state of the art methods for scene text as well.
	Address
	Corporate Author				Thesis
	Publisher	Springer US	Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN	1380-7501	ISBN		Medium
	Area		Expedition		Conference
	Notes	DAG; 600.077			Approved	no
	Call Number	Admin @ si @ SDT2014			Serial	2357
Permanent link to this record



	Author	Marçal Rusiñol; Lluis Pere de las Heras; Oriol Ramos Terrades
	Title	Flowchart Recognition for Non-Textual Information Retrieval in Patent Search			Type	Journal Article
	Year	2014	Publication	Information Retrieval	Abbreviated Journal	IR
	Volume	17	Issue	5-6	Pages	545-562
	Keywords	Flowchart recognition; Patent documents; Text/graphics separation; Raster-to-vector conversion; Symbol recognition
	Abstract	Relatively little research has been done on the topic of patent image retrieval and in general in most of the approaches the retrieval is performed in terms of a similarity measure between the query image and the images in the corpus. However, systems aimed at overcoming the semantic gap between the visual description of patent images and their conveyed concepts would be very helpful for patent professionals. In this paper we present a flowchart recognition method aimed at achieving a structured representation of flowchart images that can be further queried semantically. The proposed method was submitted to the CLEF-IP 2012 flowchart recognition task. We report the obtained results on this dataset.
	Address
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN	1386-4564	ISBN		Medium
	Area		Expedition		Conference
	Notes	DAG; 600.077			Approved	no
	Call Number	Admin @ si @ RHR2013			Serial	2342
Permanent link to this record



	Author	T.Chauhan; E.Perales; Kaida Xiao; E.Hird ; Dimosthenis Karatzas; Sophie Wuerger
	Title	The achromatic locus: Effect of navigation direction in color space			Type	Journal Article
	Year	2014	Publication	Journal of Vision	Abbreviated Journal	VSS
	Volume	14 (1)	Issue	25	Pages	1-11
	Keywords	achromatic; unique hues; color constancy; luminance; color space
	Abstract	5Y Impact Factor: 2.99 / 1st (Ophthalmology) An achromatic stimulus is defined as a patch of light that is devoid of any hue. This is usually achieved by asking observers to adjust the stimulus such that it looks neither red nor green and at the same time neither yellow nor blue. Despite the theoretical and practical importance of the achromatic locus, little is known about the variability in these settings. The main purpose of the current study was to evaluate whether achromatic settings were dependent on the task of the observers, namely the navigation direction in color space. Observers could either adjust the test patch along the two chromatic axes in the CIE uv diagram or, alternatively, navigate along the unique-hue lines. Our main result is that the navigation method affects the reliability of these achromatic settings. Observers are able to make more reliable achromatic settings when adjusting the test patch along the directions defined by the four unique hues as opposed to navigating along the main axes in the commonly used CIE uv chromaticity plane. This result holds across different ambient viewing conditions (Dark, Daylight, Cool White Fluorescent) and different test luminance levels (5, 20, and 50 cd/m2). The reduced variability in the achromatic settings is consistent with the idea that internal color representations are more aligned with the unique-hue lines than the u* and v* axes.
	Address
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference
	Notes	DAG; 600.077			Approved	no
	Call Number	Admin @ si @ CPX2014			Serial	2418
Permanent link to this record



	Author	Volkmar Frinken; Andreas Fischer; Markus Baumgartner; Horst Bunke
	Title	Keyword spotting for self-training of BLSTM NN based handwriting recognition systems			Type	Journal Article
	Year	2014	Publication	Pattern Recognition	Abbreviated Journal	PR
	Volume	47	Issue	3	Pages	1073-1082
	Keywords	Document retrieval; Keyword spotting; Handwriting recognition; Neural networks; Semi-supervised learning
	Abstract	The automatic transcription of unconstrained continuous handwritten text requires well trained recognition systems. The semi-supervised paradigm introduces the concept of not only using labeled data but also unlabeled data in the learning process. Unlabeled data can be gathered at little or not cost. Hence it has the potential to reduce the need for labeling training data, a tedious and costly process. Given a weak initial recognizer trained on labeled data, self-training can be used to recognize unlabeled data and add words that were recognized with high confidence to the training set for re-training. This process is not trivial and requires great care as far as selecting the elements that are to be added to the training set is concerned. In this paper, we propose to use a bidirectional long short-term memory neural network handwritten recognition system for keyword spotting in order to select new elements. A set of experiments shows the high potential of self-training for bootstrapping handwriting recognition systems, both for modern and historical handwritings, and demonstrate the benefits of using keyword spotting over previously published self-training schemes.
	Address
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference
	Notes	DAG; 600.077; 602.101			Approved	no
	Call Number	Admin @ si @ FFB2014			Serial	2297
Permanent link to this record



	Author	Lluis Gomez; Ali Furkan Biten; Ruben Tito; Andres Mafla; Marçal Rusiñol; Ernest Valveny; Dimosthenis Karatzas
	Title	Multimodal grid features and cell pointers for scene text visual question answering			Type	Journal Article
	Year	2021	Publication	Pattern Recognition Letters	Abbreviated Journal	PRL
	Volume	150	Issue		Pages	242-249
	Keywords
	Abstract	This paper presents a new model for the task of scene text visual question answering. In this task questions about a given image can only be answered by reading and understanding scene text. Current state of the art models for this task make use of a dual attention mechanism in which one attention module attends to visual features while the other attends to textual features. A possible issue with this is that it makes difficult for the model to reason jointly about both modalities. To fix this problem we propose a new model that is based on an single attention mechanism that attends to multi-modal features conditioned to the question. The output weights of this attention module over a grid of multi-modal spatial features are interpreted as the probability that a certain spatial location of the image contains the answer text to the given question. Our experiments demonstrate competitive performance in two standard datasets with a model that is faster than previous methods at inference time. Furthermore, we also provide a novel analysis of the ST-VQA dataset based on a human performance study. Supplementary material, code, and data is made available through this link.
	Address
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference
	Notes	DAG; 600.084; 600.121			Approved	no
	Call Number	Admin @ si @ GBT2021			Serial	3620
Permanent link to this record



	Author	Lluis Gomez; Anguelos Nicolaou; Dimosthenis Karatzas
	Title	Improving patch‐based scene text script identification with ensembles of conjoined networks			Type	Journal Article
	Year	2017	Publication	Pattern Recognition	Abbreviated Journal	PR
	Volume	67	Issue		Pages	85-96
	Keywords
	Abstract
	Address
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference
	Notes	DAG; 600.084; 600.121; 600.129			Approved	no
	Call Number	Admin @ si @ GNK2017			Serial	2887
Permanent link to this record



	Author	Dena Bazazian; Raul Gomez; Anguelos Nicolaou; Lluis Gomez; Dimosthenis Karatzas; Andrew Bagdanov
	Title	Fast: Facilitated and accurate scene text proposals through fcn guided pruning			Type	Journal Article
	Year	2019	Publication	Pattern Recognition Letters	Abbreviated Journal	PRL
	Volume	119	Issue		Pages	112-120
	Keywords
	Abstract	Class-specific text proposal algorithms can efficiently reduce the search space for possible text object locations in an image. In this paper we combine the Text Proposals algorithm with Fully Convolutional Networks to efficiently reduce the number of proposals while maintaining the same recall level and thus gaining a significant speed up. Our experiments demonstrate that such text proposal approaches yield significantly higher recall rates than state-of-the-art text localization techniques, while also producing better-quality localizations. Our results on the ICDAR 2015 Robust Reading Competition (Challenge 4) and the COCO-text datasets show that, when combined with strong word classifiers, this recall margin leads to state-of-the-art results in end-to-end scene text recognition.
	Address
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference
	Notes	DAG; 600.084; 600.121; 600.129			Approved	no
	Call Number	Admin @ si @ BGN2019			Serial	3342
Permanent link to this record