Publicacions CVC -- Query Results

	Publicacions CVC Home \| Show All \| Simple Search \| Advanced Search \| Add Record \| Import	Login Quick Search: Field: contains: ...
	1891–1905 of 3413 records found matching your query (RSS \| history):

Search & Display Options

Select All Deselect All

[111–120] << 121 122 123 124 125 126 127 128 129 130 >> [131–140]

List View

Citations

Details

	Records
	Author	Alicia Fornes; Gemma Sanchez
	Title	Analysis and Recognition of Music Scores			Type	Book Chapter
	Year	2014	Publication	Handbook of Document Image Processing and Recognition	Abbreviated Journal
	Volume	E	Issue		Pages	749-774
	Keywords
	Abstract	The analysis and recognition of music scores has attracted the interest of researchers for decades. Optical Music Recognition (OMR) is a classical research field of Document Image Analysis and Recognition (DIAR), whose aim is to extract information from music scores. Music scores contain both graphical and textual information, and for this reason, techniques are closely related to graphics recognition and text recognition. Since music scores use a particular diagrammatic notation that follow the rules of music theory, many approaches make use of context information to guide the recognition and solve ambiguities. This chapter overviews the main Optical Music Recognition (OMR) approaches. Firstly, the different methods are grouped according to the OMR stages, namely, staff removal, music symbol recognition, and syntactical analysis. Secondly, specific approaches for old and handwritten music scores are reviewed. Finally, online approaches and commercial systems are also commented.
	Address
	Corporate Author				Thesis
	Publisher	Springer London	Place of Publication		Editor	D. Doermann; K. Tombre
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN	978-0-85729-860-7	Medium
	Area		Expedition		Conference
	Notes	DAG; ADAS; 600.076; 600.077			Approved	no
	Call Number	Admin @ si @ FoS2014			Serial	2484
Permanent link to this record



	Author	Lluis Pere de las Heras; Ernest Valveny; Gemma Sanchez
	Title	Unsupervised and Notation-Independent Wall Segmentation in Floor Plans Using a Combination of Statistical and Structural Strategies			Type	Book Chapter
	Year	2014	Publication	Graphics Recognition. Current Trends and Challenges	Abbreviated Journal
	Volume	8746	Issue		Pages	109-121
	Keywords	Graphics recognition; Floor plan analysis; Object segmentation
	Abstract	In this paper we present a wall segmentation approach in floor plans that is able to work independently to the graphical notation, does not need any pre-annotated data for learning, and is able to segment multiple-shaped walls such as beams and curved-walls. This method results from the combination of the wall segmentation approaches [3, 5] presented recently by the authors. Firstly, potential straight wall segments are extracted in an unsupervised way similar to [3], but restricting even more the wall candidates considered in the original approach. Then, based on [5], these segments are used to learn the texture pattern of walls and spot the lost instances. The presented combination of both methods has been tested on 4 available datasets with different notations and compared qualitatively and quantitatively to the state-of-the-art applied on these collections. Additionally, some qualitative results on floor plans directly downloaded from the Internet are reported in the paper. The overall performance of the method demonstrates either its adaptability to different wall notations and shapes, and to document qualities and resolutions.
	Address
	Corporate Author				Thesis
	Publisher	Springer Berlin Heidelberg	Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title	LNCS
	Series Volume		Series Issue		Edition
	ISSN	0302-9743	ISBN	978-3-662-44853-3	Medium
	Area		Expedition		Conference
	Notes	DAG; ADAS; 600.076; 600.077			Approved	no
	Call Number	Admin @ si @ HVS2014			Serial	2535
Permanent link to this record



	Author	Lluis Pere de las Heras; Oriol Ramos Terrades; Sergi Robles; Gemma Sanchez
	Title	CVC-FP and SGT: a new database for structural floor plan analysis and its groundtruthing tool			Type	Journal Article
	Year	2015	Publication	International Journal on Document Analysis and Recognition	Abbreviated Journal	IJDAR
	Volume	18	Issue	1	Pages	15-30
	Keywords
	Abstract	Recent results on structured learning methods have shown the impact of structural information in a wide range of pattern recognition tasks. In the field of document image analysis, there is a long experience on structural methods for the analysis and information extraction of multiple types of documents. Yet, the lack of conveniently annotated and free access databases has not benefited the progress in some areas such as technical drawing understanding. In this paper, we present a floor plan database, named CVC-FP, that is annotated for the architectural objects and their structural relations. To construct this database, we have implemented a groundtruthing tool, the SGT tool, that allows to make specific this sort of information in a natural manner. This tool has been made for general purpose groundtruthing: It allows to define own object classes and properties, multiple labeling options are possible, grants the cooperative work, and provides user and version control. We finally have collected some of the recent work on floor plan interpretation and present a quantitative benchmark for this database. Both CVC-FP database and the SGT tool are freely released to the research community to ease comparisons between methods and boost reproducible research.
	Address
	Corporate Author				Thesis
	Publisher	Springer Berlin Heidelberg	Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN	1433-2833	ISBN		Medium
	Area		Expedition		Conference
	Notes	DAG; ADAS; 600.061; 600.076; 600.077			Approved	no
	Call Number	Admin @ si @ HRR2015			Serial	2567
Permanent link to this record



	Author	David Aldavert; Marçal Rusiñol; Ricardo Toledo; Josep Llados
	Title	A Study of Bag-of-Visual-Words Representations for Handwritten Keyword Spotting			Type	Journal Article
	Year	2015	Publication	International Journal on Document Analysis and Recognition	Abbreviated Journal	IJDAR
	Volume	18	Issue	3	Pages	223-234
	Keywords	Bag-of-Visual-Words; Keyword spotting; Handwritten documents; Performance evaluation
	Abstract	The Bag-of-Visual-Words (BoVW) framework has gained popularity among the document image analysis community, specifically as a representation of handwritten words for recognition or spotting purposes. Although in the computer vision field the BoVW method has been greatly improved, most of the approaches in the document image analysis domain still rely on the basic implementation of the BoVW method disregarding such latest refinements. In this paper, we present a review of those improvements and its application to the keyword spotting task. We thoroughly evaluate their impact against a baseline system in the well-known George Washington dataset and compare the obtained results against nine state-of-the-art keyword spotting methods. In addition, we also compare both the baseline and improved systems with the methods presented at the Handwritten Keyword Spotting Competition 2014.
	Address
	Corporate Author				Thesis
	Publisher	Springer Berlin Heidelberg	Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN	1433-2833	ISBN		Medium
	Area		Expedition		Conference
	Notes	DAG; ADAS; 600.055; 600.061; 601.223; 600.077; 600.097			Approved	no
	Call Number	Admin @ si @ ART2015			Serial	2679
Permanent link to this record



	Author	Lluis Pere de las Heras; David Fernandez; Alicia Fornes; Ernest Valveny; Gemma Sanchez; Josep Llados
	Title	Runlength Histogram Image Signature for Perceptual Retrieval of Architectural Floor Plans			Type	Book Chapter
	Year	2014	Publication	Graphics Recognition. Current Trends and Challenges	Abbreviated Journal
	Volume	8746	Issue		Pages	135-146
	Keywords	Graphics recognition; Graphics retrieval; Image classification
	Abstract	This paper proposes a runlength histogram signature as a perceptual descriptor of architectural plans in a retrieval scenario. The style of an architectural drawing is characterized by the perception of lines, shapes and texture. Such visual stimuli are the basis for defining semantic concepts as space properties, symmetry, density, etc. We propose runlength histograms extracted in vertical, horizontal and diagonal directions as a characterization of line and space properties in floorplans, so it can be roughly associated to a description of walls and room structure. A retrieval application illustrates the performance of the proposed approach, where given a plan as a query, similar ones are obtained from a database. A ground truth based on human observation has been constructed to validate the hypothesis. Additional retrieval results on sketched building’s facades are reported qualitatively in this paper. Its good description and its adaptability to two different sketch drawings despite its simplicity shows the interest of the proposed approach and opens a challenging research line in graphics recognition.
	Address
	Corporate Author				Thesis
	Publisher	Springer Berlin Heidelberg	Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title	LNCS
	Series Volume		Series Issue		Edition
	ISSN	0302-9743	ISBN	978-3-662-44853-3	Medium
	Area		Expedition		Conference
	Notes	DAG; ADAS; 600.045; 600.056; 600.061; 600.076; 600.077			Approved	no
	Call Number	Admin @ si @ HFF2014			Serial	2536
Permanent link to this record



	Author	David Aldavert; Marçal Rusiñol; Ricardo Toledo; Josep Llados
	Title	Integrating Visual and Textual Cues for Query-by-String Word Spotting			Type	Conference Article
	Year	2013	Publication	12th International Conference on Document Analysis and Recognition	Abbreviated Journal
	Volume		Issue		Pages	511 - 515
	Keywords
	Abstract	In this paper, we present a word spotting framework that follows the query-by-string paradigm where word images are represented both by textual and visual representations. The textual representation is formulated in terms of character $n$-grams while the visual one is based on the bag-of-visual-words scheme. These two representations are merged together and projected to a sub-vector space. This transform allows to, given a textual query, retrieve word instances that were only represented by the visual modality. Moreover, this statistical representation can be used together with state-of-the-art indexation structures in order to deal with large-scale scenarios. The proposed method is evaluated using a collection of historical documents outperforming state-of-the-art performances.
	Address	Washington; USA; August 2013
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN	1520-5363	ISBN		Medium
	Area		Expedition		Conference	ICDAR
	Notes	DAG; ADAS; 600.045; 600.055; 600.061			Approved	no
	Call Number	Admin @ si @ ART2013			Serial	2224
Permanent link to this record



	Author	Josep Brugues Pujolras; Lluis Gomez; Dimosthenis Karatzas
	Title	A Multilingual Approach to Scene Text Visual Question Answering			Type	Conference Article
	Year	2022	Publication	Document Analysis Systems.15th IAPR International Workshop, (DAS2022)	Abbreviated Journal
	Volume		Issue		Pages	65-79
	Keywords	Scene text; Visual question answering; Multilingual word embeddings; Vision and language; Deep learning
	Abstract	Scene Text Visual Question Answering (ST-VQA) has recently emerged as a hot research topic in Computer Vision. Current ST-VQA models have a big potential for many types of applications but lack the ability to perform well on more than one language at a time due to the lack of multilingual data, as well as the use of monolingual word embeddings for training. In this work, we explore the possibility to obtain bilingual and multilingual VQA models. In that regard, we use an already established VQA model that uses monolingual word embeddings as part of its pipeline and substitute them by FastText and BPEmb multilingual word embeddings that have been aligned to English. Our experiments demonstrate that it is possible to obtain bilingual and multilingual VQA models with a minimal loss in performance in languages not used during training, as well as a multilingual model trained in multiple languages that match the performance of the respective monolingual baselines.
	Address	La Rochelle, France; May 22–25, 2022
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference	DAS
	Notes	DAG; 611.004; 600.155; 601.002			Approved	no
	Call Number	Admin @ si @ BGK2022b			Serial	3695
Permanent link to this record



	Author	Francisco Alvaro; Francisco Cruz; Joan Andreu Sanchez; Oriol Ramos Terrades; Jose Miguel Bemedi
	Title	Page Segmentation of Structured Documents Using 2D Stochastic Context-Free Grammars			Type	Conference Article
	Year	2013	Publication	6th Iberian Conference on Pattern Recognition and Image Analysis	Abbreviated Journal
	Volume	7887	Issue		Pages	133-140
	Keywords
	Abstract	In this paper we define a bidimensional extension of Stochastic Context-Free Grammars for page segmentation of structured documents. Two sets of text classification features are used to perform an initial classification of each zone of the page. Then, the page segmentation is obtained as the most likely hypothesis according to a grammar. This approach is compared to Conditional Random Fields and results show significant improvements in several cases. Furthermore, grammars provide a detailed segmentation that allowed a semantic evaluation which also validates this model.
	Address	Madeira; Portugal; June 2013
	Corporate Author				Thesis
	Publisher	Springer Berlin Heidelberg	Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title	LNCS
	Series Volume		Series Issue		Edition
	ISSN	0302-9743	ISBN	978-3-642-38627-5	Medium
	Area		Expedition		Conference	IbPRIA
	Notes	DAG; 605.203			Approved	no
	Call Number	Admin @ si @ ACS2013			Serial	2328
Permanent link to this record



	Author	Joan Mas; Alicia Fornes; Josep Llados
	Title	An Interactive Transcription System of Census Records using Word-Spotting based Information Transfer			Type	Conference Article
	Year	2016	Publication	12th IAPR Workshop on Document Analysis Systems	Abbreviated Journal
	Volume		Issue		Pages	54-59
	Keywords
	Abstract	This paper presents a system to assist in the transcription of historical handwritten census records in a crowdsourcing platform. Census records have a tabular structured layout. They consist in a sequence of rows with information of homes ordered by street address. For each household snippet in the page, the list of family members is reported. The censuses are recorded in intervals of a few years and the information of individuals in each household is quite stable from a point in time to the next one. This redundancy is used to assist the transcriber, so the redundant information is transferred from the census already transcribed to the next one. Household records are aligned from one year to the next one using the knowledge of the ordering by street address. Given an already transcribed census, a query by string word spotting is applied. Thus, names from the census in time t are used as queries in the corresponding home record in time t+1. Since the search is constrained, the obtained precision-recall values are very high, with an important reduction in the transcription time. The proposed system has been tested in a real citizen-science experience where non expert users transcribe the census data of their home town.
	Address	Santorini; Greece; April 2016
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference	DAS
	Notes	DAG; 603.053; 602.006; 600.061; 600.077; 600.097			Approved	no
	Call Number	Admin @ si @ MFL2016			Serial	2751
Permanent link to this record



	Author	Jialuo Chen; Mohamed Ali Souibgui; Alicia Fornes; Beata Megyesi
	Title	Unsupervised Alphabet Matching in Historical Encrypted Manuscript Images			Type	Conference Article
	Year	2021	Publication	4th International Conference on Historical Cryptology	Abbreviated Journal
	Volume		Issue		Pages	34-37
	Keywords
	Abstract	Historical ciphers contain a wide range ofsymbols from various symbol sets. Iden-tifying the cipher alphabet is a prerequi-site before decryption can take place andis a time-consuming process. In this workwe explore the use of image processing foridentifying the underlying alphabet in ci-pher images, and to compare alphabets be-tween ciphers. The experiments show thatciphers with similar alphabets can be suc-cessfully discovered through clustering.
	Address	Virtual; September 2021
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference	HistoCrypt
	Notes	DAG; 602.230; 600.140; 600.121			Approved	no
	Call Number	Admin @ si @ CSF2021			Serial	3617
Permanent link to this record



	Author	Pau Torras; Mohamed Ali Souibgui; Jialuo Chen; Alicia Fornes
	Title	A Transcription Is All You Need: Learning to Align through Attention			Type	Conference Article
	Year	2021	Publication	14th IAPR International Workshop on Graphics Recognition	Abbreviated Journal
	Volume	12916	Issue		Pages	141–146
	Keywords
	Abstract	Historical ciphered manuscripts are a type of document where graphical symbols are used to encrypt their content instead of regular text. Nowadays, expert transcriptions can be found in libraries alongside the corresponding manuscript images. However, those transcriptions are not aligned, so these are barely usable for training deep learning-based recognition methods. To solve this issue, we propose a method to align each symbol in the transcript of an image with its visual representation by using an attention-based Sequence to Sequence (Seq2Seq) model. The core idea is that, by learning to recognise symbols sequence within a cipher line image, the model also identifies their position implicitly through an attention mechanism. Thus, the resulting symbol segmentation can be later used for training algorithms. The experimental evaluation shows that this method is promising, especially taking into account the small size of the cipher dataset.
	Address	Virtual; September 2021
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title	LNCS
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference	GREC
	Notes	DAG; 602.230; 600.140; 600.121			Approved	no
	Call Number	Admin @ si @ TSC2021			Serial	3619
Permanent link to this record



	Author	Mohamed Ali Souibgui; Ali Furkan Biten; Sounak Dey; Alicia Fornes; Yousri Kessentini; Lluis Gomez; Dimosthenis Karatzas; Josep Llados
	Title	One-shot Compositional Data Generation for Low Resource Handwritten Text Recognition			Type	Conference Article
	Year	2022	Publication	Winter Conference on Applications of Computer Vision	Abbreviated Journal
	Volume		Issue		Pages
	Keywords	Document Analysis
	Abstract	Low resource Handwritten Text Recognition (HTR) is a hard problem due to the scarce annotated data and the very limited linguistic information (dictionaries and language models). This appears, for example, in the case of historical ciphered manuscripts, which are usually written with invented alphabets to hide the content. Thus, in this paper we address this problem through a data generation technique based on Bayesian Program Learning (BPL). Contrary to traditional generation approaches, which require a huge amount of annotated images, our method is able to generate human-like handwriting using only one sample of each symbol from the desired alphabet. After generating symbols, we create synthetic lines to train state-of-the-art HTR architectures in a segmentation free fashion. Quantitative and qualitative analyses were carried out and confirm the effectiveness of the proposed method, achieving competitive results compared to the usage of real annotated data.
	Address	Virtual; January 2022
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference	WACV
	Notes	DAG; 602.230; 600.140			Approved	no
	Call Number	Admin @ si @ SBD2022			Serial	3615
Permanent link to this record



	Author	Mohamed Ali Souibgui; Y.Kessentini
	Title	DE-GAN: A Conditional Generative Adversarial Network for Document Enhancement			Type	Journal Article
	Year	2022	Publication	IEEE Transactions on Pattern Analysis and Machine Intelligence	Abbreviated Journal	TPAMI
	Volume	44	Issue	3	Pages	1180-1191
	Keywords
	Abstract	Documents often exhibit various forms of degradation, which make it hard to be read and substantially deteriorate the performance of an OCR system. In this paper, we propose an effective end-to-end framework named Document Enhancement Generative Adversarial Networks (DE-GAN) that uses the conditional GANs (cGANs) to restore severely degraded document images. To the best of our knowledge, this practice has not been studied within the context of generative adversarial deep networks. We demonstrate that, in different tasks (document clean up, binarization, deblurring and watermark removal), DE-GAN can produce an enhanced version of the degraded document with a high quality. In addition, our approach provides consistent improvements compared to state-of-the-art methods over the widely used DIBCO 2013, DIBCO 2017 and H-DIBCO 2018 datasets, proving its ability to restore a degraded document image to its ideal condition. The obtained results on a wide variety of degradation reveal the flexibility of the proposed model to be exploited in other document enhancement problems.
	Address	1 March 2022
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference
	Notes	DAG; 602.230; 600.121; 600.140			Approved	no
	Call Number	Admin @ si @ SoK2022			Serial	3454
Permanent link to this record



	Author	Anjan Dutta; Hichem Sahbi
	Title	Stochastic Graphlet Embedding			Type	Journal Article
	Year	2018	Publication	IEEE Transactions on Neural Networks and Learning Systems	Abbreviated Journal	TNNLS
	Volume		Issue		Pages	1-14
	Keywords	Stochastic graphlets; Graph embedding; Graph classification; Graph hashing; Betweenness centrality
	Abstract	Graph-based methods are known to be successful in many machine learning and pattern classification tasks. These methods consider semi-structured data as graphs where nodes correspond to primitives (parts, interest points, segments, etc.) and edges characterize the relationships between these primitives. However, these non-vectorial graph data cannot be straightforwardly plugged into off-the-shelf machine learning algorithms without a preliminary step of – explicit/implicit –graph vectorization and embedding. This embedding process should be resilient to intra-class graph variations while being highly discriminant. In this paper, we propose a novel high-order stochastic graphlet embedding (SGE) that maps graphs into vector spaces. Our main contribution includes a new stochastic search procedure that efficiently parses a given graph and extracts/samples unlimitedly high-order graphlets. We consider these graphlets, with increasing orders, to model local primitives as well as their increasingly complex interactions. In order to build our graph representation, we measure the distribution of these graphlets into a given graph, using particular hash functions that efficiently assign sampled graphlets into isomorphic sets with a very low probability of collision. When combined with maximum margin classifiers, these graphlet-based representations have positive impact on the performance of pattern comparison and recognition as corroborated through extensive experiments using standard benchmark databases.
	Address
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference
	Notes	DAG; 602.167; 602.168; 600.097; 600.121			Approved	no
	Call Number	Admin @ si @ DuS2018			Serial	3225
Permanent link to this record



	Author	Sounak Dey; Anjan Dutta; Suman Ghosh; Ernest Valveny; Josep Llados; Umapada Pal
	Title	Learning Cross-Modal Deep Embeddings for Multi-Object Image Retrieval using Text and Sketch			Type	Conference Article
	Year	2018	Publication	24th International Conference on Pattern Recognition	Abbreviated Journal
	Volume		Issue		Pages	916 - 921
	Keywords
	Abstract	In this work we introduce a cross modal image retrieval system that allows both text and sketch as input modalities for the query. A cross-modal deep network architecture is formulated to jointly model the sketch and text input modalities as well as the the image output modality, learning a common embedding between text and images and between sketches and images. In addition, an attention model is used to selectively focus the attention on the different objects of the image, allowing for retrieval with multiple objects in the query. Experiments show that the proposed method performs the best in both single and multiple object image retrieval in standard datasets.
	Address	Beijing; China; August 2018
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference	ICPR
	Notes	DAG; 602.167; 602.168; 600.097; 600.084; 600.121; 600.129			Approved	no
	Call Number	Admin @ si @ DDG2018b			Serial	3152
Permanent link to this record