Publicacions CVC -- Query Results

[11–20] << 21 22 23 24 25 26 27 28 29 30 >> [31–40]

Details

	Records
	Author	Thanh Ha Do; Salvatore Tabbone; Oriol Ramos Terrades
	Title	Sparse representation over learned dictionary for symbol recognition			Type	Journal Article
	Year	2016	Publication	Signal Processing	Abbreviated Journal	SP
	Volume	125	Issue		Pages	36-47
	Keywords	Symbol Recognition; Sparse Representation; Learned Dictionary; Shape Context; Interest Points
	Abstract	In this paper we propose an original sparse vector model for symbol retrieval task. More specically, we apply the K-SVD algorithm for learning a visual dictionary based on symbol descriptors locally computed around interest points. Results on benchmark datasets show that the obtained sparse representation is competitive related to state-of-the-art methods. Moreover, our sparse representation is invariant to rotation and scale transforms and also robust to degraded images and distorted symbols. Thereby, the learned visual dictionary is able to represent instances of unseen classes of symbols.
	Address
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference
	Notes	DAG; 600.061; 600.077			Approved	no
	Call Number	Admin @ si @ DTR2016			Serial	2946
Permanent link to this record



	Author	Pau Riba; Alicia Fornes; Josep Llados
	Title	Towards the Alignment of Handwritten Music Scores			Type	Book Chapter
	Year	2017	Publication	International Workshop on Graphics Recognition. GREC 2015.Graphic Recognition. Current Trends and Challenges	Abbreviated Journal
	Volume	9657	Issue		Pages	103-116
	Keywords	Optical Music Recognition; Handwritten Music Scores; Dynamic Time Warping alignment
	Abstract	It is very common to nd dierent versions of the same music work in archives of Opera Theaters. These dierences correspond to modications and annotations from the musicians. From the musicologist point of view, these variations are very interesting and deserve study. This paper explores the alignment of music scores as a tool for automatically detecting the passages that contain such dierences. Given the diculties in the recognition of handwritten music scores, our goal is to align the music scores and at the same time, avoid the recognition of music elements as much as possible. After removing the sta lines, braces and ties, the bar lines are detected. Then, the bar units are described as a whole using the Blurred Shape Model. The bar units alignment is performed by using Dynamic Time Warping. The analysis of the alignment path is used to detect the variations in the music scores. The method has been evaluated on a subset of the CVC-MUSCIMA dataset, showing encouraging results.
	Address
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor	Bart Lamiroy; R Dueire Lins
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title	LNCS
	Series Volume		Series Issue		Edition
	ISSN		ISBN	978-3-319-52158-9	Medium
	Area		Expedition		Conference
	Notes	DAG; 600.097; 602.006; 600.121			Approved	no
	Call Number	Admin @ si @ RFL2017			Serial	2955
Permanent link to this record



	Author	Marçal Rusiñol; J. Chazalon; Katerine Diaz
	Title	Augmented Songbook: an Augmented Reality Educational Application for Raising Music Awareness			Type	Journal Article
	Year	2018	Publication	Multimedia Tools and Applications	Abbreviated Journal	MTAP
	Volume	77	Issue	11	Pages	13773-13798
	Keywords	Augmented reality; Document image matching; Educational applications
	Abstract	This paper presents the development of an Augmented Reality mobile application which aims at sensibilizing young children to abstract concepts of music. Such concepts are, for instance, the musical notation or the idea of rhythm. Recent studies in Augmented Reality for education suggest that such technologies have multiple benefits for students, including younger ones. As mobile document image acquisition and processing gains maturity on mobile platforms, we explore how it is possible to build a markerless and real-time application to augment the physical documents with didactic animations and interactive virtual content. Given a standard image processing pipeline, we compare the performance of different local descriptors at two key stages of the process. Results suggest alternatives to the SIFT local descriptors, regarding result quality and computational efficiency, both for document model identification and perspective transform estimation. All experiments are performed on an original and public dataset we introduce here.
	Address
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference
	Notes	DAG; ADAS; 600.084; 600.121; 600.118; 600.129			Approved	no
	Call Number	Admin @ si @ RCD2018			Serial	2996
Permanent link to this record



	Author	Katerine Diaz; Jesus Martinez del Rincon; Aura Hernandez-Sabate; Marçal Rusiñol; Francesc J. Ferri
	Title	Fast Kernel Generalized Discriminative Common Vectors for Feature Extraction			Type	Journal Article
	Year	2018	Publication	Journal of Mathematical Imaging and Vision	Abbreviated Journal	JMIV
	Volume	60	Issue	4	Pages	512-524
	Keywords
	Abstract	This paper presents a supervised subspace learning method called Kernel Generalized Discriminative Common Vectors (KGDCV), as a novel extension of the known Discriminative Common Vectors method with Kernels. Our method combines the advantages of kernel methods to model complex data and solve nonlinear problems with moderate computational complexity, with the better generalization properties of generalized approaches for large dimensional data. These attractive combination makes KGDCV specially suited for feature extraction and classification in computer vision, image processing and pattern recognition applications. Two different approaches to this generalization are proposed, a first one based on the kernel trick (KT) and a second one based on the nonlinear projection trick (NPT) for even higher efficiency. Both methodologies have been validated on four different image datasets containing faces, objects and handwritten digits, and compared against well known non-linear state-of-art methods. Results show better discriminant properties than other generalized approaches both linear or kernel. In addition, the KGDCV-NPT approach presents a considerable computational gain, without compromising the accuracy of the model.
	Address
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference
	Notes	DAG; ADAS; 600.086; 600.130; 600.121; 600.118; 600.129;IAM			Approved	no
	Call Number	Admin @ si @ DMH2018a			Serial	3062
Permanent link to this record



	Author	David Aldavert
	Title	Efficient and Scalable Handwritten Word Spotting on Historical Documents using Bag of Visual Words			Type	Book Whole
	Year	2021	Publication	PhD Thesis, Universitat Autonoma de Barcelona-CVC	Abbreviated Journal
	Volume		Issue		Pages
	Keywords
	Abstract	Word spotting can be defined as the pattern recognition tasked aimed at locating and retrieving a specific keyword within a document image collection without explicitly transcribing the whole corpus. Its use is particularly interesting when applied in scenarios where Optical Character Recognition performs poorly or can not be used at all. This thesis focuses on such a scenario, word spotting on historical handwritten documents that have been written by a single author or by multiple authors with a similar calligraphy. This problem requires a visual signature that is robust to image artifacts, flexible to accommodate script variations and efficient to retrieve information in a rapid manner. For this, we have developed a set of word spotting methods that on their foundation use the well known Bag-of-Visual-Words (BoVW) representation. This representation has gained popularity among the document image analysis community to characterize handwritten words in an unsupervised manner. However, most approaches on this field rely on a basic BoVW configuration and disregard complex encoding and spatial representations. We determine which BoVW configurations provide the best performance boost to a spotting system. Then, we extend the segmentation-based word spotting, where word candidates are given a priori, to segmentation-free spotting. The proposed approach seeds the document images with overlapping word location candidates and characterizes them with a BoVW signature. Retrieval is achieved comparing the query and candidate signatures and returning the locations that provide a higher consensus. This is a simple but powerful approach that requires a more compact signature than in a segmentation-based scenario. We first project the BoVW signature into a reduced semantic topics space and then compress it further using Product Quantizers. The resulting signature only requires a few dozen bytes, allowing us to index thousands of pages on a common desktop computer. The final system still yields a performance comparable to the state-of-the-art despite all the information loss during the compression phases. Afterwards, we also study how to combine different modalities of information in order to create a query-by-X spotting system where, words are indexed using an information modality and queries are retrieved using another. We consider three different information modalities: visual, textual and audio. Our proposal is to create a latent feature space where features which are semantically related are projected onto the same topics. Creating thus a new feature space where information from different modalities can be compared. Later, we consider the codebook generation and descriptor encoding problem. The codebooks used to encode the BoVW signatures are usually created using an unsupervised clustering algorithm and, they require to test multiple parameters to determine which configuration is best for a certain document collection. We propose a semantic clustering algorithm which allows to estimate the best parameter from data. Since gather annotated data is costly, we use synthetically generated word images. The resulting codebook is database agnostic, i. e. a codebook that yields a good performance on document collections that use the same script. We also propose the use of an additional codebook to approximate descriptors and reduce the descriptor encoding complexity to sub-linear. Finally, we focus on the problem of signatures dimensionality. We propose a new symbol probability signature where each bin represents the probability that a certain symbol is present a certain location of the word image. This signature is extremely compact and combined with compression techniques can represent word images with just a few bytes per signature.
	Address	April 2021
	Corporate Author				Thesis	Ph.D. thesis
	Publisher	Ediciones Graficas Rey	Place of Publication		Editor	Marçal Rusiñol;Josep Llados
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN	978-84-122714-5-4	Medium
	Area		Expedition		Conference
	Notes	DAG; 600.121;ADAS			Approved	no
	Call Number	Admin @ si @ Ald2021			Serial	3601
Permanent link to this record



	Author	Sounak Dey; Anjan Dutta; Juan Ignacio Toledo; Suman Ghosh; Josep Llados; Umapada Pal
	Title	SigNet: Convolutional Siamese Network for Writer Independent Offline Signature Verification			Type	Miscellaneous
	Year	2018	Publication	Arxiv	Abbreviated Journal
	Volume		Issue		Pages
	Keywords
	Abstract	Offline signature verification is one of the most challenging tasks in biometrics and document forensics. Unlike other verification problems, it needs to model minute but critical details between genuine and forged signatures, because a skilled falsification might often resembles the real signature with small deformation. This verification task is even harder in writer independent scenarios which is undeniably fiscal for realistic cases. In this paper, we model an offline writer independent signature verification task with a convolutional Siamese network. Siamese networks are twin networks with shared weights, which can be trained to learn a feature space where similar observations are placed in proximity. This is achieved by exposing the network to a pair of similar and dissimilar observations and minimizing the Euclidean distance between similar pairs while simultaneously maximizing it between dissimilar pairs. Experiments conducted on cross-domain datasets emphasize the capability of our network to model forgery in different languages (scripts) and handwriting styles. Moreover, our designed Siamese network, named SigNet, exceeds the state-of-the-art results on most of the benchmark signature datasets, which paves the way for further research in this direction.
	Address
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference
	Notes	DAG; 600.097; 600.121			Approved	no
	Call Number	Admin @ si @ DDT2018			Serial	3085
Permanent link to this record



	Author	Lluis Pere de las Heras; Oriol Ramos Terrades; Josep Llados
	Title	Ontology-Based Understanding of Architectural Drawings			Type	Book Chapter
	Year	2017	Publication	International Workshop on Graphics Recognition. GREC 2015.Graphic Recognition. Current Trends and Challenges	Abbreviated Journal
	Volume	9657	Issue		Pages	75-85
	Keywords	Graphics recognition; Floor plan analysi; Domain ontology
	Abstract	In this paper we present a knowledge base of architectural documents aiming at improving existing methods of floor plan classification and understanding. It consists of an ontological definition of the domain and the inclusion of real instances coming from both, automatically interpreted and manually labeled documents. The knowledge base has proven to be an effective tool to structure our knowledge and to easily maintain and upgrade it. Moreover, it is an appropriate means to automatically check the consistency of relational data and a convenient complement of hard-coded knowledge interpretation systems.
	Address
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title	LNCS
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference
	Notes	DAG; 600.121			Approved	no
	Call Number	Admin @ si @ HRL2017			Serial	3086
Permanent link to this record



	Author	Sangheeta Roy; Palaiahnakote Shivakumara; Namita Jain; Vijeta Khare; Anjan Dutta; Umapada Pal; Tong Lu
	Title	Rough-Fuzzy based Scene Categorization for Text Detection and Recognition in Video			Type	Journal Article
	Year	2018	Publication	Pattern Recognition	Abbreviated Journal	PR
	Volume	80	Issue		Pages	64-82
	Keywords	Rough set; Fuzzy set; Video categorization; Scene image classification; Video text detection; Video text recognition
	Abstract	Scene image or video understanding is a challenging task especially when number of video types increases drastically with high variations in background and foreground. This paper proposes a new method for categorizing scene videos into different classes, namely, Animation, Outlet, Sports, e-Learning, Medical, Weather, Defense, Economics, Animal Planet and Technology, for the performance improvement of text detection and recognition, which is an effective approach for scene image or video understanding. For this purpose, at first, we present a new combination of rough and fuzzy concept to study irregular shapes of edge components in input scene videos, which helps to classify edge components into several groups. Next, the proposed method explores gradient direction information of each pixel in each edge component group to extract stroke based features by dividing each group into several intra and inter planes. We further extract correlation and covariance features to encode semantic features located inside planes or between planes. Features of intra and inter planes of groups are then concatenated to get a feature matrix. Finally, the feature matrix is verified with temporal frames and fed to a neural network for categorization. Experimental results show that the proposed method outperforms the existing state-of-the-art methods, at the same time, the performances of text detection and recognition methods are also improved significantly due to categorization.
	Address
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference
	Notes	DAG; 600.097; 600.121			Approved	no
	Call Number	Admin @ si @ RSJ2018			Serial	3096
Permanent link to this record



	Author	Thanh Nam Le; Muhammad Muzzamil Luqman; Anjan Dutta; Pierre Heroux; Christophe Rigaud; Clement Guerin; Pasquale Foggia; Jean Christophe Burie; Jean Marc Ogier; Josep Llados; Sebastien Adam
	Title	Subgraph spotting in graph representations of comic book images			Type	Journal Article
	Year	2018	Publication	Pattern Recognition Letters	Abbreviated Journal	PRL
	Volume	112	Issue		Pages	118-124
	Keywords	Attributed graph; Region adjacency graph; Graph matching; Graph isomorphism; Subgraph isomorphism; Subgraph spotting; Graph indexing; Graph retrieval; Query by example; Dataset and comic book images
	Abstract	Graph-based representations are the most powerful data structures for extracting, representing and preserving the structural information of underlying data. Subgraph spotting is an interesting research problem, especially for studying and investigating the structural information based content-based image retrieval (CBIR) and query by example (QBE) in image databases. In this paper we address the problem of lack of freely available ground-truthed datasets for subgraph spotting and present a new dataset for subgraph spotting in graph representations of comic book images (SSGCI) with its ground-truth and evaluation protocol. Experimental results of two state-of-the-art methods of subgraph spotting are presented on the new SSGCI dataset.
	Address
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference
	Notes	DAG; 600.097; 600.121			Approved	no
	Call Number	Admin @ si @ LLD2018			Serial	3150
Permanent link to this record



	Author	Fernando Vilariño; Dimosthenis Karatzas; Alberto Valcarce
	Title	The Library Living Lab Barcelona: A participative approach to technology as an enabling factor for innovation in cultural spaces			Type	Journal
	Year	2018	Publication	Technology Innovation Management Review	Abbreviated Journal
	Volume		Issue		Pages
	Keywords
	Abstract
	Address
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference
	Notes	DAG; MV; 600.097; 600.121; 600.129;SIAI			Approved	no
	Call Number	Admin @ si @ VKV2018a			Serial	3153
Permanent link to this record

Select All Deselect All

[11–20] << 21 22 23 24 25 26 27 28 29 30 >> [31–40]

List View

Citations

Details

All Found Records Selected Records:

Save Citations: Format:

Export Records: Format: