Publicacions CVC -- Query Results

[51–60] << 61 62 63 64 65 66 67 68 69 70 >> [71–74]

Details

	Records
	Author	Agnes Borras; Josep Llados
	Title	Object Image Retrieval by Shape Content in Complex Scenes Using Geometric Constraints			Type	Book Chapter
	Year	2005	Publication	Pattern Recognition And Image Analysis	Abbreviated Journal	LNCS
	Volume	3522	Issue		Pages	325–332
	Keywords
	Abstract	This paper presents an image retrieval system based on 2D shape information. Query shape objects and database images are repre- sented by polygonal approximations of their contours. Afterwards they are encoded, using geometric features, in terms of predefined structures. Shapes are then located in database images by a voting procedure on the spatial domain. Then an alignment matching provides a probability value to rank de database image in the retrieval result. The method al- lows to detect a query object in database images even when they contain complex scenes. Also the shape matching tolerates partial occlusions and affine transformations as translation, rotation or scaling.
	Address	Estoril (Portugal)
	Corporate Author				Thesis
	Publisher	Springer Link	Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference
	Notes	DAG;			Approved	no
	Call Number	DAG @ dag @ BoL2005; IAM @ iam @ BoL2005			Serial	556
Permanent link to this record



	Author	Lluis Gomez; Y. Patel; Marçal Rusiñol; C.V. Jawahar; Dimosthenis Karatzas
	Title	Self‐supervised learning of visual features through embedding images into text topic spaces			Type	Conference Article
	Year	2017	Publication	30th IEEE Conference on Computer Vision and Pattern Recognition	Abbreviated Journal
	Volume		Issue		Pages
	Keywords
	Abstract	End-to-end training from scratch of current deep architectures for new computer vision problems would require Imagenet-scale datasets, and this is not always possible. In this paper we present a method that is able to take advantage of freely available multi-modal content to train computer vision algorithms without human supervision. We put forward the idea of performing self-supervised learning of visual features by mining a large scale corpus of multi-modal (text and image) documents. We show that discriminative visual features can be learnt efficiently by training a CNN to predict the semantic context in which a particular image is more probable to appear as an illustration. For this we leverage the hidden semantic structures discovered in the text corpus with a well-known topic modeling technique. Our experiments demonstrate state of the art performance in image classification, object detection, and multi-modal retrieval compared to recent self-supervised or natural-supervised approaches.
	Address	Honolulu; Hawaii; July 2017
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference	CVPR
	Notes	DAG; 600.084; 600.121			Approved	no
	Call Number	Admin @ si @ GPR2017			Serial	2889
Permanent link to this record



	Author	Sounak Dey; Anjan Dutta; Juan Ignacio Toledo; Suman Ghosh; Josep Llados; Umapada Pal
	Title	SigNet: Convolutional Siamese Network for Writer Independent Offline Signature Verification			Type	Miscellaneous
	Year	2018	Publication	Arxiv	Abbreviated Journal
	Volume		Issue		Pages
	Keywords
	Abstract	Offline signature verification is one of the most challenging tasks in biometrics and document forensics. Unlike other verification problems, it needs to model minute but critical details between genuine and forged signatures, because a skilled falsification might often resembles the real signature with small deformation. This verification task is even harder in writer independent scenarios which is undeniably fiscal for realistic cases. In this paper, we model an offline writer independent signature verification task with a convolutional Siamese network. Siamese networks are twin networks with shared weights, which can be trained to learn a feature space where similar observations are placed in proximity. This is achieved by exposing the network to a pair of similar and dissimilar observations and minimizing the Euclidean distance between similar pairs while simultaneously maximizing it between dissimilar pairs. Experiments conducted on cross-domain datasets emphasize the capability of our network to model forgery in different languages (scripts) and handwriting styles. Moreover, our designed Siamese network, named SigNet, exceeds the state-of-the-art results on most of the benchmark signature datasets, which paves the way for further research in this direction.
	Address
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference
	Notes	DAG; 600.097; 600.121			Approved	no
	Call Number	Admin @ si @ DDT2018			Serial	3085
Permanent link to this record



	Author	Oriol Ramos Terrades; Albert Berenguel; Debora Gil
	Title	A flexible outlier detector based on a topology given by graph communities			Type	Miscellaneous
	Year	2020	Publication	Arxiv	Abbreviated Journal
	Volume		Issue		Pages
	Keywords
	Abstract	Outlier, or anomaly, detection is essential for optimal performance of machine learning methods and statistical predictive models. It is not just a technical step in a data cleaning process but a key topic in many fields such as fraudulent document detection, in medical applications and assisted diagnosis systems or detecting security threats. In contrast to population-based methods, neighborhood based local approaches are simple flexible methods that have the potential to perform well in small sample size unbalanced problems. However, a main concern of local approaches is the impact that the computation of each sample neighborhood has on the method performance. Most approaches use a distance in the feature space to define a single neighborhood that requires careful selection of several parameters. This work presents a local approach based on a local measure of the heterogeneity of sample labels in the feature space considered as a topological manifold. Topology is computed using the communities of a weighted graph codifying mutual nearest neighbors in the feature space. This way, we provide with a set of multiple neighborhoods able to describe the structure of complex spaces without parameter fine tuning. The extensive experiments on real-world data sets show that our approach overall outperforms, both, local and global strategies in multi and single view settings.
	Address
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference
	Notes	IAM; DAG; 600.139; 600.145; 600.140; 600.121			Approved	no
	Call Number	Admin @ si @ RBG2020			Serial	3475
Permanent link to this record



	Author	Lei Kang; Pau Riba; Marçal Rusiñol; Alicia Fornes; Mauricio Villegas
	Title	Pay Attention to What You Read: Non-recurrent Handwritten Text-Line Recognition			Type	Journal Article
	Year	2022	Publication	Pattern Recognition	Abbreviated Journal	PR
	Volume	129	Issue		Pages	108766
	Keywords
	Abstract	The advent of recurrent neural networks for handwriting recognition marked an important milestone reaching impressive recognition accuracies despite the great variability that we observe across different writing styles. Sequential architectures are a perfect fit to model text lines, not only because of the inherent temporal aspect of text, but also to learn probability distributions over sequences of characters and words. However, using such recurrent paradigms comes at a cost at training stage, since their sequential pipelines prevent parallelization. In this work, we introduce a non-recurrent approach to recognize handwritten text by the use of transformer models. We propose a novel method that bypasses any recurrence. By using multi-head self-attention layers both at the visual and textual stages, we are able to tackle character recognition as well as to learn language-related dependencies of the character sequences to be decoded. Our model is unconstrained to any predefined vocabulary, being able to recognize out-of-vocabulary words, i.e. words that do not appear in the training vocabulary. We significantly advance over prior art and demonstrate that satisfactory recognition accuracies are yielded even in few-shot learning scenarios.
	Address	Sept. 2022
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference
	Notes	DAG; 600.121; 600.162			Approved	no
	Call Number	Admin @ si @ KRR2022			Serial	3556
Permanent link to this record



	Author	Pau Riba; Andreas Fischer; Josep Llados; Alicia Fornes
	Title	Learning Graph Edit Distance by Graph NeuralNetworks			Type	Miscellaneous
	Year	2020	Publication	Arxiv	Abbreviated Journal
	Volume		Issue		Pages
	Keywords
	Abstract	The emergence of geometric deep learning as a novel framework to deal with graph-based representations has faded away traditional approaches in favor of completely new methodologies. In this paper, we propose a new framework able to combine the advances on deep metric learning with traditional approximations of the graph edit distance. Hence, we propose an efficient graph distance based on the novel field of geometric deep learning. Our method employs a message passing neural network to capture the graph structure, and thus, leveraging this information for its use on a distance computation. The performance of the proposed graph distance is validated on two different scenarios. On the one hand, in a graph retrieval of handwritten words~\ie~keyword spotting, showing its superior performance when compared with (approximate) graph edit distance benchmarks. On the other hand, demonstrating competitive results for graph similarity learning when compared with the current state-of-the-art on a recent benchmark dataset.
	Address
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference
	Notes	DAG; 600.121; 600.140; 601.302			Approved	no
	Call Number	Admin @ si @ RFL2020			Serial	3555
Permanent link to this record



	Author	Minesh Mathew; Ruben Tito; Dimosthenis Karatzas; R.Manmatha; C.V. Jawahar
	Title	Document Visual Question Answering Challenge 2020			Type	Conference Article
	Year	2020	Publication	33rd IEEE Conference on Computer Vision and Pattern Recognition – Short paper	Abbreviated Journal
	Volume		Issue		Pages
	Keywords
	Abstract	This paper presents results of Document Visual Question Answering Challenge organized as part of “Text and Documents in the Deep Learning Era” workshop, in CVPR 2020. The challenge introduces a new problem – Visual Question Answering on document images. The challenge comprised two tasks. The first task concerns with asking questions on a single document image. On the other hand, the second task is set as a retrieval task where the question is posed over a collection of images. For the task 1 a new dataset is introduced comprising 50,000 questions-answer(s) pairs defined over 12,767 document images. For task 2 another dataset has been created comprising 20 questions over 14,362 document images which share the same document template.
	Address
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference	CVPR
	Notes	DAG; 600.121			Approved	no
	Call Number	Admin @ si @ MTK2020			Serial	3558
Permanent link to this record



	Author	Soumya Jahagirdar; Minesh Mathew; Dimosthenis Karatzas; CV Jawahar
	Title	Watching the News: Towards VideoQA Models that can Read			Type	Conference Article
	Year	2023	Publication	Proceedings of the IEEE/CVF Winter Conference on Applications of Computer	Abbreviated Journal
	Volume		Issue		Pages
	Keywords
	Abstract	Video Question Answering methods focus on commonsense reasoning and visual cognition of objects or persons and their interactions over time. Current VideoQA approaches ignore the textual information present in the video. Instead, we argue that textual information is complementary to the action and provides essential contextualisation cues to the reasoning process. To this end, we propose a novel VideoQA task that requires reading and understanding the text in the video. To explore this direction, we focus on news videos and require QA systems to comprehend and answer questions about the topics presented by combining visual and textual cues in the video. We introduce the ``NewsVideoQA'' dataset that comprises more than 8,600 QA pairs on 3,000+ news videos obtained from diverse news channels from around the world. We demonstrate the limitations of current Scene Text VQA and VideoQA methods and propose ways to incorporate scene text information into VideoQA methods.
	Address	Waikoloa; Hawai; USA; January 2023
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference	WACV
	Notes	DAG			Approved	no
	Call Number	Admin @ si @ JMK2023			Serial	3899
Permanent link to this record



	Author	Arnau Baro; Carles Badal; Pau Torras; Alicia Fornes
	Title	Handwritten Historical Music Recognition through Sequence-to-Sequence with Attention Mechanism			Type	Conference Article
	Year	2022	Publication	3rd International Workshop on Reading Music Systems (WoRMS2021)	Abbreviated Journal
	Volume		Issue		Pages	55-59
	Keywords	Optical Music Recognition; Digits; Image Classification
	Abstract	Despite decades of research in Optical Music Recognition (OMR), the recognition of old handwritten music scores remains a challenge because of the variabilities in the handwriting styles, paper degradation, lack of standard notation, etc. Therefore, the research in OMR systems adapted to the particularities of old manuscripts is crucial to accelerate the conversion of music scores existing in archives into digital libraries, fostering the dissemination and preservation of our music heritage. In this paper we explore the adaptation of sequence-to-sequence models with attention mechanism (used in translation and handwritten text recognition) and the generation of specific synthetic data for recognizing old music scores. The experimental validation demonstrates that our approach is promising, especially when compared with long short-term memory neural networks.
	Address	July 23, 2021, Alicante (Spain)
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference	WoRMS
	Notes	DAG; 600.121; 600.162; 602.230; 600.140			Approved	no
	Call Number	Admin @ si @ BBT2022			Serial	3734
Permanent link to this record



	Author	Marwa Dhiaf; Mohamed Ali Souibgui; Kai Wang; Yuyang Liu; Yousri Kessentini; Alicia Fornes; Ahmed Cheikh Rouhou
	Title	CSSL-MHTR: Continual Self-Supervised Learning for Scalable Multi-script Handwritten Text Recognition			Type	Miscellaneous
	Year	2023	Publication	Arxiv	Abbreviated Journal
	Volume		Issue		Pages
	Keywords
	Abstract	Self-supervised learning has recently emerged as a strong alternative in document analysis. These approaches are now capable of learning high-quality image representations and overcoming the limitations of supervised methods, which require a large amount of labeled data. However, these methods are unable to capture new knowledge in an incremental fashion, where data is presented to the model sequentially, which is closer to the realistic scenario. In this paper, we explore the potential of continual self-supervised learning to alleviate the catastrophic forgetting problem in handwritten text recognition, as an example of sequence recognition. Our method consists in adding intermediate layers called adapters for each task, and efficiently distilling knowledge from the previous model while learning the current task. Our proposed framework is efficient in both computation and memory complexity. To demonstrate its effectiveness, we evaluate our method by transferring the learned model to diverse text recognition downstream tasks, including Latin and non-Latin scripts. As far as we know, this is the first application of continual self-supervised learning for handwritten text recognition. We attain state-of-the-art performance on English, Italian and Russian scripts, whilst adding only a few parameters per task. The code and trained models will be publicly available.
	Address
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference
	Notes	DAG			Approved	no
	Call Number	Admin @ si @ DSW2023			Serial	3851
Permanent link to this record

Select All Deselect All

[51–60] << 61 62 63 64 65 66 67 68 69 70 >> [71–74]

List View

Citations

Details

All Found Records Selected Records:

Save Citations: Format:

Export Records: Format: