Publicacions CVC -- Query Results

[1–10] << 11 12 13 14 15 16 17 18 19 20 >> [21–26]

Details

	Records
	Author	Sanket Biswas; Pau Riba; Josep Llados; Umapada Pal
	Title	Beyond Document Object Detection: Instance-Level Segmentation of Complex Layouts			Type	Journal Article
	Year	2021	Publication	International Journal on Document Analysis and Recognition	Abbreviated Journal	IJDAR
	Volume	24	Issue		Pages	269–281
	Keywords
	Abstract	Information extraction is a fundamental task of many business intelligence services that entail massive document processing. Understanding a document page structure in terms of its layout provides contextual support which is helpful in the semantic interpretation of the document terms. In this paper, inspired by the progress of deep learning methodologies applied to the task of object recognition, we transfer these models to the specific case of document object detection, reformulating the traditional problem of document layout analysis. Moreover, we importantly contribute to prior arts by defining the task of instance segmentation on the document image domain. An instance segmentation paradigm is especially important in complex layouts whose contents should interact for the proper rendering of the page, i.e., the proper text wrapping around an image. Finally, we provide an extensive evaluation, both qualitative and quantitative, that demonstrates the superior performance of the proposed methodology over the current state of the art.
	Address
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference
	Notes	DAG; 600.121; 600.140; 110.312			Approved	no
	Call Number	Admin @ si @ BRL2021b			Serial	3574
Permanent link to this record



	Author	Minesh Mathew; Lluis Gomez; Dimosthenis Karatzas; C.V. Jawahar
	Title	Asking questions on handwritten document collections			Type	Journal Article
	Year	2021	Publication	International Journal on Document Analysis and Recognition	Abbreviated Journal	IJDAR
	Volume	24	Issue		Pages	235-249
	Keywords
	Abstract	This work addresses the problem of Question Answering (QA) on handwritten document collections. Unlike typical QA and Visual Question Answering (VQA) formulations where the answer is a short text, we aim to locate a document snippet where the answer lies. The proposed approach works without recognizing the text in the documents. We argue that the recognition-free approach is suitable for handwritten documents and historical collections where robust text recognition is often difficult. At the same time, for human users, document image snippets containing answers act as a valid alternative to textual answers. The proposed approach uses an off-the-shelf deep embedding network which can project both textual words and word images into a common sub-space. This embedding bridges the textual and visual domains and helps us retrieve document snippets that potentially answer a question. We evaluate results of the proposed approach on two new datasets: (i) HW-SQuAD: a synthetic, handwritten document image counterpart of SQuAD1.0 dataset and (ii) BenthamQA: a smaller set of QA pairs defined on documents from the popular Bentham manuscripts collection. We also present a thorough analysis of the proposed recognition-free approach compared to a recognition-based approach which uses text recognized from the images using an OCR. Datasets presented in this work are available to download at docvqa.org.
	Address
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference
	Notes	DAG; 600.121			Approved	no
	Call Number	Admin @ si @ MGK2021			Serial	3621
Permanent link to this record



	Author	Ayan Banerjee; Sanket Biswas; Josep Llados; Umapada Pal
	Title	SemiDocSeg: Harnessing Semi-Supervised Learning for Document Layout Analysis			Type	Journal Article
	Year	2024	Publication	International Journal on Document Analysis and Recognition	Abbreviated Journal	IJDAR
	Volume		Issue		Pages
	Keywords	Document layout analysis; Semi-supervised learning; Co-Occurrence matrix; Instance segmentation; Swin transformer
	Abstract	Document Layout Analysis (DLA) is the process of automatically identifying and categorizing the structural components (e.g. Text, Figure, Table, etc.) within a document to extract meaningful content and establish the page's layout structure. It is a crucial stage in document parsing, contributing to their comprehension. However, traditional DLA approaches often demand a significant volume of labeled training data, and the labor-intensive task of generating high-quality annotated training data poses a substantial challenge. In order to address this challenge, we proposed a semi-supervised setting that aims to perform learning on limited annotated categories by eliminating exhaustive and expensive mask annotations. The proposed setting is expected to be generalizable to novel categories as it learns the underlying positional information through a support set and class information through Co-Occurrence that can be generalized from annotated categories to novel categories. Here, we first extract features from the input image and support set with a shared multi-scale feature acquisition backbone. Then, the extracted feature representation is fed to the transformer encoder as a query. Later on, we utilize a semantic embedding network before the decoder to capture the underlying semantic relationships and similarities between different instances, enabling the model to make accurate predictions or classifications with only a limited amount of labeled data. Extensive experimentation on competitive benchmarks like PRIMA, DocLayNet, and Historical Japanese (HJ) demonstrate that this generalized setup obtains significant performance compared to the conventional supervised approach.
	Address	June 2024
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference
	Notes	DAG			Approved	no
	Call Number	Admin @ si @ BBL2024a			Serial	4001
Permanent link to this record



	Author	Josep Llados; Gemma Sanchez
	Title	Graph Matching vs. Graph Parsing in Graphics Recognition: A Combined Approach			Type	Journal
	Year	2004	Publication	International Journal of Pattern Recognition and Artificial Intelligence	Abbreviated Journal	IJPRAI
	Volume	18	Issue	3	Pages	455–473
	Keywords
	Abstract
	Address
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference
	Notes	DAG; IF: 0.588			Approved	no
	Call Number	DAG @ dag @ LlS2004			Serial	445
Permanent link to this record



	Author	Jaume Gibert; Ernest Valveny; Horst Bunke
	Title	Embedding of Graphs with Discrete Attributes Via Label Frequencies			Type	Journal Article
	Year	2013	Publication	International Journal of Pattern Recognition and Artificial Intelligence	Abbreviated Journal	IJPRAI
	Volume	27	Issue	3	Pages	1360002-1360029
	Keywords	Discrete attributed graphs; graph embedding; graph classification
	Abstract	Graph-based representations of patterns are very flexible and powerful, but they are not easily processed due to the lack of learning algorithms in the domain of graphs. Embedding a graph into a vector space solves this problem since graphs are turned into feature vectors and thus all the statistical learning machinery becomes available for graph input patterns. In this work we present a new way of embedding discrete attributed graphs into vector spaces using node and edge label frequencies. The methodology is experimentally tested on graph classification problems, using patterns of different nature, and it is shown to be competitive to state-of-the-art classification algorithms for graphs, while being computationally much more efficient.
	Address
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference
	Notes	DAG			Approved	no
	Call Number	Admin @ si @ GVB2013			Serial	2305
Permanent link to this record

Select All Deselect All

[1–10] << 11 12 13 14 15 16 17 18 19 20 >> [21–26]

List View

Citations

Details

All Found Records Selected Records:

Save Citations: Format:

Export Records: Format: