Publicacions CVC -- Query Results

[41–50] << 51 52 53 54 55 56 57 58 59 60 >> [61–70]

Details

	Records
	Author	Alicia Fornes; Josep Llados; Gemma Sanchez
	Title	Primitive Segmentation in Old Handwritten Music Scores			Type	Book Chapter
	Year	2006	Publication	Graphics Recognition: Ten Years Review and Future Perspectives, W. Liu, J. Llados (Eds.), LNCS 3926: 288–299	Abbreviated Journal
	Volume		Issue		Pages
	Keywords
	Abstract
	Address
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference
	Notes	DAG			Approved	no
	Call Number	DAG @ dag @ FLS2006			Serial	697
Permanent link to this record



	Author	Alfons Juan-Ciscar; Gemma Sanchez
	Title	PRIS 2008. Pattern Recognition in Information Systems. Proceedings of the 8th international Workshop on Pattern Recognition in Information systems – PRIS 2008, in conjunction with ICEIS 2008			Type	Book Whole
	Year	2008	Publication		Abbreviated Journal
	Volume		Issue		Pages
	Keywords
	Abstract
	Address	Barcelona (Spain)
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference
	Notes	DAG			Approved	no
	Call Number	DAG @ dag @ JuS2008			Serial	1054
Permanent link to this record



	Author	Ruben Tito; Khanh Nguyen; Marlon Tobaben; Raouf Kerkouche; Mohamed Ali Souibgui; Kangsoo Jung; Lei Kang; Ernest Valveny; Antti Honkela; Mario Fritz; Dimosthenis Karatzas
	Title	Privacy-Aware Document Visual Question Answering			Type	Miscellaneous
	Year	2023	Publication	Arxiv	Abbreviated Journal
	Volume		Issue		Pages
	Keywords
	Abstract	Document Visual Question Answering (DocVQA) is a fast growing branch of document understanding. Despite the fact that documents contain sensitive or copyrighted information, none of the current DocVQA methods offers strong privacy guarantees. In this work, we explore privacy in the domain of DocVQA for the first time. We highlight privacy issues in state of the art multi-modal LLM models used for DocVQA, and explore possible solutions. Specifically, we focus on the invoice processing use case as a realistic, widely used scenario for document understanding, and propose a large scale DocVQA dataset comprising invoice documents and associated questions and answers. We employ a federated learning scheme, that reflects the real-life distribution of documents in different businesses, and we explore the use case where the ID of the invoice issuer is the sensitive information to be protected. We demonstrate that non-private models tend to memorise, behaviour that can lead to exposing private information. We then evaluate baseline training schemes employing federated learning and differential privacy in this multi-modal scenario, where the sensitive information might be exposed through any of the two input modalities: vision (document image) or language (OCR tokens). Finally, we design an attack exploiting the memorisation effect of the model, and demonstrate its effectiveness in probing different DocVQA models.
	Address
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference
	Notes	DAG			Approved	no
	Call Number	Admin @ si @ PNT2023			Serial	4012
Permanent link to this record



	Author	Francisco Cruz
	Title	Probabilistic Graphical Models for Document Analysis			Type	Book Whole
	Year	2016	Publication	PhD Thesis, Universitat Autonoma de Barcelona-CVC	Abbreviated Journal
	Volume		Issue		Pages
	Keywords
	Abstract	Latest advances in digitization techniques have fostered the interest in creating digital copies of collections of documents. Digitized documents permit an easy maintenance, loss-less storage, and efficient ways for transmission and to perform information retrieval processes. This situation has opened a new market niche to develop systems able to automatically extract and analyze information contained in these collections, specially in the ambit of the business activity. Due to the great variety of types of documents this is not a trivial task. For instance, the automatic extraction of numerical data from invoices differs substantially from a task of text recognition in historical documents. However, in order to extract the information of interest, is always necessary to identify the area of the document where it is located. In the area of Document Analysis we refer to this process as layout analysis, which aims at identifying and categorizing the different entities that compose the document, such as text regions, pictures, text lines, or tables, among others. To perform this task it is usually necessary to incorporate a prior knowledge about the task into the analysis process, which can be modeled by defining a set of contextual relations between the different entities of the document. The use of context has proven to be useful to reinforce the recognition process and improve the results on many computer vision tasks. It presents two fundamental questions: What kind of contextual information is appropriate for a given task, and how to incorporate this information into the models. In this thesis we study several ways to incorporate contextual information to the task of document layout analysis, and to the particular case of handwritten text line segmentation. We focus on the study of Probabilistic Graphical Models and other mechanisms for this purpose, and propose several solutions to these problems. First, we present a method for layout analysis based on Conditional Random Fields. With this model we encode local contextual relations between variables, such as pair-wise constraints. Besides, we encode a set of structural relations between different classes of regions at feature level. Second, we present a method based on 2D-Probabilistic Context-free Grammars to encode structural and hierarchical relations. We perform a comparative study between Probabilistic Graphical Models and this syntactic approach. Third, we propose a method for structured documents based on Bayesian Networks to represent the document structure, and an algorithm based in the Expectation-Maximization to find the best configuration of the page. We perform a thorough evaluation of the proposed methods on two particular collections of documents: a historical collection composed of ancient structured documents, and a collection of contemporary documents. In addition, we present a general method for the task of handwritten text line segmentation. We define a probabilistic framework where we combine the EM algorithm with variational approaches for computing inference and parameter learning on a Markov Random Field. We evaluate our method on several collections of documents, including a general dataset of annotated administrative documents. Results demonstrate the applicability of our method to real problems, and the contribution of the use of contextual information to this kind of problems.
	Address
	Corporate Author				Thesis	Ph.D. thesis
	Publisher	Ediciones Graficas Rey	Place of Publication		Editor	Oriol Ramos Terrades
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN	978-84-945373-2-5	Medium
	Area		Expedition		Conference
	Notes	DAG			Approved	no
	Call Number	Admin @ si @ Cru2016			Serial	2861
Permanent link to this record



	Author	Anjan Dutta; Josep Llados; Horst Bunke; Umapada Pal
	Title	Product graph-based higher order contextual similarities for inexact subgraph matching			Type	Journal Article
	Year	2018	Publication	Pattern Recognition	Abbreviated Journal	PR
	Volume	76	Issue		Pages	596-611
	Keywords
	Abstract	Many algorithms formulate graph matching as an optimization of an objective function of pairwise quantification of nodes and edges of two graphs to be matched. Pairwise measurements usually consider local attributes but disregard contextual information involved in graph structures. We address this issue by proposing contextual similarities between pairs of nodes. This is done by considering the tensor product graph (TPG) of two graphs to be matched, where each node is an ordered pair of nodes of the operand graphs. Contextual similarities between a pair of nodes are computed by accumulating weighted walks (normalized pairwise similarities) terminating at the corresponding paired node in TPG. Once the contextual similarities are obtained, we formulate subgraph matching as a node and edge selection problem in TPG. We use contextual similarities to construct an objective function and optimize it with a linear programming approach. Since random walk formulation through TPG takes into account higher order information, it is not a surprise that we obtain more reliable similarities and better discrimination among the nodes and edges. Experimental results shown on synthetic as well as real benchmarks illustrate that higher order contextual similarities increase discriminating power and allow one to find approximate solutions to the subgraph matching problem.
	Address
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference
	Notes	DAG; 602.167; 600.097; 600.121			Approved	no
	Call Number	Admin @ si @ DLB2018			Serial	3083
Permanent link to this record



	Author	T.O. Nguyen; Salvatore Tabbone; Oriol Ramos Terrades; A.T. Thierry
	Title	Proposition d'un descripteur de formes et du modèle vectoriel pour la recherche de symboles			Type	Conference Article
	Year	2008	Publication	Colloque International Francophone sur l'Ecrit et le Document	Abbreviated Journal
	Volume		Issue		Pages	79-84
	Keywords
	Abstract
	Address	Rouen, France
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference	CIFED
	Notes	DAG			Approved	no
	Call Number	Admin @ si @ NTR2008b			Serial	1875
Permanent link to this record



	Author	Fernando Vilariño
	Title	Public Libraries Exploring how technology transforms the cultural experience of people			Type	Conference Article
	Year	2019	Publication	Workshop on Social Impact of AI. Open Living Lab Days Conference.	Abbreviated Journal
	Volume		Issue		Pages
	Keywords
	Abstract
	Address	Thessaloniki; Grecia; September 2019
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference
	Notes	MV; DAG; 600.140; 600.121;SIAI			Approved	no
	Call Number	Admin @ si @ Vil2019b			Serial	3458
Permanent link to this record



	Author	Anjan Dutta; Pau Riba; Josep Llados; Alicia Fornes
	Title	Pyramidal Stochastic Graphlet Embedding for Document Pattern Classification			Type	Conference Article
	Year	2017	Publication	14th International Conference on Document Analysis and Recognition	Abbreviated Journal
	Volume		Issue		Pages	33-38
	Keywords	graph embedding; hierarchical graph representation; graph clustering; stochastic graphlet embedding; graph classification
	Abstract	Document pattern classification methods using graphs have received a lot of attention because of its robust representation paradigm and rich theoretical background. However, the way of preserving and the process for delineating documents with graphs introduce noise in the rendition of underlying data, which creates instability in the graph representation. To deal with such unreliability in representation, in this paper, we propose Pyramidal Stochastic Graphlet Embedding (PSGE). Given a graph representing a document pattern, our method first computes a graph pyramid by successively reducing the base graph. Once the graph pyramid is computed, we apply Stochastic Graphlet Embedding (SGE) for each level of the pyramid and combine their embedded representation to obtain a global delineation of the original graph. The consideration of pyramid of graphs rather than just a base graph extends the representational power of the graph embedding, which reduces the instability caused due to noise and distortion. When plugged with support vector machine, our proposed PSGE has outperformed the state-of-the-art results in recognition of handwritten words as well as graphical symbols
	Address
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference	ICDAR
	Notes	DAG; 600.097; 601.302; 600.121			Approved	no
	Call Number	Admin @ si @ DRL2017			Serial	3054
Permanent link to this record



	Author	Suman Ghosh; Ernest Valveny
	Title	Query by String word spotting based on character bi-gram indexing			Type	Conference Article
	Year	2015	Publication	13th International Conference on Document Analysis and Recognition ICDAR2015	Abbreviated Journal
	Volume		Issue		Pages	881-885
	Keywords
	Abstract	In this paper we propose a segmentation-free query by string word spotting method. Both the documents and query strings are encoded using a recently proposed word representa- tion that projects images and strings into a common atribute space based on a pyramidal histogram of characters(PHOC). These attribute models are learned using linear SVMs over the Fisher Vector representation of the images along with the PHOC labels of the corresponding strings. In order to search through the whole page, document regions are indexed per character bi- gram using a similar attribute representation. On top of that, we propose an integral image representation of the document using a simplified version of the attribute model for efficient computation. Finally we introduce a re-ranking step in order to boost retrieval performance. We show state-of-the-art results for segmentation-free query by string word spotting in single-writer and multi-writer standard datasets
	Address	Nancy; France; August 2015
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference	ICDAR
	Notes	DAG; 600.077			Approved	no
	Call Number	Admin @ si @ GhV2015a			Serial	2715
Permanent link to this record



	Author	Partha Pratim Roy; Umapada Pal; Josep Llados
	Title	Query Driven Word Retrieval in Graphical Documents			Type	Conference Article
	Year	2010	Publication	9th IAPR International Workshop on Document Analysis Systems	Abbreviated Journal
	Volume		Issue		Pages	191–198
	Keywords
	Abstract	In this paper, we present an approach towards the retrieval of words from graphical document images. In graphical documents, due to presence of multi-oriented characters in non-structured layout, word indexing is a challenging task. The proposed approach uses recognition results of individual components to form character pairs with the neighboring components. An indexing scheme is designed to store the spatial description of components and to access them efficiently. Given a query text word (ascii/unicode format), the character pairs present in it are searched in the document. Next the retrieved character pairs are linked sequentially to form character string. Dynamic programming is applied to find different instances of query words. A string edit distance is used here to match the query word as the objective function. Recognition of multi-scale and multi-oriented character component is done using Support Vector Machine classifier. To consider multi-oriented character strings the features used in the SVM are invariant to character orientation. Experimental results show that the method is efficient to locate a query word from multi-oriented text in graphical documents.
	Address	Boston; USA
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN	978-1-60558-773-8	Medium
	Area		Expedition		Conference	DAS
	Notes	DAG			Approved	no
	Call Number	DAG @ dag @ RPL2010b			Serial	1433
Permanent link to this record

Select All Deselect All

[41–50] << 51 52 53 54 55 56 57 58 59 60 >> [61–70]

List View

Citations

Details

All Found Records Selected Records:

Save Citations: Format:

Export Records: Format: