|
Records |
Links |
|
Author |
Ali Furkan Biten; R. Tito; Andres Mafla; Lluis Gomez; Marçal Rusiñol; M. Mathew; C.V. Jawahar; Ernest Valveny; Dimosthenis Karatzas |
|
|
Title |
ICDAR 2019 Competition on Scene Text Visual Question Answering |
Type |
Conference Article |
|
Year |
2019 |
Publication |
15th International Conference on Document Analysis and Recognition |
Abbreviated Journal |
|
|
|
Volume |
|
Issue |
|
Pages |
1563-1570 |
|
|
Keywords |
|
|
|
Abstract |
This paper presents final results of ICDAR 2019 Scene Text Visual Question Answering competition (ST-VQA). ST-VQA introduces an important aspect that is not addressed by any Visual Question Answering system up to date, namely the incorporation of scene text to answer questions asked about an image. The competition introduces a new dataset comprising 23,038 images annotated with 31,791 question / answer pairs where the answer is always grounded on text instances present in the image. The images are taken from 7 different public computer vision datasets, covering a wide range of scenarios. The competition was structured in three tasks of increasing difficulty, that require reading the text in a scene and understanding it in the context of the scene, to correctly answer a given question. A novel evaluation metric is presented, which elegantly assesses both key capabilities expected from an optimal model: text recognition and image understanding. A detailed analysis of results from different participants is showcased, which provides insight into the current capabilities of VQA systems that can read. We firmly believe the dataset proposed in this challenge will be an important milestone to consider towards a path of more robust and general models that can exploit scene text to achieve holistic image understanding. |
|
|
Address |
Sydney; Australia; September 2019 |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
ICDAR |
|
|
Notes |
DAG; 600.129; 601.338; 600.121 |
Approved |
no |
|
|
Call Number |
Admin @ si @ BTM2019c |
Serial |
3286 |
|
Permanent link to this record |
|
|
|
|
Author |
Ali Furkan Biten; R. Tito; Andres Mafla; Lluis Gomez; Marçal Rusiñol; M. Mathew; C.V. Jawahar; Ernest Valveny; Dimosthenis Karatzas |
|
|
Title |
ICDAR 2019 Competition on Scene Text Visual Question Answering |
Type |
Conference Article |
|
Year |
2019 |
Publication |
3rd Workshop on Closing the Loop Between Vision and Language, in conjunction with ICCV2019 |
Abbreviated Journal |
|
|
|
Volume |
|
Issue |
|
Pages |
|
|
|
Keywords |
|
|
|
Abstract |
This paper presents final results of ICDAR 2019 Scene Text Visual Question Answering competition (ST-VQA). ST-VQA introduces an important aspect that is not addressed
by any Visual Question Answering system up to date, namely the incorporation of scene text to answer questions asked about an image. The competition introduces a new dataset comprising 23, 038 images annotated with 31, 791 question / answer pairs where the answer is always grounded on text instances present in the image. The images are taken from 7 different public computer vision datasets, covering a wide range of scenarios.
The competition was structured in three tasks of increasing difficulty, that require reading the text in a scene and understanding it in the context of the scene, to correctly answer a given question. A novel evaluation metric is presented, which elegantly assesses both key capabilities expected from an optimal model: text recognition and image understanding. A detailed analysis of results from different participants is showcased, which provides insight into the current capabilities of VQA systems that can read. We firmly believe the dataset proposed in this challenge will be an important milestone to consider towards a path of more robust and general models that
can exploit scene text to achieve holistic image understanding. |
|
|
Address |
Sydney; Australia; September 2019 |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
CLVL |
|
|
Notes |
DAG; 600.129; 601.338; 600.135; 600.121 |
Approved |
no |
|
|
Call Number |
Admin @ si @ BTM2019a |
Serial |
3284 |
|
Permanent link to this record |
|
|
|
|
Author |
Agnes Borras; Josep Llados |
|
|
Title |
Object Image Retrieval by Shape Content in Complex Scenes Using Geometric Constraints |
Type |
Book Chapter |
|
Year |
2005 |
Publication |
Pattern Recognition And Image Analysis |
Abbreviated Journal |
LNCS |
|
|
Volume |
3522 |
Issue |
|
Pages |
325–332 |
|
|
Keywords |
|
|
|
Abstract |
This paper presents an image retrieval system based on 2D shape information. Query shape objects and database images are repre- sented by polygonal approximations of their contours. Afterwards they are encoded, using geometric features, in terms of predefined structures. Shapes are then located in database images by a voting procedure on the spatial domain. Then an alignment matching provides a probability value to rank de database image in the retrieval result. The method al- lows to detect a query object in database images even when they contain complex scenes. Also the shape matching tolerates partial occlusions and affine transformations as translation, rotation or scaling. |
|
|
Address |
Estoril (Portugal) |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
Springer Link |
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
|
|
|
Notes |
DAG; |
Approved |
no |
|
|
Call Number |
DAG @ dag @ BoL2005; IAM @ iam @ BoL2005 |
Serial |
556 |
|
Permanent link to this record |
|
|
|
|
Author |
David Fernandez; Jon Almazan; Nuria Cirera; Alicia Fornes; Josep Llados |
|
|
Title |
BH2M: the Barcelona Historical Handwritten Marriages database |
Type |
Conference Article |
|
Year |
2014 |
Publication |
22nd International Conference on Pattern Recognition |
Abbreviated Journal |
|
|
|
Volume |
|
Issue |
|
Pages |
256 - 261 |
|
|
Keywords |
|
|
|
Abstract |
This paper presents an image database of historical handwritten marriages records stored in the archives of Barcelona cathedral, and the corresponding meta-data addressed to evaluate the performance of document analysis algorithms. The contribution of this paper is twofold. First, it presents a complete ground truth which covers the whole pipeline of handwriting
recognition research, from layout analysis to recognition and understanding. Second, it is the first dataset in the emerging area of genealogical document analysis, where documents are manuscripts pseudo-structured with specific lexicons and the interest is beyond pure transcriptions but context dependent. |
|
|
Address |
Creete Island; Grecia; September 2014 |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
1051-4651 |
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
ICPR |
|
|
Notes |
DAG; 600.056; 600.061; 602.006; 600.077 |
Approved |
no |
|
|
Call Number |
Admin @ si @ FAC2014 |
Serial |
2461 |
|
Permanent link to this record |
|
|
|
|
Author |
Lluis Pere de las Heras; Ernest Valveny; Gemma Sanchez |
|
|
Title |
Combining structural and statistical strategies for unsupervised wall detection in floor plans |
Type |
Conference Article |
|
Year |
2013 |
Publication |
10th IAPR International Workshop on Graphics Recognition |
Abbreviated Journal |
|
|
|
Volume |
|
Issue |
|
Pages |
|
|
|
Keywords |
|
|
|
Abstract |
This paper presents an evolution of the first unsupervised wall segmentation method in floor plans, that was presented by the authors in [1]. This first approach, contrarily to the existing ones, is able to segment walls independently to their notation and without the need of any pre-annotated data
to learn their visual appearance. Despite the good performance of the first approach, some specific cases, such as curved shaped walls, were not correctly segmented since they do not agree the strict structural assumptions that guide the whole methodology in order to be able to learn, in an unsupervised way, the structure of a wall. In this paper, we refine this strategy by dividing the
process in two steps. In a first step, potential wall segments are extracted unsupervisedly using a modification of [1], by restricting even more the areas considered as walls in a first moment. In a second step, these segments are used to learn and spot lost instances based on a modified version of [2], also presented by the authors. The presented combined method have been tested on
4 datasets with different notations and compared with the stateof-the-art applyed on the same datasets. The results show its adaptability to different wall notations and shapes, significantly outperforming the original approach. |
|
|
Address |
Bethlehem; PA; USA; August 2013 |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
GREC |
|
|
Notes |
DAG; 600.045 |
Approved |
no |
|
|
Call Number |
Admin @ si @ HVS2013a |
Serial |
2321 |
|
Permanent link to this record |
|
|
|
|
Author |
Josep Llados; Jaime Lopez-Krahe; Enric Marti |
|
|
Title |
Hand drawn document understanding using the straight line Hough transform and graph matching |
Type |
Conference Article |
|
Year |
1996 |
Publication |
Proceedings of the 13th International Pattern Recognition Conference (ICPR’96) |
Abbreviated Journal |
|
|
|
Volume |
2 |
Issue |
|
Pages |
497-501 |
|
|
Keywords |
|
|
|
Abstract |
This paper presents a system to understand hand drawn architectural drawings in a CAD environment. The procedure is to identify in a floor plan the building elements, stored in a library of patterns, and their spatial relationships. The vectorized input document and the patterns to recognize are represented by attributed graphs. To recognize the patterns as such, we apply a structural approach based on subgraph isomorphism techniques. In spite of their value, graph matching techniques do not recognize adequately those building elements characterized by hatching patterns, i.e. walls. Here we focus on the recognition of hatching patterns and develop a straight line Hough transform based method in order to detect the regions filled in with parallel straight fines. This allows not only to recognize filling patterns, but it actually reduces the computational load associated with the subgraph isomorphism computation. The result is that the document can be redrawn by editing all the patterns recognized |
|
|
Address |
|
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
Vienna , Austria |
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
|
|
|
Notes |
DAG;IAM; |
Approved |
no |
|
|
Call Number |
IAM @ iam @ LLM1996 |
Serial |
1579 |
|
Permanent link to this record |
|
|
|
|
Author |
Joan Mas; Alicia Fornes; Josep Llados |
|
|
Title |
An Interactive Transcription System of Census Records using Word-Spotting based Information Transfer |
Type |
Conference Article |
|
Year |
2016 |
Publication |
12th IAPR Workshop on Document Analysis Systems |
Abbreviated Journal |
|
|
|
Volume |
|
Issue |
|
Pages |
54-59 |
|
|
Keywords |
|
|
|
Abstract |
This paper presents a system to assist in the transcription of historical handwritten census records in a crowdsourcing platform. Census records have a tabular structured layout. They consist in a sequence of rows with information of homes ordered by street address. For each household snippet in the page, the list of family members is reported. The censuses are recorded in intervals of a few years and the information of individuals in each household is quite stable from a point in time to the next one. This redundancy is used to assist the transcriber, so the redundant information is transferred from the census already transcribed to the next one. Household records are aligned from one year to the next one using the knowledge of the ordering by street address. Given an already transcribed census, a query by string word spotting is applied. Thus, names from the census in time t are used as queries in the corresponding home record in time t+1. Since the search is constrained, the obtained precision-recall values are very high, with an important reduction in the transcription time. The proposed system has been tested in a real citizen-science experience where non expert users transcribe the census data of their home town. |
|
|
Address |
Santorini; Greece; April 2016 |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
DAS |
|
|
Notes |
DAG; 603.053; 602.006; 600.061; 600.077; 600.097 |
Approved |
no |
|
|
Call Number |
Admin @ si @ MFL2016 |
Serial |
2751 |
|
Permanent link to this record |
|
|
|
|
Author |
Lluis Pere de las Heras; Gemma Sanchez |
|
|
Title |
And-Or Graph Grammar for Architectural Floorplan Representation, Learning and Recognition. A Semantic, Structural and Hierarchical Model |
Type |
Conference Article |
|
Year |
2011 |
Publication |
5th Iberian Conference on Pattern Recognition and Image Analysis |
Abbreviated Journal |
|
|
|
Volume |
6669 |
Issue |
|
Pages |
17-24 |
|
|
Keywords |
|
|
|
Abstract |
This paper presents a syntactic model for architectural floor plan interpretation. A stochastic image grammar over an And-Or graph is inferred to represent the hierarchical, structural and semantic relations between elements of all possible floor plans. This grammar is augmented with three different probabilistic models, learnt from a training set, to account the frequency of that relations. Then, a Bottom-Up/Top-Down parser with a pruning strategy has been used for floor plan recognition. For a given input, the parser generates the most probable parse graph for that document. This graph not only contains the structural and semantic relations of its elements, but also its hierarchical composition, that allows to interpret the floor plan at different levels of abstraction. |
|
|
Address |
Las Palmas de Gran Canaria. Spain |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
978-3-642-21256-7 |
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
IbPRIA |
|
|
Notes |
DAG |
Approved |
no |
|
|
Call Number |
Admin @ si @ HeS2011 |
Serial |
1736 |
|
Permanent link to this record |
|
|
|
|
Author |
Joan Mas; Josep Llados; Gemma Sanchez; J.A. Jorge |
|
|
Title |
A syntactic approach based on distortion-tolerant Adjacency Grammars and a spatial-directed parser to interpret sketched diagrams |
Type |
Journal Article |
|
Year |
2010 |
Publication |
Pattern Recognition |
Abbreviated Journal |
PR |
|
|
Volume |
43 |
Issue |
12 |
Pages |
4148–4164 |
|
|
Keywords |
Syntactic Pattern Recognition; Symbol recognition; Diagram understanding; Sketched diagrams; Adjacency Grammars; Incremental parsing; Spatial directed parsing |
|
|
Abstract |
This paper presents a syntactic approach based on Adjacency Grammars (AG) for sketch diagram modeling and understanding. Diagrams are a combination of graphical symbols arranged according to a set of spatial rules defined by a visual language. AG describe visual shapes by productions defined in terms of terminal and non-terminal symbols (graphical primitives and subshapes), and a set functions describing the spatial arrangements between symbols. Our approach to sketch diagram understanding provides three main contributions. First, since AG are linear grammars, there is a need to define shapes and relations inherently bidimensional using a sequential formalism. Second, our parsing approach uses an indexing structure based on a spatial tessellation. This serves to reduce the search space when finding candidates to produce a valid reduction. This allows order-free parsing of 2D visual sentences while keeping combinatorial explosion in check. Third, working with sketches requires a distortion model to cope with the natural variations of hand drawn strokes. To this end we extended the basic grammar with a distortion measure modeled on the allowable variation on spatial constraints associated with grammar productions. Finally, the paper reports on an experimental framework an interactive system for sketch analysis. User tests performed on two real scenarios show that our approach is usable in interactive settings. |
|
|
Address |
|
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
Elsevier |
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
|
|
|
Notes |
DAG |
Approved |
no |
|
|
Call Number |
DAG @ dag @ MLS2010 |
Serial |
1336 |
|
Permanent link to this record |
|
|
|
|
Author |
Marçal Rusiñol; Agnes Borras; Josep Llados |
|
|
Title |
Relational Indexing of Vectorial Primitives for Symbol Spotting in Line-Drawing Images |
Type |
Journal Article |
|
Year |
2010 |
Publication |
Pattern Recognition Letters |
Abbreviated Journal |
PRL |
|
|
Volume |
31 |
Issue |
3 |
Pages |
188–201 |
|
|
Keywords |
Document image analysis and recognition, Graphics recognition, Symbol spotting ,Vectorial representations, Line-drawings |
|
|
Abstract |
This paper presents a symbol spotting approach for indexing by content a database of line-drawing images. As line-drawings are digital-born documents designed by vectorial softwares, instead of using a pixel-based approach, we present a spotting method based on vector primitives. Graphical symbols are represented by a set of vectorial primitives which are described by an off-the-shelf shape descriptor. A relational indexing strategy aims to retrieve symbol locations into the target documents by using a combined numerical-relational description of 2D structures. The zones which are likely to contain the queried symbol are validated by a Hough-like voting scheme. In addition, a performance evaluation framework for symbol spotting in graphical documents is proposed. The presented methodology has been evaluated with a benchmarking set of architectural documents achieving good performance results. |
|
|
Address |
|
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
Elsevier |
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
|
|
|
Notes |
DAG |
Approved |
no |
|
|
Call Number |
DAG @ dag @ RBL2010 |
Serial |
1177 |
|
Permanent link to this record |