|
Records |
Links |
|
Author |
Francisco Alvaro; Francisco Cruz; Joan Andreu Sanchez; Oriol Ramos Terrades; Jose Miguel Benedi |
![download PDF file pdf](http://refbase.cvc.uab.es/img/file_PDF.gif)
|
|
Title |
Structure Detection and Segmentation of Documents Using 2D Stochastic Context-Free Grammars |
Type |
Journal Article |
|
Year |
2015 |
Publication |
Neurocomputing |
Abbreviated Journal |
NEUCOM |
|
|
Volume ![sorted by Volume (numeric) field, ascending order (up)](http://refbase.cvc.uab.es/img/sort_asc.gif) |
150 |
Issue |
A |
Pages |
147-154 |
|
|
Keywords |
document image analysis; stochastic context-free grammars; text classication features |
|
|
Abstract |
In this paper we dene a bidimensional extension of Stochastic Context-Free Grammars for structure detection and segmentation of images of documents.
Two sets of text classication features are used to perform an initial classication of each zone of the page. Then, the document segmentation is obtained as the most likely hypothesis according to a stochastic grammar. We used a dataset of historical marriage license books to validate this approach. We also tested several inference algorithms for Probabilistic Graphical Models
and the results showed that the proposed grammatical model outperformed
the other methods. Furthermore, grammars also provide the document structure
along with its segmentation. |
|
|
Address |
|
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
|
|
|
Notes |
DAG; 601.158; 600.077; 600.061 |
Approved |
no |
|
|
Call Number |
Admin @ si @ ACS2015 |
Serial |
2531 |
|
Permanent link to this record |
|
|
|
|
Author |
Lluis Gomez; Ali Furkan Biten; Ruben Tito; Andres Mafla; Marçal Rusiñol; Ernest Valveny; Dimosthenis Karatzas |
![download PDF file pdf](http://refbase.cvc.uab.es/img/file_PDF.gif)
![find record details (via OpenURL) openurl](http://refbase.cvc.uab.es/img/xref.gif)
|
|
Title |
Multimodal grid features and cell pointers for scene text visual question answering |
Type |
Journal Article |
|
Year |
2021 |
Publication |
Pattern Recognition Letters |
Abbreviated Journal |
PRL |
|
|
Volume ![sorted by Volume (numeric) field, ascending order (up)](http://refbase.cvc.uab.es/img/sort_asc.gif) |
150 |
Issue |
|
Pages |
242-249 |
|
|
Keywords |
|
|
|
Abstract |
This paper presents a new model for the task of scene text visual question answering. In this task questions about a given image can only be answered by reading and understanding scene text. Current state of the art models for this task make use of a dual attention mechanism in which one attention module attends to visual features while the other attends to textual features. A possible issue with this is that it makes difficult for the model to reason jointly about both modalities. To fix this problem we propose a new model that is based on an single attention mechanism that attends to multi-modal features conditioned to the question. The output weights of this attention module over a grid of multi-modal spatial features are interpreted as the probability that a certain spatial location of the image contains the answer text to the given question. Our experiments demonstrate competitive performance in two standard datasets with a model that is faster than previous methods at inference time. Furthermore, we also provide a novel analysis of the ST-VQA dataset based on a human performance study. Supplementary material, code, and data is made available through this link. |
|
|
Address |
|
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
|
|
|
Notes |
DAG; 600.084; 600.121 |
Approved |
no |
|
|
Call Number |
Admin @ si @ GBT2021 |
Serial |
3620 |
|
Permanent link to this record |
|
|
|
|
Author |
Anjan Dutta |
![find record details (via OpenURL) openurl](http://refbase.cvc.uab.es/img/xref.gif)
|
|
Title |
Symbol Spotting in Graphical Documents by Serialized Subgraph Matching |
Type |
Report |
|
Year |
2010 |
Publication |
CVC Technical Report |
Abbreviated Journal |
|
|
|
Volume ![sorted by Volume (numeric) field, ascending order (up)](http://refbase.cvc.uab.es/img/sort_asc.gif) |
159 |
Issue |
|
Pages |
|
|
|
Keywords |
|
|
|
Abstract |
|
|
|
Address |
|
|
|
Corporate Author |
|
Thesis |
Master's thesis |
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
|
|
|
Notes |
DAG |
Approved |
no |
|
|
Call Number |
Admin @ si @ Dut2010 |
Serial |
1351 |
|
Permanent link to this record |
|
|
|
|
Author |
Mohamed Ali Souibgui; Alicia Fornes; Yousri Kessentini; Beata Megyesi |
![goto web page (via DOI) doi](http://refbase.cvc.uab.es/img/doi.gif)
|
|
Title |
Few shots are all you need: A progressive learning approach for low resource handwritten text recognition |
Type |
Journal Article |
|
Year |
2022 |
Publication |
Pattern Recognition Letters |
Abbreviated Journal |
PRL |
|
|
Volume ![sorted by Volume (numeric) field, ascending order (up)](http://refbase.cvc.uab.es/img/sort_asc.gif) |
160 |
Issue |
|
Pages |
43-49 |
|
|
Keywords |
|
|
|
Abstract |
Handwritten text recognition in low resource scenarios, such as manuscripts with rare alphabets, is a challenging problem. In this paper, we propose a few-shot learning-based handwriting recognition approach that significantly reduces the human annotation process, by requiring only a few images of each alphabet symbols. The method consists of detecting all the symbols of a given alphabet in a textline image and decoding the obtained similarity scores to the final sequence of transcribed symbols. Our model is first pretrained on synthetic line images generated from an alphabet, which could differ from the alphabet of the target domain. A second training step is then applied to reduce the gap between the source and the target data. Since this retraining would require annotation of thousands of handwritten symbols together with their bounding boxes, we propose to avoid such human effort through an unsupervised progressive learning approach that automatically assigns pseudo-labels to the unlabeled data. The evaluation on different datasets shows that our model can lead to competitive results with a significant reduction in human effort. The code will be publicly available in the following repository: https://github.com/dali92002/HTRbyMatching |
|
|
Address |
|
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
Elsevier |
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
|
|
|
Notes |
DAG; 600.121; 600.162; 602.230 |
Approved |
no |
|
|
Call Number |
Admin @ si @ SFK2022 |
Serial |
3736 |
|
Permanent link to this record |
|
|
|
|
Author |
David Fernandez |
![find record details (via OpenURL) openurl](http://refbase.cvc.uab.es/img/xref.gif)
|
|
Title |
Handwritten Word Spotting in Old Manuscript Images using Shape Descriptors |
Type |
Report |
|
Year |
2010 |
Publication |
CVC Technical Report |
Abbreviated Journal |
|
|
|
Volume ![sorted by Volume (numeric) field, ascending order (up)](http://refbase.cvc.uab.es/img/sort_asc.gif) |
161 |
Issue |
|
Pages |
|
|
|
Keywords |
|
|
|
Abstract |
|
|
|
Address |
|
|
|
Corporate Author |
|
Thesis |
Master's thesis |
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
|
|
|
Notes |
DAG |
Approved |
no |
|
|
Call Number |
Admin @ si @ Fer2010b |
Serial |
1353 |
|
Permanent link to this record |
|
|
|
|
Author |
Lluis Gomez |
![download PDF file pdf](http://refbase.cvc.uab.es/img/file_PDF.gif)
|
|
Title |
Perceptual Organization for Text Extraction in Natural Scenes |
Type |
Report |
|
Year |
2012 |
Publication |
CVC Technical Report |
Abbreviated Journal |
|
|
|
Volume ![sorted by Volume (numeric) field, ascending order (up)](http://refbase.cvc.uab.es/img/sort_asc.gif) |
173 |
Issue |
|
Pages |
|
|
|
Keywords |
|
|
|
Abstract |
|
|
|
Address |
Bellaterra |
|
|
Corporate Author |
|
Thesis |
Master's thesis |
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
|
|
|
Notes |
DAG |
Approved |
no |
|
|
Call Number |
Admin @ si @ Gom2012 |
Serial |
2309 |
|
Permanent link to this record |
|
|
|
|
Author |
Nuria Cirera |
![find record details (via OpenURL) openurl](http://refbase.cvc.uab.es/img/xref.gif)
|
|
Title |
Recognition of Handwritten Historical Documents |
Type |
Report |
|
Year |
2012 |
Publication |
CVC Technical Report |
Abbreviated Journal |
|
|
|
Volume ![sorted by Volume (numeric) field, ascending order (up)](http://refbase.cvc.uab.es/img/sort_asc.gif) |
174 |
Issue |
|
Pages |
|
|
|
Keywords |
|
|
|
Abstract |
|
|
|
Address |
|
|
|
Corporate Author |
|
Thesis |
Master's thesis |
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
|
|
|
Notes |
DAG |
Approved |
no |
|
|
Call Number |
Admin @ si @ Cir2012 |
Serial |
2416 |
|
Permanent link to this record |
|
|
|
|
Author |
Josep Llados; J. Lopez-Krahe; D. Archambault |
![find record details (via OpenURL) openurl](http://refbase.cvc.uab.es/img/xref.gif)
|
|
Title |
Special Issue on Information Technologies for Visually Impaired People |
Type |
Journal |
|
Year |
2007 |
Publication |
Novatica |
Abbreviated Journal |
|
|
|
Volume ![sorted by Volume (numeric) field, ascending order (up)](http://refbase.cvc.uab.es/img/sort_asc.gif) |
186 |
Issue |
|
Pages |
4-7 |
|
|
Keywords |
|
|
|
Abstract |
|
|
|
Address |
|
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
Guest Editors |
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
|
|
|
Notes |
DAG |
Approved |
no |
|
|
Call Number |
DAG @ dag @ LLA2007a |
Serial |
903 |
|
Permanent link to this record |
|
|
|
|
Author |
Sounak Dey; Palaiahnakote Shivakumara; K.S. Raghunanda; Umapada Pal; Tong Lu; G. Hemantha Kumar; Chee Seng Chan |
![goto web page url](http://refbase.cvc.uab.es/img/www.gif)
|
|
Title |
Script independent approach for multi-oriented text detection in scene image |
Type |
Journal Article |
|
Year |
2017 |
Publication |
Neurocomputing |
Abbreviated Journal |
NEUCOM |
|
|
Volume ![sorted by Volume (numeric) field, ascending order (up)](http://refbase.cvc.uab.es/img/sort_asc.gif) |
242 |
Issue |
|
Pages |
96-112 |
|
|
Keywords |
|
|
|
Abstract |
Developing a text detection method which is invariant to scripts in natural scene images is a challeng- ing task due to different geometrical structures of various scripts. Besides, multi-oriented of text lines in natural scene images make the problem more challenging. This paper proposes to explore ring radius transform (RRT) for text detection in multi-oriented and multi-script environments. The method finds component regions based on convex hull to generate radius matrices using RRT. It is a fact that RRT pro- vides low radius values for the pixels that are near to edges, constant radius values for the pixels that represent stroke width, and high radius values that represent holes created in background and convex hull because of the regular structures of text components. We apply k -means clustering on the radius matrices to group such spatially coherent regions into individual clusters. Then the proposed method studies the radius values of such cluster components that are close to the centroid and far from the cen- troid to detect text components. Furthermore, we have developed a Bangla dataset (named as ISI-UM dataset) and propose a semi-automatic system for generating its ground truth for text detection of arbi- trary orientations, which can be used by the researchers for text detection and recognition in the future. The ground truth will be released to public. Experimental results on our ISI-UM data and other standard datasets, namely, ICDAR 2013 scene, SVT and MSRA data, show that the proposed method outperforms the existing methods in terms of multi-lingual and multi-oriented text detection ability. |
|
|
Address |
|
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
|
|
|
Notes |
DAG; 600.121 |
Approved |
no |
|
|
Call Number |
Admin @ si @ DSR2017 |
Serial |
3260 |
|
Permanent link to this record |
|
|
|
|
Author |
Thanh Ha Do; Salvatore Tabbone; Oriol Ramos Terrades |
![find record details (via OpenURL) openurl](http://refbase.cvc.uab.es/img/xref.gif)
|
|
Title |
Spotting Symbol over Graphical Documents Via Sparsity in Visual Vocabulary |
Type |
Book Chapter |
|
Year |
2016 |
Publication |
Recent Trends in Image Processing and Pattern Recognition |
Abbreviated Journal |
|
|
|
Volume ![sorted by Volume (numeric) field, ascending order (up)](http://refbase.cvc.uab.es/img/sort_asc.gif) |
709 |
Issue |
|
Pages |
|
|
|
Keywords |
|
|
|
Abstract |
|
|
|
Address |
|
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
RTIP2R |
|
|
Notes |
DAG |
Approved |
no |
|
|
Call Number |
Admin @ si @ HTR2016 |
Serial |
2956 |
|
Permanent link to this record |