|
Records |
Links |
|
Author |
S.Chanda; Umapada Pal; Oriol Ramos Terrades |
|
|
Title |
Word-Wise Thai and Roman Script Identification |
Type |
Journal |
|
Year |
2009 |
Publication |
ACM Transactions on Asian Language Information Processing |
Abbreviated Journal |
TALIP |
|
|
Volume |
8 |
Issue |
3 |
Pages |
1-21 |
|
|
Keywords |
|
|
|
Abstract |
In some Thai documents, a single text line of a printed document page may contain words of both Thai and Roman scripts. For the Optical Character Recognition (OCR) of such a document page it is better to identify, at first, Thai and Roman script portions and then to use individual OCR systems of the respective scripts on these identified portions. In this article, an SVM-based method is proposed for identification of word-wise printed Roman and Thai scripts from a single line of a document page. Here, at first, the document is segmented into lines and then lines are segmented into character groups (words). In the proposed scheme, we identify the script of a character group combining different character features obtained from structural shape, profile behavior, component overlapping information, topological properties, and water reservoir concept, etc. Based on the experiment on 10,000 data (words) we obtained 99.62% script identification accuracy from the proposed scheme. |
|
|
Address |
|
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
1530-0226 |
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
|
|
|
Notes |
DAG |
Approved |
no |
|
|
Call Number |
Admin @ si @ CPR2009f |
Serial |
1869 |
|
Permanent link to this record |
|
|
|
|
Author |
Marçal Rusiñol; Josep Llados |
|
|
Title |
Boosting the Handwritten Word Spotting Experience by Including the User in the Loop |
Type |
Journal Article |
|
Year |
2014 |
Publication |
Pattern Recognition |
Abbreviated Journal |
PR |
|
|
Volume |
47 |
Issue |
3 |
Pages |
1063–1072 |
|
|
Keywords |
Handwritten word spotting; Query by example; Relevance feedback; Query fusion; Multidimensional scaling |
|
|
Abstract |
In this paper, we study the effect of taking the user into account in a query-by-example handwritten word spotting framework. Several off-the-shelf query fusion and relevance feedback strategies have been tested in the handwritten word spotting context. The increase in terms of precision when the user is included in the loop is assessed using two datasets of historical handwritten documents and two baseline word spotting approaches both based on the bag-of-visual-words model. We finally present two alternative ways of presenting the results to the user that might be more attractive and suitable to the user's needs than the classic ranked list. |
|
|
Address |
|
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
0031-3203 |
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
|
|
|
Notes |
DAG; 600.045; 600.061; 600.077 |
Approved |
no |
|
|
Call Number |
Admin @ si @ RuL2013 |
Serial |
2343 |
|
Permanent link to this record |
|
|
|
|
Author |
Marçal Rusiñol; Lluis Pere de las Heras; Oriol Ramos Terrades |
|
|
Title |
Flowchart Recognition for Non-Textual Information Retrieval in Patent Search |
Type |
Journal Article |
|
Year |
2014 |
Publication |
Information Retrieval |
Abbreviated Journal |
IR |
|
|
Volume |
17 |
Issue |
5-6 |
Pages |
545-562 |
|
|
Keywords |
Flowchart recognition; Patent documents; Text/graphics separation; Raster-to-vector conversion; Symbol recognition |
|
|
Abstract |
Relatively little research has been done on the topic of patent image retrieval and in general in most of the approaches the retrieval is performed in terms of a similarity measure between the query image and the images in the corpus. However, systems aimed at overcoming the semantic gap between the visual description of patent images and their conveyed concepts would be very helpful for patent professionals. In this paper we present a flowchart recognition method aimed at achieving a structured representation of flowchart images that can be further queried semantically. The proposed method was submitted to the CLEF-IP 2012 flowchart recognition task. We report the obtained results on this dataset. |
|
|
Address |
|
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
1386-4564 |
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
|
|
|
Notes |
DAG; 600.077 |
Approved |
no |
|
|
Call Number |
Admin @ si @ RHR2013 |
Serial |
2342 |
|
Permanent link to this record |
|
|
|
|
Author |
Albert Gordo; Alicia Fornes; Ernest Valveny |
|
|
Title |
Writer identification in handwritten musical scores with bags of notes |
Type |
Journal Article |
|
Year |
2013 |
Publication |
Pattern Recognition |
Abbreviated Journal |
PR |
|
|
Volume |
46 |
Issue |
5 |
Pages |
1337-1345 |
|
|
Keywords |
|
|
|
Abstract |
Writer Identification is an important task for the automatic processing of documents. However, the identification of the writer in graphical documents is still challenging. In this work, we adapt the Bag of Visual Words framework to the task of writer identification in handwritten musical scores. A vanilla implementation of this method already performs comparably to the state-of-the-art. Furthermore, we analyze the effect of two improvements of the representation: a Bhattacharyya embedding, which improves the results at virtually no extra cost, and a Fisher Vector representation that very significantly improves the results at the cost of a more complex and costly representation. Experimental evaluation shows results more than 20 points above the state-of-the-art in a new, challenging dataset. |
|
|
Address |
|
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
0031-3203 |
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
|
|
|
Notes |
DAG |
Approved |
no |
|
|
Call Number |
Admin @ si @ GFV2013 |
Serial |
2307 |
|
Permanent link to this record |
|
|
|
|
Author |
Volkmar Frinken; Andreas Fischer; Markus Baumgartner; Horst Bunke |
|
|
Title |
Keyword spotting for self-training of BLSTM NN based handwriting recognition systems |
Type |
Journal Article |
|
Year |
2014 |
Publication |
Pattern Recognition |
Abbreviated Journal |
PR |
|
|
Volume |
47 |
Issue |
3 |
Pages |
1073-1082 |
|
|
Keywords |
Document retrieval; Keyword spotting; Handwriting recognition; Neural networks; Semi-supervised learning |
|
|
Abstract |
The automatic transcription of unconstrained continuous handwritten text requires well trained recognition systems. The semi-supervised paradigm introduces the concept of not only using labeled data but also unlabeled data in the learning process. Unlabeled data can be gathered at little or not cost. Hence it has the potential to reduce the need for labeling training data, a tedious and costly process. Given a weak initial recognizer trained on labeled data, self-training can be used to recognize unlabeled data and add words that were recognized with high confidence to the training set for re-training. This process is not trivial and requires great care as far as selecting the elements that are to be added to the training set is concerned. In this paper, we propose to use a bidirectional long short-term memory neural network handwritten recognition system for keyword spotting in order to select new elements. A set of experiments shows the high potential of self-training for bootstrapping handwriting recognition systems, both for modern and historical handwritings, and demonstrate the benefits of using keyword spotting over previously published self-training schemes. |
|
|
Address |
|
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
|
|
|
Notes |
DAG; 600.077; 602.101 |
Approved |
no |
|
|
Call Number |
Admin @ si @ FFB2014 |
Serial |
2297 |
|
Permanent link to this record |