Home | << 1 2 3 4 5 6 7 8 9 10 >> [11–12] |
Records | |||||
---|---|---|---|---|---|
Author | Dimosthenis Karatzas; Faisal Shafait; Seiichi Uchida; Masakazu Iwamura; Lluis Gomez; Sergi Robles; Joan Mas; David Fernandez; Jon Almazan; Lluis Pere de las Heras | ||||
Title | ICDAR 2013 Robust Reading Competition | Type | Conference Article | ||
Year | 2013 | Publication | 12th International Conference on Document Analysis and Recognition | Abbreviated Journal | |
Volume | Issue | Pages | 1484-1493 | ||
Keywords | |||||
Abstract | This report presents the final results of the ICDAR 2013 Robust Reading Competition. The competition is structured in three Challenges addressing text extraction in different application domains, namely born-digital images, real scene images and real-scene videos. The Challenges are organised around specific tasks covering text localisation, text segmentation and word recognition. The competition took place in the first quarter of 2013, and received a total of 42 submissions over the different tasks offered. This report describes the datasets and ground truth specification, details the performance evaluation protocols used and presents the final results along with a brief summary of the participating methods. | ||||
Address | Washington; USA; August 2013 | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | 1520-5363 | ISBN | Medium | ||
Area | Expedition | Conference | ICDAR | ||
Notes | DAG; 600.056 | Approved | no | ||
Call Number | Admin @ si @ KSU2013 | Serial | 2318 | ||
Permanent link to this record | |||||
Author | Lluis Gomez; Dimosthenis Karatzas | ||||
Title | Multi-script Text Extraction from Natural Scenes | Type | Conference Article | ||
Year | 2013 | Publication | 12th International Conference on Document Analysis and Recognition | Abbreviated Journal | |
Volume | Issue | Pages | 467-471 | ||
Keywords | |||||
Abstract | Scene text extraction methodologies are usually based in classification of individual regions or patches, using a priori knowledge for a given script or language. Human perception of text, on the other hand, is based on perceptual organisation through which text emerges as a perceptually significant group of atomic objects. Therefore humans are able to detect text even in languages and scripts never seen before. In this paper, we argue that the text extraction problem could be posed as the detection of meaningful groups of regions. We present a method built around a perceptual organisation framework that exploits collaboration of proximity and similarity laws to create text-group hypotheses. Experiments demonstrate that our algorithm is competitive with state of the art approaches on a standard dataset covering text in variable orientations and two languages. | ||||
Address | Washington; USA; August 2013 | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | 1520-5363 | ISBN | Medium | ||
Area | Expedition | Conference | ICDAR | ||
Notes | DAG; 600.056; 601.158; 601.197 | Approved | no | ||
Call Number | Admin @ si @ GoK2013 | Serial | 2310 | ||
Permanent link to this record | |||||
Author | Hongxing Gao; Marçal Rusiñol; Dimosthenis Karatzas; Josep Llados; Tomokazu Sato; Masakazu Iwamura; Koichi Kise | ||||
Title | Key-region detection for document images -applications to administrative document retrieval | Type | Conference Article | ||
Year | 2013 | Publication | 12th International Conference on Document Analysis and Recognition | Abbreviated Journal | |
Volume | Issue | Pages | 230-234 | ||
Keywords | |||||
Abstract | In this paper we argue that a key-region detector designed to take into account the special characteristics of document images can result in the detection of less and more meaningful key-regions. We propose a fast key-region detector able to capture aspects of the structural information of the document, and demonstrate its efficiency by comparing against standard detectors in an administrative document retrieval scenario. We show that using the proposed detector results to a smaller number of detected key-regions and higher performance without any drop in speed compared to standard state of the art detectors. | ||||
Address | Washington; USA; August 2013 | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | 1520-5363 | ISBN | Medium | ||
Area | Expedition | Conference | ICDAR | ||
Notes | DAG; 600.056; 600.045 | Approved | no | ||
Call Number | Admin @ si @ GRK2013b | Serial | 2293 | ||
Permanent link to this record | |||||
Author | Andreas Fischer; Volkmar Frinken; Horst Bunke; Ching Y. Suen | ||||
Title | Improving HMM-Based Keyword Spotting with Character Language Models | Type | Conference Article | ||
Year | 2013 | Publication | 12th International Conference on Document Analysis and Recognition | Abbreviated Journal | |
Volume | Issue | Pages | 506-510 | ||
Keywords | |||||
Abstract | Facing high error rates and slow recognition speed for full text transcription of unconstrained handwriting images, keyword spotting is a promising alternative to locate specific search terms within scanned document images. We have previously proposed a learning-based method for keyword spotting using character hidden Markov models that showed a high performance when compared with traditional template image matching. In the lexicon-free approach pursued, only the text appearance was taken into account for recognition. In this paper, we integrate character n-gram language models into the spotting system in order to provide an additional language context. On the modern IAM database as well as the historical George Washington database, we demonstrate that character language models significantly improve the spotting performance. | ||||
Address | Washington; USA; August 2013 | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | 1520-5363 | ISBN | Medium | ||
Area | Expedition | Conference | ICDAR | ||
Notes | DAG; 600.045; 605.203 | Approved | no | ||
Call Number | Admin @ si @ FFB2013 | Serial | 2295 | ||
Permanent link to this record | |||||
Author | Christophe Rigaud; Dimosthenis Karatzas; Joost Van de Weijer; Jean-Christophe Burie; Jean-Marc Ogier | ||||
Title | An active contour model for speech balloon detection in comics | Type | Conference Article | ||
Year | 2013 | Publication | 12th International Conference on Document Analysis and Recognition | Abbreviated Journal | |
Volume | Issue | Pages | 1240-1244 | ||
Keywords | |||||
Abstract | Comic books constitute an important cultural heritage asset in many countries. Digitization combined with subsequent comic book understanding would enable a variety of new applications, including content-based retrieval and content retargeting. Document understanding in this domain is challenging as comics are semi-structured documents, combining semantically important graphical and textual parts. Few studies have been done in this direction. In this work we detail a novel approach for closed and non-closed speech balloon localization in scanned comic book pages, an essential step towards a fully automatic comic book understanding. The approach is compared with existing methods for closed balloon localization found in the literature and results are presented. | ||||
Address | washington; USA; August 2013 | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | 1520-5363 | ISBN | Medium | ||
Area | Expedition | Conference | ICDAR | ||
Notes | DAG; CIC; 600.056 | Approved | no | ||
Call Number | Admin @ si @ RKW2013a | Serial | 2260 | ||
Permanent link to this record | |||||
Author | Alicia Fornes; Xavier Otazu; Josep Llados | ||||
Title | Show through cancellation and image enhancement by multiresolution contrast processing | Type | Conference Article | ||
Year | 2013 | Publication | 12th International Conference on Document Analysis and Recognition | Abbreviated Journal | |
Volume | Issue | Pages | 200-204 | ||
Keywords | |||||
Abstract | Historical documents suffer from different types of degradation and noise such as background variation, uneven illumination or dark spots. In case of double-sided documents, another common problem is that the back side of the document usually interferes with the front side because of the transparency of the document or ink bleeding. This effect is called the show through phenomenon. Many methods are developed to solve these problems, and in the case of show-through, by scanning and matching both the front and back sides of the document. In contrast, our approach is designed to use only one side of the scanned document. We hypothesize that show-trough are low contrast components, while foreground components are high contrast ones. A Multiresolution Contrast (MC) decomposition is presented in order to estimate the contrast of features at different spatial scales. We cancel the show-through phenomenon by thresholding these low contrast components. This decomposition is also able to enhance the image removing shadowed areas by weighting spatial scales. Results show that the enhanced images improve the readability of the documents, allowing scholars both to recover unreadable words and to solve ambiguities. | ||||
Address | Washington; USA; August 2013 | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | 1520-5363 | ISBN | Medium | ||
Area | Expedition | Conference | ICDAR | ||
Notes | DAG; 602.006; 600.045; 600.061; 600.052;CIC | Approved | no | ||
Call Number | Admin @ si @ FOL2013 | Serial | 2241 | ||
Permanent link to this record | |||||
Author | David Aldavert; Marçal Rusiñol; Ricardo Toledo; Josep Llados | ||||
Title | Integrating Visual and Textual Cues for Query-by-String Word Spotting | Type | Conference Article | ||
Year | 2013 | Publication | 12th International Conference on Document Analysis and Recognition | Abbreviated Journal | |
Volume | Issue | Pages | 511 - 515 | ||
Keywords | |||||
Abstract | In this paper, we present a word spotting framework that follows the query-by-string paradigm where word images are represented both by textual and visual representations. The textual representation is formulated in terms of character $n$-grams while the visual one is based on the bag-of-visual-words scheme. These two representations are merged together and projected to a sub-vector space. This transform allows to, given a textual query, retrieve word instances that were only represented by the visual modality. Moreover, this statistical representation can be used together with state-of-the-art indexation structures in order to deal with large-scale scenarios. The proposed method is evaluated using a collection of historical documents outperforming state-of-the-art performances. | ||||
Address | Washington; USA; August 2013 | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | 1520-5363 | ISBN | Medium | ||
Area | Expedition | Conference | ICDAR | ||
Notes | DAG; ADAS; 600.045; 600.055; 600.061 | Approved | no | ||
Call Number | Admin @ si @ ART2013 | Serial | 2224 | ||
Permanent link to this record | |||||
Author | Gioacchino Vino; Angel Sappa | ||||
Title | Revisiting Harris Corner Detector Algorithm: a Gradual Thresholding Approach | Type | Conference Article | ||
Year | 2013 | Publication | 10th International Conference on Image Analysis and Recognition | Abbreviated Journal | |
Volume | 7950 | Issue | Pages | 354-363 | |
Keywords | |||||
Abstract | This paper presents an adaptive thresholding approach intended to increase the number of detected corners, while reducing the amount of those ones corresponding to noisy data. The proposed approach works by using the classical Harris corner detector algorithm and overcome the difficulty in finding a general threshold that work well for all the images in a given data set by proposing a novel adaptive thresholding scheme. Initially, two thresholds are used to discern between strong corners and flat regions. Then, a region based criteria is used to discriminate between weak corners and noisy points in the midway interval. Experimental results show that the proposed approach has a better capability to reject false corners and, at the same time, to detect weak ones. Comparisons with the state of the art are provided showing the validity of the proposed approach. | ||||
Address | Póvoa de Varzim; Portugal; June 2013 | ||||
Corporate Author | Thesis | ||||
Publisher | Springer Berlin Heidelberg | Place of Publication | Editor | ||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | LNCS | ||
Series Volume | Series Issue | Edition | |||
ISSN | 0302-9743 | ISBN | 978-3-642-39093-7 | Medium | |
Area | Expedition | Conference | ICIAR | ||
Notes | ADAS; 600.055 | Approved | no | ||
Call Number | Admin @ si @ ViS2013 | Serial | 2562 | ||
Permanent link to this record | |||||
Author | Rahat Khan; Joost Van de Weijer; Dimosthenis Karatzas; Damien Muselet | ||||
Title | Towards multispectral data acquisition with hand-held devices | Type | Conference Article | ||
Year | 2013 | Publication | 20th IEEE International Conference on Image Processing | Abbreviated Journal | |
Volume | Issue | Pages | 2053 - 2057 | ||
Keywords | Multispectral; mobile devices; color measurements | ||||
Abstract | We propose a method to acquire multispectral data with handheld devices with front-mounted RGB cameras. We propose to use the display of the device as an illuminant while the camera captures images illuminated by the red, green and
blue primaries of the display. Three illuminants and three response functions of the camera lead to nine response values which are used for reflectance estimation. Results are promising and show that the accuracy of the spectral reconstruction improves in the range from 30-40% over the spectral reconstruction based on a single illuminant. Furthermore, we propose to compute sensor-illuminant aware linear basis by discarding the part of the reflectances that falls in the sensorilluminant null-space. We show experimentally that optimizing reflectance estimation on these new basis functions decreases the RMSE significantly over basis functions that are independent to sensor-illuminant. We conclude that, multispectral data acquisition is potentially possible with consumer hand-held devices such as tablets, mobiles, and laptops, opening up applications which are currently considered to be unrealistic. |
||||
Address | Melbourne; Australia; September 2013 | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | ICIP | ||
Notes | CIC; DAG; 600.048 | Approved | no | ||
Call Number | Admin @ si @ KWK2013b | Serial | 2265 | ||
Permanent link to this record | |||||
Author | Shida Beigpour; Marc Serra; Joost Van de Weijer; Robert Benavente; Maria Vanrell; Olivier Penacchio; Dimitris Samaras | ||||
Title | Intrinsic Image Evaluation On Synthetic Complex Scenes | Type | Conference Article | ||
Year | 2013 | Publication | 20th IEEE International Conference on Image Processing | Abbreviated Journal | |
Volume | Issue | Pages | 285 - 289 | ||
Keywords | |||||
Abstract | Scene decomposition into its illuminant, shading, and reflectance intrinsic images is an essential step for scene understanding. Collecting intrinsic image groundtruth data is a laborious task. The assumptions on which the ground-truth
procedures are based limit their application to simple scenes with a single object taken in the absence of indirect lighting and interreflections. We investigate synthetic data for intrinsic image research since the extraction of ground truth is straightforward, and it allows for scenes in more realistic situations (e.g, multiple illuminants and interreflections). With this dataset we aim to motivate researchers to further explore intrinsic image decomposition in complex scenes. |
||||
Address | Melbourne; Australia; September 2013 | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | ICIP | ||
Notes | CIC; 600.048; 600.052; 600.051 | Approved | no | ||
Call Number | Admin @ si @ BSW2013 | Serial | 2264 | ||
Permanent link to this record | |||||
Author | H. Emrah Tasli; Cevahir Çigla; Theo Gevers; A. Aydin Alatan | ||||
Title | Super pixel extraction via convexity induced boundary adaptation | Type | Conference Article | ||
Year | 2013 | Publication | 14th IEEE International Conference on Multimedia and Expo | Abbreviated Journal | |
Volume | Issue | Pages | 1-6 | ||
Keywords | |||||
Abstract | This study presents an efficient super-pixel extraction algorithm with major contributions to the state-of-the-art in terms of accuracy and computational complexity. Segmentation accuracy is improved through convexity constrained geodesic distance utilization; while computational efficiency is achieved by replacing complete region processing with boundary adaptation idea. Starting from the uniformly distributed rectangular equal-sized super-pixels, region boundaries are adapted to intensity edges iteratively by assigning boundary pixels to the most similar neighboring super-pixels. At each iteration, super-pixel regions are updated and hence progressively converging to compact pixel groups. Experimental results with state-of-the-art comparisons, validate the performance of the proposed technique in terms of both accuracy and speed. | ||||
Address | San Jose; USA; July 2013 | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | 1945-7871 | ISBN | Medium | ||
Area | Expedition | Conference | ICME | ||
Notes | ALTRES;ISE | Approved | no | ||
Call Number | Admin @ si @ TÇG2013 | Serial | 2367 | ||
Permanent link to this record | |||||
Author | Sergio Escalera; Jordi Gonzalez; Xavier Baro; Miguel Reyes; Oscar Lopes; Isabelle Guyon; V. Athitsos; Hugo Jair Escalante | ||||
Title | Multi-modal Gesture Recognition Challenge 2013: Dataset and Results | Type | Conference Article | ||
Year | 2013 | Publication | 15th ACM International Conference on Multimodal Interaction | Abbreviated Journal | |
Volume | Issue | Pages | 445-452 | ||
Keywords | |||||
Abstract | The recognition of continuous natural gestures is a complex and challenging problem due to the multi-modal nature of involved visual cues (e.g. fingers and lips movements, subtle facial expressions, body pose, etc.), as well as technical limitations such as spatial and temporal resolution and unreliable
depth cues. In order to promote the research advance on this field, we organized a challenge on multi-modal gesture recognition. We made available a large video database of 13; 858 gestures from a lexicon of 20 Italian gesture categories recorded with a KinectTM camera, providing the audio, skeletal model, user mask, RGB and depth images. The focus of the challenge was on user independent multiple gesture learning. There are no resting positions and the gestures are performed in continuous sequences lasting 1-2 minutes, containing between 8 and 20 gesture instances in each sequence. As a result, the dataset contains around 1:720:800 frames. In addition to the 20 main gesture categories, ‘distracter’ gestures are included, meaning that additional audio and gestures out of the vocabulary are included. The final evaluation of the challenge was defined in terms of the Levenshtein edit distance, where the goal was to indicate the real order of gestures within the sequence. 54 international teams participated in the challenge, and outstanding results were obtained by the first ranked participants. |
||||
Address | Sidney; Australia; December 2013 | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | 978-1-4503-2129-7 | Medium | ||
Area | Expedition | Conference | ICMI | ||
Notes | HUPBA; ISE; 600.063;MV | Approved | no | ||
Call Number | Admin @ si @ EGB2013 | Serial | 2373 | ||
Permanent link to this record | |||||
Author | Victor Ponce; Sergio Escalera; Xavier Baro | ||||
Title | Multi-modal Social Signal Analysis for Predicting Agreement in Conversation Settings | Type | Conference Article | ||
Year | 2013 | Publication | 15th ACM International Conference on Multimodal Interaction | Abbreviated Journal | |
Volume | Issue | Pages | 495-502 | ||
Keywords | |||||
Abstract | In this paper we present a non-invasive ambient intelligence framework for the analysis of non-verbal communication applied to conversational settings. In particular, we apply feature extraction techniques to multi-modal audio-RGB-depth data. We compute a set of behavioral indicators that define communicative cues coming from the fields of psychology and observational methodology. We test our methodology over data captured in victim-offender mediation scenarios. Using different state-of-the-art classification approaches, our system achieve upon 75% of recognition predicting agreement among the parts involved in the conversations, using as ground truth the experts opinions. | ||||
Address | Sidney; Australia; December 2013 | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | 978-1-4503-2129-7 | Medium | ||
Area | Expedition | Conference | ICMI | ||
Notes | HuPBA;MV | Approved | no | ||
Call Number | Admin @ si @ PEB2013 | Serial | 2488 | ||
Permanent link to this record | |||||
Author | Katerine Diaz; Francesc J. Ferri; W. Diaz | ||||
Title | Fast Approximated Discriminative Common Vectors using rank-one SVD updates | Type | Conference Article | ||
Year | 2013 | Publication | 20th International Conference On Neural Information Processing | Abbreviated Journal | |
Volume | 8228 | Issue | III | Pages | 368-375 |
Keywords | |||||
Abstract | An efficient incremental approach to the discriminative common vector (DCV) method for dimensionality reduction and classification is presented. The proposal consists of a rank-one update along with an adaptive restriction on the rank of the null space which leads to an approximate but convenient solution. The algorithm can be implemented very efficiently in terms of matrix operations and space complexity, which enables its use in large-scale dynamic application domains. Deep comparative experimentation using publicly available high dimensional image datasets has been carried out in order to properly assess the proposed algorithm against several recent incremental formulations.
K. Diaz-Chito, F.J. Ferri, W. Diaz |
||||
Address | Daegu; Korea; November 2013 | ||||
Corporate Author | Thesis | ||||
Publisher | Springer Berlin Heidelberg | Place of Publication | Editor | ||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | LNCS | ||
Series Volume | Series Issue | Edition | |||
ISSN | 0302-9743 | ISBN | 978-3-642-42050-4 | Medium | |
Area | Expedition | Conference | ICONIP | ||
Notes | ADAS | Approved | no | ||
Call Number | Admin @ si @ DFD2013 | Serial | 2439 | ||
Permanent link to this record | |||||
Author | German Ros; J. Guerrero; Angel Sappa; Antonio Lopez | ||||
Title | VSLAM pose initialization via Lie groups and Lie algebras optimization | Type | Conference Article | ||
Year | 2013 | Publication | Proceedings of IEEE International Conference on Robotics and Automation | Abbreviated Journal | |
Volume | Issue | Pages | 5740 - 5747 | ||
Keywords | SLAM | ||||
Abstract | We present a novel technique for estimating initial 3D poses in the context of localization and Visual SLAM problems. The presented approach can deal with noise, outliers and a large amount of input data and still performs in real time in a standard CPU. Our method produces solutions with an accuracy comparable to those produced by RANSAC but can be much faster when the percentage of outliers is high or for large amounts of input data. On the current work we propose to formulate the pose estimation as an optimization problem on Lie groups, considering their manifold structure as well as their associated Lie algebras. This allows us to perform a fast and simple optimization at the same time that conserve all the constraints imposed by the Lie group SE(3). Additionally, we present several key design concepts related with the cost function and its Jacobian; aspects that are critical for the good performance of the algorithm. | ||||
Address | Karlsruhe; Germany; May 2013 | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | 1050-4729 | ISBN | 978-1-4673-5641-1 | Medium | |
Area | Expedition | Conference | ICRA | ||
Notes | ADAS; 600.054; 600.055; 600.057 | Approved | no | ||
Call Number | Admin @ si @ RGS2013a; ADAS @ adas @ | Serial | 2225 | ||
Permanent link to this record |