|
Christophe Rigaud and Clement Guerin. 2014. Localisation contextuelle des personnages de bandes dessinées. Colloque International Francophone sur l'Écrit et le Document.
Abstract: Les auteurs proposent une méthode de localisation des personnages dans des cases de bandes dessinées en s'appuyant sur les caractéristiques des bulles de dialogue. L'évaluation montre un taux de localisation des personnages allant jusqu'à 65%.
|
|
|
Marçal Rusiñol, J. Chazalon and Jean-Marc Ogier. 2014. Normalisation et validation d'images de documents capturées en mobilité. Colloque International Francophone sur l'Écrit et le Document.109–124.
Abstract: Mobile document image acquisition integrates many distortions which must be corrected or detected on the device, before the document becomes unavailable or paying data transmission fees. In this paper, we propose a system to correct perspective and illumination issues, and estimate the sharpness of the image for OCR recognition. The correction step relies on fast and accurate border detection followed by illumination normalization. Its evaluation on a private dataset shows a clear improvement on OCR accuracy. The quality assessment
step relies on a combination of focus measures. Its evaluation on a public dataset shows that this simple method compares well to state of the art, learning-based methods which cannot be embedded on a mobile, and outperforms metric-based methods.
Keywords: mobile document image acquisition; perspective correction; illumination correction; quality assessment; focus measure; OCR accuracy prediction
|
|
|
P. Wang, V. Eglin, C. Garcia, C. Largeron, Josep Llados and Alicia Fornes. 2014. Représentation par graphe de mots manuscrits dans les images pour la recherche par similarité. Colloque International Francophone sur l'Écrit et le Document.233–248.
Abstract: Effective information retrieval on handwritten document images has always been
a challenging task. In this paper, we propose a novel handwritten word spotting approach based on graph representation. The presented model comprises both topological and morphological signatures of handwriting. Skeleton-based graphs with the Shape Context labeled vertexes are established for connected components. Each word image is represented as a sequence of graphs. In order to be robust to the handwriting variations, an exhaustive merging process based on DTW alignment results introduced in the similarity measure between word images. With respect to the computation complexity, an approximate graph edit distance approach using bipartite matching is employed for graph matching. The experiments on the George Washington dataset and the marriage records from the Barcelona Cathedral dataset demonstrate that the proposed approach outperforms the state-of-the-art structural methods.
Keywords: word spotting; graph-based representation; shape context description; graph edit distance; DTW; block merging; query by example
|
|
|
Marçal Rusiñol, J. Chazalon and Jean-Marc Ogier. 2016. Filtrage de descripteurs locaux pour l'amélioration de la détection de documents. Colloque International Francophone sur l'Écrit et le Document.
Abstract: In this paper we propose an effective method aimed at reducing the amount of local descriptors to be indexed in a document matching framework.In an off-line training stage, the matching between the model document and incoming images is computed retaining the local descriptors from the model that steadily produce good matches. We have evaluated this approach by using the ICDAR2015 SmartDOC dataset containing near 25000 images from documents to be captured by a mobile device. We have tested the performance of this filtering step by using ORB and SIFT local detectors and descriptors. The results show an important gain both in quality of the final matching as well as in time and space requirements.
Keywords: Local descriptors; mobile capture; document matching; keypoint selection
|
|
|
Herve Locteau, Sebastien Mace, Ernest Valveny and Salvatore Tabbone. 2010. Extraction des pieces de un plan de habitation. Colloque Internacional Francophone de l´Ecrit et le Document.1–12.
Abstract: In this article, a method to extract the rooms of an architectural floor plan image is described. We first present a line detection algorithm to extract long lines in the image. Those lines are analyzed to identify the existing walls. From this point, room extraction can be seen as a classical segmentation task for which each region corresponds to a room. The chosen resolution strategy consists in recursively decomposing the image until getting nearly convex regions. The notion of convexity is difficult to quantify, and the selection of separation lines can also be rough. Thus, we take advantage of knowledge associated to architectural floor plans in order to obtain mainly rectangular rooms. Preliminary tests on a set of real documents show promising results.
|
|
|
Antonio Clavelli, Dimosthenis Karatzas, Josep Llados, Mario Ferraro and Giuseppe Boccignone. 2014. Modelling task-dependent eye guidance to objects in pictures. CoCom, 6(3), 558–584.
Abstract: 5Y Impact Factor: 1.14 / 3rd (Computer Science, Artificial Intelligence)
We introduce a model of attentional eye guidance based on the rationale that the deployment of gaze is to be considered in the context of a general action-perception loop relying on two strictly intertwined processes: sensory processing, depending on current gaze position, identifies sources of information that are most valuable under the given task; motor processing links such information with the oculomotor act by sampling the next gaze position and thus performing the gaze shift. In such a framework, the choice of where to look next is task-dependent and oriented to classes of objects embedded within pictures of complex scenes. The dependence on task is taken into account by exploiting the value and the payoff of gazing at certain image patches or proto-objects that provide a sparse representation of the scene objects. The different levels of the action-perception loop are represented in probabilistic form and eventually give rise to a stochastic process that generates the gaze sequence. This way the model also accounts for statistical properties of gaze shifts such as individual scan path variability. Results of the simulations are compared either with experimental data derived from publicly available datasets and from our own experiments.
Keywords: Visual attention; Gaze guidance; Value; Payoff; Stochastic fixation prediction
|
|
|
Chenyang Fu, Kaida Xiao, Dimosthenis Karatzas and Sophie Wuerger. 2011. Investigation of Unique Hue Setting Changes with Ageing. COL, 9(5), 053301-5.
Abstract: Clromatic sensitivity along the protan, deutan, and tritan lines and the loci of the unique hues (red, green, yellow, blue) for a very large sample (n = 185) of colour-normal observers ranging from 18 to 75 years of age are assessed. Visual judgments are obtained under normal viewing conditions using colour patches on self-luminous display under controlled adaptation conditions. Trivector discrimination thresholds show an increase as a function of age along the protan, deutan, and tritan axes, with the largest increase present along the tritan line, less pronounced shifts in unique hue settings are also observed. Based on the chromatic (protan, deutan, tritan) thresholds and using scaled cone signals, we predict the unique hue changes with ageing. A dependency on age for unique red and unique yellow for predicted hue angle is found. We conclude that the chromatic sensitivity deteriorates significantly with age, whereas the appearance of unique hues is much less affected, remaining almost constant despite the known changes in the ocular media.
|
|
|
Joan M. Nuñez, Jorge Bernal, Miquel Ferrer and Fernando Vilariño. 2014. Impact of Keypoint Detection on Graph-based Characterization of Blood Vessels in Colonoscopy Videos. CARE workshop.
Abstract: We explore the potential of the use of blood vessels as anatomical landmarks for developing image registration methods in colonoscopy images. An unequivocal representation of blood vessels could be used to guide follow-up methods to track lesions over different interventions. We propose a graph-based representation to characterize network structures, such as blood vessels, based on the use of intersections and endpoints. We present a study consisting of the assessment of the minimal performance a keypoint detector should achieve so that the structure can still be recognized. Experimental results prove that, even by achieving a loss of 35% of the keypoints, the descriptive power of the associated graphs to the vessel pattern is still high enough to recognize blood vessels.
Keywords: Colonoscopy; Graph Matching; Biometrics; Vessel; Intersection
|
|
|
Marçal Rusiñol, Lluis Gomez, A. Landman, M. Silva Constenla and Dimosthenis Karatzas. 2019. Automatic Structured Text Reading for License Plates and Utility Meters. BMVC Workshop on Visual Artificial Intelligence and Entrepreneurship.
Abstract: Reading text in images has attracted interest from computer vision researchers for
many years. Our technology focuses on the extraction of structured text – such as serial
numbers, machine readings, product codes, etc. – so that it is able to center its attention just on the relevant textual elements. It is conceived to work in an end-to-end fashion, bypassing any explicit text segmentation stage. In this paper we present two different industrial use cases where we have applied our automatic structured text reading technology. In the first one, we demonstrate an outstanding performance when reading license plates compared to the current state of the art. In the second one, we present results on our solution for reading utility meters. The technology is commercialized by a recently created spin-off company, and both solutions are at different stages of integration with final clients.
|
|
|
Oriol Ramos Terrades, Albert Berenguel and Debora Gil. 2022. A Flexible Outlier Detector Based on a Topology Given by Graph Communities. BDR, 29, 100332.
Abstract: Outlier detection is essential for optimal performance of machine learning methods and statistical predictive models. Their detection is especially determinant in small sample size unbalanced problems, since in such settings outliers become highly influential and significantly bias models. This particular experimental settings are usual in medical applications, like diagnosis of rare pathologies, outcome of experimental personalized treatments or pandemic emergencies. In contrast to population-based methods, neighborhood based local approaches compute an outlier score from the neighbors of each sample, are simple flexible methods that have the potential to perform well in small sample size unbalanced problems. A main concern of local approaches is the impact that the computation of each sample neighborhood has on the method performance. Most approaches use a distance in the feature space to define a single neighborhood that requires careful selection of several parameters, like the number of neighbors.
This work presents a local approach based on a local measure of the heterogeneity of sample labels in the feature space considered as a topological manifold. Topology is computed using the communities of a weighted graph codifying mutual nearest neighbors in the feature space. This way, we provide with a set of multiple neighborhoods able to describe the structure of complex spaces without parameter fine tuning. The extensive experiments on real-world and synthetic data sets show that our approach outperforms, both, local and global strategies in multi and single view settings.
Keywords: Classification algorithms; Detection algorithms; Description of feature space local structure; Graph communities; Machine learning algorithms; Outlier detectors
|
|