|
Sebastien Mace, Herve Locteau, Ernest Valveny, & Salvatore Tabbone. (2010). A system to detect rooms in architectural floor plan images. In 9th IAPR International Workshop on Document Analysis Systems (167–174).
Abstract: In this article, a system to detect rooms in architectural floor plan images is described. We first present a primitive extraction algorithm for line detection. It is based on an original coupling of classical Hough transform with image vectorization in order to perform robust and efficient line detection. We show how the lines that satisfy some graphical arrangements are combined into walls. We also present the way we detect some door hypothesis thanks to the extraction of arcs. Walls and door hypothesis are then used by our room segmentation strategy; it consists in recursively decomposing the image until getting nearly convex regions. The notion of convexity is difficult to quantify, and the selection of separation lines between regions can also be rough. We take advantage of knowledge associated to architectural floor plans in order to obtain mostly rectangular rooms. Qualitative and quantitative evaluations performed on a corpus of real documents show promising results.
|
|
|
Gemma Sanchez, Ernest Valveny, Josep Llados, Enric Marti, Oriol Ramos Terrades, N.Lozano, et al. (2003). A system for virtual prototyping of architectural projects. In Proceedings of Fifth IAPR International Workshop on Pattern Recognition (pp. 65–74).
|
|
|
Enric Marti, Jordi Regincos, Jaime Lopez-Krahe, & Juan J.Villanueva. (1991). A system for interpretation of hand line drawings as three-dimensional scene for CAD input. In Proceedings of the First International Conference on Document Analysis and Recognition (pp. 472–480).
|
|
|
R. Bertrand, P. Gomez-Krämer, Oriol Ramos Terrades, P. Franco, & Jean-Marc Ogier. (2013). A System Based On Intrinsic Features for Fraudulent Document Detection. In 12th International Conference on Document Analysis and Recognition (pp. 106–110).
Abstract: Paper documents still represent a large amount of information supports used nowadays and may contain critical data. Even though official documents are secured with techniques such as printed patterns or artwork, paper documents suffer froma lack of security.
However, the high availability of cheap scanning and printing hardware allows non-experts to easily create fake documents. As the use of a watermarking system added during the document production step is hardly possible, solutions have to be proposed to distinguish a genuine document from a forged one.
In this paper, we present an automatic forgery detection method based on document’s intrinsic features at character level. This method is based on the one hand on outlier character detection in a discriminant feature space and on the other hand on the detection of strictly similar characters. Therefore, a feature set iscomputed for all characters. Then, based on a distance between characters of the same class.
Keywords: paper document; document analysis; fraudulent document; forgery; fake
|
|
|
Joan Mas. (2010). A Syntactic Pattern Recognition Approach based on a Distribution Tolerant Adjacency Grammar and a Spatial Indexed Parser. Application to Sketched Document Recognition (Gemma Sanchez, & Josep Llados, Eds.). Ph.D. thesis, Ediciones Graficas Rey, .
Abstract: Sketch recognition is a discipline which has gained an increasing interest in the last
20 years. This is due to the appearance of new devices such as PDA, Tablet PC’s
or digital pen & paper protocols. From the wide range of sketched documents we
focus on those that represent structured documents such as: architectural floor-plans,
engineering drawing, UML diagrams, etc. To recognize and understand these kinds
of documents, first we have to recognize the different compounding symbols and then
we have to identify the relations between these elements. From the way that a sketch
is captured, there are two categories: on-line and off-line. On-line input modes refer
to draw directly on a PDA or a Tablet PC’s while off-line input modes refer to scan
a previously drawn sketch.
This thesis is an overlapping of three different areas on Computer Science: Pattern
Recognition, Document Analysis and Human-Computer Interaction. The aim of this
thesis is to interpret sketched documents independently on whether they are captured
on-line or off-line. For this reason, the proposed approach should contain the following
features. First, as we are working with sketches the elements present in our input
contain distortions. Second, as we would work in on-line or off-line input modes, the
order in the input of the primitives is indifferent. Finally, the proposed method should
be applied in real scenarios, its response time must be slow.
To interpret a sketched document we propose a syntactic approach. A syntactic
approach is composed of two correlated components: a grammar and a parser. The
grammar allows describing the different elements on the document as well as their
relations. The parser, given a document checks whether it belongs to the language
generated by the grammar or not. Thus, the grammar should be able to cope with
the distortions appearing on the instances of the elements. Moreover, it would be
necessary to define a symbol independently of the order of their primitives. Concerning to the parser when analyzing 2D sentences, it does not assume an order in the
primitives. Then, at each new primitive in the input, the parser searches among the
previous analyzed symbols candidates to produce a valid reduction.
Taking into account these features, we have proposed a grammar based on Adjacency Grammars. This kind of grammars defines their productions as a multiset
of symbols rather than a list. This allows describing a symbol without an order in
their components. To cope with distortion we have proposed a distortion model.
This distortion model is an attributed estimated over the constraints of the grammar and passed through the productions. This measure gives an idea on how far is the
symbol from its ideal model. In addition to the distortion on the constraints other
distortions appear when working with sketches. These distortions are: overtracing,
overlapping, gaps or spurious strokes. Some grammatical productions have been defined to cope with these errors. Concerning the recognition, we have proposed an
incremental parser with an indexation mechanism. Incremental parsers analyze the
input symbol by symbol given a response to the user when a primitive is analyzed.
This makes incremental parser suitable to work in on-line as well as off-line input
modes. The parser has been adapted with an indexation mechanism based on a spatial division. This indexation mechanism allows setting the primitives in the space
and reducing the search to a neighbourhood.
A third contribution is a grammatical inference algorithm. This method given a
set of symbols captures the production describing it. In the field of formal languages,
different approaches has been proposed but in the graphical domain not so much work
is done in this field. The proposed method is able to capture the production from
a set of symbol although they are drawn in different order. A matching step based
on the Haussdorff distance and the Hungarian method has been proposed to match
the primitives of the different symbols. In addition the proposed approach is able to
capture the variability in the parameters of the constraints.
From the experimental results, we may conclude that we have proposed a robust
approach to describe and recognize sketches. Moreover, the addition of new symbols
to the alphabet is not restricted to an expert. Finally, the proposed approach has
been used in two real scenarios obtaining a good performance.
|
|
|
Joan Mas, Josep Llados, Gemma Sanchez, & J.A. Jorge. (2010). A syntactic approach based on distortion-tolerant Adjacency Grammars and a spatial-directed parser to interpret sketched diagrams. PR - Pattern Recognition, 43(12), 4148–4164.
Abstract: This paper presents a syntactic approach based on Adjacency Grammars (AG) for sketch diagram modeling and understanding. Diagrams are a combination of graphical symbols arranged according to a set of spatial rules defined by a visual language. AG describe visual shapes by productions defined in terms of terminal and non-terminal symbols (graphical primitives and subshapes), and a set functions describing the spatial arrangements between symbols. Our approach to sketch diagram understanding provides three main contributions. First, since AG are linear grammars, there is a need to define shapes and relations inherently bidimensional using a sequential formalism. Second, our parsing approach uses an indexing structure based on a spatial tessellation. This serves to reduce the search space when finding candidates to produce a valid reduction. This allows order-free parsing of 2D visual sentences while keeping combinatorial explosion in check. Third, working with sketches requires a distortion model to cope with the natural variations of hand drawn strokes. To this end we extended the basic grammar with a distortion measure modeled on the allowable variation on spatial constraints associated with grammar productions. Finally, the paper reports on an experimental framework an interactive system for sketch analysis. User tests performed on two real scenarios show that our approach is usable in interactive settings.
Keywords: Syntactic Pattern Recognition; Symbol recognition; Diagram understanding; Sketched diagrams; Adjacency Grammars; Incremental parsing; Spatial directed parsing
|
|
|
Alicia Fornes, & Josep Llados. (2010). A Symbol-dependent Writer Identifcation Approach in Old Handwritten Music Scores. In 12th International Conference on Frontiers in Handwriting Recognition (pp. 634–639).
Abstract: Writer identification consists in determining the writer of a piece of handwriting from a set of writers. In this paper we introduce a symbol-dependent approach for identifying the writer of old music scores, which is based on two symbol recognition methods. The main idea is to use the Blurred Shape Model descriptor and a DTW-based method for detecting, recognizing and describing the music clefs and notes. The proposed approach has been evaluated in a database of old music scores, achieving very high writer identification rates.
|
|
|
Anjan Dutta, Josep Llados, & Umapada Pal. (2013). A symbol spotting approach in graphical documents by hashing serialized graphs. PR - Pattern Recognition, 46(3), 752–768.
Abstract: In this paper we propose a symbol spotting technique in graphical documents. Graphs are used to represent the documents and a (sub)graph matching technique is used to detect the symbols in them. We propose a graph serialization to reduce the usual computational complexity of graph matching. Serialization of graphs is performed by computing acyclic graph paths between each pair of connected nodes. Graph paths are one-dimensional structures of graphs which are less expensive in terms of computation. At the same time they enable robust localization even in the presence of noise and distortion. Indexing in large graph databases involves a computational burden as well. We propose a graph factorization approach to tackle this problem. Factorization is intended to create a unified indexed structure over the database of graphical documents. Once graph paths are extracted, the entire database of graphical documents is indexed in hash tables by locality sensitive hashing (LSH) of shape descriptors of the paths. The hashing data structure aims to execute an approximate k-NN search in a sub-linear time. We have performed detailed experiments with various datasets of line drawings and compared our method with the state-of-the-art works. The results demonstrate the effectiveness and efficiency of our technique.
Keywords: Symbol spotting; Graphics recognition; Graph matching; Graph serialization; Graph factorization; Graph paths; Hashing
|
|
|
Xavier Perez Sala, Sergio Escalera, Cecilio Angulo, & Jordi Gonzalez. (2014). A survey on model based approaches for 2D and 3D visual human pose recovery. SENS - Sensors, 14(3), 4189–4210.
Abstract: Human Pose Recovery has been studied in the field of Computer Vision for the last 40 years. Several approaches have been reported, and significant improvements have been obtained in both data representation and model design. However, the problem of Human Pose Recovery in uncontrolled environments is far from being solved. In this paper, we define a general taxonomy to group model based approaches for Human Pose Recovery, which is composed of five main modules: appearance, viewpoint, spatial relations, temporal consistence, and behavior. Subsequently, a methodological comparison is performed following the proposed taxonomy, evaluating current SoA approaches in the aforementioned five group categories. As a result of this comparison, we discuss the main advantages and drawbacks of the reviewed literature.
Keywords: human pose recovery; human body modelling; behavior analysis; computer vision
|
|
|
Maryam Asadi-Aghbolaghi, Albert Clapes, Marco Bellantonio, Hugo Jair Escalante, Victor Ponce, Xavier Baro, et al. (2017). A survey on deep learning based approaches for action and gesture recognition in image sequences. In 12th IEEE International Conference on Automatic Face and Gesture Recognition.
Abstract: The interest in action and gesture recognition has grown considerably in the last years. In this paper, we present a survey on current deep learning methodologies for action and gesture recognition in image sequences. We introduce a taxonomy that summarizes important aspects of deep learning
for approaching both tasks. We review the details of the proposed architectures, fusion strategies, main datasets, and competitions.
We summarize and discuss the main works proposed so far with particular interest on how they treat the temporal dimension of data, discussing their main features and identify opportunities and challenges for future research.
|
|
|
David Castells, Vinh Ngo, Juan Borrego-Carazo, Marc Codina, Carles Sanchez, Debora Gil, et al. (2022). A Survey of FPGA-Based Vision Systems for Autonomous Cars. ACESS - IEEE Access, 10, 132525–132563.
Abstract: On the road to making self-driving cars a reality, academic and industrial researchers are working hard to continue to increase safety while meeting technical and regulatory constraints Understanding the surrounding environment is a fundamental task in self-driving cars. It requires combining complex computer vision algorithms. Although state-of-the-art algorithms achieve good accuracy, their implementations often require powerful computing platforms with high power consumption. In some cases, the processing speed does not meet real-time constraints. FPGA platforms are often used to implement a category of latency-critical algorithms that demand maximum performance and energy efficiency. Since self-driving car computer vision functions fall into this category, one could expect to see a wide adoption of FPGAs in autonomous cars. In this paper, we survey the computer vision FPGA-based works from the literature targeting automotive applications over the last decade. Based on the survey, we identify the strengths and weaknesses of FPGAs in this domain and future research opportunities and challenges.
Keywords: Autonomous automobile; Computer vision; field programmable gate arrays; reconfigurable architectures
|
|
|
Xavier Otazu, & Maria Vanrell. (2005). A surround-induction function to unify assimilation and contrast in a computational model of color apearance.
|
|
|
Bogdan Raducanu, & Fadi Dornaika. (2012). A Supervised Non-linear Dimensionality Reduction Approach for Manifold Learning. PR - Pattern Recognition, 45(6), 2432–2444.
Abstract: IF= 2.61
IF=2.61 (2010)
In this paper we introduce a novel supervised manifold learning technique called Supervised Laplacian Eigenmaps (S-LE), which makes use of class label information to guide the procedure of non-linear dimensionality reduction by adopting the large margin concept. The graph Laplacian is split into two components: within-class graph and between-class graph to better characterize the discriminant property of the data. Our approach has two important characteristics: (i) it adaptively estimates the local neighborhood surrounding each sample based on data density and similarity and (ii) the objective function simultaneously maximizes the local margin between heterogeneous samples and pushes the homogeneous samples closer to each other.
Our approach has been tested on several challenging face databases and it has been conveniently compared with other linear and non-linear techniques, demonstrating its superiority. Although we have concentrated in this paper on the face recognition problem, the proposed approach could also be applied to other category of objects characterized by large variations in their appearance (such as hand or body pose, for instance.
|
|
|
A. Pujol, & Juan J. Villanueva. (2002). A supervised Modification of the Hausdorff distance for visual shape classification. International Journal of Pattern Recognition and Artificial Intelligence, 349–359.
|
|
|
Laura Igual, Joan Carles Soliva, Antonio Hernandez, Sergio Escalera, Oscar Vilarroya, & Petia Radeva. (2012). A Supervised Graph-cut Deformable Model for Brain MRI Segmentation. Deformation models: tracking, animation and applications. In Computational Vision and Biomechanics. LNCS. Springer Netherlands.
|
|