|
Anjan Dutta, Josep Llados, & Umapada Pal. (2013). A symbol spotting approach in graphical documents by hashing serialized graphs. PR - Pattern Recognition, 46(3), 752–768.
Abstract: In this paper we propose a symbol spotting technique in graphical documents. Graphs are used to represent the documents and a (sub)graph matching technique is used to detect the symbols in them. We propose a graph serialization to reduce the usual computational complexity of graph matching. Serialization of graphs is performed by computing acyclic graph paths between each pair of connected nodes. Graph paths are one-dimensional structures of graphs which are less expensive in terms of computation. At the same time they enable robust localization even in the presence of noise and distortion. Indexing in large graph databases involves a computational burden as well. We propose a graph factorization approach to tackle this problem. Factorization is intended to create a unified indexed structure over the database of graphical documents. Once graph paths are extracted, the entire database of graphical documents is indexed in hash tables by locality sensitive hashing (LSH) of shape descriptors of the paths. The hashing data structure aims to execute an approximate k-NN search in a sub-linear time. We have performed detailed experiments with various datasets of line drawings and compared our method with the state-of-the-art works. The results demonstrate the effectiveness and efficiency of our technique.
Keywords: Symbol spotting; Graphics recognition; Graph matching; Graph serialization; Graph factorization; Graph paths; Hashing
|
|
|
Katerine Diaz, Jesus Martinez del Rincon, & Aura Hernandez-Sabate. (2017). Decremental generalized discriminative common vectors applied to images classification. KBS - Knowledge-Based Systems, 131, 46–57.
Abstract: In this paper, a novel decremental subspace-based learning method called Decremental Generalized Discriminative Common Vectors method (DGDCV) is presented. The method makes use of the concept of decremental learning, which we introduce in the field of supervised feature extraction and classification. By efficiently removing unnecessary data and/or classes for a knowledge base, our methodology is able to update the model without recalculating the full projection or accessing to the previously processed training data, while retaining the previously acquired knowledge. The proposed method has been validated in 6 standard face recognition datasets, showing a considerable computational gain without compromising the accuracy of the model.
Keywords: Decremental learning; Generalized Discriminative Common Vectors; Feature extraction; Linear subspace methods; Classification
|
|
|
Rada Deeb, Damien Muselet, Mathieu Hebert, Alain Tremeau, & Joost Van de Weijer. (2017). 3D color charts for camera spectral sensitivity estimation. In 28th British Machine Vision Conference.
Abstract: Estimating spectral data such as camera sensor responses or illuminant spectral power distribution from raw RGB camera outputs is crucial in many computer vision applications.
Usually, 2D color charts with various patches of known spectral reflectance are
used as reference for such purpose. Deducing n-D spectral data (n»3) from 3D RGB inputs is an ill-posed problem that requires a high number of inputs. Unfortunately, most of the natural color surfaces have spectral reflectances that are well described by low-dimensional linear models, i.e. each spectral reflectance can be approximated by a weighted sum of the others. It has been shown that adding patches to color charts does not help in practice, because the information they add is redundant with the information provided by the first set of patches. In this paper, we propose to use spectral data of
higher dimensionality by using 3D color charts that create inter-reflections between the surfaces. These inter-reflections produce multiplications between natural spectral curves and so provide non-linear spectral curves. We show that such data provide enough information for accurate spectral data estimation.
|
|
|
Katerine Diaz, Jesus Martinez del Rincon, Aura Hernandez-Sabate, Marçal Rusiñol, & Francesc J. Ferri. (2018). Fast Kernel Generalized Discriminative Common Vectors for Feature Extraction. JMIV - Journal of Mathematical Imaging and Vision, 60(4), 512–524.
Abstract: This paper presents a supervised subspace learning method called Kernel Generalized Discriminative Common Vectors (KGDCV), as a novel extension of the known Discriminative Common Vectors method with Kernels. Our method combines the advantages of kernel methods to model complex data and solve nonlinear
problems with moderate computational complexity, with the better generalization properties of generalized approaches for large dimensional data. These attractive combination makes KGDCV specially suited for feature extraction and classification in computer vision, image processing and pattern recognition applications. Two different approaches to this generalization are proposed, a first one based on the kernel trick (KT) and a second one based on the nonlinear projection trick (NPT) for even higher efficiency. Both methodologies
have been validated on four different image datasets containing faces, objects and handwritten digits, and compared against well known non-linear state-of-art methods. Results show better discriminant properties than other generalized approaches both linear or kernel. In addition, the KGDCV-NPT approach presents a considerable computational gain, without compromising the accuracy of the model.
|
|
|
Katerine Diaz, Jesus Martinez del Rincon, Aura Hernandez-Sabate, & Debora Gil. (2018). Continuous head pose estimation using manifold subspace embedding and multivariate regression. ACCESS - IEEE Access, 6, 18325–18334.
Abstract: In this paper, a continuous head pose estimation system is proposed to estimate yaw and pitch head angles from raw facial images. Our approach is based on manifold learningbased methods, due to their promising generalization properties shown for face modelling from images. The method combines histograms of oriented gradients, generalized discriminative common vectors and continuous local regression to achieve successful performance. Our proposal was tested on multiple standard face datasets, as well as in a realistic scenario. Results show a considerable performance improvement and a higher consistence of our model in comparison with other state-of-art methods, with angular errors varying between 9 and 17 degrees.
Keywords: Head Pose estimation; HOG features; Generalized Discriminative Common Vectors; B-splines; Multiple linear regression
|
|
|
Wenlong Deng, Yongli Mou, Takahiro Kashiwa, Sergio Escalera, Kohei Nagai, Kotaro Nakayama, et al. (2020). Vision based Pixel-level Bridge Structural Damage Detection Using a Link ASPP Network. AC - Automation in Construction, 110, 102973.
Abstract: Structural Health Monitoring (SHM) has greatly benefited from computer vision. Recently, deep learning approaches are widely used to accurately estimate the state of deterioration of infrastructure. In this work, we focus on the problem of bridge surface structural damage detection, such as delamination and rebar exposure. It is well known that the quality of a deep learning model is highly dependent on the quality of the training dataset. Bridge damage detection, our application domain, has the following main challenges: (i) labeling the damages requires knowledgeable civil engineering professionals, which makes it difficult to collect a large annotated dataset; (ii) the damage area could be very small, whereas the background area is large, which creates an unbalanced training environment; (iii) due to the difficulty to exactly determine the extension of the damage, there is often a variation among different labelers who perform pixel-wise labeling. In this paper, we propose a novel model for bridge structural damage detection to address the first two challenges. This paper follows the idea of an atrous spatial pyramid pooling (ASPP) module that is designed as a novel network for bridge damage detection. Further, we introduce the weight balanced Intersection over Union (IoU) loss function to achieve accurate segmentation on a highly unbalanced small dataset. The experimental results show that (i) the IoU loss function improves the overall performance of damage detection, as compared to cross entropy loss or focal loss, and (ii) the proposed model has a better ability to detect a minority class than other light segmentation networks.
Keywords: Semantic image segmentation; Deep learning
|
|
|
Marta Diez-Ferrer, Arturo Morales, Rosa Lopez Lisbona, Noelia Cubero, Cristian Tebe, Susana Padrones, et al. (2019). Ultrathin Bronchoscopy with and without Virtual Bronchoscopic Navigation: Influence of Segmentation on Diagnostic Yield. RES - Respiration, 97(3), 252–258.
Abstract: Background: Bronchoscopy is a safe technique for diagnosing peripheral pulmonary lesions (PPLs), and virtual bronchoscopic navigation (VBN) helps guide the bronchoscope to PPLs. Objectives: We aimed to compare the diagnostic yield of VBN-guided and unguided ultrathin bronchoscopy (UTB) and explore clinical and technical factors associated with better results. We developed a diagnostic algorithm for deciding whether to use VBN to reach PPLs or choose an alternative diagnostic approach. Methods: We compared diagnostic yield between VBN-UTB (prospective cases) and unguided UTB (historical controls) and analyzed the VBN-UTB subgroup to identify clinical and technical variables that could predict the success of VBN-UTB. Results: Fifty-five cases and 110 controls were included. The overall diagnostic yield did not differ between the VBN-guided and unguided arms (47 and 40%, respectively; p = 0.354). Although the yield was slightly higher for PPLs ≤20 mm in the VBN-UTB arm, the difference was not significant (p = 0.069). No other clinical characteristics were associated with a higher yield in a subgroup analysis, but an 85% diagnostic yield was observed when segmentation was optimal and the PPL was endobronchial (vs. 30% when segmentation was suboptimal and 20% when segmentation was optimal but the PPL was extrabronchial). Conclusions: VBN-guided UTB is not superior to unguided UTB. A greater impact of VBN-guided over unguided UTB is highly dependent on both segmentation quality and an endobronchial location of the PPL. Segmentation quality should be considered before starting a procedure, when an alternative technique that may improve yield can be chosen, saving time and resources.
Keywords: Lung cancer; Peripheral lung lesion; Diagnosis; Bronchoscopy; Ultrathin bronchoscopy; Virtual bronchoscopic navigation
|
|
|
Fadi Dornaika, Abdelmalik Moujahid, & Bogdan Raducanu. (2013). Facial expression recognition using tracked facial actions: Classifier performance analysis. EAAI - Engineering Applications of Artificial Intelligence, 26(1), 467–477.
Abstract: In this paper, we address the analysis and recognition of facial expressions in continuous videos. More precisely, we study classifiers performance that exploit head pose independent temporal facial action parameters. These are provided by an appearance-based 3D face tracker that simultaneously provides the 3D head pose and facial actions. The use of such tracker makes the recognition pose- and texture-independent. Two different schemes are studied. The first scheme adopts a dynamic time warping technique for recognizing expressions where training data are given by temporal signatures associated with different universal facial expressions. The second scheme models temporal signatures associated with facial actions with fixed length feature vectors (observations), and uses some machine learning algorithms in order to recognize the displayed expression. Experiments quantified the performance of different schemes. These were carried out on CMU video sequences and home-made video sequences. The results show that the use of dimension reduction techniques on the extracted time series can improve the classification performance. Moreover, these experiments show that the best recognition rate can be above 90%.
Keywords: Visual face tracking; 3D deformable models; Facial actions; Dynamic facial expression recognition; Human–computer interaction
|
|
|
Sounak Dey, Anguelos Nicolaou, Josep Llados, & Umapada Pal. (2016). Local Binary Pattern for Word Spotting in Handwritten Historical Document. In Joint IAPR International Workshops on Statistical Techniques in Pattern Recognition (SPR) and Structural and Syntactic Pattern Recognition (SSPR) (pp. 574–583). LNCS.
Abstract: Digital libraries store images which can be highly degraded and to index this kind of images we resort to word spotting as our information retrieval system. Information retrieval for handwritten document images is more challenging due to the difficulties in complex layout analysis, large variations of writing styles, and degradation or low quality of historical manuscripts. This paper presents a simple innovative learning-free method for word spotting from large scale historical documents combining Local Binary Pattern (LBP) and spatial sampling. This method offers three advantages: firstly, it operates in completely learning free paradigm which is very different from unsupervised learning methods, secondly, the computational time is significantly low because of the LBP features, which are very fast to compute, and thirdly, the method can be used in scenarios where annotations are not available. Finally, we compare the results of our proposed retrieval method with other methods in the literature and we obtain the best results in the learning free paradigm.
Keywords: Local binary patterns; Spatial sampling; Learning-free; Word spotting; Handwritten; Historical document analysis; Large-scale data
|
|
|
Sounak Dey, Anguelos Nicolaou, Josep Llados, & Umapada Pal. (2019). Evaluation of the Effect of Improper Segmentation on Word Spotting. IJDAR - International Journal on Document Analysis and Recognition, 22, 361–374.
Abstract: Word spotting is an important recognition task in large-scale retrieval of document collections. In most of the cases, methods are developed and evaluated assuming perfect word segmentation. In this paper, we propose an experimental framework to quantify the goodness that word segmentation has on the performance achieved by word spotting methods in identical unbiased conditions. The framework consists of generating systematic distortions on segmentation and retrieving the original queries from the distorted dataset. We have tested our framework on several established and state-of-the-art methods using George Washington and Barcelona Marriage Datasets. The experiments done allow for an estimate of the end-to-end performance of word spotting methods.
|
|
|
Fadi Dornaika, & J. Ahlberg. (2006). Fitting 3D face models for tracking and active appearance model training. Image and Vision Computing, 24(9): 1010–1024.
|
|
|
Fadi Dornaika, & Franck Davoine. (2005). Facial expression recognition in continuous videos using dynamic programming.
|
|
|
Fadi Dornaika, & Franck Davoine. (2005). SFM for planar scenes using image derivatives.
|
|
|
Fadi Dornaika, & Franck Davoine. (2005). Simultaneous Facial Action Tracking and Expression Recognition using a Particle Filter.
|
|
|
Fadi Dornaika, & Franck Davoine. (2006). Facial expression recognition using auto-regressive models.
|
|