Maria Alberich-Carramiñana, Guillem Alenya, Juan Andrade, E. Martinez, & Carme Torras. (2006). Affine Epipolar Direction from Two Views of a Planar Contour. In Proceedings of the Advanced Concepts for Intelligent Vision Systems Conference, LNCS 4179: 944–955.
|
Carme Julia. (2004). Motion segmentation through factorization. Application to night driving assistance.
|
Oriol Martinez. (2004). Semantic Retrieval of Memory Color Content.
|
J. Martinez. (2002). Automotive sector and Machine Vision..
|
Antonio Lopez, W. Niessen, Joan Serrat, K. Nikolay, B. Ter Haar Romeny, Juan J. Villanueva, et al. (2000). New improvements in the multiscale analysis of trabecular bone patterns. In Pattern Recognition and Applications (pp. 251–260). IOS Press.
|
Marçal Rusiñol, V. Poulain d'Andecy, Dimosthenis Karatzas, & Josep Llados. (2013). Classification of Administrative Document Images by Logo Identification. In 10th IAPR International Workshop on Graphics Recognition.
Abstract: This paper is focused on the categorization of administrative document images (such as invoices) based on the recognition of the supplier's graphical logo. Two different methods are proposed, the first one uses a bag-of-visual-words model whereas the second one tries to locate logo images described by the blurred shape model descriptor within documents by a sliding-window technique. Preliminar results are reported with a dataset of real administrative documents.
|
Adriana Romero, Simeon Petkov, Carlo Gatta, M.Sabate, & Petia Radeva. (2012). Efficient automatic segmentation of vessels. In 16th Conference on Medical Image Understanding and Analysis.
|
Alejandro Gonzalez Alzate, David Vazquez, Antonio Lopez, & Jaume Amores. (2017). On-Board Object Detection: Multicue, Multimodal, and Multiview Random Forest of Local Experts. Cyber - IEEE Transactions on cybernetics, 47(11), 3980–3990.
Abstract: Despite recent significant advances, object detection continues to be an extremely challenging problem in real scenarios. In order to develop a detector that successfully operates under these conditions, it becomes critical to leverage upon multiple cues, multiple imaging modalities, and a strong multiview (MV) classifier that accounts for different object views and poses. In this paper, we provide an extensive evaluation that gives insight into how each of these aspects (multicue, multimodality, and strong MV classifier) affect accuracy both individually and when integrated together. In the multimodality component, we explore the fusion of RGB and depth maps obtained by high-definition light detection and ranging, a type of modality that is starting to receive increasing attention. As our analysis reveals, although all the aforementioned aspects significantly help in improving the accuracy, the fusion of visible spectrum and depth information allows to boost the accuracy by a much larger margin. The resulting detector not only ranks among the top best performers in the challenging KITTI benchmark, but it is built upon very simple blocks that are easy to implement and computationally efficient.
Keywords: Multicue; multimodal; multiview; object detection
|
Sergio Escalera, Jordi Gonzalez, Xavier Baro, & Jamie Shotton. (2016). Guest Editor Introduction to the Special Issue on Multimodal Human Pose Recovery and Behavior Analysis. TPAMI - IEEE Transactions on Pattern Analysis and Machine Intelligence, 28, 1489–1491.
Abstract: The sixteen papers in this special section focus on human pose recovery and behavior analysis (HuPBA). This is one of the most challenging topics in computer vision, pattern analysis, and machine learning. It is of critical importance for application areas that include gaming, computer interaction, human robot interaction, security, commerce, assistive technologies and rehabilitation, sports, sign language recognition, and driver assistance technology, to mention just a few. In essence, HuPBA requires dealing with the articulated nature of the human body, changes in appearance due to clothing, and the inherent problems of clutter scenes, such as background artifacts, occlusions, and illumination changes. These papers represent the most recent research in this field, including new methods considering still images, image sequences, depth data, stereo vision, 3D vision, audio, and IMUs, among others.
|
Pau Riba, Alicia Fornes, & Josep Llados. (2015). Towards the Alignment of Handwritten Music Scores. In Bart Lamiroy, & Rafael Dueire Lins (Eds.), 11th IAPR International Workshop on Graphics Recognition. LNCS. Springer International Publishing.
Abstract: It is very common to find different versions of the same music work in archives of Opera Theaters. These differences correspond to modifications and annotations from the musicians. From the musicologist point of view, these variations are very interesting and deserve study. This paper explores the alignment of music scores as a tool for automatically detecting the passages that contain such differences. Given the difficulties in the recognition of handwritten music scores, our goal is to align the music scores and at the same time, avoid the recognition of music elements as much as possible. After removing the staff lines, braces and ties, the bar lines are detected. Then, the bar units are described as a whole using the Blurred Shape Model. The bar units alignment is performed by using Dynamic Time Warping. The analysis of the alignment path is used to detect the variations in the music scores. The method has been evaluated on a subset of the CVC-MUSCIMA dataset, showing encouraging results.
|
Patricia Suarez, Angel Sappa, & Boris X. Vintimilla. (2017). Learning to Colorize Infrared Images. In 15th International Conference on Practical Applications of Agents and Multi-Agent System.
Abstract: This paper focuses on near infrared (NIR) image colorization by using a Generative Adversarial Network (GAN) architecture model. The proposed architecture consists of two stages. Firstly, it learns to colorize the given input, resulting in a RGB image. Then, in the second stage, a discriminative model is used to estimate the probability that the generated image came from the training dataset, rather than the image automatically generated. The proposed model starts the learning process from scratch, because our set of images is very dierent from the dataset used in existing pre-trained models, so transfer learning strategies cannot be used. Infrared image colorization is an important problem when human perception need to be considered, e.g, in remote sensing applications. Experimental results with a large set of real images are provided showing the validity of the proposed approach.
Keywords: CNN in multispectral imaging; Image colorization
|
Mireia Sole, Joan Blanco, Debora Gil, G. Fonseka, Richard Frodsham, Oliver Valero, et al. (2017). Is there a pattern of Chromosome territoriality along mice spermatogenesis? In 3rd Spanish MeioNet Meeting Abstract Book (pp. 55–56).
|
Marc Masana, Joost Van de Weijer, Luis Herranz, Andrew Bagdanov, & Jose Manuel Alvarez. (2017). Domain-adaptive deep network compression. In 17th IEEE International Conference on Computer Vision.
Abstract: Deep Neural Networks trained on large datasets can be easily transferred to new domains with far fewer labeled examples by a process called fine-tuning. This has the advantage that representations learned in the large source domain can be exploited on smaller target domains. However, networks designed to be optimal for the source task are often prohibitively large for the target task. In this work we address the compression of networks after domain transfer.
We focus on compression algorithms based on low-rank matrix decomposition. Existing methods base compression solely on learned network weights and ignore the statistics of network activations. We show that domain transfer leads to large shifts in network activations and that it is desirable to take this into account when compressing.
We demonstrate that considering activation statistics when compressing weights leads to a rank-constrained regression problem with a closed-form solution. Because our method takes into account the target domain, it can more optimally
remove the redundancy in the weights. Experiments show that our Domain Adaptive Low Rank (DALR) method significantly outperforms existing low-rank compression techniques. With our approach, the fc6 layer of VGG19 can be compressed more than 4x more than using truncated SVD alone – with only a minor or no loss in accuracy. When applied to domain-transferred networks it allows for compression down to only 5-20% of the original number of parameters with only a minor drop in performance.
|
Albert Clapes, Ozan Bilici, Dariia Temirova, Egils Avots, Gholamreza Anbarjafari, & Sergio Escalera. (2018). From apparent to real age: gender, age, ethnic, makeup, and expression bias analysis in real age estimation. In IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (pp. 2373–2382).
|
Marçal Rusiñol, Dimosthenis Karatzas, & Josep Llados. (2015). Automatic Verification of Properly Signed Multi-page Document Images. In Proceedings of the Eleventh International Symposium on Visual Computing (Vol. 9475, pp. 327–336). LNCS, 9475.
Abstract: In this paper we present an industrial application for the automatic screening of incoming multi-page documents in a banking workflow aimed at determining whether these documents are properly signed or not. The proposed method is divided in three main steps. First individual pages are classified in order to identify the pages that should contain a signature. In a second step, we segment within those key pages the location where the signatures should appear. The last step checks whether the signatures are present or not. Our method is tested in a real large-scale environment and we report the results when checking two different types of real multi-page contracts, having in total more than 14,500 pages.
Keywords: Document Image; Manual Inspection; Signature Verification; Rejection Criterion; Document Flow
|