|
Albert Gordo, Marçal Rusiñol, Dimosthenis Karatzas, & Andrew Bagdanov. (2013). Document Classification and Page Stream Segmentation for Digital Mailroom Applications. In 12th International Conference on Document Analysis and Recognition (pp. 621–625).
Abstract: In this paper we present a method for the segmentation of continuous page streams into multipage documents and the simultaneous classification of the resulting documents. We first present an approach to combine the multiple pages of a document into a single feature vector that represents the whole document. Despite its simplicity and low computational cost, the proposed representation yields results comparable to more complex methods in multipage document classification tasks. We then exploit this representation in the context of page stream segmentation. The most plausible segmentation of a page stream into a sequence of multipage documents is obtained by optimizing a statistical model that represents the probability of each segmented multipage document belonging to a particular class. Experimental results are reported on a large sample of real administrative multipage documents.
|
|
|
Adriana Romero, & Carlo Gatta. (2013). Do We Really Need All These Neurons? In 6th Iberian Conference on Pattern Recognition and Image Analysis (Vol. 7887, pp. 460–467). LNCS. Springer Berlin Heidelberg.
Abstract: Restricted Boltzmann Machines (RBMs) are generative neural networks that have received much attention recently. In particular, choosing the appropriate number of hidden units is important as it might hinder their representative power. According to the literature, RBM require numerous hidden units to approximate any distribution properly. In this paper, we present an experiment to determine whether such amount of hidden units is required in a classification context. We then propose an incremental algorithm that trains RBM reusing the previously trained parameters using a trade-off measure to determine the appropriate number of hidden units. Results on the MNIST and OCR letters databases show that using a number of hidden units, which is one order of magnitude smaller than the literature estimate, suffices to achieve similar performance. Moreover, the proposed algorithm allows to estimate the required number of hidden units without the need of training many RBM from scratch.
Keywords: Retricted Boltzmann Machine; hidden units; unsupervised learning; classification
|
|
|
Ariel Amato, Angel Sappa, Alicia Fornes, Felipe Lumbreras, & Josep Llados. (2013). Divide and Conquer: Atomizing and Parallelizing A Task in A Mobile Crowdsourcing Platform. In 2nd International ACM Workshop on Crowdsourcing for Multimedia (pp. 21–22).
Abstract: In this paper we present some conclusions about the advantages of having an efficient task formulation when a crowdsourcing platform is used. In particular we show how the task atomization and distribution can help to obtain results in an efficient way. Our proposal is based on a recursive splitting of the original task into a set of smaller and simpler tasks. As a result both more accurate and faster solutions are obtained. Our evaluation is performed on a set of ancient documents that need to be digitized.
|
|
|
Rahat Khan, Joost Van de Weijer, Fahad Shahbaz Khan, Damien Muselet, christophe Ducottet, & Cecile Barat. (2013). Discriminative Color Descriptors. In IEEE Conference on Computer Vision and Pattern Recognition (pp. 2866–2873).
Abstract: Color description is a challenging task because of large variations in RGB values which occur due to scene accidental events, such as shadows, shading, specularities, illuminant color changes, and changes in viewing geometry. Traditionally, this challenge has been addressed by capturing the variations in physics-based models, and deriving invariants for the undesired variations. The drawback of this approach is that sets of distinguishable colors in the original color space are mapped to the same value in the photometric invariant space. This results in a drop of discriminative power of the color description. In this paper we take an information theoretic approach to color description. We cluster color values together based on their discriminative power in a classification problem. The clustering has the explicit objective to minimize the drop of mutual information of the final representation. We show that such a color description automatically learns a certain degree of photometric invariance. We also show that a universal color representation, which is based on other data sets than the one at hand, can obtain competing performance. Experiments show that the proposed descriptor outperforms existing photometric invariants. Furthermore, we show that combined with shape description these color descriptors obtain excellent results on four challenging datasets, namely, PASCAL VOC 2007, Flowers-102, Stanford dogs-120 and Birds-200.
|
|
|
Marina Alberti. (2013). Detection and Alignment of Vascular Structures in Intravascular Ultrasound using Pattern Recognition Techniques (Simone Balocco, & Petia Radeva, Eds.). Ph.D. thesis, Ediciones Graficas Rey, .
Abstract: In this thesis, several methods for the automatic analysis of Intravascular Ultrasound
(IVUS) sequences are presented, aimed at assisting physicians in the diagnosis, the assessment of the intervention and the monitoring of the patients with coronary disease.
The basis for the developed frameworks are machine learning, pattern recognition and
image processing techniques.
First, a novel approach for the automatic detection of vascular bifurcations in
IVUS is presented. The task is addressed as a binary classication problem (identifying bifurcation and non-bifurcation angular sectors in the sequence images). The
multiscale stacked sequential learning algorithm is applied, to take into account the
spatial and temporal context in IVUS sequences, and the results are rened using
a-priori information about branching dimensions and geometry. The achieved performance is comparable to intra- and inter-observer variability.
Then, we propose a novel method for the automatic non-rigid alignment of IVUS
sequences of the same patient, acquired at dierent moments (before and after percutaneous coronary intervention, or at baseline and follow-up examinations). The
method is based on the description of the morphological content of the vessel, obtained by extracting temporal morphological proles from the IVUS acquisitions, by
means of methods for segmentation, characterization and detection in IVUS. A technique for non-rigid sequence alignment – the Dynamic Time Warping algorithm -
is applied to the proles and adapted to the specic clinical problem. Two dierent robust strategies are proposed to address the partial overlapping between frames
of corresponding sequences, and a regularization term is introduced to compensate
for possible errors in the prole extraction. The benets of the proposed strategy
are demonstrated by extensive validation on synthetic and in-vivo data. The results
show the interest of the proposed non-linear alignment and the clinical value of the
method.
Finally, a novel automatic approach for the extraction of the luminal border in
IVUS images is presented. The method applies the multiscale stacked sequential
learning algorithm and extends it to 2-D+T, in a rst classication phase (the identi-
cation of lumen and non-lumen regions of the images), while an active contour model
is used in a second phase, to identify the lumen contour. The method is extended
to the longitudinal dimension of the sequences and it is validated on a challenging
data-set.
|
|
|
David Roche, Debora Gil, & Jesus Giraldo. (2013). Detecting loss of diversity for an efficient termination of EAs. In 15th International Symposium on Symbolic and Numeric Algorithms for Scientific Computing (pp. 561–566).
Abstract: Termination of Evolutionary Algorithms (EA) at its steady state so that useless iterations are not performed is a main point for its efficient application to black-box problems. Many EA algorithms evolve while there is still diversity in their population and, thus, they could be terminated by analyzing the behavior some measures of EA population diversity. This paper presents a numeric approximation to steady states that can be used to detect the moment EA population has lost its diversity for EA termination. Our condition has been applied to 3 EA paradigms based on diversity and a selection of functions
covering the properties most relevant for EA convergence.
Experiments show that our condition works regardless of the search space dimension and function landscape.
Keywords: EA termination; EA population diversity; EA steady state
|
|
|
Jiaolong Xu, Sebastian Ramos, David Vazquez, & Antonio Lopez. (2013). DA-DPM Pedestrian Detection. In ICCV Workshop on Reconstruction meets Recognition.
Keywords: Domain Adaptation; Pedestrian Detection
|
|
|
Joan Serrat, Felipe Lumbreras, & Antonio Lopez. (2013). Cost estimation of custom hoses from STL files and CAD drawings. COMPUTIND - Computers in Industry, 64(3), 299–309.
Abstract: We present a method for the cost estimation of custom hoses from CAD models. They can come in two formats, which are easy to generate: a STL file or the image of a CAD drawing showing several orthogonal projections. The challenges in either cases are, first, to obtain from them a high level 3D description of the shape, and second, to learn a regression function for the prediction of the manufacturing time, based on geometric features of the reconstructed shape. The chosen description is the 3D line along the medial axis of the tube and the diameter of the circular sections along it. In order to extract it from STL files, we have adapted RANSAC, a robust parametric fitting algorithm. As for CAD drawing images, we propose a new technique for 3D reconstruction from data entered on any number of orthogonal projections. The regression function is a Gaussian process, which does not constrain the function to adopt any specific form and is governed by just two parameters. We assess the accuracy of the manufacturing time estimation by k-fold cross validation on 171 STL file models for which the time is provided by an expert. The results show the feasibility of the method, whereby the relative error for 80% of the testing samples is below 15%.
Keywords: On-line quotation; STL format; Regression; Gaussian process
|
|
|
David Fernandez, Simone Marinai, Josep Llados, & Alicia Fornes. (2013). Contextual Word Spotting in Historical Manuscripts using Markov Logic Networks. In 2nd International Workshop on Historical Document Imaging and Processing (pp. 36–43).
Abstract: Natural languages can often be modelled by suitable grammars whose knowledge can improve the word spotting results. The implicit contextual information is even more useful when dealing with information that is intrinsically described as one collection of records. In this paper, we present one approach to word spotting which uses the contextual information of records to improve the results. The method relies on Markov Logic Networks to probabilistically model the relational organization of handwritten records. The performance has been evaluated on the Barcelona Marriages Dataset that contains structured handwritten records that summarize marriage information.
|
|
|
Mikhail Mozerov. (2013). Constrained Optical Flow Estimation as a Matching Problem. TIP - IEEE Transactions on Image Processing, 22(5), 2044–2055.
Abstract: In general, discretization in the motion vector domain yields an intractable number of labels. In this paper we propose an approach that can reduce general optical flow to the constrained matching problem by pre-estimating a 2D disparity labeling map of the desired discrete motion vector function. One of the goals of the proposed paper is estimating coarse distribution of motion vectors and then utilizing this distribution as global constraints for discrete optical flow estimation. This pre-estimation is done with a simple frame-to-frame correlation technique also known as the digital symmetric-phase-only-filter (SPOF). We discover a strong correlation between the output of the SPOF and the motion vector distribution of the related optical flow. The two step matching paradigm for optical flow estimation is applied: pixel accuracy (integer flow), and subpixel accuracy estimation. The matching problem is solved by global optimization. Experiments on the Middlebury optical flow datasets confirm our intuitive assumptions about strong correlation between motion vector distribution of optical flow and maximal peaks of SPOF outputs. The overall performance of the proposed method is promising and achieves state-of-the-art results on the Middlebury benchmark.
|
|
|
Sezer Karaoglu, Jan van Gemert, & Theo Gevers. (2013). Con-text: text detection using background connectivity for fine-grained object classification. In 21ST ACM International Conference on Multimedia (pp. 757–760).
|
|
|
Jorge Bernal, & David Vazquez (Eds.). (2013). Computer vision Trends and Challenges.
Abstract: This book contains the papers presented at the Eighth CVC Workshop on Computer Vision Trends and Challenges (CVCR&D'2013). The workshop was held at the Computer Vision Center (Universitat Autònoma de Barcelona), the October 25th, 2013. The CVC workshops provide an excellent opportunity for young researchers and project engineers to share new ideas and knowledge about the progress of their work, and also, to discuss about challenges and future perspectives. In addition, the workshop is the welcome event for new people that recently have joined the institute.
The program of CVCR&D is organized in a single-track single-day workshop. It comprises several sessions dedicated to specific topics. For each session, a doctor working on the topic introduces the general research lines. The PhD students expose their specific research. A poster session will be held for open questions. Session topics cover the current research lines and development projects of the CVC: Medical Imaging, Medical Imaging, Color & Texture Analysis, Object Recognition, Image Sequence Evaluation, Advanced Driver Assistance Systems, Machine Vision, Document Analysis, Pattern Recognition and Applications. We want to thank all paper authors and Program Committee members. Their contribution shows that the CVC has a dynamic, active, and promising scientific community.
We hope you all enjoy this Eighth workshop and we are looking forward to meeting you and new people next year in the Ninth CVCR&D.
Keywords: CVCRD; Computer Vision
|
|
|
Ferran Poveda. (2013). Computer Graphics and Vision Techniques for the Study of the Muscular Fiber Architecture of the Myocardium (Debora Gil, & Enric Marti, Eds.). Ph.D. thesis, , .
|
|
|
Lluis Pere de las Heras, Ernest Valveny, & Gemma Sanchez. (2013). Combining structural and statistical strategies for unsupervised wall detection in floor plans. In 10th IAPR International Workshop on Graphics Recognition.
Abstract: This paper presents an evolution of the first unsupervised wall segmentation method in floor plans, that was presented by the authors in [1]. This first approach, contrarily to the existing ones, is able to segment walls independently to their notation and without the need of any pre-annotated data
to learn their visual appearance. Despite the good performance of the first approach, some specific cases, such as curved shaped walls, were not correctly segmented since they do not agree the strict structural assumptions that guide the whole methodology in order to be able to learn, in an unsupervised way, the structure of a wall. In this paper, we refine this strategy by dividing the
process in two steps. In a first step, potential wall segments are extracted unsupervisedly using a modification of [1], by restricting even more the areas considered as walls in a first moment. In a second step, these segments are used to learn and spot lost instances based on a modified version of [2], also presented by the authors. The presented combined method have been tested on
4 datasets with different notations and compared with the stateof-the-art applyed on the same datasets. The results show its adaptability to different wall notations and shapes, significantly outperforming the original approach.
|
|
|
Fahad Shahbaz Khan, Muhammad Anwer Rao, Joost Van de Weijer, Andrew Bagdanov, Antonio Lopez, & Michael Felsberg. (2013). Coloring Action Recognition in Still Images. IJCV - International Journal of Computer Vision, 105(3), 205–221.
Abstract: In this article we investigate the problem of human action recognition in static images. By action recognition we intend a class of problems which includes both action classification and action detection (i.e. simultaneous localization and classification). Bag-of-words image representations yield promising results for action classification, and deformable part models perform very well object detection. The representations for action recognition typically use only shape cues and ignore color information. Inspired by the recent success of color in image classification and object detection, we investigate the potential of color for action classification and detection in static images. We perform a comprehensive evaluation of color descriptors and fusion approaches for action recognition. Experiments were conducted on the three datasets most used for benchmarking action recognition in still images: Willow, PASCAL VOC 2010 and Stanford-40. Our experiments demonstrate that incorporating color information considerably improves recognition performance, and that a descriptor based on color names outperforms pure color descriptors. Our experiments demonstrate that late fusion of color and shape information outperforms other approaches on action recognition. Finally, we show that the different color–shape fusion approaches result in complementary information and combining them yields state-of-the-art performance for action classification.
|
|