|
Albert Gordo, Jaume Gibert, Ernest Valveny, & Marçal Rusiñol. (2010). A Kernel-based Approach to Document Retrieval. In 9th IAPR International Workshop on Document Analysis Systems (377–384).
Abstract: In this paper we tackle the problem of document image retrieval by combining a similarity measure between documents and the probability that a given document belongs to a certain class. The membership probability to a specific class is computed using Support Vector Machines in conjunction with similarity measure based kernel applied to structural document representations. In the presented experiments, we use different document representations, both visual and structural, and we apply them to a database of historical documents. We show how our method based on similarity kernels outperforms the usual distance-based retrieval.
|
|
|
Antonio Clavelli, Dimosthenis Karatzas, & Josep Llados. (2010). A framework for the assessment of text extraction algorithms on complex colour images. In 9th IAPR International Workshop on Document Analysis Systems (19–26).
Abstract: The availability of open, ground-truthed datasets and clear performance metrics is a crucial factor in the development of an application domain. The domain of colour text image analysis (real scenes, Web and spam images, scanned colour documents) has traditionally suffered from a lack of a comprehensive performance evaluation framework. Such a framework is extremely difficult to specify, and corresponding pixel-level accurate information tedious to define. In this paper we discuss the challenges and technical issues associated with developing such a framework. Then, we describe a complete framework for the evaluation of text extraction methods at multiple levels, provide a detailed ground-truth specification and present a case study on how this framework can be used in a real-life situation.
|
|
|
Farshad Nourbakhsh, Dimosthenis Karatzas, & Ernest Valveny. (2010). A polar-based logo representation based on topological and colour features. In 9th IAPR International Workshop on Document Analysis Systems (341–348).
Abstract: In this paper, we propose a novel rotation and scale invariant method for colour logo retrieval and classification, which involves performing a simple colour segmentation and subsequently describing each of the resultant colour components based on a set of topological and colour features. A polar representation is used to represent the logo and the subsequent logo matching is based on Cyclic Dynamic Time Warping (CDTW). We also show how combining information about the global distribution of the logo components and their local neighbourhood using the Delaunay triangulation allows to improve the results. All experiments are performed on a dataset of 2500 instances of 100 colour logo images in different rotations and scales.
|
|
|
Jaume Garcia, Albert Andaluz, Debora Gil, & Francesc Carreras. (2010). Decoupled External Forces in a Predictor-Corrector Segmentation Scheme for LV Contours in Tagged MR Images. In 32nd Annual International Conference of the IEEE Engineering in Medicine and Biology Society (pp. 4805–4808).
Abstract: Computation of functional regional scores requires proper identification of LV contours. On one hand, manual segmentation is robust, but it is time consuming and requires high expertise. On the other hand, the tag pattern in TMR sequences is a problem for automatic segmentation of LV boundaries. We propose a segmentation method based on a predictorcorrector (Active Contours – Shape Models) scheme. Special stress is put in the definition of the AC external forces. First, we introduce a semantic description of the LV that discriminates myocardial tissue by using texture and motion descriptors. Second, in order to ensure convergence regardless of the initial contour, the external energy is decoupled according to the orientation of the edges in the image potential. We have validated the model in terms of error in segmented contours and accuracy of regional clinical scores.
|
|
|
Santiago Segui, Laura Igual, & Jordi Vitria. (2010). Weighted Bagging for Graph based One-Class Classifiers. In 9th International Workshop on Multiple Classifier Systems (Vol. 5997, pp. 1–10). LNCS. Springer Berlin Heidelberg.
Abstract: Most conventional learning algorithms require both positive and negative training data for achieving accurate classification results. However, the problem of learning classifiers from only positive data arises in many applications where negative data are too costly, difficult to obtain, or not available at all. Minimum Spanning Tree Class Descriptor (MSTCD) was presented as a method that achieves better accuracies than other one-class classifiers in high dimensional data. However, the presence of outliers in the target class severely harms the performance of this classifier. In this paper we propose two bagging strategies for MSTCD that reduce the influence of outliers in training data. We show the improved performance on both real and artificially contaminated data.
|
|
|
David Geronimo, Angel Sappa, Daniel Ponsa, & Antonio Lopez. (2010). 2D-3D based on-board pedestrian detection system. CVIU - Computer Vision and Image Understanding, 114(5), 583–595.
Abstract: During the next decade, on-board pedestrian detection systems will play a key role in the challenge of increasing traffic safety. The main target of these systems, to detect pedestrians in urban scenarios, implies overcoming difficulties like processing outdoor scenes from a mobile platform and searching for aspect-changing objects in cluttered environments. This makes such systems combine techniques in the state-of-the-art Computer Vision. In this paper we present a three module system based on both 2D and 3D cues. The first module uses 3D information to estimate the road plane parameters and thus select a coherent set of regions of interest (ROIs) to be further analyzed. The second module uses Real AdaBoost and a combined set of Haar wavelets and edge orientation histograms to classify the incoming ROIs as pedestrian or non-pedestrian. The final module loops again with the 3D cue in order to verify the classified ROIs and with the 2D in order to refine the final results. According to the results, the integration of the proposed techniques gives rise to a promising system.
Keywords: Pedestrian detection; Advanced Driver Assistance Systems; Horizon line; Haar wavelets; Edge orientation histograms
|
|
|
Marco Pedersoli, Jordi Gonzalez, Andrew Bagdanov, & Juan J. Villanueva. (2010). Recursive Coarse-to-Fine Localization for fast Object Recognition. In 11th European Conference on Computer Vision (Vol. 6313, 280–293). LNCS. Springer Berlin Heidelberg.
Abstract: Cascading techniques are commonly used to speed-up the scan of an image for object detection. However, cascades of detectors are slow to train due to the high number of detectors and corresponding thresholds to learn. Furthermore, they do not use any prior knowledge about the scene structure to decide where to focus the search. To handle these problems, we propose a new way to scan an image, where we couple a recursive coarse-to-fine refinement together with spatial constraints of the object location. For doing that we split an image into a set of uniformly distributed neighborhood regions, and for each of these we apply a local greedy search over feature resolutions. The neighborhood is defined as a scanning region that only one object can occupy. Therefore the best hypothesis is obtained as the location with maximum score and no thresholds are needed. We present an implementation of our method using a pyramid of HOG features and we evaluate it on two standard databases, VOC2007 and INRIA dataset. Results show that the Recursive Coarse-to-Fine Localization (RCFL) achieves a 12x speed-up compared to standard sliding windows. Compared with a cascade of multiple resolutions approach our method has slightly better performance in speed and Average-Precision. Furthermore, in contrast to cascading approach, the speed-up is independent of image conditions, the number of detected objects and clutter.
|
|
|
Carles Fernandez, Jordi Gonzalez, & Xavier Roca. (2010). Automatic Learning of Background Semantics in Generic Surveilled Scenes. In 11th European Conference on Computer Vision (Vol. 6313, 678–692). LNCS. Springer Berlin Heidelberg.
Abstract: Advanced surveillance systems for behavior recognition in outdoor traffic scenes depend strongly on the particular configuration of the scenario. Scene-independent trajectory analysis techniques statistically infer semantics in locations where motion occurs, and such inferences are typically limited to abnormality. Thus, it is interesting to design contributions that automatically categorize more specific semantic regions. State-of-the-art approaches for unsupervised scene labeling exploit trajectory data to segment areas like sources, sinks, or waiting zones. Our method, in addition, incorporates scene-independent knowledge to assign more meaningful labels like crosswalks, sidewalks, or parking spaces. First, a spatiotemporal scene model is obtained from trajectory analysis. Subsequently, a so-called GI-MRF inference process reinforces spatial coherence, and incorporates taxonomy-guided smoothness constraints. Our method achieves automatic and effective labeling of conceptual regions in urban scenarios, and is robust to tracking errors. Experimental validation on 5 surveillance databases has been conducted to assess the generality and accuracy of the segmentations. The resulting scene models are used for model-based behavior analysis.
|
|
|
N. Serrano, L. Tarazon, D. Perez, Oriol Ramos Terrades, & S. Juan. (2010). The GIDOC Prototype. In 10th International Workshop on Pattern Recognition in Information Systems (pp. 82–89).
Abstract: Transcription of handwritten text in (old) documents is an important, time-consuming task for digital libraries. It might be carried out by first processing all document images off-line, and then manually supervising system transcriptions to edit incorrect parts. However, current techniques for automatic page layout analysis, text line detection and handwriting recognition are still far from perfect, and thus post-editing system output is not clearly better than simply ignoring it.
A more effective approach to transcribe old text documents is to follow an interactive- predictive paradigm in which both, the system is guided by the user, and the user is assisted by the system to complete the transcription task as efficiently as possible. Following this approach, a system prototype called GIDOC (Gimp-based Interactive transcription of old text DOCuments) has been developed to provide user-friendly, integrated support for interactive-predictive layout analysis, line detection and handwriting transcription.
GIDOC is designed to work with (large) collections of homogeneous documents, that is, of similar structure and writing styles. They are annotated sequentially, by (par- tially) supervising hypotheses drawn from statistical models that are constantly updated with an increasing number of available annotated documents. And this is done at different annotation levels. For instance, at the level of page layout analysis, GIDOC uses a novel text block detection method in which conventional, memoryless techniques are improved with a “history” model of text block positions. Similarly, at the level of text line image transcription, GIDOC includes a handwriting recognizer which is steadily improved with a growing number of (partially) supervised transcriptions.
|
|
|
Francesco Ciompi, Oriol Pujol, E Fernandez-Nofrerias, J. Mauri, & Petia Radeva. (2010). Conditional Random Fields for image segmentation in Intravascular Ultrasound. In Medical Image Computing in Catalunya: Graduate Student Workshop (13–14).
Abstract: We present a Conditional Random Fields based approach for segmenting Intravascular Ultrasond (IVUS) images. The presented method uses a contextual discriminative graphical model to deal with the presence of distorsions and artifacts in IVUS images, that turns the segmentation of interesting regions into a difficult task. An accurate lumen segmentation on IVUS longitudinal images is achieved.
|
|
|
Pierluigi Casale, Oriol Pujol, & Petia Radeva. (2010). Classyfing Agitation in Sedated ICU Patients. In Medical Image Computing in Catalunya: Graduate Student Workshop (19–20).
Abstract: Agitation is a serious problem in sedated intensive care unit (ICU) patients. In this work, standard machine learning techniques working on wearable accelerometer data have been used to classifying agitation levels achieving very good classification performances.
|
|
|
Antonio Hernandez, Carlo Gatta, Petia Radeva, Laura Igual, R. Letaz, & Sergio Escalera. (2010). Automatic Vessel Segmentation For Angiography and CT Registration. In Medical Image Computing in Catalunya: Graduate Student Workshop (1–2).
|
|
|
Miguel Reyes, Jordi Vitria, Petia Radeva, & Sergio Escalera. (2010). Real-time Activity Monitoring of Inpatients. In Medical Image Computing in Catalunya: Graduate Student Workshop (35–36).
Abstract: In this paper, we present the development of an application capable of monitoring a set of patient vital signs in real time. The application has been designed to support the medical staff of a hospital. Preliminary results show the suitability
of the system to prevent the injury produced by the agitation of the patients.
|
|
|
Michal Drozdzal, Laura Igual, Jordi Vitria, Petia Radeva, Carolina Malagelada, & Fernando Azpiroz. (2010). SIFT flow-based Sequences Alignment. In Medical Image Computing in Catalunya: Graduate Student Workshop (7–8).
|
|
|
Santiago Segui, Michal Drozdzal, Petia Radeva, & Jordi Vitria. (2010). Severe Motility Diagnosis using WCE. In Medical Image Computing in Catalunya: Graduate Student Workshop (45–46).
|
|