Publicacions CVC -- Query Results

Alejandro Gonzalez Alzate. (2015). Multi-modal Pedestrian Detection (David Vazquez, Antonio Lopez, &, Ed.). Ph.D. thesis, Ediciones Graficas Rey, . Abstract: Pedestrian detection continues to be an extremely challenging problem in real scenarios, in which situations like illumination changes, noisy images, unexpected objects, uncontrolled scenarios and variant appearance of objects occur constantly. All these problems force the development of more robust detectors for relevant applications like vision-based autonomous vehicles, intelligent surveillance, and pedestrian tracking for behavior analysis. Most reliable vision-based pedestrian detectors base their decision on features extracted using a single sensor capturing complementary features, e.g., appearance, and texture. These features usually are extracted from the current frame, ignoring temporal information, or including it in a post process step e.g., tracking or temporal coherence. Taking into account these issues we formulate the following question: can we generate more robust pedestrian detectors by introducing new information sources in the feature extraction step? In order to answer this question we develop different approaches for introducing new information sources to well-known pedestrian detectors. We start by the inclusion of temporal information following the Stacked Sequential Learning (SSL) paradigm which suggests that information extracted from the neighboring samples in a sequence can improve the accuracy of a base classifier. We then focus on the inclusion of complementary information from different sensors like 3D point clouds (LIDAR – depth), far infrared images (FIR), or disparity maps (stereo pair cameras). For this end we develop a multi-modal framework in which information from different sensors is used for increasing detection accuracy (by increasing information redundancy). Finally we propose a multi-view pedestrian detector, this multi-view approach splits the detection problem in n sub-problems. Each sub-problem will detect objects in a given specific view reducing in that way the variability problem faced when a single detectors is used for the whole problem. We show that these approaches obtain competitive results with other state-of-the-art methods but instead of design new features, we reuse existing ones boosting their performance. http://refbase.cvc.uab.es/show.php?record=2706
Alejandro Gonzalez Alzate, Sebastian Ramos, David Vazquez, Antonio Lopez, & Jaume Amores. (2015). Spatiotemporal Stacked Sequential Learning for Pedestrian Detection. In Pattern Recognition and Image Analysis, Proceedings of 7th Iberian Conference , ibPRIA 2015 (pp. 3–12). Abstract: Pedestrian classifiers decide which image windows contain a pedestrian. In practice, such classifiers provide a relatively high response at neighbor windows overlapping a pedestrian, while the responses around potential false positives are expected to be lower. An analogous reasoning applies for image sequences. If there is a pedestrian located within a frame, the same pedestrian is expected to appear close to the same location in neighbor frames. Therefore, such a location has chances of receiving high classification scores during several frames, while false positives are expected to be more spurious. In this paper we propose to exploit such correlations for improving the accuracy of base pedestrian classifiers. In particular, we propose to use two-stage classifiers which not only rely on the image descriptors required by the base classifiers but also on the response of such base classifiers in a given spatiotemporal neighborhood. More specifically, we train pedestrian classifiers using a stacked sequential learning (SSL) paradigm. We use a new pedestrian dataset we have acquired from a car to evaluate our proposal at different frame rates. We also test on a well known dataset: Caltech. The obtained results show that our SSL proposal boosts detection accuracy significantly with a minimal impact on the computational cost. Interestingly, SSL improves more the accuracy at the most dangerous situations, i.e. when a pedestrian is close to the camera. Keywords: SSL; Pedestrian Detection http://refbase.cvc.uab.es/show.php?record=2454
David Vazquez, David Geronimo, & Antonio Lopez. (2009). The effect of the distance in pedestrian detection (Vol. 149). Master's thesis, , . Abstract: Pedestrian accidents are one of the leading preventable causes of death. In order to reduce the number of accidents, in the last decade the pedestrian protection systems have been introduced, a special type of advanced driver assistance systems, in witch an on-board camera explores the road ahead for possible collisions with pedestrians in order to warn the driver or perform braking actions. As a result of the variability of the appearance, pose and size, pedestrian detection is a very challenging task. So many techniques, models and features have been proposed to solve the problem. As the appearance of pedestrians varies signicantly as a function of distance, a system based on multiple classiers specialized on diferent depths is likely to improve the overall performance with respect to a typical system based on a general detector. Accordingly, the main aim of this work is to explore the eect of the distance in pedestrian detection. We have evaluated three pedestrian detectors (HOG, HAAR and EOH) in two dierent databases (INRIA and Daimler09) for two dierent sizes (small and big). By a extensive set of experiments we answer to questions like which datasets and evaluation methods are the most adequate, which is the best method for each size of the pedestrians and why or how do the method optimum parameters vary with respect to the distance Keywords: Pedestrian Detection http://refbase.cvc.uab.es/show.php?record=1669
Guillermo Torres, Debora Gil, Antoni Rosell, S. Mena, & Carles Sanchez. (2023). Virtual Radiomics Biopsy for the Histological Diagnosis of Pulmonary Nodules. In 37th International Congress and Exhibition is organized by Computer Assisted Radiology and Surgery. Abstract: Pòster http://refbase.cvc.uab.es/show.php?record=3950
Sonia Baeza, Debora Gil, Carles Sanchez, Guillermo Torres, Ignasi Garcia Olive, Ignasi Guasch, et al. (2023). Biopsia virtual radiomica para el diagnóstico histológico de nódulos pulmonares – Resultados intermedios del proyecto Radiolung. In SEPAR. Abstract: Pòster http://refbase.cvc.uab.es/show.php?record=3951
Debora Gil, Guillermo Torres, & Carles Sanchez. (2023). Transforming radiomic features into radiological words. In IEEE International Symposium on Biomedical Imaging. Abstract: Pòster http://refbase.cvc.uab.es/show.php?record=3952
Guillermo Torres, Debora Gil, Antonio Rosell, Sonia Baeza, & Carles Sanchez. (2023). A radiomic biopsy for virtual histology of pulmonary nodules. In IEEE International Symposium on Biomedical Imaging. Abstract: Pòster http://refbase.cvc.uab.es/show.php?record=3954
Jaume Gibert. (2012). Vector Space Embedding of Graphs via Statistics of Labelling Information (Ernest Valveny, Ed.). Ph.D. thesis, Ediciones Graficas Rey, . Abstract: Pattern recognition is the task that aims at distinguishing objects among different classes. When such a task wants to be solved in an automatic way a crucial step is how to formally represent such patterns to the computer. Based on the different representational formalisms, we may distinguish between statistical and structural pattern recognition. The former describes objects as a set of measurements arranged in the form of what is called a feature vector. The latter assumes that relations between parts of the underlying objects need to be explicitly represented and thus it uses relational structures such as graphs for encoding their inherent information. Vector spaces are a very flexible mathematical structure that has allowed to come up with several efficient ways for the analysis of patterns under the form of feature vectors. Nevertheless, such a representation cannot explicitly cope with binary relations between parts of the objects and it is restricted to measure the exact same number of features for each pattern under study regardless of their complexity. Graph-based representations present the contrary situation. They can easily adapt to the inherent complexity of the patterns but introduce a problem of high computational complexity, hindering the design of efficient tools to process and analyse patterns. Solving this paradox is the main goal of this thesis. The ideal situation for solving pattern recognition problems would be to represent the patterns using relational structures such as graphs, and to be able to use the wealthy repository of data processing tools from the statistical pattern recognition domain. An elegant solution to this problem is to transform the graph domain into a vector domain where any processing algorithm can be applied. In other words, by mapping each graph to a point in a vector space we automatically get access to the rich set of algorithms from the statistical domain to be applied in the graph domain. Such methodology is called graph embedding. In this thesis we propose to associate feature vectors to graphs in a simple and very efficient way by just putting attention on the labelling information that graphs store. In particular, we count frequencies of node labels and of edges between labels. Although their locality, these features are able to robustly represent structurally global properties of graphs, when considered together in the form of a vector. We initially deal with the case of discrete attributed graphs, where features are easy to compute. The continuous case is tackled as a natural generalization of the discrete one, where rather than counting node and edge labelling instances, we count statistics of some representatives of them. We encounter how the proposed vectorial representations of graphs suffer from high dimensionality and correlation among components and we face these problems by feature selection algorithms. We also explore how the diversity of different embedding representations can be exploited in order to boost the performance of base classifiers in a multiple classifier systems framework. An extensive experimental evaluation finally shows how the methodology we propose can be efficiently computed and compete with other graph matching and embedding methodologies. http://refbase.cvc.uab.es/show.php?record=2204
V. Valev, & Petia Radeva. (1992). Determining Structural Description by Boolean Formulas. In H. Bunke (Ed.), Advances in Structural and Syntactic Pattern Recognition (Vol. 5, 131–140). Machine Perception and Artificial Intelligence:. World Scientific. Abstract: Pattern recognition is an active area of research with many applications, some of which have reached commercial maturity. Structural and syntactic methods are very powerful. They are based on symbolic data structures together with matching, parsing, and reasoning procedures that are able to infer interpretations of complex input patterns. This book gives an overview of the latest developments and achievements in the field. http://refbase.cvc.uab.es/show.php?record=254
Antonio Hernandez, Miguel Angel Bautista, Xavier Perez Sala, Victor Ponce, Sergio Escalera, Xavier Baro, et al. (2014). Probability-based Dynamic Time Warping and Bag-of-Visual-and-Depth-Words for Human Gesture Recognition in RGB-D. PRL - Pattern Recognition Letters, 50(1), 112–121. Abstract: PATREC5825 We present a methodology to address the problem of human gesture segmentation and recognition in video and depth image sequences. A Bag-of-Visual-and-Depth-Words (BoVDW) model is introduced as an extension of the Bag-of-Visual-Words (BoVW) model. State-of-the-art RGB and depth features, including a newly proposed depth descriptor, are analysed and combined in a late fusion form. The method is integrated in a Human Gesture Recognition pipeline, together with a novel probability-based Dynamic Time Warping (PDTW) algorithm which is used to perform prior segmentation of idle gestures. The proposed DTW variant uses samples of the same gesture category to build a Gaussian Mixture Model driven probabilistic model of that gesture class. Results of the whole Human Gesture Recognition pipeline in a public data set show better performance in comparison to both standard BoVW model and DTW approach. Keywords: RGB-D; Bag-of-Words; Dynamic Time Warping; Human Gesture Recognition http://refbase.cvc.uab.es/show.php?record=2353
Angel Sappa (Ed.). (2022). ICT Applications for Smart Cities (Vol. 224). ISRL. Springer. Abstract: Part of the book series: Intelligent Systems Reference Library (ISRL) This book is the result of four-year work in the framework of the Ibero-American Research Network TICs4CI funded by the CYTED program. In the following decades, 85% of the world's population is expected to live in cities; hence, urban centers should be prepared to provide smart solutions for problems ranging from video surveillance and intelligent mobility to the solid waste recycling processes, just to mention a few. More specifically, the book describes underlying technologies and practical implementations of several successful case studies of ICTs developed in the following smart city areas: • Urban environment monitoring • Intelligent mobility • Waste recycling processes • Video surveillance • Computer-aided diagnose in healthcare systems • Computer vision-based approaches for efficiency in production processes The book is intended for researchers and engineers in the field of ICTs for smart cities, as well as to anyone who wants to know about state-of-the-art approaches and challenges on this field. Keywords: Computational Intelligence; Intelligent Systems; Smart Cities; ICT Applications; Machine Learning; Pattern Recognition; Computer Vision; Image Processing http://refbase.cvc.uab.es/show.php?record=3812
Esteve Cervantes, Long Long Yu, Andrew Bagdanov, Marc Masana, & Joost Van de Weijer. (2016). Hierarchical Part Detection with Deep Neural Networks. In 23rd IEEE International Conference on Image Processing. Abstract: Part detection is an important aspect of object recognition. Most approaches apply object proposals to generate hundreds of possible part bounding box candidates which are then evaluated by part classifiers. Recently several methods have investigated directly regressing to a limited set of bounding boxes from deep neural network representation. However, for object parts such methods may be unfeasible due to their relatively small size with respect to the image. We propose a hierarchical method for object and part detection. In a single network we first detect the object and then regress to part location proposals based only on the feature representation inside the object. Experiments show that our hierarchical approach outperforms a network which directly regresses the part locations. We also show that our approach obtains part detection accuracy comparable or better than state-of-the-art on the CUB-200 bird and Fashionista clothing item datasets with only a fraction of the number of part proposals. Keywords: Object Recognition; Part Detection; Convolutional Neural Networks http://refbase.cvc.uab.es/show.php?record=2762
R. Bertrand, P. Gomez-Krämer, Oriol Ramos Terrades, P. Franco, & Jean-Marc Ogier. (2013). A System Based On Intrinsic Features for Fraudulent Document Detection. In 12th International Conference on Document Analysis and Recognition (pp. 106–110). Abstract: Paper documents still represent a large amount of information supports used nowadays and may contain critical data. Even though official documents are secured with techniques such as printed patterns or artwork, paper documents suffer froma lack of security. However, the high availability of cheap scanning and printing hardware allows non-experts to easily create fake documents. As the use of a watermarking system added during the document production step is hardly possible, solutions have to be proposed to distinguish a genuine document from a forged one. In this paper, we present an automatic forgery detection method based on document’s intrinsic features at character level. This method is based on the one hand on outlier character detection in a discriminant feature space and on the other hand on the detection of strictly similar characters. Therefore, a feature set iscomputed for all characters. Then, based on a distance between characters of the same class. Keywords: paper document; document analysis; fraudulent document; forgery; fake http://refbase.cvc.uab.es/show.php?record=2332
Mohammad Ali Bagheri, Qigang Gao, & Sergio Escalera. (2015). Combining Local and Global Learners in the Pairwise Multiclass Classification. PAA - Pattern Analysis and Applications, 18(4), 845–860. Abstract: Pairwise classification is a well-known class binarization technique that converts a multiclass problem into a number of two-class problems, one problem for each pair of classes. However, in the pairwise technique, nuisance votes of many irrelevant classifiers may result in a wrong class prediction. To overcome this problem, a simple, but efficient method is proposed and evaluated in this paper. The proposed method is based on excluding some classes and focusing on the most probable classes in the neighborhood space, named Local Crossing Off (LCO). This procedure is performed by employing a modified version of standard K-nearest neighbor and large margin nearest neighbor algorithms. The LCO method takes advantage of nearest neighbor classification algorithm because of its local learning behavior as well as the global behavior of powerful binary classifiers to discriminate between two classes. Combining these two properties in the proposed LCO technique will avoid the weaknesses of each method and will increase the efficiency of the whole classification system. On several benchmark datasets of varying size and difficulty, we found that the LCO approach leads to significant improvements using different base learners. The experimental results show that the proposed technique not only achieves better classification accuracy in comparison to other standard approaches, but also is computationally more efficient for tackling classification problems which have a relatively large number of target classes. Keywords: Multiclass classification; Pairwise approach; One-versus-one http://refbase.cvc.uab.es/show.php?record=2441
Naila Murray, & Eduard Vazquez. (2010). Lacuna Restoration: How to choose a neutral colour? In Proceedings of The CREATE 2010 Conference (248–252). Abstract: Painting restoration which involves filling in material loss (called lacuna) is a complex process. Several standard techniques exist to tackle lacuna restoration, and this article focuses on those techniques that employ a “neutral” colour to mask the defect. Restoration experts often disagree on the choice of such a colour and in fact, the concept of a neutral colour is controversial. We posit that a neutral colour is one that attracts relatively little visual attention for a specific lacuna. We conducted an eye tracking experiment to compare two common neutral colour selection methods, specifically the most common local colour and the mean local colour. Results obtained demonstrate that the most common local colour triggers less visual attention in general. Notwithstanding, we have observed instances in which the most common colour triggers a significant amount of attention when subjects spent time resolving their confusion about whether or not a lacuna was part of the painting. http://refbase.cvc.uab.es/show.php?record=1297

Alejandro Gonzalez Alzate. (2015). Multi-modal Pedestrian Detection (David Vazquez, Antonio Lopez, &, Ed.). Ph.D. thesis, Ediciones Graficas Rey, .

Abstract: Pedestrian detection continues to be an extremely challenging problem in real scenarios, in which situations like illumination changes, noisy images, unexpected objects, uncontrolled scenarios and variant appearance of objects occur constantly. All these problems force the development of more robust detectors for relevant applications like vision-based autonomous vehicles, intelligent surveillance, and pedestrian tracking for behavior analysis. Most reliable vision-based pedestrian detectors base their decision on features extracted using a single sensor capturing complementary features, e.g., appearance, and texture. These features usually are extracted from the current frame, ignoring temporal information, or including it in a post process step e.g., tracking or temporal coherence. Taking into account these issues we formulate the following question: can we generate more robust pedestrian detectors by introducing new information sources in the feature extraction step?
In order to answer this question we develop different approaches for introducing new information sources to well-known pedestrian detectors. We start by the inclusion of temporal information following the Stacked Sequential Learning (SSL) paradigm which suggests that information extracted from the neighboring samples in a sequence can improve the accuracy of a base classifier.
We then focus on the inclusion of complementary information from different sensors like 3D point clouds (LIDAR – depth), far infrared images (FIR), or disparity maps (stereo pair cameras). For this end we develop a multi-modal framework in which information from different sensors is used for increasing detection accuracy (by increasing information redundancy). Finally we propose a multi-view pedestrian detector, this multi-view approach splits the detection problem in n sub-problems.
Each sub-problem will detect objects in a given specific view reducing in that way the variability problem faced when a single detectors is used for the whole problem. We show that these approaches obtain competitive results with other state-of-the-art methods but instead of design new features, we reuse existing ones boosting their performance.

http://refbase.cvc.uab.es/show.php?record=2706

Alejandro Gonzalez Alzate, Sebastian Ramos, David Vazquez, Antonio Lopez, & Jaume Amores. (2015). Spatiotemporal Stacked Sequential Learning for Pedestrian Detection. In Pattern Recognition and Image Analysis, Proceedings of 7th Iberian Conference , ibPRIA 2015 (pp. 3–12).

David Vazquez, David Geronimo, & Antonio Lopez. (2009). The effect of the distance in pedestrian detection (Vol. 149). Master's thesis, , .

Guillermo Torres, Debora Gil, Antoni Rosell, S. Mena, & Carles Sanchez. (2023). Virtual Radiomics Biopsy for the Histological Diagnosis of Pulmonary Nodules. In 37th International Congress and Exhibition is organized by Computer Assisted Radiology and Surgery.

Sonia Baeza, Debora Gil, Carles Sanchez, Guillermo Torres, Ignasi Garcia Olive, Ignasi Guasch, et al. (2023). Biopsia virtual radiomica para el diagnóstico histológico de nódulos pulmonares – Resultados intermedios del proyecto Radiolung. In SEPAR.

Debora Gil, Guillermo Torres, & Carles Sanchez. (2023). Transforming radiomic features into radiological words. In IEEE International Symposium on Biomedical Imaging.

Guillermo Torres, Debora Gil, Antonio Rosell, Sonia Baeza, & Carles Sanchez. (2023). A radiomic biopsy for virtual histology of pulmonary nodules. In IEEE International Symposium on Biomedical Imaging.

Jaume Gibert. (2012). Vector Space Embedding of Graphs via Statistics of Labelling Information (Ernest Valveny, Ed.). Ph.D. thesis, Ediciones Graficas Rey, .

Abstract: Pattern recognition is the task that aims at distinguishing objects among different classes. When such a task wants to be solved in an automatic way a crucial step is how to formally represent such patterns to the computer. Based on the different representational formalisms, we may distinguish between statistical and structural pattern recognition. The former describes objects as a set of measurements arranged in the form of what is called a feature vector. The latter assumes that relations between parts of the underlying objects need to be explicitly represented and thus it uses relational structures such as graphs for encoding their inherent information. Vector spaces are a very flexible mathematical structure that has allowed to come up with several efficient ways for the analysis of patterns under the form of feature vectors. Nevertheless, such a representation cannot explicitly cope with binary relations between parts of the objects and it is restricted to measure the exact same number of features for each pattern under study regardless of their complexity. Graph-based representations present the contrary situation. They can easily adapt to the inherent complexity of the patterns but introduce a problem of high computational complexity, hindering the design of efficient tools to process and analyse patterns.
Solving this paradox is the main goal of this thesis. The ideal situation for solving pattern recognition problems would be to represent the patterns using relational structures such as graphs, and to be able to use the wealthy repository of data processing tools from the statistical pattern recognition domain. An elegant solution to this problem is to transform the graph domain into a vector domain where any processing algorithm can be applied. In other words, by mapping each graph to a point in a vector space we automatically get access to the rich set of algorithms from the statistical domain to be applied in the graph domain. Such methodology is called graph embedding.
In this thesis we propose to associate feature vectors to graphs in a simple and very efficient way by just putting attention on the labelling information that graphs store. In particular, we count frequencies of node labels and of edges between labels. Although their locality, these features are able to robustly represent structurally global properties of graphs, when considered together in the form of a vector. We initially deal with the case of discrete attributed graphs, where features are easy to compute. The continuous case is tackled as a natural generalization of the discrete one, where rather than counting node and edge labelling instances, we count statistics of some representatives of them. We encounter how the proposed vectorial representations of graphs suffer from high dimensionality and correlation among components and we face these problems by feature selection algorithms. We also explore how the diversity of different embedding representations can be exploited in order to boost the performance of base classifiers in a multiple classifier systems framework. An extensive experimental evaluation finally shows how the methodology we propose can be efficiently computed and compete with other graph matching and embedding methodologies.

http://refbase.cvc.uab.es/show.php?record=2204

V. Valev, & Petia Radeva. (1992). Determining Structural Description by Boolean Formulas. In H. Bunke (Ed.), Advances in Structural and Syntactic Pattern Recognition (Vol. 5, 131–140). Machine Perception and Artificial Intelligence:. World Scientific.

Antonio Hernandez, Miguel Angel Bautista, Xavier Perez Sala, Victor Ponce, Sergio Escalera, Xavier Baro, et al. (2014). Probability-based Dynamic Time Warping and Bag-of-Visual-and-Depth-Words for Human Gesture Recognition in RGB-D. PRL - Pattern Recognition Letters, 50(1), 112–121.

Angel Sappa (Ed.). (2022). ICT Applications for Smart Cities (Vol. 224). ISRL. Springer.

Esteve Cervantes, Long Long Yu, Andrew Bagdanov, Marc Masana, & Joost Van de Weijer. (2016). Hierarchical Part Detection with Deep Neural Networks. In 23rd IEEE International Conference on Image Processing.

R. Bertrand, P. Gomez-Krämer, Oriol Ramos Terrades, P. Franco, & Jean-Marc Ogier. (2013). A System Based On Intrinsic Features for Fraudulent Document Detection. In 12th International Conference on Document Analysis and Recognition (pp. 106–110).

Mohammad Ali Bagheri, Qigang Gao, & Sergio Escalera. (2015). Combining Local and Global Learners in the Pairwise Multiclass Classification. PAA - Pattern Analysis and Applications, 18(4), 845–860.

Naila Murray, & Eduard Vazquez. (2010). Lacuna Restoration: How to choose a neutral colour? In Proceedings of The CREATE 2010 Conference (248–252).