|
Lluis Pere de las Heras, Joan Mas, Gemma Sanchez, & Ernest Valveny. (2011). Wall Patch-Based Segmentation in Architectural Floorplans. In 11th International Conference on Document Analysis and Recognition (pp. 1270–1274).
Abstract: Segmentation of architectural floor plans is a challenging task, mainly because of the large variability in the notation between different plans. In general, traditional techniques, usually based on analyzing and grouping structural primitives obtained by vectorization, are only able to handle a reduced range of similar notations. In this paper we propose an alternative patch-based segmentation approach working at pixel level, without need of vectorization. The image is divided into a set of patches and a set of features is extracted for every patch. Then, each patch is assigned to a visual word of a previously learned vocabulary and given a probability of belonging to each class of objects. Finally, a post-process assigns the final label for every pixel. This approach has been applied to the detection of walls on two datasets of architectural floor plans with different notations, achieving high accuracy rates.
|
|
|
Yaxing Wang, Abel Gonzalez-Garcia, Joost Van de Weijer, & Luis Herranz. (2019). SDIT: Scalable and Diverse Cross-domain Image Translation. In 27th ACM International Conference on Multimedia (1267–1276).
Abstract: Recently, image-to-image translation research has witnessed remarkable progress. Although current approaches successfully generate diverse outputs or perform scalable image transfer, these properties have not been combined into a single method. To address this limitation, we propose SDIT: Scalable and Diverse image-to-image translation. These properties are combined into a single generator. The diversity is determined by a latent variable which is randomly sampled from a normal distribution. The scalability is obtained by conditioning the network on the domain attributes. Additionally, we also exploit an attention mechanism that permits the generator to focus on the domain-specific attribute. We empirically demonstrate the performance of the proposed method on face mapping and other datasets beyond faces.
|
|
|
Mohammad Ali Bagheri, Gang Hu, Qigang Gao, & Sergio Escalera. (2014). A Framework of Multi-Classifier Fusion for Human Action Recognition. In 22nd International Conference on Pattern Recognition (pp. 1260–1265).
Abstract: The performance of different action-recognition methods using skeleton joint locations have been recently studied by several computer vision researchers. However, the potential improvement in classification through classifier fusion by ensemble-based methods has remained unattended. In this work, we evaluate the performance of an ensemble of five action learning techniques, each performing the recognition task from a different perspective. The underlying rationale of the fusion approach is that different learners employ varying structures of input descriptors/features to be trained. These varying structures cannot be attached and used by a single learner. In addition, combining the outputs of several learners can reduce the risk of an unfortunate selection of a poorly performing learner. This leads to having a more robust and general-applicable framework. Also, we propose two simple, yet effective, action description techniques. In order to improve the recognition performance, a powerful combination strategy is utilized based on the Dempster-Shafer theory, which can effectively make use of diversity of base learners trained on different sources of information. The recognition results of the individual classifiers are compared with those obtained from fusing the classifiers' output, showing advanced performance of the proposed methodology.
|
|
|
Mohammad Ali Bagheri, Qigang Gao, & Sergio Escalera. (2014). Generic Subclass Ensemble: A Novel Approach to Ensemble Classification. In 22nd International Conference on Pattern Recognition (pp. 1254–1259).
Abstract: Multiple classifier systems, also known as classifier ensembles, have received great attention in recent years because of their improved classification accuracy in different applications. In this paper, we propose a new general approach to ensemble classification, named generic subclass ensemble, in which each base classifier is trained with data belonging to a subset of classes, and thus discriminates among a subset of target categories. The ensemble classifiers are then fused using a combination rule. The proposed approach differs from existing methods that manipulate the target attribute, since in our approach individual classification problems are not restricted to two-class problems. We perform a series of experiments to evaluate the efficiency of the generic subclass approach on a set of benchmark datasets. Experimental results with multilayer perceptrons show that the proposed approach presents a viable alternative to the most commonly used ensemble classification approaches.
|
|
|
Patricia Suarez, Angel Sappa, Boris X. Vintimilla, & Riad I. Hammoud. (2018). Deep Learning based Single Image Dehazing. In 31st IEEE Conference on Computer Vision and Pattern Recognition Workhsop (pp. 1250–12507).
Abstract: This paper proposes a novel approach to remove haze degradations in RGB images using a stacked conditional Generative Adversarial Network (GAN). It employs a triplet of GAN to remove the haze on each color channel independently.
A multiple loss functions scheme, applied over a conditional probabilistic model, is proposed. The proposed GAN architecture learns to remove the haze, using as conditioned entrance, the images with haze from which the clear
images will be obtained. Such formulation ensures a fast model training convergence and a homogeneous model generalization. Experiments showed that the proposed method generates high-quality clear images.
Keywords: Gallium nitride; Atmospheric modeling; Generators; Generative adversarial networks; Convergence; Image color analysis
|
|
|
Lluis Pere de las Heras, David Fernandez, Ernest Valveny, Josep Llados, & Gemma Sanchez. (2013). Unsupervised wall detector in architectural floor plan. In 12th International Conference on Document Analysis and Recognition (pp. 1245–1249).
Abstract: Wall detection in floor plans is a crucial step in a complete floor plan recognition system. Walls define the main structure of buildings and convey essential information for the detection of other structural elements. Nevertheless, wall segmentation is a difficult task, mainly because of the lack of a standard graphical notation. The existing approaches are restricted to small group of similar notations or require the existence of pre-annotated corpus of input images to learn each new notation. In this paper we present an automatic wall segmentation system, with the ability to handle completely different notations without the need of any annotated dataset. It only takes advantage of the general knowledge that walls are a repetitive element, naturally distributed within the plan and commonly modeled by straight parallel lines. The method has been tested on four datasets of real floor plans with different notations, and compared with the state-of-the-art. The results show its suitability for different graphical notations, achieving higher recall rates than the rest of the methods while keeping a high average precision.
|
|
|
Christophe Rigaud, Dimosthenis Karatzas, Joost Van de Weijer, Jean-Christophe Burie, & Jean-Marc Ogier. (2013). An active contour model for speech balloon detection in comics. In 12th International Conference on Document Analysis and Recognition (pp. 1240–1244).
Abstract: Comic books constitute an important cultural heritage asset in many countries. Digitization combined with subsequent comic book understanding would enable a variety of new applications, including content-based retrieval and content retargeting. Document understanding in this domain is challenging as comics are semi-structured documents, combining semantically important graphical and textual parts. Few studies have been done in this direction. In this work we detail a novel approach for closed and non-closed speech balloon localization in scanned comic book pages, an essential step towards a fully automatic comic book understanding. The approach is compared with existing methods for closed balloon localization found in the literature and results are presented.
|
|
|
Albert Berenguel, Oriol Ramos Terrades, Josep Llados, & Cristina Cañero. (2017). Evaluation of Texture Descriptors for Validation of Counterfeit Documents. In 14th International Conference on Document Analysis and Recognition (pp. 1237–1242).
Abstract: This paper describes an exhaustive comparative analysis and evaluation of different existing texture descriptor algorithms to differentiate between genuine and counterfeit documents. We include in our experiments different categories of algorithms and compare them in different scenarios with several counterfeit datasets, comprising banknotes and identity documents. Computational time in the extraction of each descriptor is important because the final objective is to use it in a real industrial scenario. HoG and CNN based descriptors stands out statistically over the rest in terms of the F1-score/time ratio performance.
|
|
|
Suman Ghosh, Lluis Gomez, Dimosthenis Karatzas, & Ernest Valveny. (2015). Efficient indexing for Query By String text retrieval. In 6th IAPR International Workshop on Camera Based Document Analysis and Recognition CBDAR2015 (pp. 1236–1240).
Abstract: This paper deals with Query By String word spotting in scene images. A hierarchical text segmentation algorithm based on text specific selective search is used to find text regions. These regions are indexed per character n-grams present in the text region. An attribute representation based on Pyramidal Histogram of Characters (PHOC) is used to compare text regions with the query text. For generation of the index a similar attribute space based Pyramidal Histogram of character n-grams is used. These attribute models are learned using linear SVMs over the Fisher Vector [1] representation of the images along with the PHOC labels of the corresponding strings.
|
|
|
J. Chazalon, Marçal Rusiñol, & Jean-Marc Ogier. (2015). Improving Document Matching Performance by Local Descriptor Filtering. In 6th IAPR International Workshop on Camera Based Document Analysis and Recognition CBDAR2015 (pp. 1216–1220).
Abstract: In this paper we propose an effective method aimed at reducing the amount of local descriptors to be indexed in a document matching framework. In an off-line training stage, the matching between the model document and incoming images is computed retaining the local descriptors from the model that steadily produce good matches. We have evaluated this approach by using the ICDAR2015 SmartDOC dataset containing near 25 000 images from documents to be captured by a mobile device. We have tested the performance of this filtering step by using
ORB and SIFT local detectors and descriptors. The results show an important gain both in quality of the final matching as well as in time and space requirements.
|
|
|
Ana Maria Ares, Jorge Bernal, Maria Jesus Nozal, F. Javier Sanchez, & Jose Bernal. (2018). Results of the use of Kahoot! gamification tool in a course of Chemistry. In 4th International Conference on Higher Education Advances (pp. 1215–1222).
Abstract: The present study examines the use of Kahoot! as a gamification tool to explore mixed learning strategies. We analyze its use in two different groups of a theoretical subject of the third course of the Degree in Chemistry. An empirical-analytical methodology was used using Kahoot! in two different groups of students, with different frequencies. The academic results of these two group of students were compared between them and with those obtained in the previous course, in which Kahoot! was not employed, with the aim of measuring the evolution in the students´ knowledge. The results showed, in all cases, that the use of Kahoot! has led to a significant increase in the overall marks, and in the number of students who passed the subject. Moreover, some differences were also observed in students´ academic performance according to the group. Finally, it can be concluded that the use of a gamification tool (Kahoot!) in a university classroom had generally improved students´ learning and marks, and that this improvement is more prevalent in those students who have achieved a better Kahoot! performance.
|
|
|
Xavier Otazu, Olivier Penacchio, & Xim Cerda-Company. (2015). Brightness and colour induction through contextual influences in V1. In Scottish Vision Group 2015 SGV2015 (Vol. 12, pp. 1208–2012).
|
|
|
Angel Sappa, Fadi Dornaika, David Geronimo, & Antonio Lopez. (2007). Efficient On-Board Stereo Vision Pose Estimation. In Computer Aided Systems Theory, Selected paper from (Vol. 4739, 1183–1190). LNCS.
Abstract: This paper presents an efficient technique for real time estimation of on-board stereo vision system pose. The whole process is performed in the Euclidean space and consists of two stages. Initially, a compact representation of the original 3D data points is computed. Then, a RANSAC based least squares approach is used for fitting a plane to the 3D road points. Fast RANSAC fitting is obtained by selecting points according to a probability distribution function that takes into account the density of points at a given depth. Finally, stereo camera position
and orientation—pose—is computed relative to the road plane. The proposed technique is intended to be used on driver assistance systems for applications such as obstacle or pedestrian detection. A real time performance is reached. Experimental results on several environments and comparisons with a previous work are presented.
|
|
|
Miguel Reyes, Gabriel Dominguez, & Sergio Escalera. (2011). Feature Weighting in Dynamic Time Warping for Gesture Recognition in Depth Data. In 1st IEEE Workshop on Consumer Depth Cameras for Computer Vision (pp. 1182–1188).
Abstract: We present a gesture recognition approach for depth video data based on a novel Feature Weighting approach within the Dynamic Time Warping framework. Depth features from human joints are compared through video sequences using Dynamic Time Warping, and weights are assigned to features based on inter-intra class gesture variability. Feature Weighting in Dynamic Time Warping is then applied for recognizing begin-end of gestures in data sequences. The obtained results recognizing several gestures in depth data show high performance compared with classical Dynamic Time Warping approach.
|
|
|
Ajian Liu, Zichang Tan, Jun Wan, Sergio Escalera, Guodong Guo, & Stan Z. Li. (2021). CASIA-SURF CeFA: A Benchmark for Multi-modal Cross-Ethnicity Face Anti-Spoofing. In IEEE Winter Conference on Applications of Computer Vision (pp. 1178–1186).
Abstract: The issue of ethnic bias has proven to affect the performance of face recognition in previous works, while it still remains to be vacant in face anti-spoofing. Therefore, in order to study the ethnic bias for face anti-spoofing, we introduce the largest CASIA-SURF Cross-ethnicity Face Anti-spoofing (CeFA) dataset, covering 3 ethnicities, 3 modalities, 1,607 subjects, and 2D plus 3D attack types. Five protocols are introduced to measure the affect under varied evaluation conditions, such as cross-ethnicity, unknown spoofs or both of them. As our knowledge, CASIA-SURF CeFA is the first dataset including explicit ethnic labels in current released datasets. Then, we propose a novel multi-modal fusion method as a strong baseline to alleviate the ethnic bias, which employs a partially shared fusion strategy to learn complementary information from multiple modalities. Extensive experiments have been conducted on the proposed dataset to verify its significance and generalization capability for other existing datasets, i.e., CASIA-SURF, OULU-NPU and SiW datasets. The dataset is available at https://sites.google.com/qq.com/face-anti-spoofing/welcome/challengecvpr2020?authuser=0.
|
|