|
Sergio Vera, Miguel Angel Gonzalez Ballester, & Debora Gil. (2015). A Novel Cochlear Reference Frame Based On The Laplace Equation. In 29th international Congress and Exhibition on Computer Assisted Radiology and Surgery (Vol. 10, pp. 1–312).
|
|
|
Pau Riba, Josep Llados, Alicia Fornes, & Anjan Dutta. (2015). Large-scale Graph Indexing using Binary Embeddings of Node Contexts. In C.-L.Liu, B.Luo, W.G.Kropatsch, & J.Cheng (Eds.), 10th IAPR-TC15 Workshop on Graph-based Representations in Pattern Recognition (Vol. 9069, pp. 208–217). LNCS. Springer International Publishing.
Abstract: Graph-based representations are experiencing a growing usage in visual recognition and retrieval due to their representational power in front of classical appearance-based representations in terms of feature vectors. Retrieving a query graph from a large dataset of graphs has the drawback of the high computational complexity required to compare the query and the target graphs. The most important property for a large-scale retrieval is the search time complexity to be sub-linear in the number of database examples. In this paper we propose a fast indexation formalism for graph retrieval. A binary embedding is defined as hashing keys for graph nodes. Given a database of labeled graphs, graph nodes are complemented with vectors of attributes representing their local context. Hence, each attribute counts the length of a walk of order k originated in a vertex with label l. Each attribute vector is converted to a binary code applying a binary-valued hash function. Therefore, graph retrieval is formulated in terms of finding target graphs in the database whose nodes have a small Hamming distance from the query nodes, easily computed with bitwise logical operators. As an application example, we validate the performance of the proposed methods in a handwritten word spotting scenario in images of historical documents.
Keywords: Graph matching; Graph indexing; Application in document analysis; Word spotting; Binary embedding
|
|
|
Juan Ignacio Toledo, Jordi Cucurull, Jordi Puiggali, Alicia Fornes, & Josep Llados. (2015). Document Analysis Techniques for Automatic Electoral Document Processing: A Survey. In E-Voting and Identity, Proceedings of 5th international conference, VoteID 2015 (pp. 139–141). LNCS.
Abstract: In this paper, we will discuss the most common challenges in electoral document processing and study the different solutions from the document analysis community that can be applied in each case. We will cover Optical Mark Recognition techniques to detect voter selections in the Australian Ballot, handwritten number recognition for preferential elections and handwriting recognition for write-in areas. We will also propose some particular adjustments that can be made to those general techniques in the specific context of electoral documents.
Keywords: Document image analysis; Computer vision; Paper ballots; Paper based elections; Optical scan; Tally
|
|
|
Onur Ferhat, Arcadi Llanza, & Fernando Vilariño. (2015). Gaze interaction for multi-display systems using natural light eye-tracker. In 2nd International Workshop on Solutions for Automatic Gaze Data Analysis.
|
|
|
Xavier Baro, Jordi Gonzalez, Junior Fabian, Miguel Angel Bautista, Marc Oliu, Hugo Jair Escalante, et al. (2015). ChaLearn Looking at People 2015 challenges: action spotting and cultural event recognition. In 2015 IEEE Conference on Computer Vision and Pattern Recognition Worshops (CVPRW) (pp. 1–9).
Abstract: Following previous series on Looking at People (LAP) challenges [6, 5, 4], ChaLearn ran two competitions to be presented at CVPR 2015: action/interaction spotting and cultural event recognition in RGB data. We ran a second round on human activity recognition on RGB data sequences. In terms of cultural event recognition, tens of categories have to be recognized. This involves scene understanding and human analysis. This paper summarizes the two performed challenges and obtained results. Details of the ChaLearn LAP competitions can be found at http://gesture.chalearn.org/.
|
|
|
Andres Traumann, Sergio Escalera, & Gholamreza Anbarjafari. (2015). A New Retexturing Method for Virtual Fitting Room Using Kinect 2 Camera. In 2015 IEEE Conference on Computer Vision and Pattern Recognition Worshops (CVPRW) (pp. 75–79).
|
|
|
Ramin Irani, Kamal Nasrollahi, Chris Bahnsen, D.H. Lundtoft, Thomas B. Moeslund, Marc O. Simon, et al. (2015). Spatio-temporal Analysis of RGB-D-T Facial Images for Multimodal Pain Level Recognition. In 2015 IEEE Conference on Computer Vision and Pattern Recognition Worshops (CVPRW) (pp. 88–95).
Abstract: Pain is a vital sign of human health and its automatic detection can be of crucial importance in many different contexts, including medical scenarios. While most available computer vision techniques are based on RGB, in this paper, we investigate the effect of combining RGB, depth, and thermal
facial images for pain detection and pain intensity level recognition. For this purpose, we extract energies released by facial pixels using a spatiotemporal filter. Experiments on a group of 12 elderly people applying the multimodal approach show that the proposed method successfully detects pain and recognizes between three intensity levels in 82% of the analyzed frames improving more than 6% over RGB only analysis in similar conditions.
|
|
|
Mohammad Ali Bagheri, Qigang Gao, Sergio Escalera, Albert Clapes, Kamal Nasrollahi, Michael Holte, et al. (2015). Keep it Accurate and Diverse: Enhancing Action Recognition Performance by Ensemble Learning. In IEEE Conference on Computer Vision and Pattern Recognition Worshops (CVPRW) (pp. 22–29).
Abstract: The performance of different action recognition techniques has recently been studied by several computer vision researchers. However, the potential improvement in classification through classifier fusion by ensemble-based methods has remained unattended. In this work, we evaluate the performance of an ensemble of action learning techniques, each performing the recognition task from a different perspective.
The underlying idea is that instead of aiming a very sophisticated and powerful representation/learning technique, we can learn action categories using a set of relatively simple and diverse classifiers, each trained with different feature set. In addition, combining the outputs of several learners can reduce the risk of an unfortunate selection of a learner on an unseen action recognition scenario.
This leads to having a more robust and general-applicable framework. In order to improve the recognition performance, a powerful combination strategy is utilized based on the Dempster-Shafer theory, which can effectively make use
of diversity of base learners trained on different sources of information. The recognition results of the individual classifiers are compared with those obtained from fusing the classifiers’ output, showing enhanced performance of the proposed methodology.
|
|
|
Bogdan Raducanu, Alireza Bosaghzadeh, & Fadi Dornaika. (2015). Multi-observation Face Recognition in Videos based on Label Propagation. In 6th Workshop on Analysis and Modeling of Faces and Gestures AMFG2015 (pp. 10–17).
Abstract: In order to deal with the huge amount of content generated by social media, especially for indexing and retrieval purposes, the focus shifted from single object recognition to multi-observation object recognition. Of particular interest is the problem of face recognition (used as primary cue for persons’ identity assessment), since it is highly required by popular social media search engines like Facebook and Youtube. Recently, several approaches for graph-based label propagation were proposed. However, the associated graphs were constructed in an ad-hoc manner (e.g., using the KNN graph) that cannot cope properly with the rapid and frequent changes in data appearance, a phenomenon intrinsically related with video sequences. In this paper, we
propose a novel approach for efficient and adaptive graph construction, based on a two-phase scheme: (i) the first phase is used to adaptively find the neighbors of a sample and also to find the adequate weights for the minimization function of the second phase; (ii) in the second phase, the
selected neighbors along with their corresponding weights are used to locally and collaboratively estimate the sparse affinity matrix weights. Experimental results performed on Honda Video Database (HVDB) and a subset of video
sequences extracted from the popular TV-series ’Friends’ show a distinct advantage of the proposed method over the existing standard graph construction methods.
|
|
|
Santiago Segui, Oriol Pujol, & Jordi Vitria. (2015). Learning to count with deep object features. In Deep Vision: Deep Learning in Computer Vision, CVPR 2015 Workshop (pp. 90–96).
Abstract: Learning to count is a learning strategy that has been recently proposed in the literature for dealing with problems where estimating the number of object instances in a scene is the final objective. In this framework, the task of learning to detect and localize individual object instances is seen as a harder task that can be evaded by casting the problem as that of computing a regression value from hand-crafted image features. In this paper we explore the features that are learned when training a counting convolutional neural
network in order to understand their underlying representation.
To this end we define a counting problem for MNIST data and show that the internal representation of the network is able to classify digits in spite of the fact that no direct supervision was provided for them during training.
We also present preliminary results about a deep network that is able to count the number of pedestrians in a scene.
|
|
|
Xavier Otazu, Olivier Penacchio, & Xim Cerda-Company. (2015). Brightness and colour induction through contextual influences in V1. In Scottish Vision Group 2015 SGV2015 (Vol. 12, pp. 1208–2012).
|
|
|
Dennis G.Romero, Anselmo Frizera, Angel Sappa, Boris X. Vintimilla, & Teodiano F.Bastos. (2015). A predictive model for human activity recognition by observing actions and context. In Advanced Concepts for Intelligent Vision Systems, Proceedings of 16th International Conference, ACIVS 2015 (Vol. 9386, pp. 323–333). LNCS. Springer International Publishing.
Abstract: This paper presents a novel model to estimate human activities — a human activity is defined by a set of human actions. The proposed approach is based on the usage of Recurrent Neural Networks (RNN) and Bayesian inference through the continuous monitoring of human actions and its surrounding environment. In the current work human activities are inferred considering not only visual analysis but also additional resources; external sources of information, such as context information, are incorporated to contribute to the activity estimation. The novelty of the proposed approach lies in the way the information is encoded, so that it can be later associated according to a predefined semantic structure. Hence, a pattern representing a given activity can be defined by a set of actions, plus contextual information or other kind of information that could be relevant to describe the activity. Experimental results with real data are provided showing the validity of the proposed approach.
|
|
|
Fahad Shahbaz Khan, Muhammad Anwer Rao, Joost Van de Weijer, Michael Felsberg, & J.Laaksonen. (2015). Deep semantic pyramids for human attributes and action recognition. In Image Analysis, Proceedings of 19th Scandinavian Conference , SCIA 2015 (Vol. 9127, pp. 341–353). Springer International Publishing.
Abstract: Describing persons and their actions is a challenging problem due to variations in pose, scale and viewpoint in real-world images. Recently, semantic pyramids approach [1] for pose normalization has shown to provide excellent results for gender and action recognition. The performance of semantic pyramids approach relies on robust image description and is therefore limited due to the use of shallow local features. In the context of object recognition [2] and object detection [3], convolutional neural networks (CNNs) or deep features have shown to improve the performance over the conventional shallow features.
We propose deep semantic pyramids for human attributes and action recognition. The method works by constructing spatial pyramids based on CNNs of different part locations. These pyramids are then combined to obtain a single semantic representation. We validate our approach on the Berkeley and 27 Human Attributes datasets for attributes classification. For action recognition, we perform experiments on two challenging datasets: Willow and PASCAL VOC 2010. The proposed deep semantic pyramids provide a significant gain of 17.2%, 13.9%, 24.3% and 22.6% compared to the standard shallow semantic pyramids on Berkeley, 27 Human Attributes, Willow and PASCAL VOC 2010 datasets respectively. Our results also show that deep semantic pyramids outperform conventional CNNs based on the full bounding box of the person. Finally, we compare our approach with state-of-the-art methods and show a gain in performance compared to best methods in literature.
Keywords: Action recognition; Human attributes; Semantic pyramids
|
|
|
Martha Mackay, Fernando Alonso, Pere Salamero, Xavier Baro, Jordi Gonzalez, & Sergio Escalera. (2015). Care and caring: future proofing the new demographics. In 6th International Carers Conference.
Abstract: With an ageing population, the issue of care provision is becoming increasingly important. The simple aspiration of the majority of older people is to live safely and well at home. Housing will be part of health & care integration in the following years and decades. A higher proportion of people will have to rely on informal care through family, friends, neighbors and others who
provide care to an older person in need of assistance (around 80% of care across the EU). They do not usually have a formal status and are usually unpaid. We need to ensure that all disabled or chronically ill people can get the help they need without overburdening their families.
The physical and emotional stress of carers is one of the dangers that this dependency can bring. To prevent carers burnout it is necessary to provide new solutions that are affordable and user friendly for the families and caregivers.
|
|
|
Firat Ismailoglu, Ida G. Sprinkhuizen-Kuyper, Evgueni Smirnov, Sergio Escalera, & Ralf Peeters. (2015). Fractional Programming Weighted Decoding for Error-Correcting Output Codes. In Multiple Classifier Systems, Proceedings of 12th International Workshop , MCS 2015 (pp. 38–50). Springer International Publishing.
Abstract: In order to increase the classification performance obtained using Error-Correcting Output Codes designs (ECOC), introducing weights in the decoding phase of the ECOC has attracted a lot of interest. In this work, we present a method for ECOC designs that focuses on increasing hypothesis margin on the data samples given a base classifier. While achieving this, we implicitly reward the base classifiers with high performance, whereas punish those with low performance. The resulting objective function is of the fractional programming type and we deal with this problem through the Dinkelbach’s Algorithm. The conducted tests over well known UCI datasets show that the presented method is superior to the unweighted decoding and that it outperforms the results of the state-of-the-art weighted decoding methods in most of the performed experiments.
|
|