Hugo Jair Escalante, Jose Martinez, Sergio Escalera, Victor Ponce, & Xavier Baro. (2015). Improving Bag of Visual Words Representations with Genetic Programming. In IEEE International Joint Conference on Neural Networks IJCNN2015.
Abstract: The bag of visual words is a well established representation in diverse computer vision problems. Taking inspiration from the fields of text mining and retrieval, this representation has proved to be very effective in a large number of domains.
In most cases, a standard term-frequency weighting scheme is considered for representing images and videos in computer vision. This is somewhat surprising, as there are many alternative ways of generating bag of words representations within the text processing community. This paper explores the use of alternative weighting schemes for landmark tasks in computer vision: image
categorization and gesture recognition. We study the suitability of using well-known supervised and unsupervised weighting schemes for such tasks. More importantly, we devise a genetic program that learns new ways of representing images and videos under the bag of visual words representation. The proposed method learns to combine term-weighting primitives trying to maximize the classification performance. Experimental results are reported in standard image and video data sets showing the effectiveness of the proposed evolutionary algorithm.
|
J. Chazalon, Marçal Rusiñol, & Jean-Marc Ogier. (2015). Improving Document Matching Performance by Local Descriptor Filtering. In 6th IAPR International Workshop on Camera Based Document Analysis and Recognition CBDAR2015 (pp. 1216–1220).
Abstract: In this paper we propose an effective method aimed at reducing the amount of local descriptors to be indexed in a document matching framework. In an off-line training stage, the matching between the model document and incoming images is computed retaining the local descriptors from the model that steadily produce good matches. We have evaluated this approach by using the ICDAR2015 SmartDOC dataset containing near 25 000 images from documents to be captured by a mobile device. We have tested the performance of this filtering step by using
ORB and SIFT local detectors and descriptors. The results show an important gain both in quality of the final matching as well as in time and space requirements.
|
Katerine Diaz, Francesc J. Ferri, & W. Diaz. (2015). Incremental Generalized Discriminative Common Vectors for Image Classification. TNNLS - IEEE Transactions on Neural Networks and Learning Systems, 26(8), 1761–1775.
Abstract: Subspace-based methods have become popular due to their ability to appropriately represent complex data in such a way that both dimensionality is reduced and discriminativeness is enhanced. Several recent works have concentrated on the discriminative common vector (DCV) method and other closely related algorithms also based on the concept of null space. In this paper, we present a generalized incremental formulation of the DCV methods, which allows the update of a given model by considering the addition of new examples even from unseen classes. Having efficient incremental formulations of well-behaved batch algorithms allows us to conveniently adapt previously trained classifiers without the need of recomputing them from scratch. The proposed generalized incremental method has been empirically validated in different case studies from different application domains (faces, objects, and handwritten digits) considering several different scenarios in which new data are continuously added at different rates starting from an initial model.
|
R.A.Bendezu, E.Barba, E.Burri, D.Cisternas, Carolina Malagelada, Santiago Segui, et al. (2015). Intestinal gas content and distribution in health and in patients with functional gut symptoms. NEUMOT - Neurogastroenterology & Motility, 27(9), 1249–1257.
Abstract: BACKGROUND:
The precise relation of intestinal gas to symptoms, particularly abdominal bloating and distension remains incompletely elucidated. Our aim was to define the normal values of intestinal gas volume and distribution and to identify abnormalities in relation to functional-type symptoms.
METHODS:
Abdominal computed tomography scans were evaluated in healthy subjects (n = 37) and in patients in three conditions: basal (when they were feeling well; n = 88), during an episode of abdominal distension (n = 82) and after a challenge diet (n = 24). Intestinal gas content and distribution were measured by an original analysis program. Identification of patients outside the normal range was performed by machine learning techniques (one-class classifier). Results are expressed as median (IQR) or mean ± SE, as appropriate.
KEY RESULTS:
In healthy subjects the gut contained 95 (71, 141) mL gas distributed along the entire lumen. No differences were detected between patients studied under asymptomatic basal conditions and healthy subjects. However, either during a spontaneous bloating episode or once challenged with a flatulogenic diet, luminal gas was found to be increased and/or abnormally distributed in about one-fourth of the patients. These patients detected outside the normal range by the classifier exhibited a significantly greater number of abnormal features than those within the normal range (3.7 ± 0.4 vs 0.4 ± 0.1; p < 0.001).
CONCLUSIONS & INFERENCES:
The analysis of a large cohort of subjects using original techniques provides unique and heretofore unavailable information on the volume and distribution of intestinal gas in normal conditions and in relation to functional gastrointestinal symptoms.
|
Mohammad Ali Bagheri, Qigang Gao, Sergio Escalera, Albert Clapes, Kamal Nasrollahi, Michael Holte, et al. (2015). Keep it Accurate and Diverse: Enhancing Action Recognition Performance by Ensemble Learning. In IEEE Conference on Computer Vision and Pattern Recognition Worshops (CVPRW) (pp. 22–29).
Abstract: The performance of different action recognition techniques has recently been studied by several computer vision researchers. However, the potential improvement in classification through classifier fusion by ensemble-based methods has remained unattended. In this work, we evaluate the performance of an ensemble of action learning techniques, each performing the recognition task from a different perspective.
The underlying idea is that instead of aiming a very sophisticated and powerful representation/learning technique, we can learn action categories using a set of relatively simple and diverse classifiers, each trained with different feature set. In addition, combining the outputs of several learners can reduce the risk of an unfortunate selection of a learner on an unseen action recognition scenario.
This leads to having a more robust and general-applicable framework. In order to improve the recognition performance, a powerful combination strategy is utilized based on the Dempster-Shafer theory, which can effectively make use
of diversity of base learners trained on different sources of information. The recognition results of the individual classifiers are compared with those obtained from fusing the classifiers’ output, showing enhanced performance of the proposed methodology.
|
Christophe Rigaud, Clement Guerin, Dimosthenis Karatzas, Jean-Christophe Burie, & Jean-Marc Ogier. (2015). Knowledge-driven understanding of images in comic books. IJDAR - International Journal on Document Analysis and Recognition, 18(3), 199–221.
Abstract: Document analysis is an active field of research, which can attain a complete understanding of the semantics of a given document. One example of the document understanding process is enabling a computer to identify the key elements of a comic book story and arrange them according to a predefined domain knowledge. In this study, we propose a knowledge-driven system that can interact with bottom-up and top-down information to progressively understand the content of a document. We model the comic book’s and the image processing domains knowledge for information consistency analysis. In addition, different image processing methods are improved or developed to extract panels, balloons, tails, texts, comic characters and their semantic relations in an unsupervised way.
Keywords: Document Understanding; comics analysis; expert system
|
Antoni Gurgui, Debora Gil, & Enric Marti. (2015). Laplacian Unitary Domain for Texture Morphing. In Proceedings of the 10th International Conference on Computer Vision Theory and Applications VISIGRAPP2015 (Vol. 1, pp. 693–699). SciTePress.
Abstract: Deformation of expressive textures is the gateway to realistic computer synthesis of expressions. By their good mathematical properties and flexible formulation on irregular meshes, most texture mappings rely on solutions to the Laplacian in the cartesian space. In the context of facial expression morphing, this approximation can be seen from the opposite point of view by neglecting the metric. In this paper, we use the properties of the Laplacian in manifolds to present a novel approach to warping expressive facial images in order to generate a morphing between them.
Keywords: Facial; metamorphosis;LaplacianMorphing
|
Pau Riba, Josep Llados, Alicia Fornes, & Anjan Dutta. (2015). Large-scale Graph Indexing using Binary Embeddings of Node Contexts. In C.-L.Liu, B.Luo, W.G.Kropatsch, & J.Cheng (Eds.), 10th IAPR-TC15 Workshop on Graph-based Representations in Pattern Recognition (Vol. 9069, pp. 208–217). LNCS. Springer International Publishing.
Abstract: Graph-based representations are experiencing a growing usage in visual recognition and retrieval due to their representational power in front of classical appearance-based representations in terms of feature vectors. Retrieving a query graph from a large dataset of graphs has the drawback of the high computational complexity required to compare the query and the target graphs. The most important property for a large-scale retrieval is the search time complexity to be sub-linear in the number of database examples. In this paper we propose a fast indexation formalism for graph retrieval. A binary embedding is defined as hashing keys for graph nodes. Given a database of labeled graphs, graph nodes are complemented with vectors of attributes representing their local context. Hence, each attribute counts the length of a walk of order k originated in a vertex with label l. Each attribute vector is converted to a binary code applying a binary-valued hash function. Therefore, graph retrieval is formulated in terms of finding target graphs in the database whose nodes have a small Hamming distance from the query nodes, easily computed with bitwise logical operators. As an application example, we validate the performance of the proposed methods in a handwritten word spotting scenario in images of historical documents.
Keywords: Graph matching; Graph indexing; Application in document analysis; Word spotting; Binary embedding
|
Santiago Segui, Oriol Pujol, & Jordi Vitria. (2015). Learning to count with deep object features. In Deep Vision: Deep Learning in Computer Vision, CVPR 2015 Workshop (pp. 90–96).
Abstract: Learning to count is a learning strategy that has been recently proposed in the literature for dealing with problems where estimating the number of object instances in a scene is the final objective. In this framework, the task of learning to detect and localize individual object instances is seen as a harder task that can be evaded by casting the problem as that of computing a regression value from hand-crafted image features. In this paper we explore the features that are learned when training a counting convolutional neural
network in order to understand their underlying representation.
To this end we define a counting problem for MNIST data and show that the internal representation of the network is able to classify digits in spite of the fact that no direct supervision was provided for them during training.
We also present preliminary results about a deep network that is able to count the number of pedestrians in a scene.
|
Cristhian A. Aguilera-Carrasco, Angel Sappa, & Ricardo Toledo. (2015). LGHD: a Feature Descriptor for Matching Across Non-Linear Intensity Variations. In 22th IEEE International Conference on Image Processing (pp. 178–181).
|
Mariella Dimiccoli, & Petia Radeva. (2015). Lifelogging in the era of outstanding digitization. In International Conference on Digital Presentation and Preservation of Cultural and Scientific Heritage.
Abstract: In this paper, we give an overview on the emerging trend of the digitized self, focusing on visual lifelogging through wearable cameras. This is about continuously recording our life from a first-person view by wearing a camera that passively captures images. On one hand, visual lifelogging has opened the door to a large number of applications, including health. On the other, it has also boosted new challenges in the field of data analysis as well as new ethical concerns. While currently increasing efforts are being devoted to exploit lifelogging data for the improvement of personal well-being, we believe there are still many interesting applications to explore, ranging from tourism to the digitization of human behavior.
|
Aura Hernandez-Sabate, Meritxell Joanpere, Nuria Gorgorio, & Lluis Albarracin. (2015). Mathematics learning opportunities when playing a Tower Defense Game. IJSG - International Journal of Serious Games, 57–71.
Abstract: A qualitative research study is presented herein with the purpose of identifying mathematics learning opportunities in students between 10 and 12 years old while playing a commercial version of a Tower Defense game. These learning opportunities are understood as mathematicisable moments of the game and involve the establishment of relationships between the game and mathematical problem solving. Based on the analysis of these mathematicisable moments, we conclude that the game can promote problem-solving processes and learning opportunities that can be associated with different mathematical contents that appears in mathematics curricula, thought it seems that teacher or new game elements might be needed to facilitate the processes.
Keywords: Tower Defense game; learning opportunities; mathematics; problem solving; game design
|
Dan Norton, Fernando Vilariño, & Onur Ferhat. (2015). Memory Field – Creative Engagement in Digital Collections. In Internet Librarian International Conference.
Abstract: “Memory Fields” is a trans-disciplinary project aiming at the (re)valorisation of digital collections.Its main deliverable is an interface for a dual screen installation, used to access and mix the public library digital collections. The collections being used in this case are a collection of digitised posters from the Spanish Civil War, belonging to the Arxiu General de Catalunya, and a collection of field recordings made by Dan Norton. The system generates visualisations, and the images and sounds are mixed together using narrative primitives of video dj. Users contribute to the digital collections by adding personal memories and observations. The comments and recollections appear as flowers growing in a “memory field” and memories remain public in a Twitter feed (@Memoryfields).
|
Fernando Vilariño, Dan Norton, & Onur Ferhat. (2015). Memory Fields: DJs in the Library. In 21 st Symposium of Electronic Arts.
|
Adriana Romero, Petia Radeva, & Carlo Gatta. (2015). Meta-parameter free unsupervised sparse feature learning. TPAMI - IEEE Transactions on Pattern Analysis and Machine Intelligence, 37(8), 1716–1722.
Abstract: We propose a meta-parameter free, off-the-shelf, simple and fast unsupervised feature learning algorithm, which exploits a new way of optimizing for sparsity. Experiments on CIFAR-10, STL- 10 and UCMerced show that the method achieves the state-of-theart performance, providing discriminative features that generalize well.
|