|
Gerard Lacey, & Fernando Vilariño. (2011). Endoscopy system with motion sensors.
Abstract: An endoscopy system (1) comprises an endoscope (2) with a camera (3) at its tip. The endoscope extends through an endoscope guide (4) for guiding movement of the endoscope and for measurement of its movement as it enters the body. The guide (4) comprises a generally conical body (5) having a through passage (105) through which the endoscope (2) extends. A motion sensor comprises an optical transmitter (7) and a detector (8) mounted alongside the passage (105) to measure the insertion-withdrawal linear motion and also rotation of the endoscope by the endoscopist's hand. The system (1) also comprises a flexure controller (10) having wheels operated by the endoscopist. The camera (3), the motion sensor (7/8), and the flexure controller (10) are all connected to a processor (11) which feeds a display.
|
|
|
Panagiota Spyridonos, Fernando Vilariño, Jordi Vitria, Petia Radeva, Fernando Azpiroz, & Juan Malagelada. (2011). Device, system and method for automatic detection of contractile activity in an image frame.
Abstract: A device, system and method for automatic detection of contractile activity of a body lumen in an image frame is provided, wherein image frames during contractile activity are captured and/or image frames including contractile activity are automatically detected, such as through pattern recognition and/or feature extraction to trace image frames including contractions, e.g., with wrinkle patterns. A manual procedure of annotation of contractions, e.g. tonic contractions in capsule endoscopy, may consist of the visualization of the whole video by a specialist, and the labeling of the contraction frames. Embodiments of the present invention may be suitable for implementation in an in vivo imaging system.
|
|
|
Enric Marti, Ferran Poveda, Antoni Gurgui, & Debora Gil. (2011). Aprendizaje Basado en Proyectos en Ingeniería Informática. Resultados y reflexiones de seis años de experiencia.
Abstract: In this workshop a 6 years experience in Project Based Learning (PBL) in Computer Graphics, Computer Engineering course at the Autonomous University of Barcelona (UAB) is presented. We use a Moodle environment suited to manage the documentation generated in PBL. The course is organized by means of two alternative routes: a classic itinerary of lectures and test-based evaluation and another with PBL. In the PBL itinerary we explain the organization in teamgroups, homework tutoring and monitoring and evaluation guidelines for students. We provide some of the work done by students, and the results of assessment surveys carried out to students during these years. We report the evolution of our PBL itinerary in terms of, both, organization and student surveys.
The workshop aims at discussing about on the advantages and disadvantages of using these active methodologies in technical degrees such as computer engineering, in order to debate about the most suitable way of organizing PBL and assessing students learning rate.
|
|
|
Diego Velazquez, Pau Rodriguez, Josep M. Gonfaus, Xavier Roca, & Jordi Gonzalez. (2022). A Closer Look at Embedding Propagation for Manifold Smoothing. JMLR - Journal of Machine Learning Research, 23(252), 1–27.
Abstract: Supervised training of neural networks requires a large amount of manually annotated data and the resulting networks tend to be sensitive to out-of-distribution (OOD) data.
Self- and semi-supervised training schemes reduce the amount of annotated data required during the training process. However, OOD generalization remains a major challenge for most methods. Strategies that promote smoother decision boundaries play an important role in out-of-distribution generalization. For example, embedding propagation (EP) for manifold smoothing has recently shown to considerably improve the OOD performance for few-shot classification. EP achieves smoother class manifolds by building a graph from sample embeddings and propagating information through the nodes in an unsupervised manner. In this work, we extend the original EP paper providing additional evidence and experiments showing that it attains smoother class embedding manifolds and improves results in settings beyond few-shot classification. Concretely, we show that EP improves the robustness of neural networks against multiple adversarial attacks as well as semi- and
self-supervised learning performance.
Keywords: Regularization; emi-supervised learning; self-supervised learning; adversarial robustness; few-shot classification
|
|
|
Sergio Escalera, Vassilis Athitsos, & Isabelle Guyon. (2016). Challenges in multimodal gesture recognition. JMLR - Journal of Machine Learning Research, 17, 1–54.
Abstract: This paper surveys the state of the art on multimodal gesture recognition and introduces the JMLR special topic on gesture recognition 2011-2015. We began right at the start of the KinectTMrevolution when inexpensive infrared cameras providing image depth recordings became available. We published papers using this technology and other more conventional methods, including regular video cameras, to record data, thus providing a good overview of uses of machine learning and computer vision using multimodal data in this area of application. Notably, we organized a series of challenges and made available several datasets we recorded for that purpose, including tens of thousands
of videos, which are available to conduct further research. We also overview recent state of the art works on gesture recognition based on a proposed taxonomy for gesture recognition, discussing challenges and future lines of research.
Keywords: Gesture Recognition; Time Series Analysis; Multimodal Data Analysis; Computer Vision; Pattern Recognition; Wearable sensors; Infrared Cameras; KinectTM
|
|
|
Isabelle Guyon, Kristin Bennett, Gavin Cawley, Hugo Jair Escalante, Sergio Escalera, Tin Kam Ho, et al. (2015). AutoML Challenge 2015: Design and First Results. In 32nd International Conference on Machine Learning, ICML workshop, JMLR proceedings ICML15 (pp. 1–8).
Abstract: ChaLearn is organizing the Automatic Machine Learning (AutoML) contest 2015, which challenges participants to solve classication and regression problems without any human intervention. Participants' code is automatically run on the contest servers to train and test learning machines. However, there is no obligation to submit code; half of the prizes can be won by submitting prediction results only. Datasets of progressively increasing diculty are introduced throughout the six rounds of the challenge. (Participants can
enter the competition in any round.) The rounds alternate phases in which learners are tested on datasets participants have not seen (AutoML), and phases in which participants have limited time to tweak their algorithms on those datasets to improve performance (Tweakathon). This challenge will push the state of the art in fully automatic machine learning on a wide range of real-world problems. The platform will remain available beyond the termination of the challenge: http://codalab.org/AutoML.
Keywords: AutoML Challenge; machine learning; model selection; meta-learning; repre- sentation learning; active learning
|
|
|
Aura Hernandez-Sabate, Meritxell Joanpere, Nuria Gorgorio, & Lluis Albarracin. (2015). Mathematics learning opportunities when playing a Tower Defense Game. IJSG - International Journal of Serious Games, 57–71.
Abstract: A qualitative research study is presented herein with the purpose of identifying mathematics learning opportunities in students between 10 and 12 years old while playing a commercial version of a Tower Defense game. These learning opportunities are understood as mathematicisable moments of the game and involve the establishment of relationships between the game and mathematical problem solving. Based on the analysis of these mathematicisable moments, we conclude that the game can promote problem-solving processes and learning opportunities that can be associated with different mathematical contents that appears in mathematics curricula, thought it seems that teacher or new game elements might be needed to facilitate the processes.
Keywords: Tower Defense game; learning opportunities; mathematics; problem solving; game design
|
|
|
Debora Gil, Jaume Garcia, Aura Hernandez-Sabate, & Enric Marti. (2010). Manifold parametrization of the left ventricle for a statistical modelling of its complete anatomy. In 8th Medical Imaging (Vol. 7623, 304). SPIE.
Abstract: Distortion of Left Ventricle (LV) external anatomy is related to some dysfunctions, such as hypertrophy. The architecture of myocardial fibers determines LV electromechanical activation patterns as well as mechanics. Thus, their joined modelling would allow the design of specific interventions (such as peacemaker implantation and LV remodelling) and therapies (such as resynchronization). On one hand, accurate modelling of external anatomy requires either a dense sampling or a continuous infinite dimensional approach, which requires non-Euclidean statistics. On the other hand, computation of fiber models requires statistics on Riemannian spaces. Most approaches compute separate statistical models for external anatomy and fibers architecture. In this work we propose a general mathematical framework based on differential geometry concepts for computing a statistical model including, both, external and fiber anatomy. Our framework provides a continuous approach to external anatomy supporting standard statistics. We also provide a straightforward formula for the computation of the Riemannian fiber statistics. We have applied our methodology to the computation of complete anatomical atlas of canine hearts from diffusion tensor studies. The orientation of fibers over the average external geometry agrees with the segmental description of orientations reported in the literature.
|
|
|
Eloi Puertas, Sergio Escalera, & Oriol Pujol. (2015). Generalized Multi-scale Stacked Sequential Learning for Multi-class Classification. PAA - Pattern Analysis and Applications, 18(2), 247–261.
Abstract: In many classification problems, neighbor data labels have inherent sequential relationships. Sequential learning algorithms take benefit of these relationships in order to improve generalization. In this paper, we revise the multi-scale sequential learning approach (MSSL) for applying it in the multi-class case (MMSSL). We introduce the error-correcting output codesframework in the MSSL classifiers and propose a formulation for calculating confidence maps from the margins of the base classifiers. In addition, we propose a MMSSL compression approach which reduces the number of features in the extended data set without a loss in performance. The proposed methods are tested on several databases, showing significant performance improvement compared to classical approaches.
Keywords: Stacked sequential learning; Multi-scale; Error-correct output codes (ECOC); Contextual classification
|
|
|
Fahad Shahbaz Khan, Joost Van de Weijer, & Maria Vanrell. (2012). Modulating Shape Features by Color Attention for Object Recognition. IJCV - International Journal of Computer Vision, 98(1), 49–64.
Abstract: Bag-of-words based image representation is a successful approach for object recognition. Generally, the subsequent stages of the process: feature detection,feature description, vocabulary construction and image representation are performed independent of the intentioned object classes to be detected. In such a framework, it was found that the combination of different image cues, such as shape and color, often obtains below expected results. This paper presents a novel method for recognizing object categories when using ultiple cues by separately processing the shape and color cues and combining them by modulating the shape features by category specific color attention. Color is used to compute bottom up and top-down attention maps. Subsequently, these color attention maps are used to modulate the weights of the shape features. In regions with higher attention shape features are given more weight than in regions with low attention. We compare our approach with existing methods that combine color and shape cues on five data sets containing varied importance of both cues, namely, Soccer (color predominance), Flower (color and hape parity), PASCAL VOC 2007 and 2009 (shape predominance) and Caltech-101 (color co-interference). The experiments clearly demonstrate that in all five data sets our proposed framework significantly outperforms existing methods for combining color and shape information.
|
|
|
Sergio Vera, Miguel Angel Gonzalez Ballester, & Debora Gil. (2015). A Novel Cochlear Reference Frame Based On The Laplace Equation. In 29th international Congress and Exhibition on Computer Assisted Radiology and Surgery (Vol. 10, pp. 1–312).
|
|
|
Ernest Valveny, Oriol Ramos Terrades, Joan Mas, & Marçal Rusiñol. (2013). Interactive Document Retrieval and Classification. In Angel Sappa, & Jordi Vitria (Eds.), Multimodal Interaction in Image and Video Applications (Vol. 48, pp. 17–30). Springer Berlin Heidelberg.
Abstract: In this chapter we describe a system for document retrieval and classification following the interactive-predictive framework. In particular, the system addresses two different scenarios of document analysis: document classification based on visual appearance and logo detection. These two classical problems of document analysis are formulated following the interactive-predictive model, taking the user interaction into account to make easier the process of annotating and labelling the documents. A system implementing this model in a real scenario is presented and analyzed. This system also takes advantage of active learning techniques to speed up the task of labelling the documents.
|
|
|
Muhammad Muzzamil Luqman, Jean-Yves Ramel, & Josep Llados. (2013). Multilevel Analysis of Attributed Graphs for Explicit Graph Embedding in Vector Spaces. In Graph Embedding for Pattern Analysis (pp. 1–26). Springer New York.
Abstract: Ability to recognize patterns is among the most crucial capabilities of human beings for their survival, which enables them to employ their sophisticated neural and cognitive systems [1], for processing complex audio, visual, smell, touch, and taste signals. Man is the most complex and the best existing system of pattern recognition. Without any explicit thinking, we continuously compare, classify, and identify huge amount of signal data everyday [2], starting from the time we get up in the morning till the last second we fall asleep. This includes recognizing the face of a friend in a crowd, a spoken word embedded in noise, the proper key to lock the door, smell of coffee, the voice of a favorite singer, the recognition of alphabetic characters, and millions of more tasks that we perform on regular basis.
|
|
|
Juan Ignacio Toledo, Sebastian Sudholt, Alicia Fornes, Jordi Cucurull, A. Fink, & Josep Llados. (2016). Handwritten Word Image Categorization with Convolutional Neural Networks and Spatial Pyramid Pooling. In Joint IAPR International Workshops on Statistical Techniques in Pattern Recognition (SPR) and Structural and Syntactic Pattern Recognition (SSPR) (Vol. 10029, pp. 543–552). LNCS. Springer International Publishing.
Abstract: The extraction of relevant information from historical document collections is one of the key steps in order to make these documents available for access and searches. The usual approach combines transcription and grammars in order to extract semantically meaningful entities. In this paper, we describe a new method to obtain word categories directly from non-preprocessed handwritten word images. The method can be used to directly extract information, being an alternative to the transcription. Thus it can be used as a first step in any kind of syntactical analysis. The approach is based on Convolutional Neural Networks with a Spatial Pyramid Pooling layer to deal with the different shapes of the input images. We performed the experiments on a historical marriage record dataset, obtaining promising results.
Keywords: Document image analysis; Word image categorization; Convolutional neural networks; Named entity detection
|
|
|
Iiris Lusi, Sergio Escalera, & Gholamreza Anbarjafari. (2016). SASE: RGB-Depth Database for Human Head Pose Estimation. In 14th European Conference on Computer Vision Workshops.
|
|