|
Andres Traumann, Gholamreza Anbarjafari, & Sergio Escalera. (2015). Accurate 3D Measurement Using Optical Depth Information. EL - Electronic Letters, 51(18), 1420–1422.
Abstract: A novel three-dimensional measurement technique is proposed. The methodology consists in mapping from the screen coordinates reported by the optical camera to the real world, and integrating distance gradients from the beginning to the end point, while also minimising the error through fitting pixel locations to a smooth curve. The results demonstrate accuracy of less than half a centimetre using Microsoft Kinect II.
|
|
|
Fosca De Iorio, C. Malagelada, Fernando Azpiroz, M. Maluenda, C. Violanti, Laura Igual, et al. (2009). Intestinal motor activity, endoluminal motion and transit. NEUMOT - Neurogastroenterology & Motility, 21(12), 1264–e119.
Abstract: A programme for evaluation of intestinal motility has been recently developed based on endoluminal image analysis using computer vision methodology and machine learning techniques. Our aim was to determine the effect of intestinal muscle inhibition on wall motion, dynamics of luminal content and transit in the small bowel. Fourteen healthy subjects ingested the endoscopic capsule (Pillcam, Given Imaging) in fasting conditions. Seven of them received glucagon (4.8 microg kg(-1) bolus followed by a 9.6 microg kg(-1) h(-1) infusion during 1 h) and in the other seven, fasting activity was recorded, as controls. This dose of glucagon has previously shown to inhibit both tonic and phasic intestinal motor activity. Endoluminal image and displacement was analyzed by means of a computer vision programme specifically developed for the evaluation of muscular activity (contractile and non-contractile patterns), intestinal contents, endoluminal motion and transit. Thirty-minute periods before, during and after glucagon infusion were analyzed and compared with equivalent periods in controls. No differences were found in the parameters measured during the baseline (pretest) periods when comparing glucagon and control experiments. During glucagon infusion, there was a significant reduction in contractile activity (0.2 +/- 0.1 vs 4.2 +/- 0.9 luminal closures per min, P < 0.05; 0.4 +/- 0.1 vs 3.4 +/- 1.2% of images with radial wrinkles, P < 0.05) and a significant reduction of endoluminal motion (82 +/- 9 vs 21 +/- 10% of static images, P < 0.05). Endoluminal image analysis, by means of computer vision and machine learning techniques, can reliably detect reduced intestinal muscle activity and motion.
|
|
|
Aura Hernandez-Sabate, Meritxell Joanpere, Nuria Gorgorio, & Lluis Albarracin. (2015). Mathematics learning opportunities when playing a Tower Defense Game. IJSG - International Journal of Serious Games, 57–71.
Abstract: A qualitative research study is presented herein with the purpose of identifying mathematics learning opportunities in students between 10 and 12 years old while playing a commercial version of a Tower Defense game. These learning opportunities are understood as mathematicisable moments of the game and involve the establishment of relationships between the game and mathematical problem solving. Based on the analysis of these mathematicisable moments, we conclude that the game can promote problem-solving processes and learning opportunities that can be associated with different mathematical contents that appears in mathematics curricula, thought it seems that teacher or new game elements might be needed to facilitate the processes.
Keywords: Tower Defense game; learning opportunities; mathematics; problem solving; game design
|
|
|
J. Martinez, Eva Costa, P. Herreros, Antonio Lopez, & Juan J. Villanueva. (2003). TV-Screen Quality Inspection by Artificial Vision.
Abstract: A real-time vision system for TV screen quality inspection is introduced. The whole system consists of eight cameras and one processor per camera. It acquires and processes 112 images in 6 seconds. The defects to be inspected can be grouped into four main categories (bubble, line-out, line reduction and landing) although there exists a large variability among each particular type of defect. The complexity of the whole inspection process has been reduced by dividing images into smaller ones and grouping the defects into frequency and intensity relevant ones. Tools such as mathematical morphology, Fourier transform, profile analysis and classification have been used. The performance of the system has been successfully proved against human operators in normal production conditions.
|
|
|
Shun Yao, Fei Yang, Yongmei Cheng, & Mikhail Mozerov. (2021). 3D Shapes Local Geometry Codes Learning with SDF. In International Conference on Computer Vision Workshops (pp. 2110–2117).
Abstract: A signed distance function (SDF) as the 3D shape description is one of the most effective approaches to represent 3D geometry for rendering and reconstruction. Our work is inspired by the state-of-the-art method DeepSDF [17] that learns and analyzes the 3D shape as the iso-surface of its shell and this method has shown promising results especially in the 3D shape reconstruction and compression domain. In this paper, we consider the degeneration problem of reconstruction coming from the capacity decrease of the DeepSDF model, which approximates the SDF with a neural network and a single latent code. We propose Local Geometry Code Learning (LGCL), a model that improves the original DeepSDF results by learning from a local shape geometry of the full 3D shape. We add an extra graph neural network to split the single transmittable latent code into a set of local latent codes distributed on the 3D shape. Mentioned latent codes are used to approximate the SDF in their local regions, which will alleviate the complexity of the approximation compared to the original DeepSDF. Furthermore, we introduce a new geometric loss function to facilitate the training of these local latent codes. Note that other local shape adjusting methods use the 3D voxel representation, which in turn is a problem highly difficult to solve or even is insolvable. In contrast, our architecture is based on graph processing implicitly and performs the learning regression process directly in the latent code space, thus make the proposed architecture more flexible and also simple for realization. Our experiments on 3D shape reconstruction demonstrate that our LGCL method can keep more details with a significantly smaller size of the SDF decoder and outperforms considerably the original DeepSDF method under the most important quantitative metrics.
|
|
|
Senmao Li, Joost van de Weijer, Taihang Hu, Fahad Shahbaz Khan, Qibin Hou, Yaxing Wang, et al. (2023). StyleDiffusion: Prompt-Embedding Inversion for Text-Based Editing.
Abstract: A significant research effort is focused on exploiting the amazing capacities of pretrained diffusion models for the editing of images. They either finetune the model, or invert the image in the latent space of the pretrained model. However, they suffer from two problems: (1) Unsatisfying results for selected regions, and unexpected changes in nonselected regions. (2) They require careful text prompt editing where the prompt should include all visual objects in the input image. To address this, we propose two improvements: (1) Only optimizing the input of the value linear network in the cross-attention layers, is sufficiently powerful to reconstruct a real image. (2) We propose attention regularization to preserve the object-like attention maps after editing, enabling us to obtain accurate style editing without invoking significant structural changes. We further improve the editing technique which is used for the unconditional branch of classifier-free guidance, as well as the conditional one as used by P2P. Extensive experimental prompt-editing results on a variety of images, demonstrate qualitatively and quantitatively that our method has superior editing capabilities than existing and concurrent works.
|
|
|
Abel Gonzalez-Garcia, Robert Benavente, Olivier Penacchio, Javier Vazquez, Maria Vanrell, & C. Alejandro Parraga. (2013). Coloresia: An Interactive Colour Perception Device for the Visually Impaired. In Multimodal Interaction in Image and Video Applications (Vol. 48, pp. 47–66). Springer Berlin Heidelberg.
Abstract: A significative percentage of the human population suffer from impairments in their capacity to distinguish or even see colours. For them, everyday tasks like navigating through a train or metro network map becomes demanding. We present a novel technique for extracting colour information from everyday natural stimuli and presenting it to visually impaired users as pleasant, non-invasive sound. This technique was implemented inside a Personal Digital Assistant (PDA) portable device. In this implementation, colour information is extracted from the input image and categorised according to how human observers segment the colour space. This information is subsequently converted into sound and sent to the user via speakers or headphones. In the original implementation, it is possible for the user to send its feedback to reconfigure the system, however several features such as these were not implemented because the current technology is limited.We are confident that the full implementation will be possible in the near future as PDA technology improves.
|
|
|
Sergio Escalera, Oriol Pujol, & Petia Radeva. (2010). Re-coding ECOCs without retraining. PRL - Pattern Recognition Letters, 31(7), 555–562.
Abstract: A standard way to deal with multi-class categorization problems is by the combination of binary classifiers in a pairwise voting procedure. Recently, this classical approach has been formalized in the Error-Correcting Output Codes (ECOC) framework. In the ECOC framework, the one-versus-one coding demonstrates to achieve higher performance than the rest of coding designs. The binary problems that we train in the one-versus-one strategy are significantly smaller than in the rest of designs, and usually easier to be learnt, taking into account the smaller overlapping between classes. However, a high percentage of the positions coded by zero of the coding matrix, which implies a high sparseness degree, does not codify meta-class membership information. In this paper, we show that using the training data we can redefine without re-training, in a problem-dependent way, the one-versus-one coding matrix so that the new coded information helps the system to increase its generalization capability. Moreover, the new re-coding strategy is generalized to be applied over any binary code. The results over several UCI Machine Learning repository data sets and two real multi-class problems show that performance improvements can be obtained re-coding the classical one-versus-one and Sparse random designs compared to different state-of-the-art ECOC configurations.
|
|
|
Josep Llados, Horst Bunke, & Enric Marti. (1996). Structural Recognition of hand drawn floor plans. In VI National Symposium on Pattern Recognition and Image Analysis. Cordoba.
Abstract: A system to recognize hand drawn architectural drawings in a CAD environment has been deve- loped. In this paper we focus on its high level interpretation module. To interpret a floor plan, the system must identify several building elements, whose description is stored in a library of pat- terns, as well as their spatial relationships. We propose a structural approach based on subgraph isomorphism techniques to obtain a high-level interpretation of the document. The vectorized input document and the patterns to be recognized are represented by attributed graphs. Discrete relaxation techniques (AC4 algorithm) have been applied to develop the matching algorithm. The process has been divided in three steps: node labeling, local consistency and global consistency verification. The hand drawn creation causes disturbed line drawings with several accuracy errors, which must be taken into account. Here we have identified them and the AC4 algorithm has been adapted to manage them.
Keywords: Rotational Symmetry; Reflectional Symmetry; String Matching.
|
|
|
Jaume Garcia, Debora Gil, Luis Badiella, Aura Hernandez-Sabate, Francesc Carreras, Sandra Pujades, et al. (2010). A Normalized Framework for the Design of Feature Spaces Assessing the Left Ventricular Function. TMI - IEEE Transactions on Medical Imaging, 29(3), 733–745.
Abstract: A through description of the left ventricle functionality requires combining complementary regional scores. A main limitation is the lack of multiparametric normality models oriented to the assessment of regional wall motion abnormalities (RWMA). This paper covers two main topics involved in RWMA assessment. We propose a general framework allowing the fusion and comparison across subjects of different regional scores. Our framework is used to explore which combination of regional scores (including 2-D motion and strains) is better suited for RWMA detection. Our statistical analysis indicates that for a proper (within interobserver variability) identification of RWMA, models should consider motion and extreme strains.
|
|
|
Antonio Lopez, Atsushi Imiya, Tomas Pajdla, & Jose Manuel Alvarez. Computer Vision in Vehicle Technology: Land, Sea & Air.
Abstract: A unified view of the use of computer vision technology for different types of vehicles
Computer Vision in Vehicle Technology focuses on computer vision as on-board technology, bringing together fields of research where computer vision is progressively penetrating: the automotive sector, unmanned aerial and underwater vehicles. It also serves as a reference for researchers of current developments and challenges in areas of the application of computer vision, involving vehicles such as advanced driver assistance (pedestrian detection, lane departure warning, traffic sign recognition), autonomous driving and robot navigation (with visual simultaneous localization and mapping) or unmanned aerial vehicles (obstacle avoidance, landscape classification and mapping, fire risk assessment).
The overall role of computer vision for the navigation of different vehicles, as well as technology to address on-board applications, is analysed.
|
|
|
Aura Hernandez-Sabate, Debora Gil, & Petia Radeva. (2005). A Deterministic-Statistical Strategy for Adventitia Segmentation in IVUS images.
Abstract: A useful tool for some specific studies in cardiac disease diagnosis is vessel plaque assessment by analysis of IVUS sequences. Manual detection of luminal (inner) and media-adventitia (external) vessel borders is the main activity of physicians in the process of lumen narrowing (plaque) quantification. Difficult definition of vessel border descriptors, as well as, shades, artifacts and blurred signal response due to ultrasound physical properties troubles automated adventitia segmentation. In order to efficiently approach such a complex problem, we propose blending advanced anisotropic filtering operators and statistical classification techniques into a vessel border modelling strategy. Our systematic statistical analysis shows that the reported adventitia detection achieves an accuracy in the range of inter-observer variability regardless of plaque nature, vessel geometry and incomplete vessel borders.
|
|
|
Aura Hernandez-Sabate. (2005). Automatic adventitia segmentation in IntraVascular UltraSound images. Master's thesis, , 08193 Bellaterra, Barcelona (Spain).
Abstract: A usual tool in cardiac disease diagnosis is vessel plaque assessment by analysis of IVUS sequences. Manual detection of lumen-intima, intima-media and media-adventitia vessel borders is the main activity of physicians in the process of plaque quantification. Large variety in vessel border descriptors, as well as, shades, artifacts and blurred response due to ultrasound physical properties troubles automated media-adventitia segmentation. This experimental work presents a solution to such a complex problem. The process blends advanced anisotropic filtering operators and statistic classification techniques, achieving an efficient vessel border modelling strategy. First of all, we introduce the theoretic base of the method. After that, we show the steps of the algorithm, validating the method with statistics that show that the media-adventitia border detection achieves an accuracy in the range of inter-observer variability regardless of plaque nature, vessel geometry and incomplete vessel borders. Finally, we present a little Matlab application to the automatic media-adventitia border.
|
|
|
Fadi Dornaika, A.Assoum, & Bogdan Raducanu. (2012). Automatic Dimensionality Estimation for Manifold Learning through Optimal Feature Selection. In Structural, Syntactic, and Statistical Pattern Recognition, Joint IAPR International Workshop (Vol. 7626, pp. 575–583). LNCS. Springer Berlin Heidelberg.
Abstract: A very important aspect in manifold learning is represented by automatic estimation of the intrinsic dimensionality. Unfortunately, this problem has received few attention in the literature of manifold learning. In this paper, we argue that feature selection paradigm can be used to the problem of automatic dimensionality estimation. Besides this, it also leads to improved recognition rates. Our approach for optimal feature selection is based on a Genetic Algorithm. As a case study for manifold learning, we have considered Laplacian Eigenmaps (LE) and Locally Linear Embedding (LLE). The effectiveness of the proposed framework was tested on the face recognition problem. Extensive experiments carried out on ORL, UMIST, Yale, and Extended Yale face data sets confirmed our hypothesis.
|
|
|
Muhammad Muzzamil Luqman, Jean-Yves Ramel, & Josep Llados. (2013). Multilevel Analysis of Attributed Graphs for Explicit Graph Embedding in Vector Spaces. In Graph Embedding for Pattern Analysis (pp. 1–26). Springer New York.
Abstract: Ability to recognize patterns is among the most crucial capabilities of human beings for their survival, which enables them to employ their sophisticated neural and cognitive systems [1], for processing complex audio, visual, smell, touch, and taste signals. Man is the most complex and the best existing system of pattern recognition. Without any explicit thinking, we continuously compare, classify, and identify huge amount of signal data everyday [2], starting from the time we get up in the morning till the last second we fall asleep. This includes recognizing the face of a friend in a crowd, a spoken word embedded in noise, the proper key to lock the door, smell of coffee, the voice of a favorite singer, the recognition of alphabetic characters, and millions of more tasks that we perform on regular basis.
|
|