|
Patricia Suarez, Angel Sappa, & Boris X. Vintimilla. (2018). Cross-spectral image dehaze through a dense stacked conditional GAN based approach. In 14th IEEE International Conference on Signal Image Technology & Internet Based System.
Abstract: This paper proposes a novel approach to remove haze from RGB images using a near infrared images based on a dense stacked conditional Generative Adversarial Network (CGAN). The architecture of the deep network implemented
receives, besides the images with haze, its corresponding image in the near infrared spectrum, which serve to accelerate the learning process of the details of the characteristics of the images. The model uses a triplet layer that allows the independence learning of each channel of the visible spectrum image to remove the haze on each color channel separately. A multiple loss function scheme is proposed, which ensures balanced learning between the colors
and the structure of the images. Experimental results have shown that the proposed method effectively removes the haze from the images. Additionally, the proposed approach is compared with a state of the art approach showing better results.
Keywords: Infrared imaging; Dense; Stacked CGAN; Crossspectral; Convolutional networks
|
|
|
Jorge Charco, Boris X. Vintimilla, & Angel Sappa. (2018). Deep learning based camera pose estimation in multi-view environment. In 14th IEEE International Conference on Signal Image Technology & Internet Based System.
Abstract: This paper proposes to use a deep learning network architecture for relative camera pose estimation on a multi-view environment. The proposed network is a variant architecture of AlexNet to use as regressor for prediction the relative translation and rotation as output. The proposed approach is trained from
scratch on a large data set that takes as input a pair of imagesfrom the same scene. This new architecture is compared with a previous approach using standard metrics, obtaining better results on the relative camera pose.
Keywords: Deep learning; Camera pose estimation; Multiview environment; Siamese architecture
|
|
|
Patricia Suarez, Dario Carpio, Angel Sappa, & Henry Velesaca. (2022). Transformer based Image Dehazing. In 16th IEEE International Conference on Signal Image Technology & Internet Based System.
Abstract: This paper presents a novel approach to remove non homogeneous haze from real images. The proposed method consists mainly of image feature extraction, haze removal, and image reconstruction. To accomplish this challenging task, we propose an architecture based on transformers, which have been recently introduced and have shown great potential in different computer vision tasks. Our model is based on the SwinIR an image restoration architecture based on a transformer, but by modifying the deep feature extraction module, the depth level of the model, and by applying a combined loss function that improves styling and adapts the model for the non-homogeneous haze removal present in images. The obtained results prove to be superior to those obtained by state-of-the-art models.
Keywords: atmospheric light; brightness component; computational cost; dehazing quality; haze-free image
|
|
|
Patricia Suarez, Dario Carpio, & Angel Sappa. (2023). Depth Map Estimation from a Single 2D Image. In 17th International Conference on Signal-Image Technology & Internet-Based Systems (pp. 347–353).
Abstract: This paper presents an innovative architecture based on a Cycle Generative Adversarial Network (CycleGAN) for the synthesis of high-quality depth maps from monocular images. The proposed architecture leverages a diverse set of loss functions, including cycle consistency, contrastive, identity, and least square losses, to facilitate the generation of depth maps that exhibit realism and high fidelity. A notable feature of the approach is its ability to synthesize depth maps from grayscale images without the need for paired training data. Extensive comparisons with different state-of-the-art methods show the superiority of the proposed approach in both quantitative metrics and visual quality. This work addresses the challenge of depth map synthesis and offers significant advancements in the field.
|
|
|
Rafael E. Rivadeneira, Henry Velesaca, & Angel Sappa. (2023). Object Detection in Very Low-Resolution Thermal Images through a Guided-Based Super-Resolution Approach. In 17th International Conference on Signal-Image Technology & Internet-Based Systems.
Abstract: This work proposes a novel approach that integrates super-resolution techniques with off-the-shelf object detection methods to tackle the problem of handling very low-resolution thermal images. The suggested approach begins by enhancing the low-resolution (LR) thermal images through a guided super-resolution strategy, leveraging a high-resolution (HR) visible spectrum image. Subsequently, object detection is performed on the high-resolution thermal image. The experimental results demonstrate tremendous improvements in comparison with both scenarios: when object detection is performed on the LR thermal image alone, as well as when object detection is conducted on the up-sampled LR thermal image. Moreover, the proposed approach proves highly valuable in camouflaged scenarios where objects might remain undetected in visible spectrum images.
|
|
|
Patricia Suarez, Dario Carpio, & Angel Sappa. (2023). Boosting Guided Super-Resolution Performance with Synthesized Images. In 17th International Conference on Signal-Image Technology & Internet-Based Systems (pp. 189–195).
Abstract: Guided image processing techniques are widely used for extracting information from a guiding image to aid in the processing of the guided one. These images may be sourced from different modalities, such as 2D and 3D, or different spectral bands, like visible and infrared. In the case of guided cross-spectral super-resolution, features from the two modal images are extracted and efficiently merged to migrate guidance information from one image, usually high-resolution (HR), toward the guided one, usually low-resolution (LR). Different approaches have been recently proposed focusing on the development of architectures for feature extraction and merging in the cross-spectral domains, but none of them care about the different nature of the given images. This paper focuses on the specific problem of guided thermal image super-resolution, where an LR thermal image is enhanced by an HR visible spectrum image. To improve existing guided super-resolution techniques, a novel scheme is proposed that maps the original guiding information to a thermal image-like representation that is similar to the output. Experimental results evaluating five different approaches demonstrate that the best results are achieved when the guiding and guided images share the same domain.
|
|
|
Antonio Esteban Lansaque, Carles Sanchez, Agnes Borras, Marta Diez-Ferrer, Antoni Rosell, & Debora Gil. (2016). Stable Airway Center Tracking for Bronchoscopic Navigation. In 28th Conference of the international Society for Medical Innovation and Technology.
Abstract: Bronchoscopists use X‐ray fluoroscopy to guide bronchoscopes to the lesion to be biopsied without any kind of incisions. Reducing exposure to X‐ray is important for both patients and doctors but alternatives like electromagnetic navigation require specific equipment and increase the cost of the clinical procedure. We propose a guiding system based on the extraction of airway centers from intra‐operative videos. Such anatomical landmarks could be
matched to the airway centerline extracted from a pre‐planned CT to indicate the best path to the lesion. We present an extraction of lumen centers
from intra‐operative videos based on tracking of maximal stable regions of energy maps.
|
|
|
Carles Sanchez, Debora Gil, T. Gache, N. Koufos, Marta Diez-Ferrer, & Antoni Rosell. (2016). SENSA: a System for Endoscopic Stenosis Assessment. In 28th Conference of the international Society for Medical Innovation and Technology.
Abstract: Documenting the severity of a static or dynamic Central Airway Obstruction (CAO) is crucial to establish proper diagnosis and treatment, predict possible treatment effects and better follow-up the patients. The subjective visual evaluation of a stenosis during video-bronchoscopy still remains the most common way to assess a CAO in spite of a consensus among experts for a need to standardize all calculations [1].
The Computer Vision Center in cooperation with the «Hospital de Bellvitge», has developed a System for Endoscopic Stenosis Assessment (SENSA), which computes CAO directly by analyzing standard bronchoscopic data without the need of using other imaging tecnologies.
|
|
|
Vishwesh Pillai, Pranav Mehar, Manisha Das, Deep Gupta, & Petia Radeva. (2022). Integrated Hierarchical and Flat Classifiers for Food Image Classification using Epistemic Uncertainty. In IEEE International Conference on Signal Processing and Communications.
Abstract: The problem of food image recognition is an essential one in today’s context because health conditions such as diabetes, obesity, and heart disease require constant monitoring of a person’s diet. To automate this process, several models are available to recognize food images. Due to a considerable number of unique food dishes and various cuisines, a traditional flat classifier ceases to perform well. To address this issue, prediction schemes consisting of both flat and hierarchical classifiers, with the analysis of epistemic uncertainty are used to switch between the classifiers. However, the accuracy of the predictions made using epistemic uncertainty data remains considerably low. Therefore, this paper presents a prediction scheme using three different threshold criteria that helps to increase the accuracy of epistemic uncertainty predictions. The performance of the proposed method is demonstrated using several experiments performed on the MAFood-121 dataset. The experimental results validate the proposal performance and show that the proposed threshold criteria help to increase the overall accuracy of the predictions by correctly classifying the uncertainty distribution of the samples.
|
|
|
Debora Gil, Jaume Garcia, Aura Hernandez-Sabate, & Enric Marti. (2010). Manifold parametrization of the left ventricle for a statistical modelling of its complete anatomy. In 8th Medical Imaging (Vol. 7623, 304). SPIE.
Abstract: Distortion of Left Ventricle (LV) external anatomy is related to some dysfunctions, such as hypertrophy. The architecture of myocardial fibers determines LV electromechanical activation patterns as well as mechanics. Thus, their joined modelling would allow the design of specific interventions (such as peacemaker implantation and LV remodelling) and therapies (such as resynchronization). On one hand, accurate modelling of external anatomy requires either a dense sampling or a continuous infinite dimensional approach, which requires non-Euclidean statistics. On the other hand, computation of fiber models requires statistics on Riemannian spaces. Most approaches compute separate statistical models for external anatomy and fibers architecture. In this work we propose a general mathematical framework based on differential geometry concepts for computing a statistical model including, both, external and fiber anatomy. Our framework provides a continuous approach to external anatomy supporting standard statistics. We also provide a straightforward formula for the computation of the Riemannian fiber statistics. We have applied our methodology to the computation of complete anatomical atlas of canine hearts from diffusion tensor studies. The orientation of fibers over the average external geometry agrees with the segmental description of orientations reported in the literature.
|
|
|
Hongxing Gao, Marçal Rusiñol, Dimosthenis Karatzas, & Josep Llados. (2014). Fast Structural Matching for Document Image Retrieval through Spatial Databases. In Document Recognition and Retrieval XXI (Vol. 9021).
Abstract: The structure of document images plays a signicant role in document analysis thus considerable eorts have been made towards extracting and understanding document structure, usually in the form of layout analysis approaches. In this paper, we rst employ Distance Transform based MSER (DTMSER) to eciently extract stable document structural elements in terms of a dendrogram of key-regions. Then a fast structural matching method is proposed to query the structure of document (dendrogram) based on a spatial database which facilitates the formulation of advanced spatial queries. The experiments demonstrate a signicant improvement in a document retrieval scenario when compared to the use of typical Bag of Words (BoW) and pyramidal BoW descriptors.
Keywords: Document image retrieval; distance transform; MSER; spatial database
|
|
|
Sergio Vera, Debora Gil, & Miguel Angel Gonzalez Ballester. (2014). Anatomical parameterization for volumetric meshing of the liver. In SPIE – Medical Imaging (Vol. 9036).
Abstract: A coordinate system describing the interior of organs is a powerful tool for a systematic localization of injured tissue. If the same coordinate values are assigned to specific anatomical landmarks, the coordinate system allows integration of data across different medical image modalities. Harmonic mappings have been used to produce parametric coordinate systems over the surface of anatomical shapes, given their flexibility to set values
at specific locations through boundary conditions. However, most of the existing implementations in medical imaging restrict to either anatomical surfaces, or the depth coordinate with boundary conditions is given at sites
of limited geometric diversity. In this paper we present a method for anatomical volumetric parameterization that extends current harmonic parameterizations to the interior anatomy using information provided by the
volume medial surface. We have applied the methodology to define a common reference system for the liver shape and functional anatomy. This reference system sets a solid base for creating anatomical models of the patient’s liver, and allows comparing livers from several patients in a common framework of reference.
Keywords: Coordinate System; Anatomy Modeling; Parameterization
|
|
|
Miquel Ferrer, Ernest Valveny, F. Serratosa, & Horst Bunke. (2008). Exact Median Graph Computation via Graph Embedding. In 12th International Workshop on Structural and Syntactic Pattern Recognition (Vol. 5324, 15–24). LNCS.
|
|
|
Muhammad Muzzamil Luqman, Jean-Yves Ramel, & Josep Llados. (2012). Improving Fuzzy Multilevel Graph Embedding through Feature Selection Technique. In Structural, Syntactic, and Statistical Pattern Recognition, Joint IAPR International Workshop (Vol. 7626, pp. 243–253). LNCS. Springer Berlin Heidelberg.
Abstract: Graphs are the most powerful, expressive and convenient data structures but there is a lack of efficient computational tools and algorithms for processing them. The embedding of graphs into numeric vector spaces permits them to access the state-of-the-art computational efficient statistical models and tools. In this paper we take forward our work on explicit graph embedding and present an improvement to our earlier proposed method, named “fuzzy multilevel graph embedding – FMGE”, through feature selection technique. FMGE achieves the embedding of attributed graphs into low dimensional vector spaces by performing a multilevel analysis of graphs and extracting a set of global, structural and elementary level features. Feature selection permits FMGE to select the subset of most discriminating features and to discard the confusing ones for underlying graph dataset. Experimental results for graph classification experimentation on IAM letter, GREC and fingerprint graph databases, show improvement in the performance of FMGE.
|
|
|
Volkmar Frinken, Alicia Fornes, Josep Llados, & Jean-Marc Ogier. (2012). Bidirectional Language Model for Handwriting Recognition. In Structural, Syntactic, and Statistical Pattern Recognition, Joint IAPR International Workshop (Vol. 7626, pp. 611–619). LNCS. Springer Berlin Heidelberg.
Abstract: In order to improve the results of automatically recognized handwritten text, information about the language is commonly included in the recognition process. A common approach is to represent a text line as a sequence. It is processed in one direction and the language information via n-grams is directly included in the decoding. This approach, however, only uses context on one side to estimate a word’s probability. Therefore, we propose a bidirectional recognition in this paper, using distinct forward and a backward language models. By combining decoding hypotheses from both directions, we achieve a significant increase in recognition accuracy for the off-line writer independent handwriting recognition task. Both language models are of the same type and can be estimated on the same corpus. Hence, the increase in recognition accuracy comes without any additional need for training data or language modeling complexity.
|
|