|
Iban Berganzo-Besga, Hector A. Orengo, Felipe Lumbreras, Aftab Alam, Rosie Campbell, Petrus J Gerrits, et al. (2023). Curriculum learning-based strategy for low-density archaeological mound detection from historical maps in India and Pakistan. ScR - Scientific Reports, 13, 11257.
Abstract: This paper presents two algorithms for the large-scale automatic detection and instance segmentation of potential archaeological mounds on historical maps. Historical maps present a unique source of information for the reconstruction of ancient landscapes. The last 100 years have seen unprecedented landscape modifications with the introduction and large-scale implementation of mechanised agriculture, channel-based irrigation schemes, and urban expansion to name but a few. Historical maps offer a window onto disappearing landscapes where many historical and archaeological elements that no longer exist today are depicted. The algorithms focus on the detection and shape extraction of mound features with high probability of being archaeological settlements, mounds being one of the most commonly documented archaeological features to be found in the Survey of India historical map series, although not necessarily recognised as such at the time of surveying. Mound features with high archaeological potential are most commonly depicted through hachures or contour-equivalent form-lines, therefore, an algorithm has been designed to detect each of those features. Our proposed approach addresses two of the most common issues in archaeological automated survey, the low-density of archaeological features to be detected, and the small amount of training data available. It has been applied to all types of maps available of the historic 1″ to 1-mile series, thus increasing the complexity of the detection. Moreover, the inclusion of synthetic data, along with a Curriculum Learning strategy, has allowed the algorithm to better understand what the mound features look like. Likewise, a series of filters based on topographic setting, form, and size have been applied to improve the accuracy of the models. The resulting algorithms have a recall value of 52.61% and a precision of 82.31% for the hachure mounds, and a recall value of 70.80% and a precision of 70.29% for the form-line mounds, which allowed the detection of nearly 6000 mound features over an area of 470,500 km2, the largest such approach to have ever been applied. If we restrict our focus to the maps most similar to those used in the algorithm training, we reach recall values greater than 60% and precision values greater than 90%. This approach has shown the potential to implement an adaptive algorithm that allows, after a small amount of retraining with data detected from a new map, a better general mound feature detection in the same map.
|
|
|
Xavier Soria, Angel Sappa, Patricio Humanante, & Arash Akbarinia. (2023). Dense extreme inception network for edge detection. PR - Pattern Recognition, 139, 109461.
Abstract: Edge detection is the basis of many computer vision applications. State of the art predominantly relies on deep learning with two decisive factors: dataset content and network architecture. Most of the publicly available datasets are not curated for edge detection tasks. Here, we address this limitation. First, we argue that edges, contours and boundaries, despite their overlaps, are three distinct visual features requiring separate benchmark datasets. To this end, we present a new dataset of edges. Second, we propose a novel architecture, termed Dense Extreme Inception Network for Edge Detection (DexiNed), that can be trained from scratch without any pre-trained weights. DexiNed outperforms other algorithms in the presented dataset. It also generalizes well to other datasets without any fine-tuning. The higher quality of DexiNed is also perceptually evident thanks to the sharper and finer edges it outputs.
|
|
|
Henry Velesaca, Gisel Bastidas-Guacho, Mohammad Rouhani, & Angel Sappa. (2024). Multimodal image registration techniques: a comprehensive survey. MTAP - Multimedia Tools and Applications, .
Abstract: This manuscript presents a review of state-of-the-art techniques proposed in the literature for multimodal image registration, addressing instances where images from different modalities need to be precisely aligned in the same reference system. This scenario arises when the images to be registered come from different modalities, among the visible and thermal spectral bands, 3D-RGB, or flash-no flash, or NIR-visible. The review spans different techniques from classical approaches to more modern ones based on deep learning, aiming to highlight the particularities required at each step in the registration pipeline when dealing with multimodal images. It is noteworthy that medical images are excluded from this review due to their specific characteristics, including the use of both active and passive sensors or the non-rigid nature of the body contained in the image.
|
|
|
Patricia Suarez, Dario Carpio, & Angel Sappa. (2024). Enhancement of guided thermal image super-resolution approaches. NEUCOM - Neurocomputing, 573(127197), 1–17.
Abstract: Guided image processing techniques are widely used to extract meaningful information from a guiding image and facilitate the enhancement of the guided one. This paper specifically addresses the challenge of guided thermal image super-resolution, where a low-resolution thermal image is enhanced using a high-resolution visible spectrum image. We propose a new strategy that enhances outcomes from current guided super-resolution methods. This is achieved by transforming the initial guiding data into a representation resembling a thermal-like image, which is more closely in sync with the intended output. Experimental results with upscale factors of 8 and 16, demonstrate the outstanding performance of our approach in guided thermal image super-resolution obtained by mapping the original guiding information to a thermal-like image representation.
|
|