|
Oscar Lopes, Miguel Reyes, Sergio Escalera, & Jordi Gonzalez. (2014). Spherical Blurred Shape Model for 3-D Object and Pose Recognition: Quantitative Analysis and HCI Applications in Smart Environments. TSMCB - IEEE Transactions on Systems, Man and Cybernetics (Part B), 44(12), 2379–2390.
Abstract: The use of depth maps is of increasing interest after the advent of cheap multisensor devices based on structured light, such as Kinect. In this context, there is a strong need of powerful 3-D shape descriptors able to generate rich object representations. Although several 3-D descriptors have been already proposed in the literature, the research of discriminative and computationally efficient descriptors is still an open issue. In this paper, we propose a novel point cloud descriptor called spherical blurred shape model (SBSM) that successfully encodes the structure density and local variabilities of an object based on shape voxel distances and a neighborhood propagation strategy. The proposed SBSM is proven to be rotation and scale invariant, robust to noise and occlusions, highly discriminative for multiple categories of complex objects like the human hand, and computationally efficient since the SBSM complexity is linear to the number of object voxels. Experimental evaluation in public depth multiclass object data, 3-D facial expressions data, and a novel hand poses data sets show significant performance improvements in relation to state-of-the-art approaches. Moreover, the effectiveness of the proposal is also proved for object spotting in 3-D scenes and for real-time automatic hand pose recognition in human computer interaction scenarios.
|
|
|
Parichehr Behjati Ardakani, Pau Rodriguez, Carles Fernandez, Armin Mehri, Xavier Roca, Seiichi Ozawa, et al. (2022). Frequency-based Enhancement Network for Efficient Super-Resolution. ACCESS - IEEE Access, 10, 57383–57397.
Abstract: Recently, deep convolutional neural networks (CNNs) have provided outstanding performance in single image super-resolution (SISR). Despite their remarkable performance, the lack of high-frequency information in the recovered images remains a core problem. Moreover, as the networks increase in depth and width, deep CNN-based SR methods are faced with the challenge of computational complexity in practice. A promising and under-explored solution is to adapt the amount of compute based on the different frequency bands of the input. To this end, we present a novel Frequency-based Enhancement Block (FEB) which explicitly enhances the information of high frequencies while forwarding low-frequencies to the output. In particular, this block efficiently decomposes features into low- and high-frequency and assigns more computation to high-frequency ones. Thus, it can help the network generate more discriminative representations by explicitly recovering finer details. Our FEB design is simple and generic and can be used as a direct replacement of commonly used SR blocks with no need to change network architectures. We experimentally show that when replacing SR blocks with FEB we consistently improve the reconstruction error, while reducing the number of parameters in the model. Moreover, we propose a lightweight SR model — Frequency-based Enhancement Network (FENet) — based on FEB that matches the performance of larger models. Extensive experiments demonstrate that our proposal performs favorably against the state-of-the-art SR algorithms in terms of visual quality, memory footprint, and inference time. The code is available at https://github.com/pbehjatii/FENet
Keywords: Deep learning; Frequency-based methods; Lightweight architectures; Single image super-resolution
|
|
|
A. Toet, M. Henselmans, M.P. Lucassen, & Theo Gevers. (2011). Emotional effects of dynamic textures. iPER - i-Perception, 969 – 991.
Abstract: This study explores the effects of various spatiotemporal dynamic texture characteristics on human emotions. The emotional experience of auditory (eg, music) and haptic repetitive patterns has been studied extensively. In contrast, the emotional experience of visual dynamic textures is still largely unknown, despite their natural ubiquity and increasing use in digital media. Participants watched a set of dynamic textures, representing either water or various different media, and self-reported their emotional experience. Motion complexity was found to have mildly relaxing and nondominant effects. In contrast, motion change complexity was found to be arousing and dominant. The speed of dynamics had arousing, dominant, and unpleasant effects. The amplitude of dynamics was also regarded as unpleasant. The regularity of the dynamics over the textures’ area was found to be uninteresting, nondominant, mildly relaxing, and mildly pleasant. The spatial scale of the dynamics had an unpleasant, arousing, and dominant effect, which was larger for textures with diverse content than for water textures. For water textures, the effects of spatial contrast were arousing, dominant, interesting, and mildly unpleasant. None of these effects were observed for textures of diverse content. The current findings are relevant for the design and synthesis of affective multimedia content and for affective scene indexing and retrieval.
|
|
|
Ana Garcia Rodriguez, Yael Tudela, Henry Cordova, S. Carballal, I. Ordas, L. Moreira, et al. (2022). First in Vivo Computer-Aided Diagnosis of Colorectal Polyps using White Light Endoscopy. END - Endoscopy, 54.
|
|
|
Ana Garcia Rodriguez, Yael Tudela, Henry Cordova, S. Carballal, I. Ordas, L. Moreira, et al. (2022). In vivo computer-aided diagnosis of colorectal polyps using white light endoscopy. ENDIO - Endoscopy International Open, 10(9), E1201–E1207.
Abstract: Background and study aims Artificial intelligence is currently able to accurately predict the histology of colorectal polyps. However, systems developed to date use complex optical technologies and have not been tested in vivo. The objective of this study was to evaluate the efficacy of a new deep learning-based optical diagnosis system, ATENEA, in a real clinical setting using only high-definition white light endoscopy (WLE) and to compare its performance with endoscopists. Methods ATENEA was prospectively tested in real life on consecutive polyps detected in colorectal cancer screening colonoscopies at Hospital Clínic. No images were discarded, and only WLE was used. The in vivo ATENEA's prediction (adenoma vs non-adenoma) was compared with the prediction of four staff endoscopists without specific training in optical diagnosis for the study purposes. Endoscopists were blind to the ATENEA output. Histology was the gold standard. Results Ninety polyps (median size: 5 mm, range: 2-25) from 31 patients were included of which 69 (76.7 %) were adenomas. ATENEA correctly predicted the histology in 63 of 69 (91.3 %, 95 % CI: 82 %-97 %) adenomas and 12 of 21 (57.1 %, 95 % CI: 34 %-78 %) non-adenomas while endoscopists made correct predictions in 52 of 69 (75.4 %, 95 % CI: 60 %-85 %) and 20 of 21 (95.2 %, 95 % CI: 76 %-100 %), respectively. The global accuracy was 83.3 % (95 % CI: 74%-90 %) and 80 % (95 % CI: 70 %-88 %) for ATENEA and endoscopists, respectively. Conclusion ATENEA can accurately be used for in vivo characterization of colorectal polyps, enabling the endoscopist to make direct decisions. ATENEA showed a global accuracy similar to that of endoscopists despite an unsatisfactory performance for non-adenomatous lesions.
|
|