|
Debora Gil, Sergio Vera, Agnes Borras, Albert Andaluz, & Miguel Angel Gonzalez Ballester. (2017). Anatomical Medial Surfaces with Efficient Resolution of Branches Singularities. MIA - Medical Image Analysis, 35, 390–402.
Abstract: Medial surfaces are powerful tools for shape description, but their use has been limited due to the sensibility existing methods to branching artifacts. Medial branching artifacts are associated to perturbations of the object boundary rather than to geometric features. Such instability is a main obstacle for a condent application in shape recognition and description. Medial branches correspond to singularities of the medial surface and, thus, they are problematic for existing morphological and energy-based algorithms. In this paper, we use algebraic geometry concepts in an energy-based approach to compute a medial surface presenting a stable branching topology. We also present an ecient GPU-CPU implementation using standard image processing tools. We show the method computational eciency and quality on a custom made synthetic database. Finally, we present some results on a medical imaging application for localization of abdominal pathologies.
Keywords: Medial Representations; Shape Recognition; Medial Branching Stability ; Singular Points
|
|
|
Dena Bazazian, Raul Gomez, Anguelos Nicolaou, Lluis Gomez, Dimosthenis Karatzas, & Andrew Bagdanov. (2019). Fast: Facilitated and accurate scene text proposals through fcn guided pruning. PRL - Pattern Recognition Letters, 119, 112–120.
Abstract: Class-specific text proposal algorithms can efficiently reduce the search space for possible text object locations in an image. In this paper we combine the Text Proposals algorithm with Fully Convolutional Networks to efficiently reduce the number of proposals while maintaining the same recall level and thus gaining a significant speed up. Our experiments demonstrate that such text proposal approaches yield significantly higher recall rates than state-of-the-art text localization techniques, while also producing better-quality localizations. Our results on the ICDAR 2015 Robust Reading Competition (Challenge 4) and the COCO-text datasets show that, when combined with strong word classifiers, this recall margin leads to state-of-the-art results in end-to-end scene text recognition.
|
|
|
Diana Ramirez Cifuentes, Ana Freire, Ricardo Baeza Yates, Joaquim Punti Vidal, Pilar Medina Bravo, Diego Velazquez, et al. (2020). Detection of Suicidal Ideation on Social Media: Multimodal, Relational, and Behavioral Analysis. JMIR - Journal of Medical Internet Research, 22(7), e17758.
Abstract: Background:
Suicide risk assessment usually involves an interaction between doctors and patients. However, a significant number of people with mental disorders receive no treatment for their condition due to the limited access to mental health care facilities; the reduced availability of clinicians; the lack of awareness; and stigma, neglect, and discrimination surrounding mental disorders. In contrast, internet access and social media usage have increased significantly, providing experts and patients with a means of communication that may contribute to the development of methods to detect mental health issues among social media users.
Objective:
This paper aimed to describe an approach for the suicide risk assessment of Spanish-speaking users on social media. We aimed to explore behavioral, relational, and multimodal data extracted from multiple social platforms and develop machine learning models to detect users at risk.
Methods:
We characterized users based on their writings, posting patterns, relations with other users, and images posted. We also evaluated statistical and deep learning approaches to handle multimodal data for the detection of users with signs of suicidal ideation (suicidal ideation risk group). Our methods were evaluated over a dataset of 252 users annotated by clinicians. To evaluate the performance of our models, we distinguished 2 control groups: users who make use of suicide-related vocabulary (focused control group) and generic random users (generic control group).
Results:
We identified significant statistical differences between the textual and behavioral attributes of each of the control groups compared with the suicidal ideation risk group. At a 95% CI, when comparing the suicidal ideation risk group and the focused control group, the number of friends (P=.04) and median tweet length (P=.04) were significantly different. The median number of friends for a focused control user (median 578.5) was higher than that for a user at risk (median 372.0). Similarly, the median tweet length was higher for focused control users, with 16 words against 13 words of suicidal ideation risk users. Our findings also show that the combination of textual, visual, relational, and behavioral data outperforms the accuracy of using each modality separately. We defined text-based baseline models based on bag of words and word embeddings, which were outperformed by our models, obtaining an increase in accuracy of up to 8% when distinguishing users at risk from both types of control users.
Conclusions:
The types of attributes analyzed are significant for detecting users at risk, and their combination outperforms the results provided by generic, exclusively text-based baseline models. After evaluating the contribution of image-based predictive models, we believe that our results can be improved by enhancing the models based on textual and relational features. These methods can be extended and applied to different use cases related to other mental disorders.
|
|
|
Diana Ramirez Cifuentes, Ana Freire, Ricardo Baeza Yates, Nadia Sanz Lamora, Aida Alvarez, Alexandre Gonzalez, et al. (2021). Characterization of Anorexia Nervosa on Social Media: Textual, Visual, Relational, Behavioral, and Demographical Analysis. JMIR - Journal of Medical Internet Research, 23(7), e25925.
Abstract: Background: Eating disorders are psychological conditions characterized by unhealthy eating habits. Anorexia nervosa (AN) is defined as the belief of being overweight despite being dangerously underweight. The psychological signs involve emotional and behavioral issues. There is evidence that signs and symptoms can manifest on social media, wherein both harmful and beneficial content is shared daily.
|
|
|
Diego Velazquez, Josep M. Gonfaus, Pau Rodriguez, Xavier Roca, Seiichi Ozawa, & Jordi Gonzalez. (2021). Logo Detection With No Priors. ACCESS - IEEE Access, 9, 106998–107011.
Abstract: In recent years, top referred methods on object detection like R-CNN have implemented this task as a combination of proposal region generation and supervised classification on the proposed bounding boxes. Although this pipeline has achieved state-of-the-art results in multiple datasets, it has inherent limitations that make object detection a very complex and inefficient task in computational terms. Instead of considering this standard strategy, in this paper we enhance Detection Transformers (DETR) which tackles object detection as a set-prediction problem directly in an end-to-end fully differentiable pipeline without requiring priors. In particular, we incorporate Feature Pyramids (FP) to the DETR architecture and demonstrate the effectiveness of the resulting DETR-FP approach on improving logo detection results thanks to the improved detection of small logos. So, without requiring any domain specific prior to be fed to the model, DETR-FP obtains competitive results on the OpenLogo and MS-COCO datasets offering a relative improvement of up to 30%, when compared to a Faster R-CNN baseline which strongly depends on hand-designed priors.
|
|
|
Diego Velazquez, Pau Rodriguez, Josep M. Gonfaus, Xavier Roca, & Jordi Gonzalez. (2022). A Closer Look at Embedding Propagation for Manifold Smoothing. JMLR - Journal of Machine Learning Research, 23(252), 1–27.
Abstract: Supervised training of neural networks requires a large amount of manually annotated data and the resulting networks tend to be sensitive to out-of-distribution (OOD) data.
Self- and semi-supervised training schemes reduce the amount of annotated data required during the training process. However, OOD generalization remains a major challenge for most methods. Strategies that promote smoother decision boundaries play an important role in out-of-distribution generalization. For example, embedding propagation (EP) for manifold smoothing has recently shown to considerably improve the OOD performance for few-shot classification. EP achieves smoother class manifolds by building a graph from sample embeddings and propagating information through the nodes in an unsupervised manner. In this work, we extend the original EP paper providing additional evidence and experiments showing that it attains smoother class embedding manifolds and improves results in settings beyond few-shot classification. Concretely, we show that EP improves the robustness of neural networks against multiple adversarial attacks as well as semi- and
self-supervised learning performance.
Keywords: Regularization; emi-supervised learning; self-supervised learning; adversarial robustness; few-shot classification
|
|
|
Domicele Jonauskaite, Lucia Camenzind, C. Alejandro Parraga, Cecile N Diouf, Mathieu Mercapide Ducommun, Lauriane Müller, et al. (2021). Colour-emotion associations in individuals with red-green colour blindness. PeerJ, 9, e11180.
Abstract: Colours and emotions are associated in languages and traditions. Some of us may convey sadness by saying feeling blue or by wearing black clothes at funerals. The first example is a conceptual experience of colour and the second example is an immediate perceptual experience of colour. To investigate whether one or the other type of experience more strongly drives colour-emotion associations, we tested 64 congenitally red-green colour-blind men and 66 non-colour-blind men. All participants associated 12 colours, presented as terms or patches, with 20 emotion concepts, and rated intensities of the associated emotions. We found that colour-blind and non-colour-blind men associated similar emotions with colours, irrespective of whether colours were conveyed via terms (r = .82) or patches (r = .80). The colour-emotion associations and the emotion intensities were not modulated by participants' severity of colour blindness. Hinting at some additional, although minor, role of actual colour perception, the consistencies in associations for colour terms and patches were higher in non-colour-blind than colour-blind men. Together, these results suggest that colour-emotion associations in adults do not require immediate perceptual colour experiences, as conceptual experiences are sufficient.
Keywords: Affect; Chromotherapy; Colour cognition; Colour vision deficiency; Cross-modal correspondences; Daltonism; Deuteranopia; Dichromatic; Emotion; Protanopia.
|
|
|
Domicele Jonauskaite, Nele Dael, C. Alejandro Parraga, Laetitia Chevre, Alejandro Garcia Sanchez, & Christine Mohr. (2018). Stripping #The Dress: The importance of contextual information on inter-individual differences in colour perception. PSYCHO R - Psychological Research, , 1–15.
Abstract: In 2015, a picture of a Dress (henceforth the Dress) triggered popular and scientific interest; some reported seeing the Dress in white and gold (W&G) and others in blue and black (B&B). We aimed to describe the phenomenon and investigate the role of contextualization. Few days after the Dress had appeared on the Internet, we projected it to 240 students on two large screens in the classroom. Participants reported seeing the Dress in B&B (48%), W&G (38%), or blue and brown (B&Br; 7%). Amongst numerous socio-demographic variables, we only observed that W&G viewers were most likely to have always seen the Dress as W&G. In the laboratory, we tested how much contextual information is necessary for the phenomenon to occur. Fifty-seven participants selected colours most precisely matching predominant colours of parts or the full Dress. We presented, in this order, small squares (a), vertical strips (b), and the full Dress (c). We found that (1) B&B, B&Br, and W&G viewers had selected colours differing in lightness and chroma levels for contextualized images only (b, c conditions) and hue for fully contextualized condition only (c) and (2) B&B viewers selected colours most closely matching displayed colours of the Dress. Thus, the Dress phenomenon emerges due to inter-individual differences in subjectively perceived lightness, chroma, and hue, at least when all aspects of the picture need to be integrated. Our results support the previous conclusions that contextual information is key to colour perception; it should be important to understand how this actually happens.
|
|
|
Dorota Kaminska, Kadir Aktas, Davit Rizhinashvili, Danila Kuklyanov, Abdallah Hussein Sham, Sergio Escalera, et al. (2021). Two-stage Recognition and Beyond for Compound Facial Emotion Recognition. ELEC - Electronics, 10(22), 2847.
Abstract: Facial emotion recognition is an inherently complex problem due to individual diversity in facial features and racial and cultural differences. Moreover, facial expressions typically reflect the mixture of people’s emotional statuses, which can be expressed using compound emotions. Compound facial emotion recognition makes the problem even more difficult because the discrimination between dominant and complementary emotions is usually weak. We have created a database that includes 31,250 facial images with different emotions of 115 subjects whose gender distribution is almost uniform to address compound emotion recognition. In addition, we have organized a competition based on the proposed dataset, held at FG workshop 2020. This paper analyzes the winner’s approach—a two-stage recognition method (1st stage, coarse recognition; 2nd stage, fine recognition), which enhances the classification of symmetrical emotion labels.
Keywords: compound emotion recognition; facial expression recognition; dominant and complementary emotion recognition; deep learning
|
|
|
Eduard Vazquez, Ramon Baldrich, Joost Van de Weijer, & Maria Vanrell. (2011). Describing Reflectances for Colour Segmentation Robust to Shadows, Highlights and Textures. TPAMI - IEEE Transactions on Pattern Analysis and Machine Intelligence, 33(5), 917–930.
Abstract: The segmentation of a single material reflectance is a challenging problem due to the considerable variation in image measurements caused by the geometry of the object, shadows, and specularities. The combination of these effects has been modeled by the dichromatic reflection model. However, the application of the model to real-world images is limited due to unknown acquisition parameters and compression artifacts. In this paper, we present a robust model for the shape of a single material reflectance in histogram space. The method is based on a multilocal creaseness analysis of the histogram which results in a set of ridges representing the material reflectances. The segmentation method derived from these ridges is robust to both shadow, shading and specularities, and texture in real-world images. We further complete the method by incorporating prior knowledge from image statistics, and incorporate spatial coherence by using multiscale color contrast information. Results obtained show that our method clearly outperforms state-of-the-art segmentation methods on a widely used segmentation benchmark, having as a main characteristic its excellent performance in the presence of shadows and highlights at low computational cost.
|
|
|
Eduard Vazquez, Theo Gevers, M. Lucassen, Joost Van de Weijer, & Ramon Baldrich. (2010). Saliency of Color Image Derivatives: A Comparison between Computational Models and Human Perception. JOSA A - Journal of the Optical Society of America A, 27(3), 613–621.
Abstract: In this paper, computational methods are proposed to compute color edge saliency based on the information content of color edges. The computational methods are evaluated on bottom-up saliency in a psychophysical experiment, and on a more complex task of salient object detection in real-world images. The psychophysical experiment demonstrates the relevance of using information theory as a saliency processing model and that the proposed methods are significantly better in predicting color saliency (with a human-method correspondence up to 74.75% and an observer agreement of 86.8%) than state-of-the-art models. Furthermore, results from salient object detection confirm that an early fusion of color and contrast provide accurate performance to compute visual saliency with a hit rate up to 95.2%.
|
|
|
Eduardo Aguilar, Beatriz Remeseiro, Marc Bolaños, & Petia Radeva. (2018). Grab, Pay, and Eat: Semantic Food Detection for Smart Restaurants. IEEE Transactions on Multimedia, 20(12), 3266–3275.
Abstract: The increase in awareness of people towards their nutritional habits has drawn considerable attention to the field of automatic food analysis. Focusing on self-service restaurants environment, automatic food analysis is not only useful for extracting nutritional information from foods selected by customers, it is also of high interest to speed up the service solving the bottleneck produced at the cashiers in times of high demand. In this paper, we address the problem of automatic food tray analysis in canteens and restaurants environment, which consists in predicting multiple foods placed on a tray image. We propose a new approach for food analysis based on convolutional neural networks, we name Semantic Food Detection, which integrates in the same framework food localization, recognition and segmentation. We demonstrate that our method improves the state of the art food detection by a considerable margin on the public dataset UNIMIB2016 achieving about 90% in terms of F-measure, and thus provides a significant technological advance towards the automatic billing in restaurant environments.
|
|
|
Eduardo Aguilar, Bhalaji Nagarajan, Beatriz Remeseiro, & Petia Radeva. (2022). Bayesian deep learning for semantic segmentation of food images. CEE - Computers and Electrical Engineering, 103, 108380.
Abstract: Deep learning has provided promising results in various applications; however, algorithms tend to be overconfident in their predictions, even though they may be entirely wrong. Particularly for critical applications, the model should provide answers only when it is very sure of them. This article presents a Bayesian version of two different state-of-the-art semantic segmentation methods to perform multi-class segmentation of foods and estimate the uncertainty about the given predictions. The proposed methods were evaluated on three public pixel-annotated food datasets. As a result, we can conclude that Bayesian methods improve the performance achieved by the baseline architectures and, in addition, provide information to improve decision-making. Furthermore, based on the extracted uncertainty map, we proposed three measures to rank the images according to the degree of noisy annotations they contained. Note that the top 135 images ranked by one of these measures include more than half of the worst-labeled food images.
Keywords: Deep learning; Uncertainty quantification; Bayesian inference; Image segmentation; Food analysis
|
|
|
Eduardo Aguilar, Marc Bolaños, & Petia Radeva. (2019). Regularized uncertainty-based multi-task learning model for food analysis. JVCIR - Journal of Visual Communication and Image Representation, 60, 360–370.
Abstract: Food plays an important role in several aspects of our daily life. Several computer vision approaches have been proposed for tackling food analysis problems, but very little effort has been done in developing methodologies that could take profit of the existent correlation between tasks. In this paper, we propose a new multi-task model that is able to simultaneously predict different food-related tasks, e.g. dish, cuisine and food categories. Here, we extend the homoscedastic uncertainty modeling to allow single-label and multi-label classification and propose a regularization term, which jointly weighs the tasks as well as their correlations. Furthermore, we propose a new Multi-Attribute Food dataset and a new metric, Multi-Task Accuracy. We prove that using both our uncertainty-based loss and the class regularization term, we are able to improve the coherence of outputs between different tasks. Moreover, we outperform the use of task-specific models on classical measures like accuracy or .
Keywords: Multi-task models; Uncertainty modeling; Convolutional neural networks; Food image analysis; Food recognition; Food group recognition; Ingredients recognition; Cuisine recognition
|
|
|
Eduardo Aguilar, & Petia Radeva. (2020). Uncertainty-aware integration of local and flat classifiers for food recognition. PRL - Pattern Recognition Letters, 136, 237–243.
Abstract: Food image recognition has recently attracted the attention of many researchers, due to the challenging problem it poses, the ease collection of food images, and its numerous applications to health and leisure. In real applications, it is necessary to analyze and recognize thousands of different foods. For this purpose, we propose a novel prediction scheme based on a class hierarchy that considers local classifiers, in addition to a flat classifier. In order to make a decision about which approach to use, we define different criteria that take into account both the analysis of the Epistemic Uncertainty estimated from the ‘children’ classifiers and the prediction from the ‘parent’ classifier. We evaluate our proposal using three Uncertainty estimation methods, tested on two public food datasets. The results show that the proposed method reduces parent-child error propagation in hierarchical schemes and improves classification results compared to the single flat classifier, meanwhile maintains good performance regardless the Uncertainty estimation method chosen.
|
|