M. Bressan, & Jordi Vitria. (2002). Feature Subset Selection in an ICA Space.
|
M. Bressan, & Jordi Vitria. (2002). Improving Naive Bayes using Class Conditional ICA.
|
M. Bressan, & Jordi Vitria. (2002). Independent Component Analysis and Naïve Bayes Classification. Proceedings of the Second IASTED International Conference Visualilzation, Imaging and Image Proceesing VIIP 2002: 496–501., .
|
M. Bressan, & Jordi Vitria. (2002). Improving Naive Bayes using Class Condicitonal ICA..
|
M. Bressan, & Jordi Vitria. (2003). Independent Feature Selection. IEEE Transactions on Pattern Analysis and Machine Intelligence, 25(10): 1312–1317 (IF: 3.823).
|
M. Bressan, & Jordi Vitria. (2003). Nonparametric Discriminant Analysis and Nearest Neighbor Classification. PRL - Pattern Recognition Letters, 24(15), 2743–2749.
|
M. Bressan, David Guillamet, & Jordi Vitria. (2001). Using a local ICA Representation of High Dimensional Data for Object Recognition and Classification..
|
M. Bressan, David Guillamet, & Jordi Vitria. (2000). Using an ICA representation of local color histograms for object recognition..
|
M. Bressan, David Guillamet, & Jordi Vitria. (2003). Using an ICA Representation of Local Color Histograms for Object Recognition. Pattern Recognition, 36(3):691–701 (IF: 1.611).
|
M. Bressan, David Guillamet, & Jordi Vitria. (2004). Multiclass Object Recognition using Class-Conditional Independent Component Analisis. Cybernetics and Systems, 35/1:35–61 (IF: 0.768).
|
M. Bressan. (2001). Un analisis de viabilidad para la confeccion semisupervisada de un mapa de usos del suelo de Catalunya.
|
M. Bressan. (2000). Independent modes of variation in Point Distribution models.
|
M. Altillawi, S. Li, S.M. Prakhya, Z. Liu, & Joan Serrat. (2024). Implicit Learning of Scene Geometry From Poses for Global Localization. ROBOTAUTOMLET - IEEE Robotics and Automation Letters, 9(2), 955–962.
Abstract: Global visual localization estimates the absolute pose of a camera using a single image, in a previously mapped area. Obtaining the pose from a single image enables many robotics and augmented/virtual reality applications. Inspired by latest advances in deep learning, many existing approaches directly learn and regress 6 DoF pose from an input image. However, these methods do not fully utilize the underlying scene geometry for pose regression. The challenge in monocular relocalization is the minimal availability of supervised training data, which is just the corresponding 6 DoF poses of the images. In this letter, we propose to utilize these minimal available labels (i.e., poses) to learn the underlying 3D geometry of the scene and use the geometry to estimate the 6 DoF camera pose. We present a learning method that uses these pose labels and rigid alignment to learn two 3D geometric representations ( X, Y, Z coordinates ) of the scene, one in camera coordinate frame and the other in global coordinate frame. Given a single image, it estimates these two 3D scene representations, which are then aligned to estimate a pose that matches the pose label. This formulation allows for the active inclusion of additional learning constraints to minimize 3D alignment errors between the two 3D scene representations, and 2D re-projection errors between the 3D global scene representation and 2D image pixels, resulting in improved localization accuracy. During inference, our model estimates the 3D scene geometry in camera and global frames and aligns them rigidly to obtain pose in real-time. We evaluate our work on three common visual localization datasets, conduct ablation studies, and show that our method exceeds state-of-the-art regression methods' pose accuracy on all datasets.
Keywords: Localization; Localization and mapping; Deep learning for visual perception; Visual learning
|
Luis Herranz, Weiqing Min, & Shuqiang Jiang. (2018). Food recognition and recipe analysis: integrating visual content, context and external knowledge.
Abstract: The central role of food in our individual and social life, combined with recent technological advances, has motivated a growing interest in applications that help to better monitor dietary habits as well as the exploration and retrieval of food-related information. We review how visual content, context and external knowledge can be integrated effectively into food-oriented applications, with special focus on recipe analysis and retrieval, food recommendation and restaurant context as emerging directions.
|
Luis Herranz, Shuqiang Jiang, & Ruihan Xu. (2017). Modeling Restaurant Context for Food Recognition. TMM - IEEE Transactions on Multimedia, 19(2), 430–440.
Abstract: Food photos are widely used in food logs for diet monitoring and in social networks to share social and gastronomic experiences. A large number of these images are taken in restaurants. Dish recognition in general is very challenging, due to different cuisines, cooking styles, and the intrinsic difficulty of modeling food from its visual appearance. However, contextual knowledge can be crucial to improve recognition in such scenario. In particular, geocontext has been widely exploited for outdoor landmark recognition. Similarly, we exploit knowledge about menus and location of restaurants and test images. We first adapt a framework based on discarding unlikely categories located far from the test image. Then, we reformulate the problem using a probabilistic model connecting dishes, restaurants, and locations. We apply that model in three different tasks: dish recognition, restaurant recognition, and location refinement. Experiments on six datasets show that by integrating multiple evidences (visual, location, and external knowledge) our system can boost the performance in all tasks.
|