|
Zhengying Liu, Adrien Pavao, Zhen Xu, Sergio Escalera, Isabelle Guyon, Julio C. S. Jacques Junior, et al. (2020). How far are we from true AutoML: reflection from winning solutions and results of AutoDL challenge. In 7th ICML Workshop on Automated Machine Learning.
Abstract: Following the completion of the AutoDL challenge (the final challenge in the ChaLearn
AutoDL challenge series 2019), we investigate winning solutions and challenge results to
answer an important motivational question: how far are we from achieving true AutoML?
On one hand, the winning solutions achieve good (accurate and fast) classification performance on unseen datasets. On the other hand, all winning solutions still contain a
considerable amount of hard-coded knowledge on the domain (or modality) such as image,
video, text, speech and tabular. This form of ad-hoc meta-learning could be replaced by
more automated forms of meta-learning in the future. Organizing a meta-learning challenge could help forging AutoML solutions that generalize to new unseen domains (e.g.
new types of sensor data) as well as gaining insights on the AutoML problem from a more
fundamental point of view. The datasets of the AutoDL challenge are a resource that can
be used for further benchmarks and the code of the winners has been outsourced, which is
a big step towards “democratizing” Deep Learning.
|
|
|
Jose Seabra, F. Javier Sanchez, Francesco Ciompi, & Petia Radeva. (2010). Ultrasonographic Plaque Characterization using a Rayleigh Mixture Model. In 7th IEEE International Symposium on Biomedical Imaging (1–4).
Abstract: From Nano to Macro
A correct modelling of tissue morphology is determinant for the identification of vulnerable plaques. This paper aims at describing the plaque composition by means of a Rayleigh Mixture Model applied to ultrasonic data. The effectiveness of using a mixture of distributions is established through synthetic and real ultrasonic data samples. Furthermore, the proposed mixture model is used in a plaque classification problem in Intravascular Ultrasound (IVUS) images of coronary plaques. A classifier tested on a set of 67 in-vitro plaques, yields an overall accuracy of 86% and sensitivity of 92%, 94% and 82%, for fibrotic, calcified and lipidic tissues, respectively. These results strongly suggest that different plaques types can be distinguished by means of the coefficients and Rayleigh parameters of the mixture distribution.
|
|
|
Agata Lapedriza, David Masip, & Jordi Vitria. (2006). Face Verification using External Features.
|
|
|
David Aldavert, Ricardo Toledo, Arnau Ramisa, & Ramon Lopez de Mantaras. (2009). Visual Registration Method For A Low Cost Robot: Computer Vision Systems. In 7th International Conference on Computer Vision Systems (Vol. 5815, 204–214). LNCS. Springer Berlin Heidelberg.
Abstract: An autonomous mobile robot must face the correspondence or data association problem in order to carry out tasks like place recognition or unknown environment mapping. In order to put into correspondence two maps, most methods estimate the transformation relating the maps from matches established between low level feature extracted from sensor data. However, finding explicit matches between features is a challenging and computationally expensive task. In this paper, we propose a new method to align obstacle maps without searching explicit matches between features. The maps are obtained from a stereo pair. Then, we use a vocabulary tree approach to identify putative corresponding maps followed by the Newton minimization algorithm to find the transformation that relates both maps. The proposed method is evaluated in a typical office environment showing good performance.
|
|
|
Diego Cheda, Daniel Ponsa, & Antonio Lopez. (2012). Monocular Depth-based Background Estimation. In 7th International Conference on Computer Vision Theory and Applications (pp. 323–328).
Abstract: In this paper, we address the problem of reconstructing the background of a scene from a video sequence with occluding objects. The images are taken by hand-held cameras. Our method composes the background by selecting the appropriate pixels from previously aligned input images. To do that, we minimize a cost function that penalizes the deviations from the following assumptions: background represents objects whose distance to the camera is maximal, and background objects are stationary. Distance information is roughly obtained by a supervised learning approach that allows us to distinguish between close and distant image regions. Moving foreground objects are filtered out by using stationariness and motion boundary constancy measurements. The cost function is minimized by a graph cuts method. We demonstrate the applicability of our approach to recover an occlusion-free background in a set of sequences.
|
|
|
Pedro Martins, Carlo Gatta, & Paulo Carvalho. (2012). Feature-driven Maximally Stable Extremal Regions. In 7th International Conference on Computer Vision Theory and Applications (pp. 490–497).
|
|
|
Susana Alvarez, Anna Salvatella, Maria Vanrell, & Xavier Otazu. (2010). 3D Texton Spaces for color-texture retrieval. In A.C. Campilho and M.S. Kamel (Ed.), 7th International Conference on Image Analysis and Recognition (Vol. 6111, 354–363). LNCS. Springer Berlin Heidelberg.
Abstract: Color and texture are visual cues of different nature, their integration in an useful visual descriptor is not an easy problem. One way to combine both features is to compute spatial texture descriptors independently on each color channel. Another way is to do the integration at the descriptor level. In this case the problem of normalizing both cues arises. In this paper we solve the latest problem by fusing color and texture through distances in texton spaces. Textons are the attributes of image blobs and they are responsible for texture discrimination as defined in Julesz’s Texton theory. We describe them in two low-dimensional and uniform spaces, namely, shape and color. The dissimilarity between color texture images is computed by combining the distances in these two spaces. Following this approach, we propose our TCD descriptor which outperforms current state of art methods in the two different approaches mentioned above, early combination with LBP and late combination with MPEG-7. This is done on an image retrieval experiment over a highly diverse texture dataset from Corel.
|
|
|
Naveen Onkarappa, & Angel Sappa. (2010). On-Board Monocular Vision System Pose Estimation through a Dense Optical Flow. In 7th International Conference on Image Analysis and Recognition (Vol. 6111, pp. 230–239). LNCS. Springer Berlin Heidelberg.
Abstract: This paper presents a robust technique for estimating on-board monocular vision system pose. The proposed approach is based on a dense optical flow that is robust against shadows, reflections and illumination changes. A RANSAC based scheme is used to cope with the outliers in the optical flow. The proposed technique is intended to be used in driver assistance systems for applications such as obstacle or pedestrian detection. Experimental results on different scenarios, both from synthetic and real sequences, shows usefulness of the proposed approach.
|
|
|
Xavier Soria, Angel Sappa, & Arash Akbarinia. (2017). Multispectral Single-Sensor RGB-NIR Imaging: New Challenges and Opportunities. In 7th International Conference on Image Processing Theory, Tools & Applications.
Abstract: Multispectral images captured with a single sensor camera have become an attractive alternative for numerous computer vision applications. However, in order to fully exploit their potentials, the color restoration problem (RGB representation) should be addressed. This problem is more evident in outdoor scenarios containing vegetation, living beings, or specular materials. The problem of color distortion emerges from the sensitivity of sensors due to the overlap of visible and near infrared spectral bands. This paper empirically evaluates the variability of the near infrared (NIR) information with respect to the changes of light throughout the day. A tiny neural network is proposed to restore the RGB color representation from the given RGBN (Red, Green, Blue, NIR) images. In order to evaluate the proposed algorithm, different experiments on a RGBN outdoor dataset are conducted, which include various challenging cases. The obtained result shows the challenge and the importance of addressing color restoration in single sensor multispectral images.
Keywords: Color restoration; Neural networks; Singlesensor cameras; Multispectral images; RGB-NIR dataset
|
|
|
Mohamed Ali Souibgui, Pau Torras, Jialuo Chen, & Alicia Fornes. (2023). An Evaluation of Handwritten Text Recognition Methods for Historical Ciphered Manuscripts. In 7th International Workshop on Historical Document Imaging and Processing (pp. 7–12).
Abstract: This paper investigates the effectiveness of different deep learning HTR families, including LSTM, Seq2Seq, and transformer-based approaches with self-supervised pretraining, in recognizing ciphered manuscripts from different historical periods and cultures. The goal is to identify the most suitable method or training techniques for recognizing ciphered manuscripts and to provide insights into the challenges and opportunities in this field of research. We evaluate the performance of these models on several datasets of ciphered manuscripts and discuss their results. This study contributes to the development of more accurate and efficient methods for recognizing historical manuscripts for the preservation and dissemination of our cultural heritage.
|
|
|
Javier Rodenas, Bhalaji Nagarajan, Marc Bolaños, & Petia Radeva. (2022). Learning Multi-Subset of Classes for Fine-Grained Food Recognition. In 7th International Workshop on Multimedia Assisted Dietary Management (17–26).
Abstract: Food image recognition is a complex computer vision task, because of the large number of fine-grained food classes. Fine-grained recognition tasks focus on learning subtle discriminative details to distinguish similar classes. In this paper, we introduce a new method to improve the classification of classes that are more difficult to discriminate based on Multi-Subsets learning. Using a pre-trained network, we organize classes in multiple subsets using a clustering technique. Later, we embed these subsets in a multi-head model structure. This structure has three distinguishable parts. First, we use several shared blocks to learn the generalized representation of the data. Second, we use multiple specialized blocks focusing on specific subsets that are difficult to distinguish. Lastly, we use a fully connected layer to weight the different subsets in an end-to-end manner by combining the neuron outputs. We validated our proposed method using two recent state-of-the-art vision transformers on three public food recognition datasets. Our method was successful in learning the confused classes better and we outperformed the state-of-the-art on the three datasets.
|
|
|
Antoni Gurgui, Debora Gil, Enric Marti, & Vicente Grau. (2016). Left-Ventricle Basal Region Constrained Parametric Mapping to Unitary Domain. In 7th International Workshop on Statistical Atlases & Computational Modelling of the Heart (Vol. 10124, pp. 163–171). LNCS.
Abstract: Due to its complex geometry, the basal ring is often omitted when putting different heart geometries into correspondence. In this paper, we present the first results on a new mapping of the left ventricle basal rings onto a normalized coordinate system using a fold-over free approach to the solution to the Laplacian. To guarantee correspondences between different basal rings, we imposed some internal constrained positions at anatomical landmarks in the normalized coordinate system. To prevent internal fold-overs, constraints are handled by cutting the volume into regions defined by anatomical features and mapping each piece of the volume separately. Initial results presented in this paper indicate that our method is able to handle internal constrains without introducing fold-overs and thus guarantees one-to-one mappings between different basal ring geometries.
Keywords: Laplacian; Constrained maps; Parameterization; Basal ring
|
|
|
Liu Wenyin, Josep Llados, & Jean-Marc Ogier. (2008). Graphics Recognition. Recent Advances and New Opportunities. (Vol. 5046). LNCS.
|
|
|
N. Zakaria, Jean-Marc Ogier, & Josep Llados. (2006). The Fuzzy-Spatial Descriptor for the Online Graphic Recognition: Overlapping Matrix Algorithm. In 7th International Workshop, Document Analysis Systems VII (DAS´06), LNCS 3872: 616–627.
|
|
|
W. Niessen, Antonio Lopez, W. Van Enk, P. Van Roermund, Bart M. Ter Haar Romeny, & M. Viergever. (1997). Multiscale Trabecular Bone Orientation Analysis..
|
|