|
David Masip, Agata Lapedriza, & Jordi Vitria. (2009). Boosted Online Learning for Face Recognition. TSMCB - IEEE Transactions on Systems, Man and Cybernetics part B, 39(2), 530–538.
Abstract: Face recognition applications commonly suffer from three main drawbacks: a reduced training set, information lying in high-dimensional subspaces, and the need to incorporate new people to recognize. In the recent literature, the extension of a face classifier in order to include new people in the model has been solved using online feature extraction techniques. The most successful approaches of those are the extensions of the principal component analysis or the linear discriminant analysis. In the current paper, a new online boosting algorithm is introduced: a face recognition method that extends a boosting-based classifier by adding new classes while avoiding the need of retraining the classifier each time a new person joins the system. The classifier is learned using the multitask learning principle where multiple verification tasks are trained together sharing the same feature space. The new classes are added taking advantage of the structure learned previously, being the addition of new classes not computationally demanding. The present proposal has been (experimentally) validated with two different facial data sets by comparing our approach with the current state-of-the-art techniques. The results show that the proposed online boosting algorithm fares better in terms of final accuracy. In addition, the global performance does not decrease drastically even when the number of classes of the base problem is multiplied by eight.
|
|
|
Jaume Amores, N. Sebe, Petia Radeva, Theo Gevers, & A. Smeulders. (2004). Boosting Contextual Information in Content-based Image Retrieval.
|
|
|
Patricia Suarez, Dario Carpio, & Angel Sappa. (2023). Boosting Guided Super-Resolution Performance with Synthesized Images. In 17th International Conference on Signal-Image Technology & Internet-Based Systems (pp. 189–195).
Abstract: Guided image processing techniques are widely used for extracting information from a guiding image to aid in the processing of the guided one. These images may be sourced from different modalities, such as 2D and 3D, or different spectral bands, like visible and infrared. In the case of guided cross-spectral super-resolution, features from the two modal images are extracted and efficiently merged to migrate guidance information from one image, usually high-resolution (HR), toward the guided one, usually low-resolution (LR). Different approaches have been recently proposed focusing on the development of architectures for feature extraction and merging in the cross-spectral domains, but none of them care about the different nature of the given images. This paper focuses on the specific problem of guided thermal image super-resolution, where an LR thermal image is enhanced by an HR visible spectrum image. To improve existing guided super-resolution techniques, a novel scheme is proposed that maps the original guiding information to a thermal image-like representation that is similar to the output. Experimental results evaluating five different approaches demonstrate that the best results are achieved when the guiding and guided images share the same domain.
|
|
|
Jaume Amores, N. Sebe, & Petia Radeva. (2006). Boosting the distance estimation: Application to the K-Nearest Neighbor Classifier. PRL - Pattern Recognition Letters, 27(3), 201–209.
|
|
|
Marçal Rusiñol, & Josep Llados. (2014). Boosting the Handwritten Word Spotting Experience by Including the User in the Loop. PR - Pattern Recognition, 47(3), 1063–1072.
Abstract: In this paper, we study the effect of taking the user into account in a query-by-example handwritten word spotting framework. Several off-the-shelf query fusion and relevance feedback strategies have been tested in the handwritten word spotting context. The increase in terms of precision when the user is included in the loop is assessed using two datasets of historical handwritten documents and two baseline word spotting approaches both based on the bag-of-visual-words model. We finally present two alternative ways of presenting the results to the user that might be more attractive and suitable to the user's needs than the classic ranked list.
Keywords: Handwritten word spotting; Query by example; Relevance feedback; Query fusion; Multidimensional scaling
|
|
|
Marçal Rusiñol, Philippe Dosch, & Josep Llados. (2007). Boundary Shape Recognition Using Accumulated Length and Angle Information. In 3rd Iberian Conference on Pattern Recognition and Image Analysis (IbPRIA 2007), J. Marti et al. (Eds.) LNCS 4478:210–217.
|
|
|
Miquel Ferrer, Ernest Valveny, & F. Serratosa. (2007). Bounding the Size Of the Median Graph. In 3rd Iberian Conference on Pattern Recognition and Image Analysis (IbPRIA 2007), J. Marti et al. (Eds.) LNCS 4478(2):491–498.
|
|
|
Ole Larsen, Petia Radeva, & Enric Marti. (1995). Bounds on the optimal elasticity parameters for a snake. Image Analysis and Processing, , 37–42.
Abstract: This paper develops a formalism by which an estimate for the upper and lower bounds for the elasticity parameters for a snake can be obtained. Objects different in size and shape give rise to different bounds. The bounds can be obtained based on an analysis of the shape of the object of interest. Experiments on synthetic images show a good correlation between the estimated behaviour of the snake and the one actually observed. Experiments on real X-ray images show that the parameters for optimal segmentation lie within the estimated bounds.
|
|
|
Antonio Hernandez, Miguel Angel Bautista, Xavier Perez Sala, Victor Ponce, Xavier Baro, Oriol Pujol, et al. (2012). BoVDW: Bag-of-Visual-and-Depth-Words for Gesture Recognition. In 21st International Conference on Pattern Recognition.
Abstract: We present a Bag-of-Visual-and-Depth-Words (BoVDW) model for gesture recognition, an extension of the Bag-of-Visual-Words (BoVW) model, that benefits from the multimodal fusion of visual and depth features. State-of-the-art RGB and depth features, including a new proposed depth descriptor, are analysed and combined in a late fusion fashion. The method is integrated in a continuous gesture recognition pipeline, where Dynamic Time Warping (DTW) algorithm is used to perform prior segmentation of gestures. Results of the method in public data sets, within our gesture recognition pipeline, show better performance in comparison to a standard BoVW model.
|
|
|
Joana Maria Pujadas-Mora, Alicia Fornes, Josep Llados, & Anna Cabre. (2016). Bridging the gap between historical demography and computing: tools for computer-assisted transcription and the analysis of demographic sources. In K.Matthijs, S.Hin, H.Matsuo, & J.Kok (Eds.), The future of historical demography. Upside down and inside out (pp. 127–131). Acco Publishers.
|
|
|
F. Javier Sanchez, Jorge Bernal, Cristina Sanchez Montes, Cristina Rodriguez de Miguel, & Gloria Fernandez Esparrach. (2017). Bright spot regions segmentation and classification for specular highlights detection in colonoscopy videos. MVAP - Machine Vision and Applications, , 1–20.
Abstract: A novel specular highlights detection method in colonoscopy videos is presented. The method is based on a model of appearance dening specular
highlights as bright spots which are highly contrasted with respect to adjacent regions. Our approach proposes two stages; segmentation, and then classication
of bright spot regions. The former denes a set of candidate regions obtained through a region growing process with local maxima as initial region seeds. This process creates a tree structure which keeps track, at each growing iteration, of the region frontier contrast; nal regions provided depend on restrictions over contrast value. Non-specular regions are ltered through a classication stage performed by a linear SVM classier using model-based features from each region. We introduce a new validation database with more than 25; 000 regions along with their corresponding pixel-wise annotations. We perform a comparative study against other approaches. Results show that our method is superior to other approaches, with our segmented regions being
closer to actual specular regions in the image. Finally, we also present how our methodology can also be used to obtain an accurate prediction of polyp histology.
Keywords: Specular highlights; bright spot regions segmentation; region classification; colonoscopy
|
|
|
Xavier Otazu, Olivier Penacchio, & Xim Cerda-Company. (2015). Brightness and colour induction through contextual influences in V1. In Scottish Vision Group 2015 SGV2015 (Vol. 12, pp. 1208–2012).
|
|
|
Xavier Otazu, Olivier Penacchio, & Laura Dempere-Marco. (2012). Brightness induction by contextual influences in V1: a neurodynamical account. In Journal of Vision (Vol. 12).
Abstract: Brightness induction is the modulation of the perceived intensity of an area by the luminance of surrounding areas and reveals fundamental properties of neural organization in the visual system. Several phenomenological models have been proposed that successfully account for psychophysical data (Pessoa et al. 1995, Blakeslee and McCourt 2004, Barkan et al. 2008, Otazu et al. 2008).
Neurophysiological evidence suggests that brightness information is explicitly represented in V1 and neuronal response modulations have been observed followingluminance changes outside their receptive fields (Rossi and Paradiso, 1999).
In this work we investigate possible neural mechanisms that offer a plausible explanation for such effects. To this end, we consider the model by Z.Li (1999) which is based on biological data and focuses on the part of V1 responsible for contextual influences, namely, layer 2–3 pyramidal cells, interneurons, and horizontal intracortical connections. This model has proven to account for phenomena such as contour detection and preattentive segmentation, which share with brightness induction the relevant effect of contextual influences. In our model, the input to the network is derived from a complete multiscale and multiorientation wavelet decomposition which makes it possible to recover an image reflecting the perceived intensity. The proposed model successfully accounts for well known pyschophysical effects (among them: the White's and modified White's effects, the Todorović, Chevreul, achromatic ring patterns, and grating induction effects). Our work suggests that intra-cortical interactions in the primary visual cortex could partially explain perceptual brightness induction effects and reveals how a common general architecture may account for several different fundamental processes emerging early in the visual pathway.
|
|
|
Fernando Vilariño. (2017). Bringing and keeping all the stakeholders together: creating a catalog of models of governance for innovation.
|
|
|
Juan Borrego-Carazo, Carles Sanchez, David Castells, Jordi Carrabina, & Debora Gil. (2023). BronchoPose: an analysis of data and model configuration for vision-based bronchoscopy pose estimation. CMPB - Computer Methods and Programs in Biomedicine, 228, 107241.
Abstract: Vision-based bronchoscopy (VB) models require the registration of the virtual lung model with the frames from the video bronchoscopy to provide effective guidance during the biopsy. The registration can be achieved by either tracking the position and orientation of the bronchoscopy camera or by calibrating its deviation from the pose (position and orientation) simulated in the virtual lung model. Recent advances in neural networks and temporal image processing have provided new opportunities for guided bronchoscopy. However, such progress has been hindered by the lack of comparative experimental conditions.
In the present paper, we share a novel synthetic dataset allowing for a fair comparison of methods. Moreover, this paper investigates several neural network architectures for the learning of temporal information at different levels of subject personalization. In order to improve orientation measurement, we also present a standardized comparison framework and a novel metric for camera orientation learning. Results on the dataset show that the proposed metric and architectures, as well as the standardized conditions, provide notable improvements to current state-of-the-art camera pose estimation in video bronchoscopy.
Keywords: Videobronchoscopy guiding; Deep learning; Architecture optimization; Datasets; Standardized evaluation framework; Pose estimation
|
|