toggle visibility Search & Display Options

Select All    Deselect All
 |   | 
Details
   print
  Records Links
Author Lluis Gomez; Dimosthenis Karatzas edit   pdf
url  openurl
  Title TextProposals: a Text‐specific Selective Search Algorithm for Word Spotting in the Wild Type Journal Article
  Year 2017 Publication Pattern Recognition Abbreviated Journal PR  
  Volume 70 Issue Pages 60-74  
  Keywords  
  Abstract (down) Motivated by the success of powerful while expensive techniques to recognize words in a holistic way (Goel et al., 2013; Almazán et al., 2014; Jaderberg et al., 2016) object proposals techniques emerge as an alternative to the traditional text detectors. In this paper we introduce a novel object proposals method that is specifically designed for text. We rely on a similarity based region grouping algorithm that generates a hierarchy of word hypotheses. Over the nodes of this hierarchy it is possible to apply a holistic word recognition method in an efficient way.

Our experiments demonstrate that the presented method is superior in its ability of producing good quality word proposals when compared with class-independent algorithms. We show impressive recall rates with a few thousand proposals in different standard benchmarks, including focused or incidental text datasets, and multi-language scenarios. Moreover, the combination of our object proposals with existing whole-word recognizers (Almazán et al., 2014; Jaderberg et al., 2016) shows competitive performance in end-to-end word spotting, and, in some benchmarks, outperforms previously published results. Concretely, in the challenging ICDAR2015 Incidental Text dataset, we overcome in more than 10% F-score the best-performing method in the last ICDAR Robust Reading Competition (Karatzas, 2015). Source code of the complete end-to-end system is available at https://github.com/lluisgomez/TextProposals.
 
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes DAG; 600.084; 601.197; 600.121; 600.129 Approved no  
  Call Number Admin @ si @ GoK2017 Serial 2886  
Permanent link to this record
 

 
Author Naveen Onkarappa edit  isbn
openurl 
  Title Optical Flow in Driver Assistance Systems Type Book Whole
  Year 2013 Publication PhD Thesis, Universitat Autonoma de Barcelona-CVC Abbreviated Journal  
  Volume Issue Pages  
  Keywords  
  Abstract (down) Motion perception is one of the most important attributes of the human brain. Visual motion perception consists in inferring speed and direction of elements in a scene based on visual inputs. Analogously, computer vision is assisted by motion cues in the scene. Motion detection in computer vision is useful in solving problems such as segmentation, depth from motion, structure from motion, compression, navigation and many others. These problems are common in several applications, for instance, video surveillance, robot navigation and advanced driver assistance systems (ADAS). One of the most widely used techniques for motion detection is the optical flow estimation. The work in this thesis attempts to make optical flow suitable for the requirements and conditions of driving scenarios. In this context, a novel space-variant representation called reverse log-polar representation is proposed that is shown to be better than the traditional log-polar space-variant representation for ADAS. The space-variant representations reduce the amount of data to be processed. Another major contribution in this research is related to the analysis of the influence of specific characteristics from driving scenarios on the optical flow accuracy. Characteristics such as vehicle speed and
road texture are considered in the aforementioned analysis. From this study, it is inferred that the regularization weight has to be adapted according to the required error measure and for different speeds and road textures. It is also shown that polar represented optical flow suits driving scenarios where predominant motion is translation. Due to the requirements of such a study and by the lack of needed datasets a new synthetic dataset is presented; it contains: i) sequences of different speeds and road textures in an urban scenario; ii) sequences with complex motion of an on-board camera; and iii) sequences with additional moving vehicles in the scene. The ground-truth optical flow is generated by the ray-tracing technique. Further, few applications of optical flow in ADAS are shown. Firstly, a robust RANSAC based technique to estimate horizon line is proposed. Then, an egomotion estimation is presented to compare the proposed space-variant representation with the classical one. As a final contribution, a modification in the regularization term is proposed that notably improves the results
in the ADAS applications. This adaptation is evaluated using a state of the art optical flow technique. The experiments on a public dataset (KITTI) validate the advantages of using the proposed modification.
 
  Address Bellaterra  
  Corporate Author Thesis Ph.D. thesis  
  Publisher Ediciones Graficas Rey Place of Publication Editor Angel Sappa  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN 978-84-940902-1-9 Medium  
  Area Expedition Conference  
  Notes ADAS Approved no  
  Call Number Admin @ si @ Nav2013 Serial 2447  
Permanent link to this record
 

 
Author Ariel Amato edit  openurl
  Title Moving cast shadow detection Type Journal Article
  Year 2014 Publication Electronic letters on computer vision and image analysis Abbreviated Journal ELCVIA  
  Volume 13 Issue 2 Pages 70-71  
  Keywords  
  Abstract (down) Motion perception is an amazing innate ability of the creatures on the planet. This adroitness entails a functional advantage that enables species to compete better in the wild. The motion perception ability is usually employed at different levels, allowing from the simplest interaction with the ’physis’ up to the most transcendental survival tasks. Among the five classical perception system , vision is the most widely used in the motion perception field. Millions years of evolution have led to a highly specialized visual system in humans, which is characterized by a tremendous accuracy as well as an extraordinary robustness. Although humans and an immense diversity of species can distinguish moving object with a seeming simplicity, it has proven to be a difficult and non trivial problem from a computational perspective. In the field of Computer Vision, the detection of moving objects is a challenging and fundamental research area. This can be referred to as the ’origin’ of vast and numerous vision-based research sub-areas. Nevertheless, from the bottom to the top of this hierarchical analysis, the foundations still relies on when and where motion has occurred in an image. Pixels corresponding to moving objects in image sequences can be identified by measuring changes in their values. However, a pixel’s value (representing a combination of color and brightness) could also vary due to other factors such as: variation in scene illumination, camera noise and nonlinear sensor responses among others. The challenge lies in detecting if the changes in pixels’ value are caused by a genuine object movement or not. An additional challenging aspect in motion detection is represented by moving cast shadows. The paradox arises because a moving object and its cast shadow share similar motion patterns. However, a moving cast shadow is not a moving object. In fact, a shadow represents a photometric illumination effect caused by the relative position of the object with respect to the light sources. Shadow detection methods are mainly divided in two domains depending on the application field. One normally consists of static images where shadows are casted by static objects, whereas the second one is referred to image sequences where shadows are casted by moving objects. For the first case, shadows can provide additional geometric and semantic cues about shape and position of its casting object as well as the localization of the light source. Although the previous information can be extracted from static images as well as video sequences, the main focus in the second area is usually change detection, scene matching or surveillance. In this context, a shadow can severely affect with the analysis and interpretation of the scene. The work done in the thesis is focused on the second case, thus it addresses the problem of detection and removal of moving cast shadows in video sequences in order to enhance the detection of moving object.  
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes ISE Approved no  
  Call Number Admin @ si @ Ama2014 Serial 2870  
Permanent link to this record
 

 
Author Naila Murray; Luca Marchesotti; Florent Perronnin edit   pdf
url  doi
isbn  openurl
  Title Learning to Rank Images using Semantic and Aesthetic Labels Type Conference Article
  Year 2012 Publication 23rd British Machine Vision Conference Abbreviated Journal  
  Volume Issue Pages 110.1-110.10  
  Keywords  
  Abstract (down) Most works on image retrieval from text queries have addressed the problem of retrieving semantically relevant images. However, the ability to assess the aesthetic quality of an image is an increasingly important differentiating factor for search engines. In this work, given a semantic query, we are interested in retrieving images which are semantically relevant and score highly in terms of aesthetics/visual quality. We use large-margin classifiers and rankers to learn statistical models capable of ordering images based on the aesthetic and semantic information. In particular, we compare two families of approaches: while the first one attempts to learn a single ranker which takes into account both semantic and aesthetic information, the second one learns separate semantic and aesthetic models. We carry out a quantitative and qualitative evaluation on a recently-published large-scale dataset and we show that the second family of techniques significantly outperforms the first one.  
  Address Guildford, London  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN 1-901725-46-4 Medium  
  Area Expedition Conference BMVC  
  Notes CIC Approved no  
  Call Number Admin @ si @ MMP2012b Serial 2027  
Permanent link to this record
 

 
Author Noha Elfiky; Jordi Gonzalez; Xavier Roca edit   pdf
doi  openurl
  Title Compact and Adaptive Spatial Pyramids for Scene Recognition Type Journal Article
  Year 2012 Publication Image and Vision Computing Abbreviated Journal IMAVIS  
  Volume 30 Issue 8 Pages 492–500  
  Keywords  
  Abstract (down) Most successful approaches on scenerecognition tend to efficiently combine global image features with spatial local appearance and shape cues. On the other hand, less attention has been devoted for studying spatial texture features within scenes. Our method is based on the insight that scenes can be seen as a composition of micro-texture patterns. This paper analyzes the role of texture along with its spatial layout for scenerecognition. However, one main drawback of the resulting spatial representation is its huge dimensionality. Hence, we propose a technique that addresses this problem by presenting a compactSpatialPyramid (SP) representation. The basis of our compact representation, namely, CompactAdaptiveSpatialPyramid (CASP) consists of a two-stages compression strategy. This strategy is based on the Agglomerative Information Bottleneck (AIB) theory for (i) compressing the least informative SP features, and, (ii) automatically learning the most appropriate shape for each category. Our method exceeds the state-of-the-art results on several challenging scenerecognition data sets.  
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes ISE Approved no  
  Call Number Admin @ si @ EGR2012 Serial 2004  
Permanent link to this record
 

 
Author Bojana Gajic; Ariel Amato; Ramon Baldrich; Joost Van de Weijer; Carlo Gatta edit   pdf
doi  openurl
  Title Area Under the ROC Curve Maximization for Metric Learning Type Conference Article
  Year 2022 Publication CVPR 2022 Workshop on Efficien Deep Learning for Computer Vision (ECV 2022, 5th Edition) Abbreviated Journal  
  Volume Issue Pages  
  Keywords Training; Computer vision; Conferences; Area measurement; Benchmark testing; Pattern recognition  
  Abstract (down) Most popular metric learning losses have no direct relation with the evaluation metrics that are subsequently applied to evaluate their performance. We hypothesize that training a metric learning model by maximizing the area under the ROC curve (which is a typical performance measure of recognition systems) can induce an implicit ranking suitable for retrieval problems. This hypothesis is supported by previous work that proved that a curve dominates in ROC space if and only if it dominates in Precision-Recall space. To test this hypothesis, we design and maximize an approximated, derivable relaxation of the area under the ROC curve. The proposed AUC loss achieves state-of-the-art results on two large scale retrieval benchmark datasets (Stanford Online Products and DeepFashion In-Shop). Moreover, the AUC loss achieves comparable performance to more complex, domain specific, state-of-the-art methods for vehicle re-identification.  
  Address New Orleans, USA; 20 June 2022  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference CVPRW  
  Notes CIC; LAMP; Approved no  
  Call Number Admin @ si @ GAB2022 Serial 3700  
Permanent link to this record
 

 
Author Carme Julia; Angel Sappa; Felipe Lumbreras; Joan Serrat; Antonio Lopez edit   pdf
doi  openurl
  Title An Iterative Multiresolution Scheme for SFM with Missing Data: single and multiple object scenes Type Journal Article
  Year 2010 Publication Image and Vision Computing Abbreviated Journal IMAVIS  
  Volume 28 Issue 1 Pages 164-176  
  Keywords  
  Abstract (down) Most of the techniques proposed for tackling the Structure from Motion problem (SFM) cannot deal with high percentages of missing data in the matrix of trajectories. Furthermore, an additional problem should be faced up when working with multiple object scenes: the rank of the matrix of trajectories should be estimated. This paper presents an iterative multiresolution scheme for SFM with missing data to be used in both the single and multiple object cases. The proposed scheme aims at recovering missing entries in the original input matrix. The objective is to improve the results by applying a factorization technique to the partially or totally filled in matrix instead of to the original input one. Experimental results obtained with synthetic and real data sequences, containing single and multiple objects, are presented to show the viability of the proposed approach.  
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN 0262-8856 ISBN Medium  
  Area Expedition Conference  
  Notes ADAS Approved no  
  Call Number ADAS @ adas @ JSL2010 Serial 1278  
Permanent link to this record
 

 
Author Daniel Marczak; Grzegorz Rypesc; Sebastian Cygert; Tomasz Trzcinski; Bartłomiej Twardowski edit   pdf
url  openurl
  Title Generalized Continual Category Discovery Type Miscellaneous
  Year 2023 Publication arxiv Abbreviated Journal  
  Volume Issue Pages  
  Keywords  
  Abstract (down) Most of Continual Learning (CL) methods push the limit of supervised learning settings, where an agent is expected to learn new labeled tasks and not forget previous knowledge. However, these settings are not well aligned with real-life scenarios, where a learning agent has access to a vast amount of unlabeled data encompassing both novel (entirely unlabeled) classes and examples from known classes. Drawing inspiration from Generalized Category Discovery (GCD), we introduce a novel framework that relaxes this assumption. Precisely, in any task, we allow for the existence of novel and known classes, and one must use continual version of unsupervised learning methods to discover them. We call this setting Generalized Continual Category Discovery (GCCD). It unifies CL and GCD, bridging the gap between synthetic benchmarks and real-life scenarios. With a series of experiments, we present that existing methods fail to accumulate knowledge from subsequent tasks in which unlabeled samples of novel classes are present. In light of these limitations, we propose a method that incorporates both supervised and unsupervised signals and mitigates the forgetting through the use of centroid adaptation. Our method surpasses strong CL methods adopted for GCD techniques and presents a superior representation learning performance.  
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes LAMP Approved no  
  Call Number Admin @ si @ MRC2023 Serial 3985  
Permanent link to this record
 

 
Author Gemma Roig; Xavier Boix; R. de Nijs; Sebastian Ramos; K. Kühnlenz; Luc Van Gool edit   pdf
doi  openurl
  Title Active MAP Inference in CRFs for Efficient Semantic Segmentation Type Conference Article
  Year 2013 Publication 15th IEEE International Conference on Computer Vision Abbreviated Journal  
  Volume Issue Pages 2312 - 2319  
  Keywords Semantic Segmentation  
  Abstract (down) Most MAP inference algorithms for CRFs optimize an energy function knowing all the potentials. In this paper, we focus on CRFs where the computational cost of instantiating the potentials is orders of magnitude higher than MAP inference. This is often the case in semantic image segmentation, where most potentials are instantiated by slow classifiers fed with costly features. We introduce Active MAP inference 1) to on-the-fly select a subset of potentials to be instantiated in the energy function, leaving the rest of the parameters of the potentials unknown, and 2) to estimate the MAP labeling from such incomplete energy function. Results for semantic segmentation benchmarks, namely PASCAL VOC 2010 [5] and MSRC-21 [19], show that Active MAP inference achieves similar levels of accuracy but with major efficiency gains.  
  Address Sydney; Australia; December 2013  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN 1550-5499 ISBN Medium  
  Area Expedition Conference ICCV  
  Notes ADAS; 600.057 Approved no  
  Call Number ADAS @ adas @ RBN2013 Serial 2377  
Permanent link to this record
 

 
Author Xavier Soria; Yachuan Li; Mohammad Rouhani; Angel Sappa edit   pdf
url  openurl
  Title Tiny and Efficient Model for the Edge Detection Generalization Type Conference Article
  Year 2023 Publication Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV) Workshops Abbreviated Journal  
  Volume Issue Pages  
  Keywords  
  Abstract (down) Most high-level computer vision tasks rely on low-level image operations as their initial processes. Operations such as edge detection, image enhancement, and super-resolution, provide the foundations for higher level image analysis. In this work we address the edge detection considering three main objectives: simplicity, efficiency, and generalization since current state-of-the-art (SOTA) edge detection models are increased in complexity for better accuracy. To achieve this, we present Tiny and Efficient Edge Detector (TEED), a light convolutional neural network with only 58K parameters, less than 0:2% of the state-of-the-art models. Training on the BIPED dataset takes less than 30 minutes, with each epoch requiring less than 5 minutes. Our proposed model is easy to train and it quickly converges within very first few epochs, while the predicted edge-maps are crisp and of high quality. Additionally, we propose a new dataset to test the generalization of edge detection, which comprises samples from popular images used in edge detection and image segmentation. The source code is available in https://github.com/xavysp/TEED.  
  Address Paris; France; October 2023  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference ICCVW  
  Notes MSIAU Approved no  
  Call Number Admin @ si @ SLR2023 Serial 3941  
Permanent link to this record
 

 
Author Shida Beigpour; Christian Riess; Joost Van de Weijer; Elli Angelopoulou edit   pdf
doi  openurl
  Title Multi-Illuminant Estimation with Conditional Random Fields Type Journal Article
  Year 2014 Publication IEEE Transactions on Image Processing Abbreviated Journal TIP  
  Volume 23 Issue 1 Pages 83-95  
  Keywords color constancy; CRF; multi-illuminant  
  Abstract (down) Most existing color constancy algorithms assume uniform illumination. However, in real-world scenes, this is not often the case. Thus, we propose a novel framework for estimating the colors of multiple illuminants and their spatial distribution in the scene. We formulate this problem as an energy minimization task within a conditional random field over a set of local illuminant estimates. In order to quantitatively evaluate the proposed method, we created a novel data set of two-dominant-illuminant images comprised of laboratory, indoor, and outdoor scenes. Unlike prior work, our database includes accurate pixel-wise ground truth illuminant information. The performance of our method is evaluated on multiple data sets. Experimental results show that our framework clearly outperforms single illuminant estimators as well as a recently proposed multi-illuminant estimation approach.  
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN 1057-7149 ISBN Medium  
  Area Expedition Conference  
  Notes CIC; LAMP; 600.074; 600.079 Approved no  
  Call Number Admin @ si @ BRW2014 Serial 2451  
Permanent link to this record
 

 
Author M. Visani; Oriol Ramos Terrades; Salvatore Tabbone edit  doi
openurl 
  Title A Protocol to Characterize the Descriptive Power and the Complementarity of Shape Descriptors Type Journal Article
  Year 2011 Publication International Journal on Document Analysis and Recognition Abbreviated Journal IJDAR  
  Volume 14 Issue 1 Pages 87-100  
  Keywords Document analysis; Shape descriptors; Symbol description; Performance characterization; Complementarity analysis  
  Abstract (down) Most document analysis applications rely on the extraction of shape descriptors, which may be grouped into different categories, each category having its own advantages and drawbacks (O.R. Terrades et al. in Proceedings of ICDAR’07, pp. 227–231, 2007). In order to improve the richness of their description, many authors choose to combine multiple descriptors. Yet, most of the authors who propose a new descriptor content themselves with comparing its performance to the performance of a set of single state-of-the-art descriptors in a specific applicative context (e.g. symbol recognition, symbol spotting...). This results in a proliferation of the shape descriptors proposed in the literature. In this article, we propose an innovative protocol, the originality of which is to be as independent of the final application as possible and which relies on new quantitative and qualitative measures. We introduce two types of measures: while the measures of the first type are intended to characterize the descriptive power (in terms of uniqueness, distinctiveness and robustness towards noise) of a descriptor, the second type of measures characterizes the complementarity between multiple descriptors. Characterizing upstream the complementarity of shape descriptors is an alternative to the usual approach where the descriptors to be combined are selected by trial and error, considering the performance characteristics of the overall system. To illustrate the contribution of this protocol, we performed experimental studies using a set of descriptors and a set of symbols which are widely used by the community namely ART and SC descriptors and the GREC 2003 database.  
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes DAG; IF 1.091 Approved no  
  Call Number Admin @ si @VRT2011 Serial 1856  
Permanent link to this record
 

 
Author Santiago Segui; Laura Igual; Jordi Vitria edit  doi
isbn  openurl
  Title Weighted Bagging for Graph based One-Class Classifiers Type Conference Article
  Year 2010 Publication 9th International Workshop on Multiple Classifier Systems Abbreviated Journal  
  Volume 5997 Issue Pages 1-10  
  Keywords  
  Abstract (down) Most conventional learning algorithms require both positive and negative training data for achieving accurate classification results. However, the problem of learning classifiers from only positive data arises in many applications where negative data are too costly, difficult to obtain, or not available at all. Minimum Spanning Tree Class Descriptor (MSTCD) was presented as a method that achieves better accuracies than other one-class classifiers in high dimensional data. However, the presence of outliers in the target class severely harms the performance of this classifier. In this paper we propose two bagging strategies for MSTCD that reduce the influence of outliers in training data. We show the improved performance on both real and artificially contaminated data.  
  Address Cairo, Egypt  
  Corporate Author Thesis  
  Publisher Springer Berlin Heidelberg Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title LNCS  
  Series Volume Series Issue Edition  
  ISSN 0302-9743 ISBN 978-3-642-12126-5 Medium  
  Area Expedition Conference MCS  
  Notes MILAB;OR;MV Approved no  
  Call Number BCNPCL @ bcnpcl @ SIV2010 Serial 1284  
Permanent link to this record
 

 
Author Fahad Shahbaz Khan; Joost Van de Weijer; Muhammad Anwer Rao; Andrew Bagdanov; Michael Felsberg; Jorma edit   pdf
url  openurl
  Title Scale coding bag of deep features for human attribute and action recognition Type Journal Article
  Year 2018 Publication Machine Vision and Applications Abbreviated Journal MVAP  
  Volume 29 Issue 1 Pages 55-71  
  Keywords Action recognition; Attribute recognition; Bag of deep features  
  Abstract (down) Most approaches to human attribute and action recognition in still images are based on image representation in which multi-scale local features are pooled across scale into a single, scale-invariant encoding. Both in bag-of-words and the recently popular representations based on convolutional neural networks, local features are computed at multiple scales. However, these multi-scale convolutional features are pooled into a single scale-invariant representation. We argue that entirely scale-invariant image representations are sub-optimal and investigate approaches to scale coding within a bag of deep features framework. Our approach encodes multi-scale information explicitly during the image encoding stage. We propose two strategies to encode multi-scale information explicitly in the final image representation. We validate our two scale coding techniques on five datasets: Willow, PASCAL VOC 2010, PASCAL VOC 2012, Stanford-40 and Human Attributes (HAT-27). On all datasets, the proposed scale coding approaches outperform both the scale-invariant method and the standard deep features of the same network. Further, combining our scale coding approaches with standard deep features leads to consistent improvement over the state of the art.  
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes LAMP; 600.068; 600.079; 600.106; 600.120 Approved no  
  Call Number Admin @ si @ KWR2018 Serial 3107  
Permanent link to this record
 

 
Author Antonio Hernandez; Carlos Primo; Sergio Escalera edit  doi
isbn  openurl
  Title Automatic user interaction correction via Multi-label Graph cuts Type Conference Article
  Year 2011 Publication In ICCV 2011 1st IEEE International Workshop on Human Interaction in Computer Vision HICV Abbreviated Journal  
  Volume Issue Pages 1276-1281  
  Keywords  
  Abstract (down) Most applications in image segmentation requires from user interaction in order to achieve accurate results. However, user wants to achieve the desired segmentation accuracy reducing effort of manual labelling. In this work, we extend standard multi-label α-expansion Graph Cut algorithm so that it analyzes the interaction of the user in order to modify the object model and improve final segmentation of objects. The approach is inspired in the fact that fast user interactions may introduce some pixel errors confusing object and background. Our results with different degrees of user interaction and input errors show high performance of the proposed approach on a multi-label human limb segmentation problem compared with classical α-expansion algorithm.  
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN 978-1-4673-0062-9 Medium  
  Area Expedition Conference HICV  
  Notes MILAB; HuPBA Approved no  
  Call Number Admin @ si @ HPE2011 Serial 1892  
Permanent link to this record
Select All    Deselect All
 |   | 
Details
   print

Save Citations:
Export Records: