|   | 
Details
   web
Records Links
Author Shida Beigpour; Joost Van de Weijer edit   pdf
url  doi
isbn  openurl
Title Object Recoloring Based on Intrinsic Image Estimation Type Conference Article
Year 2011 Publication 13th IEEE International Conference in Computer Vision Abbreviated Journal  
Volume Issue Pages 327 - 334  
Keywords  
Abstract Object recoloring is one of the most popular photo-editing tasks. The problem of object recoloring is highly under-constrained, and existing recoloring methods limit their application to objects lit by a white illuminant. Application of these methods to real-world scenes lit by colored illuminants, multiple illuminants, or interreflections, results in unrealistic recoloring of objects. In this paper, we focus on the recoloring of single-colored objects presegmented from their background. The single-color constraint allows us to fit a more comprehensive physical model to the object. We demonstrate that this permits us to perform realistic recoloring of objects lit by non-white illuminants, and multiple illuminants. Moreover, the model allows for more realistic handling of illuminant alteration of the scene. Recoloring results captured by uncalibrated cameras demonstrate that the proposed framework obtains realistic recoloring for complex natural images. Furthermore we use the model to transfer color between objects and show that the results are more realistic than existing color transfer methods.  
Address Barcelona  
Corporate Author Thesis  
Publisher Place of Publication Editor  
Language Summary Language Original Title  
Series Editor Series Title Abbreviated Series Title  
Series Volume Series Issue Edition  
ISSN 1550-5499 ISBN 978-1-4577-1101-5 Medium  
Area Expedition Conference ICCV  
Notes CIC Approved no  
Call Number (up) Admin @ si @ BeW2011 Serial 1781  
Permanent link to this record
 

 
Author Xavier Boix; Josep M. Gonfaus; Joost Van de Weijer; Andrew Bagdanov; Joan Serrat; Jordi Gonzalez edit   pdf
url  doi
openurl 
Title Harmony Potentials: Fusing Global and Local Scale for Semantic Image Segmentation Type Journal Article
Year 2012 Publication International Journal of Computer Vision Abbreviated Journal IJCV  
Volume 96 Issue 1 Pages 83-102  
Keywords  
Abstract The Hierarchical Conditional Random Field(HCRF) model have been successfully applied to a number of image labeling problems, including image segmentation. However, existing HCRF models of image segmentation do not allow multiple classes to be assigned to a single region, which limits their ability to incorporate contextual information across multiple scales.
At higher scales in the image, this representation yields an oversimpli ed model since multiple classes can be reasonably expected to appear within large regions. This simpli ed model particularly limits the impact of information at higher scales. Since class-label information at these scales is usually more reliable than at lower, noisier scales, neglecting this information is undesirable. To
address these issues, we propose a new consistency potential for image labeling problems, which we call the harmony potential. It can encode any possible combi-
nation of labels, penalizing only unlikely combinations of classes. We also propose an e ective sampling strategy over this expanded label set that renders tractable the underlying optimization problem. Our approach obtains state-of-the-art results on two challenging, standard benchmark datasets for semantic image segmentation: PASCAL VOC 2010, and MSRC-21.
 
Address  
Corporate Author Thesis  
Publisher Place of Publication Editor  
Language Summary Language Original Title  
Series Editor Series Title Abbreviated Series Title  
Series Volume Series Issue Edition  
ISSN 0920-5691 ISBN Medium  
Area Expedition Conference  
Notes ISE;CIC;ADAS Approved no  
Call Number (up) Admin @ si @ BGW2012 Serial 1718  
Permanent link to this record
 

 
Author Xavier Boix edit  openurl
Title Learning Conditional Random Fields for Stereo Type Report
Year 2009 Publication CVC Technical Report Abbreviated Journal  
Volume 136 Issue Pages  
Keywords  
Abstract  
Address  
Corporate Author Computer Vision Center Thesis Master's thesis  
Publisher Place of Publication Bellaterra, Barcelona Editor  
Language Summary Language Original Title  
Series Editor Series Title Abbreviated Series Title  
Series Volume Series Issue Edition  
ISSN ISBN Medium  
Area Expedition Conference  
Notes CIC Approved no  
Call Number (up) Admin @ si @ Boi2009 Serial 2395  
Permanent link to this record
 

 
Author Shida Beigpour; Christian Riess; Joost Van de Weijer; Elli Angelopoulou edit   pdf
doi  openurl
Title Multi-Illuminant Estimation with Conditional Random Fields Type Journal Article
Year 2014 Publication IEEE Transactions on Image Processing Abbreviated Journal TIP  
Volume 23 Issue 1 Pages 83-95  
Keywords color constancy; CRF; multi-illuminant  
Abstract Most existing color constancy algorithms assume uniform illumination. However, in real-world scenes, this is not often the case. Thus, we propose a novel framework for estimating the colors of multiple illuminants and their spatial distribution in the scene. We formulate this problem as an energy minimization task within a conditional random field over a set of local illuminant estimates. In order to quantitatively evaluate the proposed method, we created a novel data set of two-dominant-illuminant images comprised of laboratory, indoor, and outdoor scenes. Unlike prior work, our database includes accurate pixel-wise ground truth illuminant information. The performance of our method is evaluated on multiple data sets. Experimental results show that our framework clearly outperforms single illuminant estimators as well as a recently proposed multi-illuminant estimation approach.  
Address  
Corporate Author Thesis  
Publisher Place of Publication Editor  
Language Summary Language Original Title  
Series Editor Series Title Abbreviated Series Title  
Series Volume Series Issue Edition  
ISSN 1057-7149 ISBN Medium  
Area Expedition Conference  
Notes CIC; LAMP; 600.074; 600.079 Approved no  
Call Number (up) Admin @ si @ BRW2014 Serial 2451  
Permanent link to this record
 

 
Author Shida Beigpour; Marc Serra; Joost Van de Weijer; Robert Benavente; Maria Vanrell; Olivier Penacchio; Dimitris Samaras edit   pdf
doi  openurl
Title Intrinsic Image Evaluation On Synthetic Complex Scenes Type Conference Article
Year 2013 Publication 20th IEEE International Conference on Image Processing Abbreviated Journal  
Volume Issue Pages 285 - 289  
Keywords  
Abstract Scene decomposition into its illuminant, shading, and reflectance intrinsic images is an essential step for scene understanding. Collecting intrinsic image groundtruth data is a laborious task. The assumptions on which the ground-truth
procedures are based limit their application to simple scenes with a single object taken in the absence of indirect lighting and interreflections. We investigate synthetic data for intrinsic image research since the extraction of ground truth is straightforward, and it allows for scenes in more realistic situations (e.g, multiple illuminants and interreflections). With this dataset we aim to motivate researchers to further explore intrinsic image decomposition in complex scenes.
 
Address Melbourne; Australia; September 2013  
Corporate Author Thesis  
Publisher Place of Publication Editor  
Language Summary Language Original Title  
Series Editor Series Title Abbreviated Series Title  
Series Volume Series Issue Edition  
ISSN ISBN Medium  
Area Expedition Conference ICIP  
Notes CIC; 600.048; 600.052; 600.051 Approved no  
Call Number (up) Admin @ si @ BSW2013 Serial 2264  
Permanent link to this record
 

 
Author Xim Cerda-Company; C. Alejandro Parraga; Xavier Otazu edit  openurl
Title Which tone-mapping is the best? A comparative study of tone-mapping perceived quality Type Abstract
Year 2014 Publication Perception Abbreviated Journal  
Volume 43 Issue Pages 106  
Keywords  
Abstract Perception 43 ECVP Abstract Supplement
High-dynamic-range (HDR) imaging refers to the methods designed to increase the brightness dynamic range present in standard digital imaging techniques. This increase is achieved by taking the same picture under di erent exposure values and mapping the intensity levels into a single image by way of a tone-mapping operator (TMO). Currently, there is no agreement on how to evaluate the quality
of di erent TMOs. In this work we psychophysically evaluate 15 di erent TMOs obtaining rankings based on the perceived properties of the resulting tone-mapped images. We performed two di erent experiments on a CRT calibrated display using 10 subjects: (1) a study of the internal relationships between grey-levels and (2) a pairwise comparison of the resulting 15 tone-mapped images. In (1) observers internally matched the grey-levels to a reference inside the tone-mapped images and in the real scene. In (2) observers performed a pairwise comparison of the tone-mapped images alongside the real scene. We obtained two rankings of the TMOs according their performance. In (1) the best algorithm
was ICAM by J.Kuang et al (2007) and in (2) the best algorithm was a TMO by Krawczyk et al (2005). Our results also show no correlation between these two rankings.
 
Address  
Corporate Author Thesis  
Publisher Place of Publication Editor  
Language Summary Language Original Title  
Series Editor Series Title Abbreviated Series Title  
Series Volume Series Issue Edition  
ISSN ISBN Medium  
Area Expedition Conference ECVP  
Notes CIC; NEUROBIT; 600.074 Approved no  
Call Number (up) Admin @ si @ CPO2014 Serial 2527  
Permanent link to this record
 

 
Author Marcos V Conde; Javier Vazquez; Michael S Brown; Radu TImofte edit   pdf
url  openurl
Title NILUT: Conditional Neural Implicit 3D Lookup Tables for Image Enhancement Type Conference Article
Year 2024 Publication 38th AAAI Conference on Artificial Intelligence Abbreviated Journal  
Volume Issue Pages  
Keywords  
Abstract 3D lookup tables (3D LUTs) are a key component for image enhancement. Modern image signal processors (ISPs) have dedicated support for these as part of the camera rendering pipeline. Cameras typically provide multiple options for picture styles, where each style is usually obtained by applying a unique handcrafted 3D LUT. Current approaches for learning and applying 3D LUTs are notably fast, yet not so memory-efficient, as storing multiple 3D LUTs is required. For this reason and other implementation limitations, their use on mobile devices is less popular. In this work, we propose a Neural Implicit LUT (NILUT), an implicitly defined continuous 3D color transformation parameterized by a neural network. We show that NILUTs are capable of accurately emulating real 3D LUTs. Moreover, a NILUT can be extended to incorporate multiple styles into a single network with the ability to blend styles implicitly. Our novel approach is memory-efficient, controllable and can complement previous methods, including learned ISPs.  
Address  
Corporate Author Thesis  
Publisher Place of Publication Editor  
Language Summary Language Original Title  
Series Editor Series Title Abbreviated Series Title  
Series Volume Series Issue Edition  
ISSN ISBN Medium  
Area Expedition Conference AAAI  
Notes CIC; MACO Approved no  
Call Number (up) Admin @ si @ CVB2024 Serial 3872  
Permanent link to this record
 

 
Author Trevor Canham; Javier Vazquez; D Long; Richard F. Murray; Michael S Brown edit   pdf
openurl 
Title Noise Prism: A Novel Multispectral Visualization Technique Type Journal Article
Year 2021 Publication 31st Color and Imaging Conference Abbreviated Journal  
Volume Issue Pages  
Keywords  
Abstract A novel technique for visualizing multispectral images is proposed. Inspired by how prisms work, our method spreads spectral information over a chromatic noise pattern. This is accomplished by populating the pattern with pixels representing each measurement band at a count proportional to its measured intensity. The method is advantageous because it allows for lightweight encoding and visualization of spectral information
while maintaining the color appearance of the stimulus. A four alternative forced choice (4AFC) experiment was conducted to validate the method’s information-carrying capacity in displaying metameric stimuli of varying colors and spectral basis functions. The scores ranged from 100% to 20% (less than chance given the 4AFC task), with many conditions falling somewhere in between at statistically significant intervals. Using this data, color and texture difference metrics can be evaluated and optimized to predict the legibility of the visualization technique.
 
Address  
Corporate Author Thesis  
Publisher Place of Publication Editor  
Language Summary Language Original Title  
Series Editor Series Title Abbreviated Series Title  
Series Volume Series Issue Edition  
ISSN ISBN Medium  
Area Expedition Conference CIC  
Notes MACO; CIC Approved no  
Call Number (up) Admin @ si @ CVL2021 Serial 4000  
Permanent link to this record
 

 
Author Trevor Canham; Javier Vazquez; Elise Mathieu; Marcelo Bertalmío edit   pdf
url  doi
openurl 
Title Matching visual induction effects on screens of different size Type Journal Article
Year 2021 Publication Journal of Vision Abbreviated Journal JOV  
Volume 21 Issue 6(10) Pages 1-22  
Keywords  
Abstract In the film industry, the same movie is expected to be watched on displays of vastly different sizes, from cinema screens to mobile phones. But visual induction, the perceptual phenomenon by which the appearance of a scene region is affected by its surroundings, will be different for the same image shown on two displays of different dimensions. This phenomenon presents a practical challenge for the preservation of the artistic intentions of filmmakers, because it can lead to shifts in image appearance between viewing destinations. In this work, we show that a neural field model based on the efficient representation principle is able to predict induction effects and how, by regularizing its associated energy functional, the model is still able to represent induction but is now invertible. From this finding, we propose a method to preprocess an image in a screen–size dependent way so that its perception, in terms of visual induction, may remain constant across displays of different size. The potential of the method is demonstrated through psychophysical experiments on synthetic images and qualitative examples on natural images.  
Address  
Corporate Author Thesis  
Publisher Place of Publication Editor  
Language Summary Language Original Title  
Series Editor Series Title Abbreviated Series Title  
Series Volume Series Issue Edition  
ISSN ISBN Medium  
Area Expedition Conference  
Notes CIC Approved no  
Call Number (up) Admin @ si @ CVM2021 Serial 3595  
Permanent link to this record
 

 
Author Marcos V Conde; Florin Vasluianu; Javier Vazquez; Radu Timofte edit   pdf
url  openurl
Title Perceptual image enhancement for smartphone real-time applications Type Conference Article
Year 2023 Publication Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision Abbreviated Journal  
Volume Issue Pages 1848-1858  
Keywords  
Abstract Recent advances in camera designs and imaging pipelines allow us to capture high-quality images using smartphones. However, due to the small size and lens limitations of the smartphone cameras, we commonly find artifacts or degradation in the processed images. The most common unpleasant effects are noise artifacts, diffraction artifacts, blur, and HDR overexposure. Deep learning methods for image restoration can successfully remove these artifacts. However, most approaches are not suitable for real-time applications on mobile devices due to their heavy computation and memory requirements. In this paper, we propose LPIENet, a lightweight network for perceptual image enhancement, with the focus on deploying it on smartphones. Our experiments show that, with much fewer parameters and operations, our model can deal with the mentioned artifacts and achieve competitive performance compared with state-of-the-art methods on standard benchmarks. Moreover, to prove the efficiency and reliability of our approach, we deployed the model directly on commercial smartphones and evaluated its performance. Our model can process 2K resolution images under 1 second in mid-level commercial smartphones.  
Address Waikoloa; Hawai; USA; January 2023  
Corporate Author Thesis  
Publisher Place of Publication Editor  
Language Summary Language Original Title  
Series Editor Series Title Abbreviated Series Title  
Series Volume Series Issue Edition  
ISSN ISBN Medium  
Area Expedition Conference WACV  
Notes MACO; CIC Approved no  
Call Number (up) Admin @ si @ CVV2023 Serial 3900  
Permanent link to this record
 

 
Author Maria del Camp Davesa edit  openurl
Title Human action categorization in image sequences Type Report
Year 2011 Publication CVC Technical Report Abbreviated Journal  
Volume 169 Issue Pages  
Keywords  
Abstract  
Address Bellaterra (Spain)  
Corporate Author Computer Vision Center Thesis Master's thesis  
Publisher Place of Publication Editor  
Language Summary Language Original Title  
Series Editor Series Title Abbreviated Series Title  
Series Volume Series Issue Edition  
ISSN ISBN Medium  
Area Expedition Conference  
Notes CiC;CIC Approved no  
Call Number (up) Admin @ si @ Dav2011 Serial 1934  
Permanent link to this record
 

 
Author M. Danelljan; Fahad Shahbaz Khan; Michael Felsberg; Joost Van de Weijer edit   pdf
doi  openurl
Title Adaptive color attributes for real-time visual tracking Type Conference Article
Year 2014 Publication 27th IEEE Conference on Computer Vision and Pattern Recognition Abbreviated Journal  
Volume Issue Pages 1090 - 1097  
Keywords  
Abstract Visual tracking is a challenging problem in computer vision. Most state-of-the-art visual trackers either rely on luminance information or use simple color representations for image description. Contrary to visual tracking, for object
recognition and detection, sophisticated color features when combined with luminance have shown to provide excellent performance. Due to the complexity of the tracking problem, the desired color feature should be computationally
efficient, and possess a certain amount of photometric invariance while maintaining high discriminative power.
This paper investigates the contribution of color in a tracking-by-detection framework. Our results suggest that color attributes provides superior performance for visual tracking. We further propose an adaptive low-dimensional
variant of color attributes. Both quantitative and attributebased evaluations are performed on 41 challenging benchmark color sequences. The proposed approach improves the baseline intensity-based tracker by 24% in median distance precision. Furthermore, we show that our approach outperforms
state-of-the-art tracking methods while running at more than 100 frames per second.
 
Address Nottingham; UK; September 2014  
Corporate Author Thesis  
Publisher Place of Publication Editor  
Language Summary Language Original Title  
Series Editor Series Title Abbreviated Series Title  
Series Volume Series Issue Edition  
ISSN ISBN Medium  
Area Expedition Conference CVPR  
Notes CIC; LAMP; 600.074; 600.079 Approved no  
Call Number (up) Admin @ si @ DKF2014 Serial 2509  
Permanent link to this record
 

 
Author Sagnik Das; Hassan Ahmed Sial; Ke Ma; Ramon Baldrich; Maria Vanrell; Dimitris Samaras edit   pdf
openurl 
Title Intrinsic Decomposition of Document Images In-the-Wild Type Conference Article
Year 2020 Publication 31st British Machine Vision Conference Abbreviated Journal  
Volume Issue Pages  
Keywords  
Abstract Automatic document content processing is affected by artifacts caused by the shape
of the paper, non-uniform and diverse color of lighting conditions. Fully-supervised
methods on real data are impossible due to the large amount of data needed. Hence, the
current state of the art deep learning models are trained on fully or partially synthetic images. However, document shadow or shading removal results still suffer because: (a) prior methods rely on uniformity of local color statistics, which limit their application on real-scenarios with complex document shapes and textures and; (b) synthetic or hybrid datasets with non-realistic, simulated lighting conditions are used to train the models. In this paper we tackle these problems with our two main contributions. First, a physically constrained learning-based method that directly estimates document reflectance based on intrinsic image formation which generalizes to challenging illumination conditions. Second, a new dataset that clearly improves previous synthetic ones, by adding a large range of realistic shading and diverse multi-illuminant conditions, uniquely customized to deal with documents in-the-wild. The proposed architecture works in two steps. First, a white balancing module neutralizes the color of the illumination on the input image. Based on the proposed multi-illuminant dataset we achieve a good white-balancing in really difficult conditions. Second, the shading separation module accurately disentangles the shading and paper material in a self-supervised manner where only the synthetic texture is used as a weak training signal (obviating the need for very costly ground truth with disentangled versions of shading and reflectance). The proposed approach leads to significant generalization of document reflectance estimation in real scenes with challenging illumination. We extensively evaluate on the real benchmark datasets available for intrinsic image decomposition and document shadow removal tasks. Our reflectance estimation scheme, when used as a pre-processing step of an OCR pipeline, shows a 21% improvement of character error rate (CER), thus, proving the practical applicability. The data and code will be available at: https://github.com/cvlab-stonybrook/DocIIW.
 
Address Virtual; September 2020  
Corporate Author Thesis  
Publisher Place of Publication Editor  
Language Summary Language Original Title  
Series Editor Series Title Abbreviated Series Title  
Series Volume Series Issue Edition  
ISSN ISBN Medium  
Area Expedition Conference BMVC  
Notes CIC; 600.087; 600.140; 600.118 Approved no  
Call Number (up) Admin @ si @ DSM2020 Serial 3461  
Permanent link to this record
 

 
Author Noha Elfiky; Fahad Shahbaz Khan; Joost Van de Weijer; Jordi Gonzalez edit   pdf
url  doi
openurl 
Title Discriminative Compact Pyramids for Object and Scene Recognition Type Journal Article
Year 2012 Publication Pattern Recognition Abbreviated Journal PR  
Volume 45 Issue 4 Pages 1627-1636  
Keywords  
Abstract Spatial pyramids have been successfully applied to incorporating spatial information into bag-of-words based image representation. However, a major drawback is that it leads to high dimensional image representations. In this paper, we present a novel framework for obtaining compact pyramid representation. First, we investigate the usage of the divisive information theoretic feature clustering (DITC) algorithm in creating a compact pyramid representation. In many cases this method allows us to reduce the size of a high dimensional pyramid representation up to an order of magnitude with little or no loss in accuracy. Furthermore, comparison to clustering based on agglomerative information bottleneck (AIB) shows that our method obtains superior results at significantly lower computational costs. Moreover, we investigate the optimal combination of multiple features in the context of our compact pyramid representation. Finally, experiments show that the method can obtain state-of-the-art results on several challenging data sets.  
Address  
Corporate Author Thesis  
Publisher Place of Publication Editor  
Language Summary Language Original Title  
Series Editor Series Title Abbreviated Series Title  
Series Volume Series Issue Edition  
ISSN 0031-3203 ISBN Medium  
Area Expedition Conference  
Notes ISE; CAT;CIC Approved no  
Call Number (up) Admin @ si @ EKW2012 Serial 1807  
Permanent link to this record
 

 
Author Alicia Fornes; Xavier Otazu; Josep Llados edit   pdf
doi  openurl
Title Show through cancellation and image enhancement by multiresolution contrast processing Type Conference Article
Year 2013 Publication 12th International Conference on Document Analysis and Recognition Abbreviated Journal  
Volume Issue Pages 200-204  
Keywords  
Abstract Historical documents suffer from different types of degradation and noise such as background variation, uneven illumination or dark spots. In case of double-sided documents, another common problem is that the back side of the document usually interferes with the front side because of the transparency of the document or ink bleeding. This effect is called the show through phenomenon. Many methods are developed to solve these problems, and in the case of show-through, by scanning and matching both the front and back sides of the document. In contrast, our approach is designed to use only one side of the scanned document. We hypothesize that show-trough are low contrast components, while foreground components are high contrast ones. A Multiresolution Contrast (MC) decomposition is presented in order to estimate the contrast of features at different spatial scales. We cancel the show-through phenomenon by thresholding these low contrast components. This decomposition is also able to enhance the image removing shadowed areas by weighting spatial scales. Results show that the enhanced images improve the readability of the documents, allowing scholars both to recover unreadable words and to solve ambiguities.  
Address Washington; USA; August 2013  
Corporate Author Thesis  
Publisher Place of Publication Editor  
Language Summary Language Original Title  
Series Editor Series Title Abbreviated Series Title  
Series Volume Series Issue Edition  
ISSN 1520-5363 ISBN Medium  
Area Expedition Conference ICDAR  
Notes DAG; 602.006; 600.045; 600.061; 600.052;CIC Approved no  
Call Number (up) Admin @ si @ FOL2013 Serial 2241  
Permanent link to this record
 

 
Author Graham D. Finlayson; Javier Vazquez; Sabine Süsstrunk; Maria Vanrell edit   pdf
url  doi
openurl 
Title Spectral sharpening by spherical sampling Type Journal Article
Year 2012 Publication Journal of the Optical Society of America A Abbreviated Journal JOSA A  
Volume 29 Issue 7 Pages 1199-1210  
Keywords  
Abstract There are many works in color that assume illumination change can be modeled by multiplying sensor responses by individual scaling factors. The early research in this area is sometimes grouped under the heading “von Kries adaptation”: the scaling factors are applied to the cone responses. In more recent studies, both in psychophysics and in computational analysis, it has been proposed that scaling factors should be applied to linear combinations of the cones that have narrower support: they should be applied to the so-called “sharp sensors.” In this paper, we generalize the computational approach to spectral sharpening in three important ways. First, we introduce spherical sampling as a tool that allows us to enumerate in a principled way all linear combinations of the cones. This allows us to, second, find the optimal sharp sensors that minimize a variety of error measures including CIE Delta E (previous work on spectral sharpening minimized RMS) and color ratio stability. Lastly, we extend the spherical sampling paradigm to the multispectral case. Here the objective is to model the interaction of light and surface in terms of color signal spectra. Spherical sampling is shown to improve on the state of the art.  
Address  
Corporate Author Thesis  
Publisher Place of Publication Editor  
Language Summary Language Original Title  
Series Editor Series Title Abbreviated Series Title  
Series Volume Series Issue Edition  
ISSN 1084-7529 ISBN Medium  
Area Expedition Conference  
Notes CIC Approved no  
Call Number (up) Admin @ si @ FVS2012 Serial 2000  
Permanent link to this record
 

 
Author Bojana Gajic; Ariel Amato; Ramon Baldrich; Carlo Gatta edit   pdf
openurl 
Title Bag of Negatives for Siamese Architectures Type Conference Article
Year 2019 Publication 30th British Machine Vision Conference Abbreviated Journal  
Volume Issue Pages  
Keywords  
Abstract Training a Siamese architecture for re-identification with a large number of identities is a challenging task due to the difficulty of finding relevant negative samples efficiently. In this work we present Bag of Negatives (BoN), a method for accelerated and improved training of Siamese networks that scales well on datasets with a very large number of identities. BoN is an efficient and loss-independent method, able to select a bag of high quality negatives, based on a novel online hashing strategy.  
Address Cardiff; United Kingdom; September 2019  
Corporate Author Thesis  
Publisher Place of Publication Editor  
Language Summary Language Original Title  
Series Editor Series Title Abbreviated Series Title  
Series Volume Series Issue Edition  
ISSN ISBN Medium  
Area Expedition Conference BMVC  
Notes CIC; 600.140; 600.118 Approved no  
Call Number (up) Admin @ si @ GAB2019b Serial 3263  
Permanent link to this record
 

 
Author Bojana Gajic; Ariel Amato; Ramon Baldrich; Joost Van de Weijer; Carlo Gatta edit   pdf
doi  openurl
Title Area Under the ROC Curve Maximization for Metric Learning Type Conference Article
Year 2022 Publication CVPR 2022 Workshop on Efficien Deep Learning for Computer Vision (ECV 2022, 5th Edition) Abbreviated Journal  
Volume Issue Pages  
Keywords Training; Computer vision; Conferences; Area measurement; Benchmark testing; Pattern recognition  
Abstract Most popular metric learning losses have no direct relation with the evaluation metrics that are subsequently applied to evaluate their performance. We hypothesize that training a metric learning model by maximizing the area under the ROC curve (which is a typical performance measure of recognition systems) can induce an implicit ranking suitable for retrieval problems. This hypothesis is supported by previous work that proved that a curve dominates in ROC space if and only if it dominates in Precision-Recall space. To test this hypothesis, we design and maximize an approximated, derivable relaxation of the area under the ROC curve. The proposed AUC loss achieves state-of-the-art results on two large scale retrieval benchmark datasets (Stanford Online Products and DeepFashion In-Shop). Moreover, the AUC loss achieves comparable performance to more complex, domain specific, state-of-the-art methods for vehicle re-identification.  
Address New Orleans, USA; 20 June 2022  
Corporate Author Thesis  
Publisher Place of Publication Editor  
Language Summary Language Original Title  
Series Editor Series Title Abbreviated Series Title  
Series Volume Series Issue Edition  
ISSN ISBN Medium  
Area Expedition Conference CVPRW  
Notes CIC; LAMP; Approved no  
Call Number (up) Admin @ si @ GAB2022 Serial 3700  
Permanent link to this record