Home | [1–10] << 11 12 13 >> |
![]() |
Records | |||||
---|---|---|---|---|---|
Author | Olivier Penacchio; Laura Dempere-Marco; Xavier Otazu | ||||
Title | A Neurodynamical Model Of Brightness Induction In V1 Following Static And Dynamic Contextual Influences | Type | Abstract | ||
Year | 2012 | Publication ![]() |
8th Federation of European Neurosciences | Abbreviated Journal | |
Volume | 6 | Issue | Pages | 63-64 | |
Keywords | |||||
Abstract | Brightness induction is the modulation of the perceived intensity of an area by the luminance of surrounding areas. Although striate cortex is traditionally regarded as an area mostly responsive to ensory (i.e. retinal) information,
neurophysiological evidence suggests that perceived brightness information mightbe explicitly represented in V1. Such evidence has been observed both in anesthetised cats where neuronal response modulations have been found to follow luminance changes outside the receptive felds and in human fMRI measurements. In this work, possible neural mechanisms that ofer a plausible explanation for such phenomenon are investigated. To this end, we consider the model proposed by Z.Li (Li, Network:Comput. Neural Syst., 10 (1999)) which is based on neurophysiological evidence and focuses on the part of V1 responsible for contextual infuences, i.e. layer 2-3 pyramidal cells, interneurons, and horizontal intracortical connections. This model has reproduced other phenomena such as contour detection and preattentive segmentation, which share with brightness induction the relevant efect of contextual infuences. We have extended the original model such that the input to the network is obtained from a complete multiscale and multiorientation wavelet decomposition, thereby allowing the recovery of an image refecting the perceived intensity. The proposed model successfully accounts for well known psychophysical efects for static contexts (among them: the White's and modifed White's efects, the Todorovic, Chevreul, achromatic ring patterns, and grating induction efects) and also for brigthness induction in dynamic contexts defned by modulating the luminance of surrounding areas (e.g. the brightness of a static central area is perceived to vary in antiphase to the sinusoidal luminance changes of its surroundings). This work thus suggests that intra-cortical interactions in V1 could partially explain perceptual brightness induction efects and reveals how a common general architecture may account for several different fundamental processes emerging early in the visual processing pathway. |
||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | FENS | ||
Notes | CIC | Approved | no | ||
Call Number | Admin @ si @ PDO2012b | Serial | 2181 | ||
Permanent link to this record | |||||
Author | Susana Alvarez; Anna Salvatella; Maria Vanrell; Xavier Otazu | ||||
Title | 3D Texton Spaces for color-texture retrieval | Type | Conference Article | ||
Year | 2010 | Publication ![]() |
7th International Conference on Image Analysis and Recognition | Abbreviated Journal | |
Volume | 6111 | Issue | Pages | 354–363 | |
Keywords | |||||
Abstract | Color and texture are visual cues of different nature, their integration in an useful visual descriptor is not an easy problem. One way to combine both features is to compute spatial texture descriptors independently on each color channel. Another way is to do the integration at the descriptor level. In this case the problem of normalizing both cues arises. In this paper we solve the latest problem by fusing color and texture through distances in texton spaces. Textons are the attributes of image blobs and they are responsible for texture discrimination as defined in Julesz’s Texton theory. We describe them in two low-dimensional and uniform spaces, namely, shape and color. The dissimilarity between color texture images is computed by combining the distances in these two spaces. Following this approach, we propose our TCD descriptor which outperforms current state of art methods in the two different approaches mentioned above, early combination with LBP and late combination with MPEG-7. This is done on an image retrieval experiment over a highly diverse texture dataset from Corel. | ||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Springer Berlin Heidelberg | Place of Publication | Editor | A.C. Campilho and M.S. Kamel | |
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | LNCS | ||
Series Volume | Series Issue | Edition | |||
ISSN | 0302-9743 | ISBN | 978-3-642-13771-6 | Medium | |
Area | Expedition | Conference | ICIAR | ||
Notes | CIC | Approved | no | ||
Call Number | CAT @ cat @ ASV2010a | Serial | 1325 | ||
Permanent link to this record | |||||
Author | Jordi Roca; Maria Vanrell; C. Alejandro Parraga | ||||
Title | What is constant in colour constancy? | Type | Conference Article | ||
Year | 2012 | Publication ![]() |
6th European Conference on Colour in Graphics, Imaging and Vision | Abbreviated Journal | |
Volume | Issue | Pages | 337-343 | ||
Keywords | |||||
Abstract | Color constancy refers to the ability of the human visual system to stabilize
the color appearance of surfaces under an illuminant change. In this work we studied how the interrelations among nine colors are perceived under illuminant changes, particularly whether they remain stable across 10 different conditions (5 illuminants and 2 backgrounds). To do so we have used a paradigm that measures several colors under an immersive state of adaptation. From our measures we defined a perceptual structure descriptor that is up to 87% stable over all conditions, suggesting that color category features could be used to predict color constancy. This is in agreement with previous results on the stability of border categories [1,2] and with computational color constancy algorithms [3] for estimating the scene illuminant. |
||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | 9781622767014 | Medium | ||
Area | Expedition | Conference | CGIV | ||
Notes | CIC | Approved | no | ||
Call Number | RVP2012 | Serial | 2189 | ||
Permanent link to this record | |||||
Author | C. Alejandro Parraga; Ramon Baldrich; Maria Vanrell | ||||
Title | Accurate Mapping of Natural Scenes Radiance to Cone Activation Space: A New Image Dataset | Type | Conference Article | ||
Year | 2010 | Publication ![]() |
5th European Conference on Colour in Graphics, Imaging and Vision and 12th International Symposium on Multispectral Colour Science | Abbreviated Journal | |
Volume | Issue | Pages | 50–57 | ||
Keywords | |||||
Abstract | The characterization of trichromatic cameras is usually done in terms of a device-independent color space, such as the CIE 1931 XYZ space. This is indeed convenient since it allows the testing of results against colorimetric measures. We have characterized our camera to represent human cone activation by mapping the camera sensor's (RGB) responses to human (LMS) through a polynomial transformation, which can be “customized” according to the types of scenes we want to represent. Here we present a method to test the accuracy of the camera measures and a study on how the choice of training reflectances for the polynomial may alter the results. | ||||
Address | Joensuu, Finland | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | 9781617388897 | Medium | ||
Area | Expedition | Conference | CGIV/MCS | ||
Notes | CIC | Approved | no | ||
Call Number | CAT @ cat @ PBV2010a | Serial | 1322 | ||
Permanent link to this record | |||||
Author | Javier Vazquez; G. D. Finlayson; Maria Vanrell | ||||
Title | A compact singularity function to predict WCS data and unique hues | Type | Conference Article | ||
Year | 2010 | Publication ![]() |
5th European Conference on Colour in Graphics, Imaging and Vision and 12th International Symposium on Multispectral Colour Science | Abbreviated Journal | |
Volume | Issue | Pages | 33–38 | ||
Keywords | |||||
Abstract | Understanding how colour is used by the human vision system is a widely studied research field. The field, though quite advanced, still faces important unanswered questions. One of them is the explanation of the unique hues and the assignment of color names. This problem addresses the fact of different perceptual status for different colors.
Recently, Philipona and O'Regan have proposed a biological model that allows to extract the reflection properties of any surface independently of the lighting conditions. These invariant properties are the basis to compute a singularity index that predicts the asymmetries presented in unique hues and basic color categories psychophysical data, therefore is giving a further step in their explanation. In this paper we build on their formulation and propose a new singularity index. This new formulation equally accounts for the location of the 4 peaks of the World colour survey and has two main advantages. First, it is a simple elegant numerical measure (the Philipona measurement is a rather cumbersome formula). Second, we develop a colour-based explanation for the measure. |
||||
Address | Joensuu, Finland | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | 9781617388897 | Medium | ||
Area | Expedition | Conference | CGIV/MCS | ||
Notes | CIC | Approved | no | ||
Call Number | CAT @ cat @ VFV2010 | Serial | 1324 | ||
Permanent link to this record | |||||
Author | Jaime Moreno; Xavier Otazu; Maria Vanrell | ||||
Title | Local Perceptual Weighting in JPEG2000 for Color Images | Type | Conference Article | ||
Year | 2010 | Publication ![]() |
5th European Conference on Colour in Graphics, Imaging and Vision and 12th International Symposium on Multispectral Colour Science | Abbreviated Journal | |
Volume | Issue | Pages | 255–260 | ||
Keywords | |||||
Abstract | The aim of this work is to explain how to apply perceptual concepts to define a perceptual pre-quantizer and to improve JPEG2000 compressor. The approach consists in quantizing wavelet transform coefficients using some of the human visual system behavior properties. Noise is fatal to image compression performance, because it can be both annoying for the observer and consumes excessive bandwidth when the imagery is transmitted. Perceptual pre-quantization reduces unperceivable details and thus improve both visual impression and transmission properties. The comparison between JPEG2000 without and with perceptual pre-quantization shows that the latter is not favorable in PSNR, but the recovered image is more compressed at the same or even better visual quality measured with a weighted PSNR. Perceptual criteria were taken from the CIWaM (Chromatic Induction Wavelet Model). | ||||
Address | Joensuu, Finland | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | 9781617388897 | Medium | ||
Area | Expedition | Conference | CGIV/MCS | ||
Notes | CIC | Approved | no | ||
Call Number | CAT @ cat @ MOV2010a | Serial | 1307 | ||
Permanent link to this record | |||||
Author | Francesc Tous; Agnes Borras; Robert Benavente; Ramon Baldrich; Maria Vanrell; Josep Llados | ||||
Title | Textual Descriptors for browsing people by visual appearence. | Type | Conference Article | ||
Year | 2002 | Publication ![]() |
5è. Congrés Català d’Intel·ligència Artificial CCIA | Abbreviated Journal | |
Volume | Issue | Pages | |||
Keywords | Image retrieval, textual descriptors, colour naming, colour normalization, graph matching. | ||||
Abstract | This paper presents a first approach to build colour and structural descriptors for information retrieval on a people database. Queries are formulated in terms of their appearance that allows to seek people wearing specific clothes of a given colour name or texture. Descriptors are automatically computed by following three essential steps. A colour naming labelling from pixel properties. A region seg- mentation step based on colour properties of pixels combined with edge information. And a high level step that models the region arrangements in order to build clothes structure. Results are tested on large set of images from real scenes taken at the entrance desk of a building. | ||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | |||
Notes | DAG;CIC | Approved | no | ||
Call Number | CAT @ cat @ TBB2002a | Serial | 287 | ||
Permanent link to this record | |||||
Author | Eduard Vazquez; Ramon Baldrich | ||||
Title | Colour Image Segmentation in Presence of Shadows | Type | Conference Article | ||
Year | 2008 | Publication ![]() |
4th European Conference on Colour in Graphics, Imaging and Vision Proceedings | Abbreviated Journal | |
Volume | Issue | Pages | 383–387 | ||
Keywords | |||||
Abstract | |||||
Address | Terrassa (Spain) | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | CGIV08 | ||
Notes | CAT;CIC | Approved | no | ||
Call Number | CAT @ cat @ VaB2008 | Serial | 966 | ||
Permanent link to this record | |||||
Author | Javier Vazquez; Maria Vanrell; Ramon Baldrich | ||||
Title | Towards a Psychophysical Evaluation of Colour Constancy Algorithms | Type | Conference Article | ||
Year | 2008 | Publication ![]() |
4th European Conference on Colour in Graphics, Imaging and Vision Proceedings | Abbreviated Journal | |
Volume | Issue | Pages | 372–377 | ||
Keywords | |||||
Abstract | |||||
Address | Terrassa (Spain) | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | CGIV08 | ||
Notes | CAT;CIC | Approved | no | ||
Call Number | CAT @ cat @ VVB2008a | Serial | 968 | ||
Permanent link to this record | |||||
Author | C. Alejandro Parraga; Robert Benavente; Maria Vanrell; Ramon Baldrich | ||||
Title | Modelling Inter-Colour Regions of Colour Naming Space | Type | Conference Article | ||
Year | 2008 | Publication ![]() |
4th European Conference on Colour in Graphics, Imaging and Vision Proceedings | Abbreviated Journal | |
Volume | Issue | Pages | 218–222 | ||
Keywords | |||||
Abstract | |||||
Address | Terrassa (Spain) | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | CGIV08 | ||
Notes | CAT;CIC | Approved | no | ||
Call Number | CAT @ cat @ PBV2008 | Serial | 969 | ||
Permanent link to this record | |||||
Author | Joost Van de Weijer; Fahad Shahbaz Khan | ||||
Title | Fusing Color and Shape for Bag-of-Words Based Object Recognition | Type | Conference Article | ||
Year | 2013 | Publication ![]() |
4th Computational Color Imaging Workshop | Abbreviated Journal | |
Volume | 7786 | Issue | Pages | 25-34 | |
Keywords | Object Recognition; color features; bag-of-words; image classification | ||||
Abstract | In this article we provide an analysis of existing methods for the incorporation of color in bag-of-words based image representations. We propose a list of desired properties on which bases fusing methods can be compared. We discuss existing methods and indicate shortcomings of the two well-known fusing methods, namely early and late fusion. Several recent works have addressed these shortcomings by exploiting top-down information in the bag-of-words pipeline: color attention which is motivated from human vision, and Portmanteau vocabularies which are based on information theoretic compression of product vocabularies. We point out several remaining challenges in cue fusion and provide directions for future research. | ||||
Address | Chiba; Japan; March 2013 | ||||
Corporate Author | Thesis | ||||
Publisher | Springer Berlin Heidelberg | Place of Publication | Editor | ||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | 0302-9743 | ISBN | 978-3-642-36699-4 | Medium | |
Area | Expedition | Conference | CCIW | ||
Notes | CIC; 600.048 | Approved | no | ||
Call Number | Admin @ si @ WeK2013 | Serial | 2283 | ||
Permanent link to this record | |||||
Author | Anna Salvatella; Maria Vanrell; Juan J. Villanueva | ||||
Title | Texture Description based on Subtexture Components, 3rd International Workshop on Texture Syntesis and Analysis | Type | Conference Article | ||
Year | 2003 | Publication ![]() |
3rd International Workshop on Texture Synthesis and Analysis, | Abbreviated Journal | |
Volume | Issue | Pages | 77–82 | ||
Keywords | |||||
Abstract | |||||
Address | Nice | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | 1-904410-11-1 | Medium | ||
Area | Expedition | Conference | |||
Notes | CIC | Approved | no | ||
Call Number | CAT @ cat @ SVV2003 | Serial | 422 | ||
Permanent link to this record | |||||
Author | Maria Vanrell; Naila Murray; Robert Benavente; C. Alejandro Parraga; Xavier Otazu; Ramon Baldrich | ||||
Title | Perception Based Representations for Computational Colour | Type | Conference Article | ||
Year | 2011 | Publication ![]() |
3rd International Workshop on Computational Color Imaging | Abbreviated Journal | |
Volume | 6626 | Issue | Pages | 16-30 | |
Keywords | colour perception, induction, naming, psychophysical data, saliency, segmentation | ||||
Abstract | The perceived colour of a stimulus is dependent on multiple factors stemming out either from the context of the stimulus or idiosyncrasies of the observer. The complexity involved in combining these multiple effects is the main reason for the gap between classical calibrated colour spaces from colour science and colour representations used in computer vision, where colour is just one more visual cue immersed in a digital image where surfaces, shadows and illuminants interact seemingly out of control. With the aim to advance a few steps towards bridging this gap we present some results on computational representations of colour for computer vision. They have been developed by introducing perceptual considerations derived from the interaction of the colour of a point with its context. We show some techniques to represent the colour of a point influenced by assimilation and contrast effects due to the image surround and we show some results on how colour saliency can be derived in real images. We outline a model for automatic assignment of colour names to image points directly trained on psychophysical data. We show how colour segments can be perceptually grouped in the image by imposing shading coherence in the colour space. | ||||
Address | Milan, Italy | ||||
Corporate Author | Thesis | ||||
Publisher | Springer-Verlag | Place of Publication | Editor | Raimondo Schettini, Shoji Tominaga, Alain Trémeau | |
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | LNCS | ||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | 978-3-642-20403-6 | Medium | ||
Area | Expedition | Conference | CCIW | ||
Notes | CIC | Approved | no | ||
Call Number | Admin @ si @ VMB2011 | Serial | 1733 | ||
Permanent link to this record | |||||
Author | Jose Manuel Alvarez; Antonio Lopez; Ramon Baldrich | ||||
Title | Shadow Resistant Road Segmentation from a Mobile Monocular System | Type | Conference Article | ||
Year | 2007 | Publication ![]() |
3rd Iberian Conference on Pattern Recognition and Image Analysis (IbPRIA 2007), J. Marti et al. (Eds.) LNCS 4477:9–16 | Abbreviated Journal | |
Volume | Issue | Pages | |||
Keywords | road detection | ||||
Abstract | |||||
Address | Gerona (Spain) | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | |||
Notes | ADAS;CIC | Approved | no | ||
Call Number | ADAS @ adas @ ALB2007 | Serial | 943 | ||
Permanent link to this record | |||||
Author | Eduard Vazquez; Ramon Baldrich; Javier Vazquez; Maria Vanrell | ||||
Title | Topological histogram reduction towards colour segmentation | Type | Book Chapter | ||
Year | 2007 | Publication ![]() |
3rd Iberian Conference on Pattern Recognition and Image Analysis (IbPRIA 2007), J. Marti et al. (Eds.) LNCS 4477:55–62 | Abbreviated Journal | |
Volume | Issue | Pages | |||
Keywords | |||||
Abstract | |||||
Address | Gerona (Spain) | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | |||
Notes | CIC | Approved | no | ||
Call Number | CAT @ cat @ VBV2007 | Serial | 809 | ||
Permanent link to this record | |||||
Author | Marcos V Conde; Javier Vazquez; Michael S Brown; Radu TImofte | ||||
Title | NILUT: Conditional Neural Implicit 3D Lookup Tables for Image Enhancement | Type | Conference Article | ||
Year | 2024 | Publication ![]() |
38th AAAI Conference on Artificial Intelligence | Abbreviated Journal | |
Volume | Issue | Pages | |||
Keywords | |||||
Abstract | 3D lookup tables (3D LUTs) are a key component for image enhancement. Modern image signal processors (ISPs) have dedicated support for these as part of the camera rendering pipeline. Cameras typically provide multiple options for picture styles, where each style is usually obtained by applying a unique handcrafted 3D LUT. Current approaches for learning and applying 3D LUTs are notably fast, yet not so memory-efficient, as storing multiple 3D LUTs is required. For this reason and other implementation limitations, their use on mobile devices is less popular. In this work, we propose a Neural Implicit LUT (NILUT), an implicitly defined continuous 3D color transformation parameterized by a neural network. We show that NILUTs are capable of accurately emulating real 3D LUTs. Moreover, a NILUT can be extended to incorporate multiple styles into a single network with the ability to blend styles implicitly. Our novel approach is memory-efficient, controllable and can complement previous methods, including learned ISPs. | ||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | AAAI | ||
Notes | CIC; MACO | Approved | no | ||
Call Number | Admin @ si @ CVB2024 | Serial | 3872 | ||
Permanent link to this record | |||||
Author | Trevor Canham; Javier Vazquez; D Long; Richard F. Murray; Michael S Brown | ||||
Title | Noise Prism: A Novel Multispectral Visualization Technique | Type | Journal Article | ||
Year | 2021 | Publication ![]() |
31st Color and Imaging Conference | Abbreviated Journal | |
Volume | Issue | Pages | |||
Keywords | |||||
Abstract | A novel technique for visualizing multispectral images is proposed. Inspired by how prisms work, our method spreads spectral information over a chromatic noise pattern. This is accomplished by populating the pattern with pixels representing each measurement band at a count proportional to its measured intensity. The method is advantageous because it allows for lightweight encoding and visualization of spectral information
while maintaining the color appearance of the stimulus. A four alternative forced choice (4AFC) experiment was conducted to validate the method’s information-carrying capacity in displaying metameric stimuli of varying colors and spectral basis functions. The scores ranged from 100% to 20% (less than chance given the 4AFC task), with many conditions falling somewhere in between at statistically significant intervals. Using this data, color and texture difference metrics can be evaluated and optimized to predict the legibility of the visualization technique. |
||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | CIC | ||
Notes | MACO; CIC | Approved | no | ||
Call Number | Admin @ si @ CVL2021 | Serial | 4000 | ||
Permanent link to this record | |||||
Author | Sagnik Das; Hassan Ahmed Sial; Ke Ma; Ramon Baldrich; Maria Vanrell; Dimitris Samaras | ||||
Title | Intrinsic Decomposition of Document Images In-the-Wild | Type | Conference Article | ||
Year | 2020 | Publication ![]() |
31st British Machine Vision Conference | Abbreviated Journal | |
Volume | Issue | Pages | |||
Keywords | |||||
Abstract | Automatic document content processing is affected by artifacts caused by the shape
of the paper, non-uniform and diverse color of lighting conditions. Fully-supervised methods on real data are impossible due to the large amount of data needed. Hence, the current state of the art deep learning models are trained on fully or partially synthetic images. However, document shadow or shading removal results still suffer because: (a) prior methods rely on uniformity of local color statistics, which limit their application on real-scenarios with complex document shapes and textures and; (b) synthetic or hybrid datasets with non-realistic, simulated lighting conditions are used to train the models. In this paper we tackle these problems with our two main contributions. First, a physically constrained learning-based method that directly estimates document reflectance based on intrinsic image formation which generalizes to challenging illumination conditions. Second, a new dataset that clearly improves previous synthetic ones, by adding a large range of realistic shading and diverse multi-illuminant conditions, uniquely customized to deal with documents in-the-wild. The proposed architecture works in two steps. First, a white balancing module neutralizes the color of the illumination on the input image. Based on the proposed multi-illuminant dataset we achieve a good white-balancing in really difficult conditions. Second, the shading separation module accurately disentangles the shading and paper material in a self-supervised manner where only the synthetic texture is used as a weak training signal (obviating the need for very costly ground truth with disentangled versions of shading and reflectance). The proposed approach leads to significant generalization of document reflectance estimation in real scenes with challenging illumination. We extensively evaluate on the real benchmark datasets available for intrinsic image decomposition and document shadow removal tasks. Our reflectance estimation scheme, when used as a pre-processing step of an OCR pipeline, shows a 21% improvement of character error rate (CER), thus, proving the practical applicability. The data and code will be available at: https://github.com/cvlab-stonybrook/DocIIW. |
||||
Address | Virtual; September 2020 | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | BMVC | ||
Notes | CIC; 600.087; 600.140; 600.118 | Approved | no | ||
Call Number | Admin @ si @ DSM2020 | Serial | 3461 | ||
Permanent link to this record |