|
Records |
Links |
|
Author |
Kaida Xiao; Chenyang Fu; D.Mylonas; Dimosthenis Karatzas; S. Wuerger |
|
|
Title |
Unique Hue Data for Colour Appearance Models. Part ii: Chromatic Adaptation Transform |
Type |
Journal Article |
|
Year |
2013 |
Publication |
Color Research & Application |
Abbreviated Journal |
CRA |
|
|
Volume |
38 |
Issue |
1 |
Pages |
22-29 |
|
|
Keywords |
|
|
|
Abstract |
Unique hue settings of 185 observers under three room-lighting conditions were used to evaluate the accuracy of full and mixed chromatic adaptation transform models of CIECAM02 in terms of unique hue reproduction. Perceptual hue shifts in CIECAM02 were evaluated for both models with no clear difference using the current Commission Internationale de l'Éclairage (CIE) recommendation for mixed chromatic adaptation ratio. Using our large dataset of unique hue data as a benchmark, an optimised parameter is proposed for chromatic adaptation under mixed illumination conditions that produces more accurate results in unique hue reproduction. © 2011 Wiley Periodicals, Inc. Col Res Appl, 2013 |
|
|
Address |
|
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
|
|
|
Notes |
DAG |
Approved |
no |
|
|
Call Number |
Admin @ si @ XFM2013 |
Serial |
1822 |
|
Permanent link to this record |
|
|
|
|
Author |
Ernest Valveny; Robert Benavente; Agata Lapedriza; Miquel Ferrer; Jaume Garcia; Gemma Sanchez |
|
|
Title |
Adaptation of a computer programming course to the EXHE requirements: evaluation five years later |
Type |
Miscellaneous |
|
Year |
2012 |
Publication |
European Journal of Engineering Education |
Abbreviated Journal |
|
|
|
Volume |
37 |
Issue |
3 |
Pages |
243-254 |
|
|
Keywords |
|
|
|
Abstract |
|
|
|
Address |
|
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
|
|
|
Notes |
DAG; CIC; OR; invisible;MV |
Approved |
no |
|
|
Call Number |
Admin @ si @ VBL2012 |
Serial |
2070 |
|
Permanent link to this record |
|
|
|
|
Author |
Marçal Rusiñol; Josep Llados |
|
|
Title |
Flowchart Recognition in Patent Information Retrieval |
Type |
Book Chapter |
|
Year |
2017 |
Publication |
Current Challenges in Patent Information Retrieval |
Abbreviated Journal |
|
|
|
Volume |
37 |
Issue |
|
Pages |
351-368 |
|
|
Keywords |
|
|
|
Abstract |
|
|
|
Address |
|
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
Springer Berlin Heidelberg |
Place of Publication |
|
Editor |
M. Lupu; K. Mayer; N. Kando; A.J. Trippe |
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
|
|
|
Notes |
DAG; 600.097; 600.121 |
Approved |
no |
|
|
Call Number |
Admin @ si @ RuL2017 |
Serial |
2896 |
|
Permanent link to this record |
|
|
|
|
Author |
Mohamed Ali Souibgui; Sanket Biswas; Andres Mafla; Ali Furkan Biten; Alicia Fornes; Yousri Kessentini; Josep Llados; Lluis Gomez; Dimosthenis Karatzas |
|
|
Title |
Text-DIAE: a self-supervised degradation invariant autoencoder for text recognition and document enhancement |
Type |
Conference Article |
|
Year |
2023 |
Publication |
Proceedings of the 37th AAAI Conference on Artificial Intelligence |
Abbreviated Journal |
|
|
|
Volume |
37 |
Issue |
2 |
Pages |
|
|
|
Keywords |
Representation Learning for Vision; CV Applications; CV Language and Vision; ML Unsupervised; Self-Supervised Learning |
|
|
Abstract |
In this paper, we propose a Text-Degradation Invariant Auto Encoder (Text-DIAE), a self-supervised model designed to tackle two tasks, text recognition (handwritten or scene-text) and document image enhancement. We start by employing a transformer-based architecture that incorporates three pretext tasks as learning objectives to be optimized during pre-training without the usage of labelled data. Each of the pretext objectives is specifically tailored for the final downstream tasks. We conduct several ablation experiments that confirm the design choice of the selected pretext tasks. Importantly, the proposed model does not exhibit limitations of previous state-of-the-art methods based on contrastive losses, while at the same time requiring substantially fewer data samples to converge. Finally, we demonstrate that our method surpasses the state-of-the-art in existing supervised and self-supervised settings in handwritten and scene text recognition and document image enhancement. Our code and trained models will be made publicly available at https://github.com/dali92002/SSL-OCR |
|
|
Address |
|
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
AAAI |
|
|
Notes |
DAG |
Approved |
no |
|
|
Call Number |
Admin @ si @ SBM2023 |
Serial |
3848 |
|
Permanent link to this record |
|
|
|
|
Author |
Khanh Nguyen; Ali Furkan Biten; Andres Mafla; Lluis Gomez; Dimosthenis Karatzas |
|
|
Title |
Show, Interpret and Tell: Entity-Aware Contextualised Image Captioning in Wikipedia |
Type |
Conference Article |
|
Year |
2023 |
Publication |
Proceedings of the 37th AAAI Conference on Artificial Intelligence |
Abbreviated Journal |
|
|
|
Volume |
37 |
Issue |
2 |
Pages |
1940-1948 |
|
|
Keywords |
|
|
|
Abstract |
Humans exploit prior knowledge to describe images, and are able to adapt their explanation to specific contextual information given, even to the extent of inventing plausible explanations when contextual information and images do not match. In this work, we propose the novel task of captioning Wikipedia images by integrating contextual knowledge. Specifically, we produce models that jointly reason over Wikipedia articles, Wikimedia images and their associated descriptions to produce contextualized captions. The same Wikimedia image can be used to illustrate different articles, and the produced caption needs to be adapted to the specific context allowing us to explore the limits of the model to adjust captions to different contextual information. Dealing with out-of-dictionary words and Named Entities is a challenging task in this domain. To address this, we propose a pre-training objective, Masked Named Entity Modeling (MNEM), and show that this pretext task results to significantly improved models. Furthermore, we verify that a model pre-trained in Wikipedia generalizes well to News Captioning datasets. We further define two different test splits according to the difficulty of the captioning task. We offer insights on the role and the importance of each modality and highlight the limitations of our model. |
|
|
Address |
Washington; USA; February 2023 |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
AAAI |
|
|
Notes |
DAG |
Approved |
no |
|
|
Call Number |
Admin @ si @ NBM2023 |
Serial |
3860 |
|
Permanent link to this record |
|
|
|
|
Author |
Albert Gordo; Florent Perronnin; Yunchao Gong; Svetlana Lazebnik |
|
|
Title |
Asymmetric Distances for Binary Embeddings |
Type |
Journal Article |
|
Year |
2014 |
Publication |
IEEE Transactions on Pattern Analysis and Machine Intelligence |
Abbreviated Journal |
TPAMI |
|
|
Volume |
36 |
Issue |
1 |
Pages |
33-47 |
|
|
Keywords |
|
|
|
Abstract |
In large-scale query-by-example retrieval, embedding image signatures in a binary space offers two benefits: data compression and search efficiency. While most embedding algorithms binarize both query and database signatures, it has been noted that this is not strictly a requirement. Indeed, asymmetric schemes which binarize the database signatures but not the query still enjoy the same two benefits but may provide superior accuracy. In this work, we propose two general asymmetric distances which are applicable to a wide variety of embedding techniques including Locality Sensitive Hashing (LSH), Locality Sensitive Binary Codes (LSBC), Spectral Hashing (SH), PCA Embedding (PCAE), PCA Embedding with random rotations (PCAE-RR), and PCA Embedding with iterative quantization (PCAE-ITQ). We experiment on four public benchmarks containing up to 1M images and show that the proposed asymmetric distances consistently lead to large improvements over the symmetric Hamming distance for all binary embedding techniques. |
|
|
Address |
|
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
0162-8828 |
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
|
|
|
Notes |
DAG; 600.045; 605.203; 600.077 |
Approved |
no |
|
|
Call Number |
Admin @ si @ GPG2014 |
Serial |
2272 |
|
Permanent link to this record |
|
|
|
|
Author |
Kaida Xiao; Sophie Wuerger; Chenyang Fu; Dimosthenis Karatzas |
|
|
Title |
Unique Hue Data for Colour Appearance Models. Part i: Loci of Unique Hues and Hue Uniformity |
Type |
Journal Article |
|
Year |
2011 |
Publication |
Color Research & Application |
Abbreviated Journal |
CRA |
|
|
Volume |
36 |
Issue |
5 |
Pages |
316-323 |
|
|
Keywords |
unique hues; colour appearance models; CIECAM02; hue uniformity |
|
|
Abstract |
Psychophysical experiments were conducted to assess unique hues on a CRT display for a large sample of colour-normal observers (n 1⁄4 185). These data were then used to evaluate the most commonly used colour appear- ance model, CIECAM02, by transforming the CIEXYZ tris- timulus values of the unique hues to the CIECAM02 colour appearance attributes, lightness, chroma and hue angle. We report two findings: (1) the hue angles derived from our unique hue data are inconsistent with the commonly used Natural Color System hues that are incorporated in the CIECAM02 model. We argue that our predicted unique hue angles (derived from our large dataset) provide a more reliable standard for colour management applications when the precise specification of these salient colours is im- portant. (2) We test hue uniformity for CIECAM02 in all four unique hues and show significant disagreements for all hues, except for unique red which seems to be invariant under lightness changes. Our dataset is useful to improve the CIECAM02 model as it provides reliable data for benchmarking. |
|
|
Address |
|
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
Wiley Periodicals Inc |
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
|
|
|
Notes |
DAG |
Approved |
no |
|
|
Call Number |
Admin @ si @ XWF2011 |
Serial |
1816 |
|
Permanent link to this record |
|
|
|
|
Author |
Jon Almazan; Albert Gordo; Alicia Fornes; Ernest Valveny |
|
|
Title |
Word Spotting and Recognition with Embedded Attributes |
Type |
Journal Article |
|
Year |
2014 |
Publication |
IEEE Transactions on Pattern Analysis and Machine Intelligence |
Abbreviated Journal |
TPAMI |
|
|
Volume |
36 |
Issue |
12 |
Pages |
2552 - 2566 |
|
|
Keywords |
|
|
|
Abstract |
This article addresses the problems of word spotting and word recognition on images. In word spotting, the goal is to find all instances of a query word in a dataset of images. In recognition, the goal is to recognize the content of the word image, usually aided by a dictionary or lexicon. We describe an approach in which both word images and text strings are embedded in a common vectorial subspace. This is achieved by a combination of label embedding and attributes learning, and a common subspace regression. In this subspace, images and strings that represent the same word are close together, allowing one to cast recognition and retrieval tasks as a nearest neighbor problem. Contrary to most other existing methods, our representation has a fixed length, is low dimensional, and is very fast to compute and, especially, to compare. We test our approach on four public datasets of both handwritten documents and natural images showing results comparable or better than the state-of-the-art on spotting and recognition tasks. |
|
|
Address |
|
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
0162-8828 |
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
|
|
|
Notes |
DAG; 600.056; 600.045; 600.061; 602.006; 600.077 |
Approved |
no |
|
|
Call Number |
Admin @ si @ AGF2014a |
Serial |
2483 |
|
Permanent link to this record |
|
|
|
|
Author |
Yunchao Gong; Svetlana Lazebnik; Albert Gordo; Florent Perronnin |
|
|
Title |
Iterative quantization: A procrustean approach to learning binary codes for Large-Scale Image Retrieval |
Type |
Journal Article |
|
Year |
2012 |
Publication |
IEEE Transactions on Pattern Analysis and Machine Intelligence |
Abbreviated Journal |
TPAMI |
|
|
Volume |
35 |
Issue |
12 |
Pages |
2916-2929 |
|
|
Keywords |
|
|
|
Abstract |
This paper addresses the problem of learning similarity-preserving binary codes for efficient similarity search in large-scale image collections. We formulate this problem in terms of finding a rotation of zero-centered data so as to minimize the quantization error of mapping this data to the vertices of a zero-centered binary hypercube, and propose a simple and efficient alternating minimization algorithm to accomplish this task. This algorithm, dubbed iterative quantization (ITQ), has connections to multi-class spectral clustering and to the orthogonal Procrustes problem, and it can be used both with unsupervised data embeddings such as PCA and supervised embeddings such as canonical correlation analysis (CCA). The resulting binary codes significantly outperform several other state-of-the-art methods. We also show that further performance improvements can result from transforming the data with a nonlinear kernel mapping prior to PCA or CCA. Finally, we demonstrate an application of ITQ to learning binary attributes or “classemes” on the ImageNet dataset. |
|
|
Address |
|
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
0162-8828 |
ISBN |
978-1-4577-0394-2 |
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
|
|
|
Notes |
DAG |
Approved |
no |
|
|
Call Number |
Admin @ si @ GLG 2012b |
Serial |
2008 |
|
Permanent link to this record |
|
|
|
|
Author |
C. Alejandro Parraga; Jordi Roca; Dimosthenis Karatzas; Sophie Wuerger |
|
|
Title |
Limitations of visual gamma corrections in LCD displays |
Type |
Journal Article |
|
Year |
2014 |
Publication |
Displays |
Abbreviated Journal |
Dis |
|
|
Volume |
35 |
Issue |
5 |
Pages |
227–239 |
|
|
Keywords |
Display calibration; Psychophysics; Perceptual; Visual gamma correction; Luminance matching; Observer-based calibration |
|
|
Abstract |
A method for estimating the non-linear gamma transfer function of liquid–crystal displays (LCDs) without the need of a photometric measurement device was described by Xiao et al. (2011) [1]. It relies on observer’s judgments of visual luminance by presenting eight half-tone patterns with luminances from 1/9 to 8/9 of the maximum value of each colour channel. These half-tone patterns were distributed over the screen both over the vertical and horizontal viewing axes. We conducted a series of photometric and psychophysical measurements (consisting in the simultaneous presentation of half-tone patterns in each trial) to evaluate whether the angular dependency of the light generated by three different LCD technologies would bias the results of these gamma transfer function estimations. Our results show that there are significant differences between the gamma transfer functions measured and produced by observers at different viewing angles. We suggest appropriate modifications to the Xiao et al. paradigm to counterbalance these artefacts which also have the advantage of shortening the amount of time spent in collecting the psychophysical measurements. |
|
|
Address |
|
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
|
|
|
Notes |
CIC; DAG; 600.052; 600.077; 600.074 |
Approved |
no |
|
|
Call Number |
Admin @ si @ PRK2014 |
Serial |
2511 |
|
Permanent link to this record |