|
Records |
Links |
|
Author |
Rahat Khan; Joost Van de Weijer; Dimosthenis Karatzas; Damien Muselet |
|
|
Title |
Towards multispectral data acquisition with hand-held devices |
Type |
Conference Article |
|
Year |
2013 |
Publication |
20th IEEE International Conference on Image Processing |
Abbreviated Journal |
|
|
|
Volume |
|
Issue |
|
Pages |
2053 - 2057 |
|
|
Keywords |
Multispectral; mobile devices; color measurements |
|
|
Abstract |
We propose a method to acquire multispectral data with handheld devices with front-mounted RGB cameras. We propose to use the display of the device as an illuminant while the camera captures images illuminated by the red, green and
blue primaries of the display. Three illuminants and three response functions of the camera lead to nine response values which are used for reflectance estimation. Results are promising and show that the accuracy of the spectral reconstruction improves in the range from 30-40% over the spectral
reconstruction based on a single illuminant. Furthermore, we propose to compute sensor-illuminant aware linear basis by discarding the part of the reflectances that falls in the sensorilluminant null-space. We show experimentally that optimizing reflectance estimation on these new basis functions decreases
the RMSE significantly over basis functions that are independent to sensor-illuminant. We conclude that, multispectral data acquisition is potentially possible with consumer hand-held devices such as tablets, mobiles, and laptops, opening up applications which are currently considered to be unrealistic. |
|
|
Address |
Melbourne; Australia; September 2013 |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
ICIP |
|
|
Notes |
CIC; DAG; 600.048 |
Approved |
no |
|
|
Call Number |
Admin @ si @ KWK2013b |
Serial |
2265 |
|
Permanent link to this record |
|
|
|
|
Author |
Marçal Rusiñol; J. Chazalon; Jean-Marc Ogier |
|
|
Title |
Normalisation et validation d'images de documents capturées en mobilité |
Type |
Conference Article |
|
Year |
2014 |
Publication |
Colloque International Francophone sur l'Écrit et le Document |
Abbreviated Journal |
|
|
|
Volume |
|
Issue |
|
Pages |
109-124 |
|
|
Keywords |
mobile document image acquisition; perspective correction; illumination correction; quality assessment; focus measure; OCR accuracy prediction |
|
|
Abstract |
Mobile document image acquisition integrates many distortions which must be corrected or detected on the device, before the document becomes unavailable or paying data transmission fees. In this paper, we propose a system to correct perspective and illumination issues, and estimate the sharpness of the image for OCR recognition. The correction step relies on fast and accurate border detection followed by illumination normalization. Its evaluation on a private dataset shows a clear improvement on OCR accuracy. The quality assessment
step relies on a combination of focus measures. Its evaluation on a public dataset shows that this simple method compares well to state of the art, learning-based methods which cannot be embedded on a mobile, and outperforms metric-based methods. |
|
|
Address |
Nancy; France; March 2014 |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
CIFED |
|
|
Notes |
DAG; 601.223; 600.077 |
Approved |
no |
|
|
Call Number |
Admin @ si @ RCO2014b |
Serial |
2546 |
|
Permanent link to this record |
|
|
|
|
Author |
Miquel Ferrer; Ernest Valveny; F. Serratosa |
|
|
Title |
Median Graphs: A Genetic Approach based on New Theoretical Properties |
Type |
Journal Article |
|
Year |
2009 |
Publication |
Pattern Recognition |
Abbreviated Journal |
PR |
|
|
Volume |
42 |
Issue |
9 |
Pages |
2003–2012 |
|
|
Keywords |
Median graph; Genetic search; Maximum common subgraph; Graph matching; Structural pattern recognition |
|
|
Abstract |
Given a set of graphs, the median graph has been theoretically presented as a useful concept to infer a representative of the set. However, the computation of the median graph is a highly complex task and its practical application has been very limited up to now. In this work we present two major contributions. On one side, and from a theoretical point of view, we show new theoretical properties of the median graph. On the other side, using these new properties, we present a new approximate algorithm based on the genetic search, that improves the computation of the median graph. Finally, we perform a set of experiments on real data, where none of the existing algorithms for the median graph computation could be applied up to now due to their computational complexity. With these results, we show how the concept of the median graph can be used in real applications and leaves the box of the only-theoretical concepts, demonstrating, from a practical point of view, that can be a useful tool to represent a set of graphs. |
|
|
Address |
|
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
|
|
|
Notes |
DAG |
Approved |
no |
|
|
Call Number |
DAG @ dag @ FVS2009b |
Serial |
1167 |
|
Permanent link to this record |
|
|
|
|
Author |
Josep Llados; Dimosthenis Karatzas; Joan Mas; Gemma Sanchez |
|
|
Title |
A Generic Architecture for the Conversion of Document Collections into Semantically Annotated Digital Archives |
Type |
Journal |
|
Year |
2008 |
Publication |
Journal of Universal Computer Science |
Abbreviated Journal |
|
|
|
Volume |
14 |
Issue |
18 |
Pages |
2912–2935 |
|
|
Keywords |
Median Graph, Graph Embedding, Graph Matching, Structural Pattern Recognition |
|
|
Abstract |
|
|
|
Address |
|
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
|
|
|
Notes |
DAG |
Approved |
no |
|
|
Call Number |
DAG @ dag @ LKM2008 |
Serial |
1142 |
|
Permanent link to this record |
|
|
|
|
Author |
Miquel Ferrer; Dimosthenis Karatzas; Ernest Valveny; I. Bardaji; Horst Bunke |
|
|
Title |
A Generic Framework for Median Graph Computation based on a Recursive Embedding Approach |
Type |
Journal Article |
|
Year |
2011 |
Publication |
Computer Vision and Image Understanding |
Abbreviated Journal |
CVIU |
|
|
Volume |
115 |
Issue |
7 |
Pages |
919-928 |
|
|
Keywords |
Median Graph, Graph Embedding, Graph Matching, Structural Pattern Recognition |
|
|
Abstract |
The median graph has been shown to be a good choice to obtain a represen- tative of a set of graphs. However, its computation is a complex problem. Recently, graph embedding into vector spaces has been proposed to obtain approximations of the median graph. The problem with such an approach is how to go from a point in the vector space back to a graph in the graph space. The main contribution of this paper is the generalization of this previ- ous method, proposing a generic recursive procedure that permits to recover the graph corresponding to a point in the vector space, introducing only the amount of approximation inherent to the use of graph matching algorithms. In order to evaluate the proposed method, we compare it with the set me- dian and with the other state-of-the-art embedding-based methods for the median graph computation. The experiments are carried out using four dif- ferent databases (one semi-artificial and three containing real-world data). Results show that with the proposed approach we can obtain better medi- ans, in terms of the sum of distances to the training graphs, than with the previous existing methods. |
|
|
Address |
|
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
|
|
|
Notes |
DAG |
Approved |
no |
|
|
Call Number |
IAM @ iam @ FKV2011 |
Serial |
1831 |
|
Permanent link to this record |
|
|
|
|
Author |
Ali Furkan Biten; Lluis Gomez; Dimosthenis Karatzas |
|
|
Title |
Let there be a clock on the beach: Reducing Object Hallucination in Image Captioning |
Type |
Conference Article |
|
Year |
2022 |
Publication |
Winter Conference on Applications of Computer Vision |
Abbreviated Journal |
|
|
|
Volume |
|
Issue |
|
Pages |
1381-1390 |
|
|
Keywords |
Measurement; Training; Visualization; Analytical models; Computer vision; Computational modeling; Training data |
|
|
Abstract |
Explaining an image with missing or non-existent objects is known as object bias (hallucination) in image captioning. This behaviour is quite common in the state-of-the-art captioning models which is not desirable by humans. To decrease the object hallucination in captioning, we propose three simple yet efficient training augmentation method for sentences which requires no new training data or increase
in the model size. By extensive analysis, we show that the proposed methods can significantly diminish our models’ object bias on hallucination metrics. Moreover, we experimentally demonstrate that our methods decrease the dependency on the visual features. All of our code, configuration files and model weights are available online. |
|
|
Address |
Virtual; Waikoloa; Hawai; USA; January 2022 |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
WACV |
|
|
Notes |
DAG; 600.155; 302.105 |
Approved |
no |
|
|
Call Number |
Admin @ si @ BGK2022 |
Serial |
3662 |
|
Permanent link to this record |
|
|
|
|
Author |
Ali Furkan Biten; Andres Mafla; Lluis Gomez; Dimosthenis Karatzas |
|
|
Title |
Is An Image Worth Five Sentences? A New Look into Semantics for Image-Text Matching |
Type |
Conference Article |
|
Year |
2022 |
Publication |
Winter Conference on Applications of Computer Vision |
Abbreviated Journal |
|
|
|
Volume |
|
Issue |
|
Pages |
1391-1400 |
|
|
Keywords |
Measurement; Training; Integrated circuits; Annotations; Semantics; Training data; Semisupervised learning |
|
|
Abstract |
The task of image-text matching aims to map representations from different modalities into a common joint visual-textual embedding. However, the most widely used datasets for this task, MSCOCO and Flickr30K, are actually image captioning datasets that offer a very limited set of relationships between images and sentences in their ground-truth annotations. This limited ground truth information forces us to use evaluation metrics based on binary relevance: given a sentence query we consider only one image as relevant. However, many other relevant images or captions may be present in the dataset. In this work, we propose two metrics that evaluate the degree of semantic relevance of retrieved items, independently of their annotated binary relevance. Additionally, we incorporate a novel strategy that uses an image captioning metric, CIDEr, to define a Semantic Adaptive Margin (SAM) to be optimized in a standard triplet loss. By incorporating our formulation to existing models, a large improvement is obtained in scenarios where available training data is limited. We also demonstrate that the performance on the annotated image-caption pairs is maintained while improving on other non-annotated relevant items when employing the full training set. The code for our new metric can be found at github. com/furkanbiten/ncsmetric and the model implementation at github. com/andrespmd/semanticadaptive_margin. |
|
|
Address |
Virtual; Waikoloa; Hawai; USA; January 2022 |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
WACV |
|
|
Notes |
DAG; 600.155; 302.105; |
Approved |
no |
|
|
Call Number |
Admin @ si @ BMG2022 |
Serial |
3663 |
|
Permanent link to this record |
|
|
|
|
Author |
Carlos Boned Riera; Oriol Ramos Terrades |
|
|
Title |
Discriminative Neural Variational Model for Unbalanced Classification Tasks in Knowledge Graph |
Type |
Conference Article |
|
Year |
2022 |
Publication |
26th International Conference on Pattern Recognition |
Abbreviated Journal |
|
|
|
Volume |
|
Issue |
|
Pages |
2186-2191 |
|
|
Keywords |
Measurement; Couplings; Semantics; Ear; Benchmark testing; Data models; Pattern recognition |
|
|
Abstract |
Nowadays the paradigm of link discovery problems has shown significant improvements on Knowledge Graphs. However, method performances are harmed by the unbalanced nature of this classification problem, since many methods are easily biased to not find proper links. In this paper we present a discriminative neural variational auto-encoder model, called DNVAE from now on, in which we have introduced latent variables to serve as embedding vectors. As a result, the learnt generative model approximate better the underlying distribution and, at the same time, it better differentiate the type of relations in the knowledge graph. We have evaluated this approach on benchmark knowledge graph and Census records. Results in this last data set are quite impressive since we reach the highest possible score in the evaluation metrics. However, further experiments are still needed to deeper evaluate the performance of the method in more challenging tasks. |
|
|
Address |
Montreal; Quebec; Canada; August 2022 |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
ICPR |
|
|
Notes |
DAG; 600.121; 600.162 |
Approved |
no |
|
|
Call Number |
Admin @ si @ BoR2022 |
Serial |
3741 |
|
Permanent link to this record |
|
|
|
|
Author |
A.Kesidis; Dimosthenis Karatzas |
|
|
Title |
Logo and Trademark Recognition |
Type |
Book Chapter |
|
Year |
2014 |
Publication |
Handbook of Document Image Processing and Recognition |
Abbreviated Journal |
|
|
|
Volume |
D |
Issue |
|
Pages |
591-646 |
|
|
Keywords |
Logo recognition; Logo removal; Logo spotting; Trademark registration; Trademark retrieval systems |
|
|
Abstract |
The importance of logos and trademarks in nowadays society is indisputable, variably seen under a positive light as a valuable service for consumers or a negative one as a catalyst of ever-increasing consumerism. This chapter discusses the technical approaches for enabling machines to work with logos, looking into the latest methodologies for logo detection, localization, representation, recognition, retrieval, and spotting in a variety of media. This analysis is presented in the context of three different applications covering the complete depth and breadth of state of the art techniques. These are trademark retrieval systems, logo recognition in document images, and logo detection and removal in images and videos. This chapter, due to the very nature of logos and trademarks, brings together various facets of document image analysis spanning graphical and textual content, while it links document image analysis to other computer vision domains, especially when it comes to the analysis of real-scene videos and images. |
|
|
Address |
|
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
Springer London |
Place of Publication |
|
Editor |
D. Doermann; K. Tombre |
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
978-0-85729-858-4 |
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
|
|
|
Notes |
DAG; 600.077 |
Approved |
no |
|
|
Call Number |
Admin @ si @ KeK2014 |
Serial |
2425 |
|
Permanent link to this record |
|
|
|
|
Author |
Marçal Rusiñol; J. Chazalon; Jean-Marc Ogier |
|
|
Title |
Filtrage de descripteurs locaux pour l'amélioration de la détection de documents |
Type |
Conference Article |
|
Year |
2016 |
Publication |
Colloque International Francophone sur l'Écrit et le Document |
Abbreviated Journal |
|
|
|
Volume |
|
Issue |
|
Pages |
|
|
|
Keywords |
Local descriptors; mobile capture; document matching; keypoint selection |
|
|
Abstract |
In this paper we propose an effective method aimed at reducing the amount of local descriptors to be indexed in a document matching framework.In an off-line training stage, the matching between the model document and incoming images is computed retaining the local descriptors from the model that steadily produce good matches. We have evaluated this approach by using the ICDAR2015 SmartDOC dataset containing near 25000 images from documents to be captured by a mobile device. We have tested the performance of this filtering step by using ORB and SIFT local detectors and descriptors. The results show an important gain both in quality of the final matching as well as in time and space requirements. |
|
|
Address |
Toulouse; France; March 2016 |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
CIFED |
|
|
Notes |
DAG; 600.084; 600.077 |
Approved |
no |
|
|
Call Number |
Admin @ si @ RCO2016 |
Serial |
2755 |
|
Permanent link to this record |