Publicacions CVC -- Query Results

[31–40] << 41 42 43 44 45 46 47 48 49 50 >> [51–60]

Details

	Records
	Author	Ali Furkan Biten; Ruben Tito; Andres Mafla; Lluis Gomez; Marçal Rusiñol; C.V. Jawahar; Ernest Valveny; Dimosthenis Karatzas
	Title	Scene Text Visual Question Answering			Type	Conference Article
	Year	2019	Publication	18th IEEE International Conference on Computer Vision	Abbreviated Journal
	Volume		Issue		Pages	4291-4301
	Keywords
	Abstract	Current visual question answering datasets do not consider the rich semantic information conveyed by text within an image. In this work, we present a new dataset, ST-VQA, that aims to highlight the importance of exploiting highlevel semantic information present in images as textual cues in the Visual Question Answering process. We use this dataset to define a series of tasks of increasing difficulty for which reading the scene text in the context provided by the visual information is necessary to reason and generate an appropriate answer. We propose a new evaluation metric for these tasks to account both for reasoning errors as well as shortcomings of the text recognition module. In addition we put forward a series of baseline methods, which provide further insight to the newly released dataset, and set the scene for further research.
	Address	Seul; Corea; October 2019
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference	ICCV
	Notes	DAG; 600.129; 600.135; 601.338; 600.121			Approved	no
	Call Number	Admin @ si @ BTM2019b			Serial	3285
Permanent link to this record



	Author	Ruben Tito; Dimosthenis Karatzas; Ernest Valveny
	Title	Document Collection Visual Question Answering			Type	Conference Article
	Year	2021	Publication	16th International Conference on Document Analysis and Recognition	Abbreviated Journal
	Volume	12822	Issue		Pages	778-792
	Keywords	Document collection; Visual Question Answering
	Abstract	Current tasks and methods in Document Understanding aims to process documents as single elements. However, documents are usually organized in collections (historical records, purchase invoices), that provide context useful for their interpretation. To address this problem, we introduce Document Collection Visual Question Answering (DocCVQA) a new dataset and related task, where questions are posed over a whole collection of document images and the goal is not only to provide the answer to the given question, but also to retrieve the set of documents that contain the information needed to infer the answer. Along with the dataset we propose a new evaluation metric and baselines which provide further insights to the new dataset and task.
	Address
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title	LNCS
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference	ICDAR
	Notes	DAG; 600.121			Approved	no
	Call Number	Admin @ si @ TKV2021			Serial	3622
Permanent link to this record



	Author	Ali Furkan Biten; Lluis Gomez; Marçal Rusiñol; Dimosthenis Karatzas
	Title	Good News, Everyone! Context driven entity-aware captioning for news images			Type	Conference Article
	Year	2019	Publication	32nd IEEE Conference on Computer Vision and Pattern Recognition	Abbreviated Journal
	Volume		Issue		Pages	12458-12467
	Keywords
	Abstract	Current image captioning systems perform at a merely descriptive level, essentially enumerating the objects in the scene and their relations. Humans, on the contrary, interpret images by integrating several sources of prior knowledge of the world. In this work, we aim to take a step closer to producing captions that offer a plausible interpretation of the scene, by integrating such contextual information into the captioning pipeline. For this we focus on the captioning of images used to illustrate news articles. We propose a novel captioning method that is able to leverage contextual information provided by the text of news articles associated with an image. Our model is able to selectively draw information from the article guided by visual cues, and to dynamically extend the output dictionary to out-of-vocabulary named entities that appear in the context source. Furthermore we introduce“ GoodNews”, the largest news image captioning dataset in the literature and demonstrate state-of-the-art results.
	Address	Long beach; California; USA; june 2019
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference	CVPR
	Notes	DAG; 600.129; 600.135; 601.338; 600.121			Approved	no
	Call Number	Admin @ si @ BGR2019			Serial	3289
Permanent link to this record



	Author	Y. Patel; Lluis Gomez; Marçal Rusiñol; Dimosthenis Karatzas; C.V. Jawahar
	Title	Self-Supervised Visual Representations for Cross-Modal Retrieval			Type	Conference Article
	Year	2019	Publication	ACM International Conference on Multimedia Retrieval	Abbreviated Journal
	Volume		Issue		Pages	182–186
	Keywords
	Abstract	Cross-modal retrieval methods have been significantly improved in last years with the use of deep neural networks and large-scale annotated datasets such as ImageNet and Places. However, collecting and annotating such datasets requires a tremendous amount of human effort and, besides, their annotations are limited to discrete sets of popular visual classes that may not be representative of the richer semantics found on large-scale cross-modal retrieval datasets. In this paper, we present a self-supervised cross-modal retrieval framework that leverages as training data the correlations between images and text on the entire set of Wikipedia articles. Our method consists in training a CNN to predict: (1) the semantic context of the article in which an image is more probable to appear as an illustration, and (2) the semantic context of its caption. Our experiments demonstrate that the proposed method is not only capable of learning discriminative visual representations for solving vision tasks like classification, but that the learned representations are better for cross-modal retrieval when compared to supervised pre-training of the network on the ImageNet dataset.
	Address	Otawa; Canada; june 2019
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference	ICMR
	Notes	DAG; 600.121; 600.129			Approved	no
	Call Number	Admin @ si @ PGR2019			Serial	3288
Permanent link to this record



	Author	Albert Berenguel
	Title	Analysis of background textures in banknotes and identity documents for counterfeit detection			Type	Book Whole
	Year	2019	Publication	PhD Thesis, Universitat Autonoma de Barcelona-CVC	Abbreviated Journal
	Volume		Issue		Pages
	Keywords
	Abstract	Counterfeiting and piracy are a form of theft that has been steadily growing in recent years. A counterfeit is an unauthorized reproduction of an authentic/genuine object. Banknotes and identity documents are two common objects of counterfeiting. The former is used by organized criminal groups to finance a variety of illegal activities or even to destabilize entire countries due the inflation effect. Generally, in order to run their illicit businesses, counterfeiters establish companies and bank accounts using fraudulent identity documents. The illegal activities generated by counterfeit banknotes and identity documents has a damaging effect on business, the economy and the general population. To fight against counterfeiters, governments and authorities around the globe cooperate and develop security features to protect their security documents. Many of the security features in identity documents can also be found in banknotes. In this dissertation we focus our efforts in detecting the counterfeit banknotes and identity documents by analyzing the security features at the background printing. Background areas on secure documents contain fine-line patterns and designs that are difficult to reproduce without the manufacturers cutting-edge printing equipment. Our objective is to find the loose of resolution between the genuine security document and the printed counterfeit version with a publicly available commercial printer. We first present the most complete survey to date in identity and banknote security features. The compared algorithms and systems are based on computer vision and machine learning. Then we advance to present the banknote and identity counterfeit dataset we have built and use along all this thesis. Afterwards, we evaluate and adapt algorithms in the literature for the security background texture analysis. We study this problem from the point of view of robustness, computational efficiency and applicability into a real and non-controlled industrial scenario, proposing key insights to use these algorithms. Next, within the industrial environment of this thesis, we build a complete service oriented architecture to detect counterfeit documents. The mobile application and the server framework intends to be used even by non-expert document examiners to spot counterfeits. Later, we re-frame the problem of background texture counterfeit detection as a full-reference game of spotting the differences, by alternating glimpses between a counterfeit and a genuine background using recurrent neural networks. Finally, we deal with the lack of counterfeit samples, studying different approaches based on anomaly detection.
	Address	November 2019
	Corporate Author				Thesis	Ph.D. thesis
	Publisher	Ediciones Graficas Rey	Place of Publication		Editor	Oriol Ramos Terrades;Josep Llados
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN	978-84-121011-2-6	Medium
	Area		Expedition		Conference
	Notes	DAG; 600.140; 600.121			Approved	no
	Call Number	Admin @ si @ Ber2019			Serial	3395
Permanent link to this record



	Author	Christophe Rigaud; Dimosthenis Karatzas; Jean-Christophe Burie; Jean-Marc Ogier
	Title	Speech balloon contour classification in comics			Type	Conference Article
	Year	2013	Publication	10th IAPR International Workshop on Graphics Recognition	Abbreviated Journal
	Volume		Issue		Pages
	Keywords
	Abstract	Comic books digitization combined with subsequent comic book understanding create a variety of new applications, including mobile reading and data mining. Document understanding in this domain is challenging as comics are semi-structured documents, combining semantically important graphical and textual parts. In this work we detail a novel approach for classifying speech balloon in scanned comics book pages based on their contour time series.
	Address	Bethlehem; PA; USA; August 2013
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference	GREC
	Notes	DAG; 600.056			Approved	no
	Call Number	Admin @ si @ RKB2013			Serial	2429
Permanent link to this record



	Author	Christophe Rigaud; Dimosthenis Karatzas; Joost Van de Weijer; Jean-Christophe Burie; Jean-Marc Ogier
	Title	Automatic text localisation in scanned comic books			Type	Conference Article
	Year	2013	Publication	Proceedings of the International Conference on Computer Vision Theory and Applications	Abbreviated Journal
	Volume		Issue		Pages	814-819
	Keywords	Text localization; comics; text/graphic separation; complex background; unstructured document
	Abstract	Comic books constitute an important cultural heritage asset in many countries. Digitization combined with subsequent document understanding enable direct content-based search as opposed to metadata only search (e.g. album title or author name). Few studies have been done in this direction. In this work we detail a novel approach for the automatic text localization in scanned comics book pages, an essential step towards a fully automatic comics book understanding. We focus on speech text as it is semantically important and represents the majority of the text present in comics. The approach is compared with existing methods of text localization found in the literature and results are presented.
	Address	Barcelona; February 2013
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference	VISAPP
	Notes	DAG; CIC; 600.056			Approved	no
	Call Number	Admin @ si @ RKW2013b			Serial	2261
Permanent link to this record



	Author	Christophe Rigaud; Dimosthenis Karatzas; Joost Van de Weijer; Jean-Christophe Burie; Jean-Marc Ogier
	Title	An active contour model for speech balloon detection in comics			Type	Conference Article
	Year	2013	Publication	12th International Conference on Document Analysis and Recognition	Abbreviated Journal
	Volume		Issue		Pages	1240-1244
	Keywords
	Abstract	Comic books constitute an important cultural heritage asset in many countries. Digitization combined with subsequent comic book understanding would enable a variety of new applications, including content-based retrieval and content retargeting. Document understanding in this domain is challenging as comics are semi-structured documents, combining semantically important graphical and textual parts. Few studies have been done in this direction. In this work we detail a novel approach for closed and non-closed speech balloon localization in scanned comic book pages, an essential step towards a fully automatic comic book understanding. The approach is compared with existing methods for closed balloon localization found in the literature and results are presented.
	Address	washington; USA; August 2013
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN	1520-5363	ISBN		Medium
	Area		Expedition		Conference	ICDAR
	Notes	DAG; CIC; 600.056			Approved	no
	Call Number	Admin @ si @ RKW2013a			Serial	2260
Permanent link to this record



	Author	Chenyang Fu; Kaida Xiao; Dimosthenis Karatzas; Sophie Wuerger
	Title	Investigation of Unique Hue Setting Changes with Ageing			Type	Journal Article
	Year	2011	Publication	Chinese Optics Letters	Abbreviated Journal	COL
	Volume	9	Issue	5	Pages	053301-1-5
	Keywords
	Abstract	Clromatic sensitivity along the protan, deutan, and tritan lines and the loci of the unique hues (red, green, yellow, blue) for a very large sample (n = 185) of colour-normal observers ranging from 18 to 75 years of age are assessed. Visual judgments are obtained under normal viewing conditions using colour patches on self-luminous display under controlled adaptation conditions. Trivector discrimination thresholds show an increase as a function of age along the protan, deutan, and tritan axes, with the largest increase present along the tritan line, less pronounced shifts in unique hue settings are also observed. Based on the chromatic (protan, deutan, tritan) thresholds and using scaled cone signals, we predict the unique hue changes with ageing. A dependency on age for unique red and unique yellow for predicted hue angle is found. We conclude that the chromatic sensitivity deteriorates significantly with age, whereas the appearance of unique hues is much less affected, remaining almost constant despite the known changes in the ocular media.
	Address
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference
	Notes	DAG			Approved	no
	Call Number	Admin @ si @ XFW2011			Serial	1818
Permanent link to this record



	Author	Dena Bazazian; Raul Gomez; Anguelos Nicolaou; Lluis Gomez; Dimosthenis Karatzas; Andrew Bagdanov
	Title	Fast: Facilitated and accurate scene text proposals through fcn guided pruning			Type	Journal Article
	Year	2019	Publication	Pattern Recognition Letters	Abbreviated Journal	PRL
	Volume	119	Issue		Pages	112-120
	Keywords
	Abstract	Class-specific text proposal algorithms can efficiently reduce the search space for possible text object locations in an image. In this paper we combine the Text Proposals algorithm with Fully Convolutional Networks to efficiently reduce the number of proposals while maintaining the same recall level and thus gaining a significant speed up. Our experiments demonstrate that such text proposal approaches yield significantly higher recall rates than state-of-the-art text localization techniques, while also producing better-quality localizations. Our results on the ICDAR 2015 Robust Reading Competition (Challenge 4) and the COCO-text datasets show that, when combined with strong word classifiers, this recall margin leads to state-of-the-art results in end-to-end scene text recognition.
	Address
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference
	Notes	DAG; 600.084; 600.121; 600.129			Approved	no
	Call Number	Admin @ si @ BGN2019			Serial	3342
Permanent link to this record

Select All Deselect All

[31–40] << 41 42 43 44 45 46 47 48 49 50 >> [51–60]

List View

Citations

Details

All Found Records Selected Records:

Save Citations: Format:

Export Records: Format: