Publicacions CVC -- Query Results

<< 1 2 3 4 5 6 7 8 9 10 >> [11–20]

Details

	Records
	Author	Sanket Biswas; Pau Riba; Josep Llados; Umapada Pal
	Title	Beyond Document Object Detection: Instance-Level Segmentation of Complex Layouts			Type	Journal Article
	Year	2021	Publication	International Journal on Document Analysis and Recognition	Abbreviated Journal	IJDAR
	Volume	24	Issue		Pages	269–281
	Keywords
	Abstract	Information extraction is a fundamental task of many business intelligence services that entail massive document processing. Understanding a document page structure in terms of its layout provides contextual support which is helpful in the semantic interpretation of the document terms. In this paper, inspired by the progress of deep learning methodologies applied to the task of object recognition, we transfer these models to the specific case of document object detection, reformulating the traditional problem of document layout analysis. Moreover, we importantly contribute to prior arts by defining the task of instance segmentation on the document image domain. An instance segmentation paradigm is especially important in complex layouts whose contents should interact for the proper rendering of the page, i.e., the proper text wrapping around an image. Finally, we provide an extensive evaluation, both qualitative and quantitative, that demonstrates the superior performance of the proposed methodology over the current state of the art.
	Address
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference
	Notes	DAG; 600.121; 600.140; 110.312			Approved	no
	Call Number	Admin @ si @ BRL2021b			Serial	3574
Permanent link to this record



	Author	Kunal Biswas; Palaiahnakote Shivakumara; Umapada Pal; Tong Lu; Michel Blumenstein; Josep Llados
	Title	Classification of aesthetic natural scene images using statistical and semantic features			Type	Journal Article
	Year	2023	Publication	Multimedia Tools and Applications	Abbreviated Journal	MTAP
	Volume	82	Issue	9	Pages	13507-13532
	Keywords
	Abstract	Aesthetic image analysis is essential for improving the performance of multimedia image retrieval systems, especially from a repository of social media and multimedia content stored on mobile devices. This paper presents a novel method for classifying aesthetic natural scene images by studying the naturalness of image content using statistical features, and reading text in the images using semantic features. Unlike existing methods that focus only on image quality with human information, the proposed approach focuses on image features as well as text-based semantic features without human intervention to reduce the gap between subjectivity and objectivity in the classification. The aesthetic classes considered in this work are (i) Very Pleasant, (ii) Pleasant, (iii) Normal and (iv) Unpleasant. The naturalness is represented by features of focus, defocus, perceived brightness, perceived contrast, blurriness and noisiness, while semantics are represented by text recognition, description of the images and labels of images, profile pictures, and banner images. Furthermore, a deep learning model is proposed in a novel way to fuse statistical and semantic features for the classification of aesthetic natural scene images. Experiments on our own dataset and the standard datasets demonstrate that the proposed approach achieves 92.74%, 88.67% and 83.22% average classification rates on our own dataset, AVA dataset and CUHKPQ dataset, respectively. Furthermore, a comparative study of the proposed model with the existing methods shows that the proposed method is effective for the classification of aesthetic social media images.
	Address
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference
	Notes	DAG			Approved	no
	Call Number	Admin @ si @ BSP2023			Serial	3873
Permanent link to this record



	Author	Manuel Carbonell; Alicia Fornes; Mauricio Villegas; Josep Llados
	Title	A Neural Model for Text Localization, Transcription and Named Entity Recognition in Full Pages			Type	Journal Article
	Year	2020	Publication	Pattern Recognition Letters	Abbreviated Journal	PRL
	Volume	136	Issue		Pages	219-227
	Keywords
	Abstract	In the last years, the consolidation of deep neural network architectures for information extraction in document images has brought big improvements in the performance of each of the tasks involved in this process, consisting of text localization, transcription, and named entity recognition. However, this process is traditionally performed with separate methods for each task. In this work we propose an end-to-end model that combines a one stage object detection network with branches for the recognition of text and named entities respectively in a way that shared features can be learned simultaneously from the training error of each of the tasks. By doing so the model jointly performs handwritten text detection, transcription, and named entity recognition at page level with a single feed forward step. We exhaustively evaluate our approach on different datasets, discussing its advantages and limitations compared to sequential approaches. The results show that the model is capable of benefiting from shared features by simultaneously solving interdependent tasks.
	Address
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference
	Notes	DAG; 600.140; 601.311; 600.121			Approved	no
	Call Number	Admin @ si @ CFV2020			Serial	3451
Permanent link to this record



	Author	Antonio Clavelli; Dimosthenis Karatzas; Josep Llados; Mario Ferraro; Giuseppe Boccignone
	Title	Modelling task-dependent eye guidance to objects in pictures			Type	Journal Article
	Year	2014	Publication	Cognitive Computation	Abbreviated Journal	CoCom
	Volume	6	Issue	3	Pages	558-584
	Keywords	Visual attention; Gaze guidance; Value; Payoff; Stochastic fixation prediction
	Abstract	5Y Impact Factor: 1.14 / 3rd (Computer Science, Artificial Intelligence) We introduce a model of attentional eye guidance based on the rationale that the deployment of gaze is to be considered in the context of a general action-perception loop relying on two strictly intertwined processes: sensory processing, depending on current gaze position, identifies sources of information that are most valuable under the given task; motor processing links such information with the oculomotor act by sampling the next gaze position and thus performing the gaze shift. In such a framework, the choice of where to look next is task-dependent and oriented to classes of objects embedded within pictures of complex scenes. The dependence on task is taken into account by exploiting the value and the payoff of gazing at certain image patches or proto-objects that provide a sparse representation of the scene objects. The different levels of the action-perception loop are represented in probabilistic form and eventually give rise to a stochastic process that generates the gaze sequence. This way the model also accounts for statistical properties of gaze shifts such as individual scan path variability. Results of the simulations are compared either with experimental data derived from publicly available datasets and from our own experiments.
	Address
	Corporate Author				Thesis
	Publisher	Springer US	Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN	1866-9956	ISBN		Medium
	Area		Expedition		Conference
	Notes	DAG; 600.056; 600.045; 605.203; 601.212; 600.077			Approved	no
	Call Number	Admin @ si @ CKL2014			Serial	2419
Permanent link to this record



	Author	S. Chanda; Umapada Pal; Oriol Ramos Terrades
	Title	Word-Wise Thai and Roman Script Identification			Type	Journal
	Year	2009	Publication	ACM Transactions on Asian Language Information Processing	Abbreviated Journal	TALIP
	Volume	8	Issue	3	Pages	1-21
	Keywords
	Abstract	In some Thai documents, a single text line of a printed document page may contain words of both Thai and Roman scripts. For the Optical Character Recognition (OCR) of such a document page it is better to identify, at first, Thai and Roman script portions and then to use individual OCR systems of the respective scripts on these identified portions. In this article, an SVM-based method is proposed for identification of word-wise printed Roman and Thai scripts from a single line of a document page. Here, at first, the document is segmented into lines and then lines are segmented into character groups (words). In the proposed scheme, we identify the script of a character group combining different character features obtained from structural shape, profile behavior, component overlapping information, topological properties, and water reservoir concept, etc. Based on the experiment on 10,000 data (words) we obtained 99.62% script identification accuracy from the proposed scheme.
	Address
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN	1530-0226	ISBN		Medium
	Area		Expedition		Conference
	Notes	DAG			Approved	no
	Call Number	Admin @ si @ CPR2009f			Serial	1869
Permanent link to this record

Select All Deselect All

<< 1 2 3 4 5 6 7 8 9 10 >> [11–20]

List View

Citations

Details

All Found Records Selected Records:

Save Citations: Format:

Export Records: Format: