Publicacions CVC -- Query Results

[131–140] << 141 142 143 144 145 146 147 148 149 150 >> [151–160]

Details

Records
Author	Q. Bao; Marçal Rusiñol; M.Coustaty; Muhammad Muzzamil Luqman; C.D. Tran; Jean-Marc Ogier
Title	Delaunay triangulation-based features for Camera-based document image retrieval system			Type	Conference Article
Year	2016	Publication	12th IAPR Workshop on Document Analysis Systems	Abbreviated Journal
Volume		Issue		Pages	1-6
Keywords	Camera-based Document Image Retrieval; Delaunay Triangulation; Feature descriptors; Indexing
Abstract	In this paper, we propose a new feature vector, named DElaunay TRIangulation-based Features (DETRIF), for real-time camera-based document image retrieval. DETRIF is computed based on the geometrical constraints from each pair of adjacency triangles in delaunay triangulation which is constructed from centroids of connected components. Besides, we employ a hashing-based indexing system in order to evaluate the performance of DETRIF and to compare it with other systems such as LLAH and SRIF. The experimentation is carried out on two datasets comprising of 400 heterogeneous-content complex linguistic map images (huge size, 9800 X 11768 pixels resolution)and 700 textual document images.
Address	Santorini; Greece; April 2016
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference	DAS
Notes	DAG; 600.061; 600.084; 600.077			Approved	no
Call Number	Admin @ si @ BRC2016			Serial	2757
Permanent link to this record



Author	Farshad Nourbakhsh; Dimosthenis Karatzas; Ernest Valveny
Title	A polar-based logo representation based on topological and colour features			Type	Conference Article
Year	2010	Publication	9th IAPR International Workshop on Document Analysis Systems	Abbreviated Journal
Volume		Issue		Pages	341–348
Keywords
Abstract	In this paper, we propose a novel rotation and scale invariant method for colour logo retrieval and classification, which involves performing a simple colour segmentation and subsequently describing each of the resultant colour components based on a set of topological and colour features. A polar representation is used to represent the logo and the subsequent logo matching is based on Cyclic Dynamic Time Warping (CDTW). We also show how combining information about the global distribution of the logo components and their local neighbourhood using the Delaunay triangulation allows to improve the results. All experiments are performed on a dataset of 2500 instances of 100 colour logo images in different rotations and scales.
Address	Boston; USA;
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN	978-1-60558-773-8	Medium
Area		Expedition		Conference	DAS
Notes	DAG			Approved	no
Call Number	DAG @ dag @ NKV2010			Serial	1436
Permanent link to this record



Author	Cesar Isaza; Joaquin Salas; Bogdan Raducanu
Title	Toward the Detection of Urban Infrastructures Edge Shadows			Type	Conference Article
Year	2010	Publication	12th International Conference on Advanced Concepts for Intelligent Vision Systems	Abbreviated Journal
Volume	6474	Issue	I	Pages	30–37
Keywords
Abstract	In this paper, we propose a novel technique to detect the shadows cast by urban infrastructure, such as buildings, billboards, and traffic signs, using a sequence of images taken from a fixed camera. In our approach, we compute two different background models in parallel: one for the edges and one for the reflected light intensity. An algorithm is proposed to train the system to distinguish between moving edges in general and edges that belong to static objects, creating an edge background model. Then, during operation, a background intensity model allow us to separate between moving and static objects. Those edges included in the moving objects and those that belong to the edge background model are subtracted from the current image edges. The remaining edges are the ones cast by urban infrastructure. Our method is tested on a typical crossroad scene and the results show that the approach is sound and promising.
Address	Sydney, Australia
Corporate Author				Thesis
Publisher	Springer Berlin Heidelberg	Place of Publication		Editor	eds. Blanc–Talon et al
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title	LNCS
Series Volume		Series Issue		Edition
ISSN	0302-9743	ISBN	978-3-642-17687-6	Medium
Area		Expedition		Conference	ACIVS
Notes	OR;MV			Approved	no
Call Number	BCNPCL @ bcnpcl @ ISR2010			Serial	1458
Permanent link to this record



Author	Carlo Gatta; Adriana Romero; Joost Van de Weijer
Title	Unrolling loopy top-down semantic feedback in convolutional deep networks			Type	Conference Article
Year	2014	Publication	Workshop on Deep Vision: Deep Learning for Computer Vision	Abbreviated Journal
Volume		Issue		Pages	498-505
Keywords
Abstract	In this paper, we propose a novel way to perform top-down semantic feedback in convolutional deep networks for efficient and accurate image parsing. We also show how to add global appearance/semantic features, which have shown to improve image parsing performance in state-of-the-art methods, and was not present in previous convolutional approaches. The proposed method is characterised by an efficient training and a sufficiently fast testing. We use the well known SIFTflow dataset to numerically show the advantages provided by our contributions, and to compare with state-of-the-art image parsing convolutional based approaches.
Address	Columbus; Ohio; June 2014
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference	CVPRW
Notes	LAMP; MILAB; 601.160; 600.079			Approved	no
Call Number	Admin @ si @ GRW2014			Serial	2490
Permanent link to this record



Author	Cesar Isaza; Joaquin Salas; Bogdan Raducanu
Title	Synthetic ground truth dataset to detect shadow cast by static objects in outdoor			Type	Conference Article
Year	2012	Publication	1st International Workshop on Visual Interfaces for Ground Truth Collection in Computer Vision Applications	Abbreviated Journal
Volume		Issue		Pages	art. 11
Keywords
Abstract	In this paper, we propose a precise synthetic ground truth dataset to study the problem of detection of the shadows cast by static objects in outdoor environments during extended periods of time (days). For our dataset, we have created a virtual scenario using a rendering software. To increase the realism of the simulated environment, we have defined the scenario in a precise geographical location. In our dataset the sun is by far the main illumination source. The sun position during the simulation time takes into consideration factors related to the geographical location, such as the latitude, longitude, elevation above sea level, and precise image capturing day and time. In our simulation the camera remains fixed. The dataset consists of seven days of simulation, from 10:00am to 5:00pm. Images are captured every 10 seconds. The shadows' ground truth is automatically computed by the rendering software.
Address	Capri, Italy
Corporate Author				Thesis
Publisher	ACM	Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN	978-1-4503-1405-3	Medium
Area		Expedition		Conference	VIGTA
Notes	OR;MV			Approved	no
Call Number	Admin @ si @ ISR2012a			Serial	2037
Permanent link to this record



Author	Mohamed Ali Souibgui; Sanket Biswas; Andres Mafla; Ali Furkan Biten; Alicia Fornes; Yousri Kessentini; Josep Llados; Lluis Gomez; Dimosthenis Karatzas
Title	Text-DIAE: a self-supervised degradation invariant autoencoder for text recognition and document enhancement			Type	Conference Article
Year	2023	Publication	Proceedings of the 37th AAAI Conference on Artificial Intelligence	Abbreviated Journal
Volume	37	Issue	2	Pages
Keywords	Representation Learning for Vision; CV Applications; CV Language and Vision; ML Unsupervised; Self-Supervised Learning
Abstract	In this paper, we propose a Text-Degradation Invariant Auto Encoder (Text-DIAE), a self-supervised model designed to tackle two tasks, text recognition (handwritten or scene-text) and document image enhancement. We start by employing a transformer-based architecture that incorporates three pretext tasks as learning objectives to be optimized during pre-training without the usage of labelled data. Each of the pretext objectives is specifically tailored for the final downstream tasks. We conduct several ablation experiments that confirm the design choice of the selected pretext tasks. Importantly, the proposed model does not exhibit limitations of previous state-of-the-art methods based on contrastive losses, while at the same time requiring substantially fewer data samples to converge. Finally, we demonstrate that our method surpasses the state-of-the-art in existing supervised and self-supervised settings in handwritten and scene text recognition and document image enhancement. Our code and trained models will be made publicly available at https://github.com/dali92002/SSL-OCR
Address
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference	AAAI
Notes	DAG			Approved	no
Call Number	Admin @ si @ SBM2023			Serial	3848
Permanent link to this record



Author	C. Butakoff; Simone Balocco; F.M. Sukno; C. Hoogendoorn; C. Tobon-Gomez; G. Avegliano; A.F. Frangi
Title	Left-ventricular Epi- and Endocardium Extraction from 3D Ultrasound Images Using an Automatically Constructed 3D ASM			Type	Journal Article
Year	2016	Publication	Computer Methods in Biomechanics and Biomedical Engineering: Imaging and Visualization	Abbreviated Journal	CMBBE
Volume	4	Issue	5	Pages	265-280
Keywords	ASM; cardiac segmentation; statistical model; shape model; 3D ultrasound; cardiac segmentation
Abstract	In this paper, we propose an automatic method for constructing an active shape model (ASM) to segment the complete cardiac left ventricle in 3D ultrasound (3DUS) images, which avoids costly manual landmarking. The automatic construction of the ASM has already been addressed in the literature; however, the direct application of these methods to 3DUS is hampered by a high level of noise and artefacts. Therefore, we propose to construct the ASM by fusing the multidetector computed tomography data, to learn the shape, with the artificially generated 3DUS, in order to learn the neighbourhood of the boundaries. Our artificial images were generated by two approaches: a faster one that does not take into account the geometry of the transducer, and a more comprehensive one, implemented in Field II toolbox. The segmentation accuracy of our ASM was evaluated on 20 patients with left-ventricular asynchrony, demonstrating plausibility of the approach.
Address
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN	2168-1163	ISBN		Medium
Area		Expedition		Conference
Notes	MILAB			Approved	no
Call Number	Admin @ si @ BBS2016			Serial	2449
Permanent link to this record



Author	Razieh Rastgoo; Kourosh Kiani; Sergio Escalera
Title	Video-based Isolated Hand Sign Language Recognition Using a Deep Cascaded Model			Type	Journal Article
Year	2020	Publication	Multimedia Tools and Applications	Abbreviated Journal	MTAP
Volume	79	Issue		Pages	22965–22987
Keywords
Abstract	In this paper, we propose an efficient cascaded model for sign language recognition taking benefit from spatio-temporal hand-based information using deep learning approaches, especially Single Shot Detector (SSD), Convolutional Neural Network (CNN), and Long Short Term Memory (LSTM), from videos. Our simple yet efficient and accurate model includes two main parts: hand detection and sign recognition. Three types of spatial features, including hand features, Extra Spatial Hand Relation (ESHR) features, and Hand Pose (HP) features, have been fused in the model to feed to LSTM for temporal features extraction. We train SSD model for hand detection using some videos collected from five online sign dictionaries. Our model is evaluated on our proposed dataset (Rastgoo et al., Expert Syst Appl 150: 113336, 2020), including 10’000 sign videos for 100 Persian sign using 10 contributors in 10 different backgrounds, and isoGD dataset. Using the 5-fold cross-validation method, our model outperforms state-of-the-art alternatives in sign language recognition
Address
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference
Notes	HuPBA; no menciona			Approved	no
Call Number	Admin @ si @ RKE2020b			Serial	3442
Permanent link to this record



Author	Bogdan Raducanu; Fadi Dornaika
Title	Dynamic Facial Expression Recognition Using Laplacian Eigenmaps-Based Manifold Learning			Type	Conference Article
Year	2010	Publication	IEEE International Conference on Robotics and Automation	Abbreviated Journal
Volume		Issue		Pages	156–161
Keywords
Abstract	In this paper, we propose an integrated framework for tracking, modelling and recognition of facial expressions. The main contributions are: (i) a view- and texture independent scheme that exploits facial action parameters estimated by an appearance-based 3D face tracker; (ii) the complexity of the non-linear facial expression space is modelled through a manifold, whose structure is learned using Laplacian Eigenmaps. The projected facial expressions are afterwards recognized based on Nearest Neighbor classifier; (iii) with the proposed approach, we developed an application for an AIBO robot, in which it mirrors the perceived facial expression.
Address	Anchorage; AK; USA;
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN	1050-4729	ISBN	978-1-4244-5038-1	Medium
Area		Expedition		Conference	ICRA
Notes	OR; MV			Approved	no
Call Number	BCNPCL @ bcnpcl @ RaD2010			Serial	1310
Permanent link to this record



Author	David Aldavert; Arnau Ramisa; Ramon Lopez de Mantaras; Ricardo Toledo
Title	Real-time Object Segmentation using a Bag of Features Approach			Type	Conference Article
Year	2010	Publication	13th International Conference of the Catalan Association for Artificial Intelligence	Abbreviated Journal
Volume	220	Issue		Pages	321–329
Keywords	Object Segmentation; Bag Of Features; Feature Quantization; Densely sampled descriptors
Abstract	In this paper, we propose an object segmentation framework, based on the popular bag of features (BoF), which can process several images per second while achieving a good segmentation accuracy assigning an object category to every pixel of the image. We propose an efficient color descriptor to complement the information obtained by a typical gradient-based local descriptor. Results show that color proves to be a useful cue to increase the segmentation accuracy, specially in large homogeneous regions. Then, we extend the Hierarchical K-Means codebook using the recently proposed Vector of Locally Aggregated Descriptors method. Finally, we show that the BoF method can be easily parallelized since it is applied locally, thus the time necessary to process an image is further reduced. The performance of the proposed method is evaluated in the standard PASCAL 2007 Segmentation Challenge object segmentation dataset.
Address
Corporate Author				Thesis
Publisher	IOS Press Amsterdam,	Place of Publication		Editor	In R.Alquezar, A.Moreno, J.Aguilar.
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN	9781607506423	Medium
Area		Expedition		Conference	CCIA
Notes	ADAS			Approved	no
Call Number	Admin @ si @ ARL2010b			Serial	1417
Permanent link to this record



Author	Francisco Javier Orozco; Ognjen Rudovic; Jordi Gonzalez; Maja Pantic
Title	Hierarchical On-line Appearance-Based Tracking for 3D Head Pose, Eyebrows, Lips, Eyelids and Irises			Type	Journal Article
Year	2013	Publication	Image and Vision Computing	Abbreviated Journal	IMAVIS
Volume	31	Issue	4	Pages	322-340
Keywords	On-line appearance models; Levenberg–Marquardt algorithm; Line-search optimization; 3D face tracking; Facial action tracking; Eyelid tracking; Iris tracking
Abstract	In this paper, we propose an On-line Appearance-Based Tracker (OABT) for simultaneous tracking of 3D head pose, lips, eyebrows, eyelids and irises in monocular video sequences. In contrast to previously proposed tracking approaches, which deal with face and gaze tracking separately, our OABT can also be used for eyelid and iris tracking, as well as 3D head pose, lips and eyebrows facial actions tracking. Furthermore, our approach applies an on-line learning of changes in the appearance of the tracked target. Hence, the prior training of appearance models, which usually requires a large amount of labeled facial images, is avoided. Moreover, the proposed method is built upon a hierarchical combination of three OABTs, which are optimized using a Levenberg–Marquardt Algorithm (LMA) enhanced with line-search procedures. This, in turn, makes the proposed method robust to changes in lighting conditions, occlusions and translucent textures, as evidenced by our experiments. Finally, the proposed method achieves head and facial actions tracking in real-time.
Address
Corporate Author				Thesis
Publisher	Elsevier	Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference
Notes	ISE; 605.203; 302.012; 302.018; 600.049			Approved	no
Call Number	ORG2013			Serial	2221
Permanent link to this record



Author	Lluis Pere de las Heras; Oriol Ramos Terrades; Josep Llados
Title	Attributed Graph Grammar for floor plan analysis			Type	Conference Article
Year	2015	Publication	13th International Conference on Document Analysis and Recognition ICDAR2015	Abbreviated Journal
Volume		Issue		Pages	726 - 730
Keywords
Abstract	In this paper, we propose the use of an Attributed Graph Grammar as unique framework to model and recognize the structure of floor plans. This grammar represents a building as a hierarchical composition of structurally and semantically related elements, where common representations are learned stochastically from annotated data. Given an input image, the parsing consists on constructing that graph representation that better agrees with the probabilistic model defined by the grammar. The proposed method provides several advantages with respect to the traditional floor plan analysis techniques. It uses an unsupervised statistical approach for detecting walls that adapts to different graphical notations and relaxes strong structural assumptions such are straightness and orthogonality. Moreover, the independence between the knowledge model and the parsing implementation allows the method to learn automatically different building configurations and thus, to cope the existing variability. These advantages are clearly demonstrated by comparing it with the most recent floor plan interpretation techniques on 4 datasets of real floor plans with different notations.
Address	Nancy; France; August 2015
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference	ICDAR
Notes	DAG; 600.077; 600.061			Approved	no
Call Number	Admin @ si @ HRL2015b			Serial	2727
Permanent link to this record



Author	Shanxin Yuan; Guillermo Garcia-Hernando; Bjorn Stenger; Gyeongsik Moon; Ju Yong Chang; Kyoung Mu Lee; Pavlo Molchanov; Jan Kautz; Sina Honari; Liuhao Ge; Junsong Yuan; Xinghao Chen; Guijin Wang; Fan Yang; Kai Akiyama; Yang Wu; Qingfu Wan; Meysam Madadi; Sergio Escalera; Shile Li; Dongheui Lee; Iason Oikonomidis; Antonis Argyros; Tae-Kyun Kim
Title	Depth-Based 3D Hand Pose Estimation: From Current Achievements to Future Goals			Type	Conference Article
Year	2018	Publication	31st IEEE Conference on Computer Vision and Pattern Recognition	Abbreviated Journal
Volume		Issue		Pages	2636 - 2645
Keywords	Three-dimensional displays; Task analysis; Pose estimation; Two dimensional displays; Joints; Training; Solid modeling
Abstract	In this paper, we strive to answer two questions: What is the current state of 3D hand pose estimation from depth images? And, what are the next challenges that need to be tackled? Following the successful Hands In the Million Challenge (HIM2017), we investigate the top 10 state-of-the-art methods on three tasks: single frame 3D pose estimation, 3D hand tracking, and hand pose estimation during object interaction. We analyze the performance of different CNN structures with regard to hand shape, joint visibility, view point and articulation distributions. Our findings include: (1) isolated 3D hand pose estimation achieves low mean errors (10 mm) in the view point range of [70, 120] degrees, but it is far from being solved for extreme view points; (2) 3D volumetric representations outperform 2D CNNs, better capturing the spatial structure of the depth data; (3) Discriminative methods still generalize poorly to unseen hand shapes; (4) While joint occlusions pose a challenge for most methods, explicit modeling of structure constraints can significantly narrow the gap between errors on visible and occluded joints.
Address	Salt Lake City; USA; June 2018
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference	CVPR
Notes	HUPBA; no proj			Approved	no
Call Number	Admin @ si @ YGS2018			Serial	3115
Permanent link to this record



Author	Marçal Rusiñol; Josep Llados
Title	Boosting the Handwritten Word Spotting Experience by Including the User in the Loop			Type	Journal Article
Year	2014	Publication	Pattern Recognition	Abbreviated Journal	PR
Volume	47	Issue	3	Pages	1063–1072
Keywords	Handwritten word spotting; Query by example; Relevance feedback; Query fusion; Multidimensional scaling
Abstract	In this paper, we study the effect of taking the user into account in a query-by-example handwritten word spotting framework. Several off-the-shelf query fusion and relevance feedback strategies have been tested in the handwritten word spotting context. The increase in terms of precision when the user is included in the loop is assessed using two datasets of historical handwritten documents and two baseline word spotting approaches both based on the bag-of-visual-words model. We finally present two alternative ways of presenting the results to the user that might be more attractive and suitable to the user's needs than the classic ranked list.
Address
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN	0031-3203	ISBN		Medium
Area		Expedition		Conference
Notes	DAG; 600.045; 600.061; 600.077			Approved	no
Call Number	Admin @ si @ RuL2013			Serial	2343
Permanent link to this record



Author	Juan Ignacio Toledo; Jordi Cucurull; Jordi Puiggali; Alicia Fornes; Josep Llados
Title	Document Analysis Techniques for Automatic Electoral Document Processing: A Survey			Type	Conference Article
Year	2015	Publication	E-Voting and Identity, Proceedings of 5th international conference, VoteID 2015	Abbreviated Journal
Volume		Issue		Pages	139-141
Keywords	Document image analysis; Computer vision; Paper ballots; Paper based elections; Optical scan; Tally
Abstract	In this paper, we will discuss the most common challenges in electoral document processing and study the different solutions from the document analysis community that can be applied in each case. We will cover Optical Mark Recognition techniques to detect voter selections in the Australian Ballot, handwritten number recognition for preferential elections and handwriting recognition for write-in areas. We will also propose some particular adjustments that can be made to those general techniques in the specific context of electoral documents.
Address	Bern; Switzerland; September 2015
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title	LNCS
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference	VoteID
Notes	DAG; 600.061; 602.006; 600.077			Approved	no
Call Number	Admin @ si @ TCP2015			Serial	2641
Permanent link to this record