Publicacions CVC -- Query Results

[71–80] << 81 82 83 84 85 86 87 88 89 90 >> [91–100]

Details

Records
Author	Razieh Rastgoo; Kourosh Kiani; Sergio Escalera
Title	Video-based Isolated Hand Sign Language Recognition Using a Deep Cascaded Model			Type	Journal Article
Year	2020	Publication	Multimedia Tools and Applications	Abbreviated Journal	MTAP
Volume	79	Issue		Pages	22965–22987
Keywords
Abstract	In this paper, we propose an efficient cascaded model for sign language recognition taking benefit from spatio-temporal hand-based information using deep learning approaches, especially Single Shot Detector (SSD), Convolutional Neural Network (CNN), and Long Short Term Memory (LSTM), from videos. Our simple yet efficient and accurate model includes two main parts: hand detection and sign recognition. Three types of spatial features, including hand features, Extra Spatial Hand Relation (ESHR) features, and Hand Pose (HP) features, have been fused in the model to feed to LSTM for temporal features extraction. We train SSD model for hand detection using some videos collected from five online sign dictionaries. Our model is evaluated on our proposed dataset (Rastgoo et al., Expert Syst Appl 150: 113336, 2020), including 10’000 sign videos for 100 Persian sign using 10 contributors in 10 different backgrounds, and isoGD dataset. Using the 5-fold cross-validation method, our model outperforms state-of-the-art alternatives in sign language recognition
Address
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference
Notes	HuPBA; no menciona			Approved	no
Call Number	Admin @ si @ RKE2020b			Serial	3442
Permanent link to this record



Author	C. Butakoff; Simone Balocco; F.M. Sukno; C. Hoogendoorn; C. Tobon-Gomez; G. Avegliano; A.F. Frangi
Title	Left-ventricular Epi- and Endocardium Extraction from 3D Ultrasound Images Using an Automatically Constructed 3D ASM			Type	Journal Article
Year	2016	Publication	Computer Methods in Biomechanics and Biomedical Engineering: Imaging and Visualization	Abbreviated Journal	CMBBE
Volume	4	Issue	5	Pages	265-280
Keywords	ASM; cardiac segmentation; statistical model; shape model; 3D ultrasound; cardiac segmentation
Abstract	In this paper, we propose an automatic method for constructing an active shape model (ASM) to segment the complete cardiac left ventricle in 3D ultrasound (3DUS) images, which avoids costly manual landmarking. The automatic construction of the ASM has already been addressed in the literature; however, the direct application of these methods to 3DUS is hampered by a high level of noise and artefacts. Therefore, we propose to construct the ASM by fusing the multidetector computed tomography data, to learn the shape, with the artificially generated 3DUS, in order to learn the neighbourhood of the boundaries. Our artificial images were generated by two approaches: a faster one that does not take into account the geometry of the transducer, and a more comprehensive one, implemented in Field II toolbox. The segmentation accuracy of our ASM was evaluated on 20 patients with left-ventricular asynchrony, demonstrating plausibility of the approach.
Address
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN	2168-1163	ISBN		Medium
Area		Expedition		Conference
Notes	MILAB			Approved	no
Call Number	Admin @ si @ BBS2016			Serial	2449
Permanent link to this record



Author	Mohamed Ali Souibgui; Sanket Biswas; Andres Mafla; Ali Furkan Biten; Alicia Fornes; Yousri Kessentini; Josep Llados; Lluis Gomez; Dimosthenis Karatzas
Title	Text-DIAE: a self-supervised degradation invariant autoencoder for text recognition and document enhancement			Type	Conference Article
Year	2023	Publication	Proceedings of the 37th AAAI Conference on Artificial Intelligence	Abbreviated Journal
Volume	37	Issue	2	Pages
Keywords	Representation Learning for Vision; CV Applications; CV Language and Vision; ML Unsupervised; Self-Supervised Learning
Abstract	In this paper, we propose a Text-Degradation Invariant Auto Encoder (Text-DIAE), a self-supervised model designed to tackle two tasks, text recognition (handwritten or scene-text) and document image enhancement. We start by employing a transformer-based architecture that incorporates three pretext tasks as learning objectives to be optimized during pre-training without the usage of labelled data. Each of the pretext objectives is specifically tailored for the final downstream tasks. We conduct several ablation experiments that confirm the design choice of the selected pretext tasks. Importantly, the proposed model does not exhibit limitations of previous state-of-the-art methods based on contrastive losses, while at the same time requiring substantially fewer data samples to converge. Finally, we demonstrate that our method surpasses the state-of-the-art in existing supervised and self-supervised settings in handwritten and scene text recognition and document image enhancement. Our code and trained models will be made publicly available at https://github.com/dali92002/SSL-OCR
Address
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference	AAAI
Notes	DAG			Approved	no
Call Number	Admin @ si @ SBM2023			Serial	3848
Permanent link to this record



Author	Cesar Isaza; Joaquin Salas; Bogdan Raducanu
Title	Synthetic ground truth dataset to detect shadow cast by static objects in outdoor			Type	Conference Article
Year	2012	Publication	1st International Workshop on Visual Interfaces for Ground Truth Collection in Computer Vision Applications	Abbreviated Journal
Volume		Issue		Pages	art. 11
Keywords
Abstract	In this paper, we propose a precise synthetic ground truth dataset to study the problem of detection of the shadows cast by static objects in outdoor environments during extended periods of time (days). For our dataset, we have created a virtual scenario using a rendering software. To increase the realism of the simulated environment, we have defined the scenario in a precise geographical location. In our dataset the sun is by far the main illumination source. The sun position during the simulation time takes into consideration factors related to the geographical location, such as the latitude, longitude, elevation above sea level, and precise image capturing day and time. In our simulation the camera remains fixed. The dataset consists of seven days of simulation, from 10:00am to 5:00pm. Images are captured every 10 seconds. The shadows' ground truth is automatically computed by the rendering software.
Address	Capri, Italy
Corporate Author				Thesis
Publisher	ACM	Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN	978-1-4503-1405-3	Medium
Area		Expedition		Conference	VIGTA
Notes	OR;MV			Approved	no
Call Number	Admin @ si @ ISR2012a			Serial	2037
Permanent link to this record



Author	Carlo Gatta; Adriana Romero; Joost Van de Weijer
Title	Unrolling loopy top-down semantic feedback in convolutional deep networks			Type	Conference Article
Year	2014	Publication	Workshop on Deep Vision: Deep Learning for Computer Vision	Abbreviated Journal
Volume		Issue		Pages	498-505
Keywords
Abstract	In this paper, we propose a novel way to perform top-down semantic feedback in convolutional deep networks for efficient and accurate image parsing. We also show how to add global appearance/semantic features, which have shown to improve image parsing performance in state-of-the-art methods, and was not present in previous convolutional approaches. The proposed method is characterised by an efficient training and a sufficiently fast testing. We use the well known SIFTflow dataset to numerically show the advantages provided by our contributions, and to compare with state-of-the-art image parsing convolutional based approaches.
Address	Columbus; Ohio; June 2014
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference	CVPRW
Notes	LAMP; MILAB; 601.160; 600.079			Approved	no
Call Number	Admin @ si @ GRW2014			Serial	2490
Permanent link to this record



Author	Cesar Isaza; Joaquin Salas; Bogdan Raducanu
Title	Toward the Detection of Urban Infrastructures Edge Shadows			Type	Conference Article
Year	2010	Publication	12th International Conference on Advanced Concepts for Intelligent Vision Systems	Abbreviated Journal
Volume	6474	Issue	I	Pages	30–37
Keywords
Abstract	In this paper, we propose a novel technique to detect the shadows cast by urban infrastructure, such as buildings, billboards, and traffic signs, using a sequence of images taken from a fixed camera. In our approach, we compute two different background models in parallel: one for the edges and one for the reflected light intensity. An algorithm is proposed to train the system to distinguish between moving edges in general and edges that belong to static objects, creating an edge background model. Then, during operation, a background intensity model allow us to separate between moving and static objects. Those edges included in the moving objects and those that belong to the edge background model are subtracted from the current image edges. The remaining edges are the ones cast by urban infrastructure. Our method is tested on a typical crossroad scene and the results show that the approach is sound and promising.
Address	Sydney, Australia
Corporate Author				Thesis
Publisher	Springer Berlin Heidelberg	Place of Publication		Editor	eds. Blanc–Talon et al
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title	LNCS
Series Volume		Series Issue		Edition
ISSN	0302-9743	ISBN	978-3-642-17687-6	Medium
Area		Expedition		Conference	ACIVS
Notes	OR;MV			Approved	no
Call Number	BCNPCL @ bcnpcl @ ISR2010			Serial	1458
Permanent link to this record



Author	Farshad Nourbakhsh; Dimosthenis Karatzas; Ernest Valveny
Title	A polar-based logo representation based on topological and colour features			Type	Conference Article
Year	2010	Publication	9th IAPR International Workshop on Document Analysis Systems	Abbreviated Journal
Volume		Issue		Pages	341–348
Keywords
Abstract	In this paper, we propose a novel rotation and scale invariant method for colour logo retrieval and classification, which involves performing a simple colour segmentation and subsequently describing each of the resultant colour components based on a set of topological and colour features. A polar representation is used to represent the logo and the subsequent logo matching is based on Cyclic Dynamic Time Warping (CDTW). We also show how combining information about the global distribution of the logo components and their local neighbourhood using the Delaunay triangulation allows to improve the results. All experiments are performed on a dataset of 2500 instances of 100 colour logo images in different rotations and scales.
Address	Boston; USA;
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN	978-1-60558-773-8	Medium
Area		Expedition		Conference	DAS
Notes	DAG			Approved	no
Call Number	DAG @ dag @ NKV2010			Serial	1436
Permanent link to this record



Author	Q. Bao; Marçal Rusiñol; M.Coustaty; Muhammad Muzzamil Luqman; C.D. Tran; Jean-Marc Ogier
Title	Delaunay triangulation-based features for Camera-based document image retrieval system			Type	Conference Article
Year	2016	Publication	12th IAPR Workshop on Document Analysis Systems	Abbreviated Journal
Volume		Issue		Pages	1-6
Keywords	Camera-based Document Image Retrieval; Delaunay Triangulation; Feature descriptors; Indexing
Abstract	In this paper, we propose a new feature vector, named DElaunay TRIangulation-based Features (DETRIF), for real-time camera-based document image retrieval. DETRIF is computed based on the geometrical constraints from each pair of adjacency triangles in delaunay triangulation which is constructed from centroids of connected components. Besides, we employ a hashing-based indexing system in order to evaluate the performance of DETRIF and to compare it with other systems such as LLAH and SRIF. The experimentation is carried out on two datasets comprising of 400 heterogeneous-content complex linguistic map images (huge size, 9800 X 11768 pixels resolution)and 700 textual document images.
Address	Santorini; Greece; April 2016
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference	DAS
Notes	DAG; 600.061; 600.084; 600.077			Approved	no
Call Number	Admin @ si @ BRC2016			Serial	2757
Permanent link to this record



Author	Youssef El Rhabi; Simon Loic; Brun Luc; Josep Llados; Felipe Lumbreras
Title	Information Theoretic Rotationwise Robust Binary Descriptor Learning			Type	Conference Article
Year	2016	Publication	Joint IAPR International Workshops on Statistical Techniques in Pattern Recognition (SPR) and Structural and Syntactic Pattern Recognition (SSPR)	Abbreviated Journal
Volume		Issue		Pages	368-378
Keywords
Abstract	In this paper, we propose a new data-driven approach for binary descriptor selection. In order to draw a clear analysis of common designs, we present a general information-theoretic selection paradigm. It encompasses several standard binary descriptor construction schemes, including a recent state-of-the-art one named BOLD. We pursue the same endeavor to increase the stability of the produced descriptors with respect to rotations. To achieve this goal, we have designed a novel offline selection criterion which is better adapted to the online matching procedure. The effectiveness of our approach is demonstrated on two standard datasets, where our descriptor is compared to BOLD and to several classical descriptors. In particular, it emerges that our approach can reproduce equivalent if not better performance as BOLD while relying on twice shorter descriptors. Such an improvement can be influential for real-time applications.
Address	Mérida; Mexico; November 2016
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference	S+SSPR
Notes	DAG; ADAS; 600.097; 600.086			Approved	no
Call Number	Admin @ si @ RLL2016			Serial	2871
Permanent link to this record



Author	Alejandro Cartas; Petia Radeva; Mariella Dimiccoli
Title	Modeling long-term interactions to enhance action recognition			Type	Conference Article
Year	2021	Publication	25th International Conference on Pattern Recognition	Abbreviated Journal
Volume		Issue		Pages	10351-10358
Keywords
Abstract	In this paper, we propose a new approach to under-stand actions in egocentric videos that exploits the semantics of object interactions at both frame and temporal levels. At the frame level, we use a region-based approach that takes as input a primary region roughly corresponding to the user hands and a set of secondary regions potentially corresponding to the interacting objects and calculates the action score through a CNN formulation. This information is then fed to a Hierarchical LongShort-Term Memory Network (HLSTM) that captures temporal dependencies between actions within and across shots. Ablation studies thoroughly validate the proposed approach, showing in particular that both levels of the HLSTM architecture contribute to performance improvement. Furthermore, quantitative comparisons show that the proposed approach outperforms the state-of-the-art in terms of action recognition on standard benchmarks,without relying on motion information
Address	January 2021
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference	ICPR
Notes	MILAB;			Approved	no
Call Number	Admin @ si @ CRD2021			Serial	3626
Permanent link to this record



Author	Thanh Ha Do; Salvatore Tabbone; Oriol Ramos Terrades
Title	Text/graphic separation using a sparse representation with multi-learned dictionaries			Type	Conference Article
Year	2012	Publication	21st International Conference on Pattern Recognition	Abbreviated Journal
Volume		Issue		Pages
Keywords	Graphics Recognition; Layout Analysis; Document Understandin
Abstract	In this paper, we propose a new approach to extract text regions from graphical documents. In our method, we first empirically construct two sequences of learned dictionaries for the text and graphical parts respectively. Then, we compute the sparse representations of all different sizes and non-overlapped document patches in these learned dictionaries. Based on these representations, each patch can be classified into the text or graphic category by comparing its reconstruction errors. Same-sized patches in one category are then merged together to define the corresponding text or graphic layers which are combined to createfinal text/graphic layer. Finally, in a post-processing step, text regions are further filtered out by using some learned thresholds.
Address	Tsukuba
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference	ICPR
Notes	DAG			Approved	no
Call Number	Admin @ si @ DTR2012a			Serial	2135
Permanent link to this record



Author	Thanh Ha Do; Salvatore Tabbone; Oriol Ramos Terrades
Title	New Approach for Symbol Recognition Combining Shape Context of Interest Points with Sparse Representation			Type	Conference Article
Year	2013	Publication	12th International Conference on Document Analysis and Recognition	Abbreviated Journal
Volume		Issue		Pages	265-269
Keywords
Abstract	In this paper, we propose a new approach for symbol description. Our method is built based on the combination of shape context of interest points descriptor and sparse representation. More specifically, we first learn a dictionary describing shape context of interest point descriptors. Then, based on information retrieval techniques, we build a vector model for each symbol based on its sparse representation in a visual vocabulary whose visual words are columns in the learneddictionary. The retrieval task is performed by ranking symbols based on similarity between vector models. Evaluation of our method, using benchmark datasets, demonstrates the validity of our approach and shows that it outperforms related state-of-theart methods.
Address	Washington; USA; August 2013
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN	1520-5363	ISBN		Medium
Area		Expedition		Conference	ICDAR
Notes	DAG			Approved	no
Call Number	Admin @ si @ DTR2013b			Serial	2331
Permanent link to this record



Author	Saiping Zhang; Luis Herranz; Marta Mrak; Marc Gorriz Blanch; Shuai Wan; Fuzheng Yang
Title	DCNGAN: A Deformable Convolution-Based GAN with QP Adaptation for Perceptual Quality Enhancement of Compressed Video			Type	Conference Article
Year	2022	Publication	47th International Conference on Acoustics, Speech, and Signal Processing	Abbreviated Journal
Volume		Issue		Pages
Keywords
Abstract	In this paper, we propose a deformable convolution-based generative adversarial network (DCNGAN) for perceptual quality enhancement of compressed videos. DCNGAN is also adaptive to the quantization parameters (QPs). Compared with optical flows, deformable convolutions are more effective and efficient to align frames. Deformable convolutions can operate on multiple frames, thus leveraging more temporal information, which is beneficial for enhancing the perceptual quality of compressed videos. Instead of aligning frames in a pairwise manner, the deformable convolution can process multiple frames simultaneously, which leads to lower computational complexity. Experimental results demonstrate that the proposed DCNGAN outperforms other state-of-the-art compressed video quality enhancement algorithms.
Address	Virtual; May 2022
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference	ICASSP
Notes	MACO; 600.161; 601.379			Approved	no
Call Number	Admin @ si @ ZHM2022a			Serial	3765
Permanent link to this record



Author	Guillem Martinez; Maya Aghaei; Martin Dijkstra; Bhalaji Nagarajan; Femke Jaarsma; Jaap van de Loosdrecht; Petia Radeva; Klaas Dijkstra
Title	Hyper-Spectral Imaging for Overlapping Plastic Flakes Segmentation			Type	Conference Article
Year	2022	Publication	47th International Conference on Acoustics, Speech, and Signal Processing	Abbreviated Journal
Volume		Issue		Pages
Keywords	Hyper-spectral imaging; plastic sorting; multi-label segmentation; bitfield encoding
Abstract	In this paper, we propose a deformable convolution-based generative adversarial network (DCNGAN) for perceptual quality enhancement of compressed videos. DCNGAN is also adaptive to the quantization parameters (QPs). Compared with optical flows, deformable convolutions are more effective and efficient to align frames. Deformable convolutions can operate on multiple frames, thus leveraging more temporal information, which is beneficial for enhancing the perceptual quality of compressed videos. Instead of aligning frames in a pairwise manner, the deformable convolution can process multiple frames simultaneously, which leads to lower computational complexity. Experimental results demonstrate that the proposed DCNGAN outperforms other state-of-the-art compressed video quality enhancement algorithms.
Address	Singapore; May 2022
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference	ICASSP
Notes	MILAB; no proj			Approved	no
Call Number	Admin @ si @ MAD2022			Serial	3767
Permanent link to this record



Author	Sergio Escalera; Alicia Fornes; Oriol Pujol; Josep Llados; Petia Radeva
Title	Circular Blurred Shape Model for Multiclass Symbol Recognition			Type	Journal Article
Year	2011	Publication	IEEE Transactions on Systems, Man and Cybernetics (Part B) (IEEE)	Abbreviated Journal	TSMCB
Volume	41	Issue	2	Pages	497-506
Keywords
Abstract	In this paper, we propose a circular blurred shape model descriptor to deal with the problem of symbol detection and classification as a particular case of object recognition. The feature extraction is performed by capturing the spatial arrangement of significant object characteristics in a correlogram structure. The shape information from objects is shared among correlogram regions, where a prior blurring degree defines the level of distortion allowed in the symbol, making the descriptor tolerant to irregular deformations. Moreover, the descriptor is rotation invariant by definition. We validate the effectiveness of the proposed descriptor in both the multiclass symbol recognition and symbol detection domains. In order to perform the symbol detection, the descriptors are learned using a cascade of classifiers. In the case of multiclass categorization, the new feature space is learned using a set of binary classifiers which are embedded in an error-correcting output code design. The results over four symbol data sets show the significant improvements of the proposed descriptor compared to the state-of-the-art descriptors. In particular, the results are even more significant in those cases where the symbols suffer from elastic deformations.
Address
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN	1083-4419	ISBN		Medium
Area		Expedition		Conference
Notes	MILAB; DAG;HuPBA			Approved	no
Call Number	Admin @ si @ EFP2011			Serial	1784
Permanent link to this record