Records |
Author |
Razieh Rastgoo; Kourosh Kiani; Sergio Escalera |
Title |
Video-based Isolated Hand Sign Language Recognition Using a Deep Cascaded Model |
Type |
Journal Article |
Year |
2020 |
Publication |
Multimedia Tools and Applications |
Abbreviated Journal |
MTAP |
Volume |
79 |
Issue |
|
Pages |
22965–22987 |
Keywords |
|
Abstract ![sorted by Abstract field, descending order (down)](img/sort_desc.gif) |
In this paper, we propose an efficient cascaded model for sign language recognition taking benefit from spatio-temporal hand-based information using deep learning approaches, especially Single Shot Detector (SSD), Convolutional Neural Network (CNN), and Long Short Term Memory (LSTM), from videos. Our simple yet efficient and accurate model includes two main parts: hand detection and sign recognition. Three types of spatial features, including hand features, Extra Spatial Hand Relation (ESHR) features, and Hand Pose (HP) features, have been fused in the model to feed to LSTM for temporal features extraction. We train SSD model for hand detection using some videos collected from five online sign dictionaries. Our model is evaluated on our proposed dataset (Rastgoo et al., Expert Syst Appl 150: 113336, 2020), including 10’000 sign videos for 100 Persian sign using 10 contributors in 10 different backgrounds, and isoGD dataset. Using the 5-fold cross-validation method, our model outperforms state-of-the-art alternatives in sign language recognition |
Address |
|
Corporate Author |
|
Thesis |
|
Publisher |
|
Place of Publication |
|
Editor |
|
Language |
|
Summary Language |
|
Original Title |
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
Series Volume |
|
Series Issue |
|
Edition |
|
ISSN |
|
ISBN |
|
Medium |
|
Area |
|
Expedition |
|
Conference |
|
Notes |
HuPBA; no menciona |
Approved |
no |
Call Number |
Admin @ si @ RKE2020b |
Serial |
3442 |
Permanent link to this record |
|
|
|
Author |
C. Butakoff; Simone Balocco; F.M. Sukno; C. Hoogendoorn; C. Tobon-Gomez; G. Avegliano; A.F. Frangi |
Title |
Left-ventricular Epi- and Endocardium Extraction from 3D Ultrasound Images Using an Automatically Constructed 3D ASM |
Type |
Journal Article |
Year |
2016 |
Publication |
Computer Methods in Biomechanics and Biomedical Engineering: Imaging and Visualization |
Abbreviated Journal |
CMBBE |
Volume |
4 |
Issue |
5 |
Pages |
265-280 |
Keywords |
ASM; cardiac segmentation; statistical model; shape model; 3D ultrasound; cardiac segmentation |
Abstract ![sorted by Abstract field, descending order (down)](img/sort_desc.gif) |
In this paper, we propose an automatic method for constructing an active shape model (ASM) to segment the complete cardiac left ventricle in 3D ultrasound (3DUS) images, which avoids costly manual landmarking. The automatic construction of the ASM has already been addressed in the literature; however, the direct application of these methods to 3DUS is hampered by a high level of noise and artefacts. Therefore, we propose to construct the ASM by fusing the multidetector computed tomography data, to learn the shape, with the artificially generated 3DUS, in order to learn the neighbourhood of the boundaries. Our artificial images were generated by two approaches: a faster one that does not take into account the geometry of the transducer, and a more comprehensive one, implemented in Field II toolbox. The segmentation accuracy of our ASM was evaluated on 20 patients with left-ventricular asynchrony, demonstrating plausibility of the approach. |
Address |
|
Corporate Author |
|
Thesis |
|
Publisher |
|
Place of Publication |
|
Editor |
|
Language |
|
Summary Language |
|
Original Title |
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
Series Volume |
|
Series Issue |
|
Edition |
|
ISSN |
2168-1163 |
ISBN |
|
Medium |
|
Area |
|
Expedition |
|
Conference |
|
Notes |
MILAB |
Approved |
no |
Call Number |
Admin @ si @ BBS2016 |
Serial |
2449 |
Permanent link to this record |
|
|
|
Author |
Mohamed Ali Souibgui; Sanket Biswas; Andres Mafla; Ali Furkan Biten; Alicia Fornes; Yousri Kessentini; Josep Llados; Lluis Gomez; Dimosthenis Karatzas |
Title |
Text-DIAE: a self-supervised degradation invariant autoencoder for text recognition and document enhancement |
Type |
Conference Article |
Year |
2023 |
Publication |
Proceedings of the 37th AAAI Conference on Artificial Intelligence |
Abbreviated Journal |
|
Volume |
37 |
Issue |
2 |
Pages |
|
Keywords |
Representation Learning for Vision; CV Applications; CV Language and Vision; ML Unsupervised; Self-Supervised Learning |
Abstract ![sorted by Abstract field, descending order (down)](img/sort_desc.gif) |
In this paper, we propose a Text-Degradation Invariant Auto Encoder (Text-DIAE), a self-supervised model designed to tackle two tasks, text recognition (handwritten or scene-text) and document image enhancement. We start by employing a transformer-based architecture that incorporates three pretext tasks as learning objectives to be optimized during pre-training without the usage of labelled data. Each of the pretext objectives is specifically tailored for the final downstream tasks. We conduct several ablation experiments that confirm the design choice of the selected pretext tasks. Importantly, the proposed model does not exhibit limitations of previous state-of-the-art methods based on contrastive losses, while at the same time requiring substantially fewer data samples to converge. Finally, we demonstrate that our method surpasses the state-of-the-art in existing supervised and self-supervised settings in handwritten and scene text recognition and document image enhancement. Our code and trained models will be made publicly available at https://github.com/dali92002/SSL-OCR |
Address |
|
Corporate Author |
|
Thesis |
|
Publisher |
|
Place of Publication |
|
Editor |
|
Language |
|
Summary Language |
|
Original Title |
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
Series Volume |
|
Series Issue |
|
Edition |
|
ISSN |
|
ISBN |
|
Medium |
|
Area |
|
Expedition |
|
Conference |
AAAI |
Notes |
DAG |
Approved |
no |
Call Number |
Admin @ si @ SBM2023 |
Serial |
3848 |
Permanent link to this record |
|
|
|
Author |
Cesar Isaza; Joaquin Salas; Bogdan Raducanu |
Title |
Synthetic ground truth dataset to detect shadow cast by static objects in outdoor |
Type |
Conference Article |
Year |
2012 |
Publication |
1st International Workshop on Visual Interfaces for Ground Truth Collection in Computer Vision Applications |
Abbreviated Journal |
|
Volume |
|
Issue |
|
Pages |
art. 11 |
Keywords |
|
Abstract ![sorted by Abstract field, descending order (down)](img/sort_desc.gif) |
In this paper, we propose a precise synthetic ground truth dataset to study the problem of detection of the shadows cast by static objects in outdoor environments during extended periods of time (days). For our dataset, we have created a virtual scenario using a rendering software. To increase the realism of the simulated environment, we have defined the scenario in a precise geographical location. In our dataset the sun is by far the main illumination source. The sun position during the simulation time takes into consideration factors related to the geographical location, such as the latitude, longitude, elevation above sea level, and precise image capturing day and time. In our simulation the camera remains fixed. The dataset consists of seven days of simulation, from 10:00am to 5:00pm. Images are captured every 10 seconds. The shadows' ground truth is automatically computed by the rendering software. |
Address |
Capri, Italy |
Corporate Author |
|
Thesis |
|
Publisher |
ACM |
Place of Publication |
|
Editor |
|
Language |
|
Summary Language |
|
Original Title |
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
Series Volume |
|
Series Issue |
|
Edition |
|
ISSN |
|
ISBN |
978-1-4503-1405-3 |
Medium |
|
Area |
|
Expedition |
|
Conference |
VIGTA |
Notes |
OR;MV |
Approved |
no |
Call Number |
Admin @ si @ ISR2012a |
Serial |
2037 |
Permanent link to this record |
|
|
|
Author |
Carlo Gatta; Adriana Romero; Joost Van de Weijer |
Title |
Unrolling loopy top-down semantic feedback in convolutional deep networks |
Type |
Conference Article |
Year |
2014 |
Publication |
Workshop on Deep Vision: Deep Learning for Computer Vision |
Abbreviated Journal |
|
Volume |
|
Issue |
|
Pages |
498-505 |
Keywords |
|
Abstract ![sorted by Abstract field, descending order (down)](img/sort_desc.gif) |
In this paper, we propose a novel way to perform top-down semantic feedback in convolutional deep networks for efficient and accurate image parsing. We also show how to add global appearance/semantic features, which have shown to improve image parsing performance in state-of-the-art methods, and was not present in previous convolutional approaches. The proposed method is characterised by an efficient training and a sufficiently fast testing. We use the well known SIFTflow dataset to numerically show the advantages provided by our contributions, and to compare with state-of-the-art image parsing convolutional based approaches. |
Address |
Columbus; Ohio; June 2014 |
Corporate Author |
|
Thesis |
|
Publisher |
|
Place of Publication |
|
Editor |
|
Language |
|
Summary Language |
|
Original Title |
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
Series Volume |
|
Series Issue |
|
Edition |
|
ISSN |
|
ISBN |
|
Medium |
|
Area |
|
Expedition |
|
Conference |
CVPRW |
Notes |
LAMP; MILAB; 601.160; 600.079 |
Approved |
no |
Call Number |
Admin @ si @ GRW2014 |
Serial |
2490 |
Permanent link to this record |
|
|
|
Author |
Cesar Isaza; Joaquin Salas; Bogdan Raducanu |
Title |
Toward the Detection of Urban Infrastructures Edge Shadows |
Type |
Conference Article |
Year |
2010 |
Publication |
12th International Conference on Advanced Concepts for Intelligent Vision Systems |
Abbreviated Journal |
|
Volume |
6474 |
Issue |
I |
Pages |
30–37 |
Keywords |
|
Abstract ![sorted by Abstract field, descending order (down)](img/sort_desc.gif) |
In this paper, we propose a novel technique to detect the shadows cast by urban infrastructure, such as buildings, billboards, and traffic signs, using a sequence of images taken from a fixed camera. In our approach, we compute two different background models in parallel: one for the edges and one for the reflected light intensity. An algorithm is proposed to train the system to distinguish between moving edges in general and edges that belong to static objects, creating an edge background model. Then, during operation, a background intensity model allow us to separate between moving and static objects. Those edges included in the moving objects and those that belong to the edge background model are subtracted from the current image edges. The remaining edges are the ones cast by urban infrastructure. Our method is tested on a typical crossroad scene and the results show that the approach is sound and promising. |
Address |
Sydney, Australia |
Corporate Author |
|
Thesis |
|
Publisher |
Springer Berlin Heidelberg |
Place of Publication |
|
Editor |
eds. Blanc–Talon et al |
Language |
|
Summary Language |
|
Original Title |
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
LNCS |
Series Volume |
|
Series Issue |
|
Edition |
|
ISSN |
0302-9743 |
ISBN |
978-3-642-17687-6 |
Medium |
|
Area |
|
Expedition |
|
Conference |
ACIVS |
Notes |
OR;MV |
Approved |
no |
Call Number |
BCNPCL @ bcnpcl @ ISR2010 |
Serial |
1458 |
Permanent link to this record |
|
|
|
Author |
Farshad Nourbakhsh; Dimosthenis Karatzas; Ernest Valveny |
Title |
A polar-based logo representation based on topological and colour features |
Type |
Conference Article |
Year |
2010 |
Publication |
9th IAPR International Workshop on Document Analysis Systems |
Abbreviated Journal |
|
Volume |
|
Issue |
|
Pages |
341–348 |
Keywords |
|
Abstract ![sorted by Abstract field, descending order (down)](img/sort_desc.gif) |
In this paper, we propose a novel rotation and scale invariant method for colour logo retrieval and classification, which involves performing a simple colour segmentation and subsequently describing each of the resultant colour components based on a set of topological and colour features. A polar representation is used to represent the logo and the subsequent logo matching is based on Cyclic Dynamic Time Warping (CDTW). We also show how combining information about the global distribution of the logo components and their local neighbourhood using the Delaunay triangulation allows to improve the results. All experiments are performed on a dataset of 2500 instances of 100 colour logo images in different rotations and scales. |
Address |
Boston; USA; |
Corporate Author |
|
Thesis |
|
Publisher |
|
Place of Publication |
|
Editor |
|
Language |
|
Summary Language |
|
Original Title |
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
Series Volume |
|
Series Issue |
|
Edition |
|
ISSN |
|
ISBN |
978-1-60558-773-8 |
Medium |
|
Area |
|
Expedition |
|
Conference |
DAS |
Notes |
DAG |
Approved |
no |
Call Number |
DAG @ dag @ NKV2010 |
Serial |
1436 |
Permanent link to this record |
|
|
|
Author |
Q. Bao; Marçal Rusiñol; M.Coustaty; Muhammad Muzzamil Luqman; C.D. Tran; Jean-Marc Ogier |
Title |
Delaunay triangulation-based features for Camera-based document image retrieval system |
Type |
Conference Article |
Year |
2016 |
Publication |
12th IAPR Workshop on Document Analysis Systems |
Abbreviated Journal |
|
Volume |
|
Issue |
|
Pages |
1-6 |
Keywords |
Camera-based Document Image Retrieval; Delaunay Triangulation; Feature descriptors; Indexing |
Abstract ![sorted by Abstract field, descending order (down)](img/sort_desc.gif) |
In this paper, we propose a new feature vector, named DElaunay TRIangulation-based Features (DETRIF), for real-time camera-based document image retrieval. DETRIF is computed based on the geometrical constraints from each pair of adjacency triangles in delaunay triangulation which is constructed from centroids of connected components. Besides, we employ a hashing-based indexing system in order to evaluate the performance of DETRIF and to compare it with other systems such as LLAH and SRIF. The experimentation is carried out on two datasets comprising of 400 heterogeneous-content complex linguistic map images (huge size, 9800 X 11768 pixels resolution)and 700 textual document images. |
Address |
Santorini; Greece; April 2016 |
Corporate Author |
|
Thesis |
|
Publisher |
|
Place of Publication |
|
Editor |
|
Language |
|
Summary Language |
|
Original Title |
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
Series Volume |
|
Series Issue |
|
Edition |
|
ISSN |
|
ISBN |
|
Medium |
|
Area |
|
Expedition |
|
Conference |
DAS |
Notes |
DAG; 600.061; 600.084; 600.077 |
Approved |
no |
Call Number |
Admin @ si @ BRC2016 |
Serial |
2757 |
Permanent link to this record |
|
|
|
Author |
Youssef El Rhabi; Simon Loic; Brun Luc; Josep Llados; Felipe Lumbreras |
Title |
Information Theoretic Rotationwise Robust Binary Descriptor Learning |
Type |
Conference Article |
Year |
2016 |
Publication |
Joint IAPR International Workshops on Statistical Techniques in Pattern Recognition (SPR) and Structural and Syntactic Pattern Recognition (SSPR) |
Abbreviated Journal |
|
Volume |
|
Issue |
|
Pages |
368-378 |
Keywords |
|
Abstract ![sorted by Abstract field, descending order (down)](img/sort_desc.gif) |
In this paper, we propose a new data-driven approach for binary descriptor selection. In order to draw a clear analysis of common designs, we present a general information-theoretic selection paradigm. It encompasses several standard binary descriptor construction schemes, including a recent state-of-the-art one named BOLD. We pursue the same endeavor to increase the stability of the produced descriptors with respect to rotations. To achieve this goal, we have designed a novel offline selection criterion which is better adapted to the online matching procedure. The effectiveness of our approach is demonstrated on two standard datasets, where our descriptor is compared to BOLD and to several classical descriptors. In particular, it emerges that our approach can reproduce equivalent if not better performance as BOLD while relying on twice shorter descriptors. Such an improvement can be influential for real-time applications. |
Address |
Mérida; Mexico; November 2016 |
Corporate Author |
|
Thesis |
|
Publisher |
|
Place of Publication |
|
Editor |
|
Language |
|
Summary Language |
|
Original Title |
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
Series Volume |
|
Series Issue |
|
Edition |
|
ISSN |
|
ISBN |
|
Medium |
|
Area |
|
Expedition |
|
Conference |
S+SSPR |
Notes |
DAG; ADAS; 600.097; 600.086 |
Approved |
no |
Call Number |
Admin @ si @ RLL2016 |
Serial |
2871 |
Permanent link to this record |
|
|
|
Author |
Alejandro Cartas; Petia Radeva; Mariella Dimiccoli |
Title |
Modeling long-term interactions to enhance action recognition |
Type |
Conference Article |
Year |
2021 |
Publication |
25th International Conference on Pattern Recognition |
Abbreviated Journal |
|
Volume |
|
Issue |
|
Pages |
10351-10358 |
Keywords |
|
Abstract ![sorted by Abstract field, descending order (down)](img/sort_desc.gif) |
In this paper, we propose a new approach to under-stand actions in egocentric videos that exploits the semantics of object interactions at both frame and temporal levels. At the frame level, we use a region-based approach that takes as input a primary region roughly corresponding to the user hands and a set of secondary regions potentially corresponding to the interacting objects and calculates the action score through a CNN formulation. This information is then fed to a Hierarchical LongShort-Term Memory Network (HLSTM) that captures temporal dependencies between actions within and across shots. Ablation studies thoroughly validate the proposed approach, showing in particular that both levels of the HLSTM architecture contribute to performance improvement. Furthermore, quantitative comparisons show that the proposed approach outperforms the state-of-the-art in terms of action recognition on standard benchmarks,without relying on motion information |
Address |
January 2021 |
Corporate Author |
|
Thesis |
|
Publisher |
|
Place of Publication |
|
Editor |
|
Language |
|
Summary Language |
|
Original Title |
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
Series Volume |
|
Series Issue |
|
Edition |
|
ISSN |
|
ISBN |
|
Medium |
|
Area |
|
Expedition |
|
Conference |
ICPR |
Notes |
MILAB; |
Approved |
no |
Call Number |
Admin @ si @ CRD2021 |
Serial |
3626 |
Permanent link to this record |
|
|
|
Author |
Thanh Ha Do; Salvatore Tabbone; Oriol Ramos Terrades |
Title |
Text/graphic separation using a sparse representation with multi-learned dictionaries |
Type |
Conference Article |
Year |
2012 |
Publication |
21st International Conference on Pattern Recognition |
Abbreviated Journal |
|
Volume |
|
Issue |
|
Pages |
|
Keywords |
Graphics Recognition; Layout Analysis; Document Understandin |
Abstract ![sorted by Abstract field, descending order (down)](img/sort_desc.gif) |
In this paper, we propose a new approach to extract text regions from graphical documents. In our method, we first empirically construct two sequences of learned dictionaries for the text and graphical parts respectively. Then, we compute the sparse representations of all different sizes and non-overlapped document patches in these learned dictionaries. Based on these representations, each patch can be classified into the text or graphic category by comparing its reconstruction errors. Same-sized patches in one category are then merged together to define the corresponding text or graphic layers which are combined to createfinal text/graphic layer. Finally, in a post-processing step, text regions are further filtered out by using some learned thresholds. |
Address |
Tsukuba |
Corporate Author |
|
Thesis |
|
Publisher |
|
Place of Publication |
|
Editor |
|
Language |
|
Summary Language |
|
Original Title |
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
Series Volume |
|
Series Issue |
|
Edition |
|
ISSN |
|
ISBN |
|
Medium |
|
Area |
|
Expedition |
|
Conference |
ICPR |
Notes |
DAG |
Approved |
no |
Call Number |
Admin @ si @ DTR2012a |
Serial |
2135 |
Permanent link to this record |
|
|
|
Author |
Thanh Ha Do; Salvatore Tabbone; Oriol Ramos Terrades |
Title |
New Approach for Symbol Recognition Combining Shape Context of Interest Points with Sparse Representation |
Type |
Conference Article |
Year |
2013 |
Publication |
12th International Conference on Document Analysis and Recognition |
Abbreviated Journal |
|
Volume |
|
Issue |
|
Pages |
265-269 |
Keywords |
|
Abstract ![sorted by Abstract field, descending order (down)](img/sort_desc.gif) |
In this paper, we propose a new approach for symbol description. Our method is built based on the combination of shape context of interest points descriptor and sparse representation. More specifically, we first learn a dictionary describing shape context of interest point descriptors. Then, based on information retrieval techniques, we build a vector model for each symbol based on its sparse representation in a visual vocabulary whose visual words are columns in the learneddictionary. The retrieval task is performed by ranking symbols based on similarity between vector models. Evaluation of our method, using benchmark datasets, demonstrates the validity of our approach and shows that it outperforms related state-of-theart methods. |
Address |
Washington; USA; August 2013 |
Corporate Author |
|
Thesis |
|
Publisher |
|
Place of Publication |
|
Editor |
|
Language |
|
Summary Language |
|
Original Title |
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
Series Volume |
|
Series Issue |
|
Edition |
|
ISSN |
1520-5363 |
ISBN |
|
Medium |
|
Area |
|
Expedition |
|
Conference |
ICDAR |
Notes |
DAG |
Approved |
no |
Call Number |
Admin @ si @ DTR2013b |
Serial |
2331 |
Permanent link to this record |
|
|
|
Author |
Saiping Zhang; Luis Herranz; Marta Mrak; Marc Gorriz Blanch; Shuai Wan; Fuzheng Yang |
Title |
DCNGAN: A Deformable Convolution-Based GAN with QP Adaptation for Perceptual Quality Enhancement of Compressed Video |
Type |
Conference Article |
Year |
2022 |
Publication |
47th International Conference on Acoustics, Speech, and Signal Processing |
Abbreviated Journal |
|
Volume |
|
Issue |
|
Pages |
|
Keywords |
|
Abstract ![sorted by Abstract field, descending order (down)](img/sort_desc.gif) |
In this paper, we propose a deformable convolution-based generative adversarial network (DCNGAN) for perceptual quality enhancement of compressed videos. DCNGAN is also adaptive to the quantization parameters (QPs). Compared with optical flows, deformable convolutions are more effective and efficient to align frames. Deformable convolutions can operate on multiple frames, thus leveraging more temporal information, which is beneficial for enhancing the perceptual quality of compressed videos. Instead of aligning frames in a pairwise manner, the deformable convolution can process multiple frames simultaneously, which leads to lower computational complexity. Experimental results demonstrate that the proposed DCNGAN outperforms other state-of-the-art compressed video quality enhancement algorithms. |
Address |
Virtual; May 2022 |
Corporate Author |
|
Thesis |
|
Publisher |
|
Place of Publication |
|
Editor |
|
Language |
|
Summary Language |
|
Original Title |
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
Series Volume |
|
Series Issue |
|
Edition |
|
ISSN |
|
ISBN |
|
Medium |
|
Area |
|
Expedition |
|
Conference |
ICASSP |
Notes |
MACO; 600.161; 601.379 |
Approved |
no |
Call Number |
Admin @ si @ ZHM2022a |
Serial |
3765 |
Permanent link to this record |
|
|
|
Author |
Guillem Martinez; Maya Aghaei; Martin Dijkstra; Bhalaji Nagarajan; Femke Jaarsma; Jaap van de Loosdrecht; Petia Radeva; Klaas Dijkstra |
Title |
Hyper-Spectral Imaging for Overlapping Plastic Flakes Segmentation |
Type |
Conference Article |
Year |
2022 |
Publication |
47th International Conference on Acoustics, Speech, and Signal Processing |
Abbreviated Journal |
|
Volume |
|
Issue |
|
Pages |
|
Keywords |
Hyper-spectral imaging; plastic sorting; multi-label segmentation; bitfield encoding |
Abstract ![sorted by Abstract field, descending order (down)](img/sort_desc.gif) |
In this paper, we propose a deformable convolution-based generative adversarial network (DCNGAN) for perceptual quality enhancement of compressed videos. DCNGAN is also adaptive to the quantization parameters (QPs). Compared with optical flows, deformable convolutions are more effective and efficient to align frames. Deformable convolutions can operate on multiple frames, thus leveraging more temporal information, which is beneficial for enhancing the perceptual quality of compressed videos. Instead of aligning frames in a pairwise manner, the deformable convolution can process multiple frames simultaneously, which leads to lower computational complexity. Experimental results demonstrate that the proposed DCNGAN outperforms other state-of-the-art compressed video quality enhancement algorithms. |
Address |
Singapore; May 2022 |
Corporate Author |
|
Thesis |
|
Publisher |
|
Place of Publication |
|
Editor |
|
Language |
|
Summary Language |
|
Original Title |
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
Series Volume |
|
Series Issue |
|
Edition |
|
ISSN |
|
ISBN |
|
Medium |
|
Area |
|
Expedition |
|
Conference |
ICASSP |
Notes |
MILAB; no proj |
Approved |
no |
Call Number |
Admin @ si @ MAD2022 |
Serial |
3767 |
Permanent link to this record |
|
|
|
Author |
Sergio Escalera; Alicia Fornes; Oriol Pujol; Josep Llados; Petia Radeva |
Title |
Circular Blurred Shape Model for Multiclass Symbol Recognition |
Type |
Journal Article |
Year |
2011 |
Publication |
IEEE Transactions on Systems, Man and Cybernetics (Part B) (IEEE) |
Abbreviated Journal |
TSMCB |
Volume |
41 |
Issue |
2 |
Pages |
497-506 |
Keywords |
|
Abstract ![sorted by Abstract field, descending order (down)](img/sort_desc.gif) |
In this paper, we propose a circular blurred shape model descriptor to deal with the problem of symbol detection and classification as a particular case of object recognition. The feature extraction is performed by capturing the spatial arrangement of significant object characteristics in a correlogram structure. The shape information from objects is shared among correlogram regions, where a prior blurring degree defines the level of distortion allowed in the symbol, making the descriptor tolerant to irregular deformations. Moreover, the descriptor is rotation invariant by definition. We validate the effectiveness of the proposed descriptor in both the multiclass symbol recognition and symbol detection domains. In order to perform the symbol detection, the descriptors are learned using a cascade of classifiers. In the case of multiclass categorization, the new feature space is learned using a set of binary classifiers which are embedded in an error-correcting output code design. The results over four symbol data sets show the significant improvements of the proposed descriptor compared to the state-of-the-art descriptors. In particular, the results are even more significant in those cases where the symbols suffer from elastic deformations. |
Address |
|
Corporate Author |
|
Thesis |
|
Publisher |
|
Place of Publication |
|
Editor |
|
Language |
|
Summary Language |
|
Original Title |
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
Series Volume |
|
Series Issue |
|
Edition |
|
ISSN |
1083-4419 |
ISBN |
|
Medium |
|
Area |
|
Expedition |
|
Conference |
|
Notes |
MILAB; DAG;HuPBA |
Approved |
no |
Call Number |
Admin @ si @ EFP2011 |
Serial |
1784 |
Permanent link to this record |