|   | 
Details
   web
Records
Author Alicia Fornes; Sergio Escalera; Josep Llados; Ernest Valveny
Title (down) Symbol Classification using Dynamic Aligned Shape Descriptor Type Conference Article
Year 2010 Publication 20th International Conference on Pattern Recognition Abbreviated Journal
Volume Issue Pages 1957–1960
Keywords
Abstract Shape representation is a difficult task because of several symbol distortions, such as occlusions, elastic deformations, gaps or noise. In this paper, we propose a new descriptor and distance computation for coping with the problem of symbol recognition in the domain of Graphical Document Image Analysis. The proposed D-Shape descriptor encodes the arrangement information of object parts in a circular structure, allowing different levels of distortion. The classification is performed using a cyclic Dynamic Time Warping based method, allowing distortions and rotation. The methodology has been validated on different data sets, showing very high recognition rates.
Address Istanbul (Turkey)
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN 1051-4651 ISBN 978-1-4244-7542-1 Medium
Area Expedition Conference ICPR
Notes DAG; HUPBA; MILAB Approved no
Call Number BCNPCL @ bcnpcl @ FEL2010 Serial 1421
Permanent link to this record
 

 
Author Mohamed Ramzy Ibrahim; Robert Benavente; Daniel Ponsa; Felipe Lumbreras
Title (down) SWViT-RRDB: Shifted Window Vision Transformer Integrating Residual in Residual Dense Block for Remote Sensing Super-Resolution Type Conference Article
Year 2024 Publication 19th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications Abbreviated Journal
Volume Issue Pages
Keywords
Abstract Remote sensing applications, impacted by acquisition season and sensor variety, require high-resolution images. Transformer-based models improve satellite image super-resolution but are less effective than convolutional neural networks (CNNs) at extracting local details, crucial for image clarity. This paper introduces SWViT-RRDB, a new deep learning model for satellite imagery super-resolution. The SWViT-RRDB, combining transformer with convolution and attention blocks, overcomes the limitations of existing models by better representing small objects in satellite images. In this model, a pipeline of residual fusion group (RFG) blocks is used to combine the multi-headed self-attention (MSA) with residual in residual dense block (RRDB). This combines global and local image data for better super-resolution. Additionally, an overlapping cross-attention block (OCAB) is used to enhance fusion and allow interaction between neighboring pixels to maintain long-range pixel dependencies across the image. The SWViT-RRDB model and its larger variants outperform state-of-the-art (SoTA) models on two different satellite datasets in terms of PSNR and SSIM.
Address Roma; Italia; February 2024
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference
Notes MSIAU Approved no
Call Number Admin @ si @ RBP2024 Serial 4004
Permanent link to this record
 

 
Author Olivier Penacchio; Laura Dempere-Marco; Xavier Otazu
Title (down) Switching off brightness induction through induction-reversed images Type Abstract
Year 2012 Publication Perception Abbreviated Journal PER
Volume 41 Issue Pages 208
Keywords
Abstract Brightness induction is the modulation of the perceived intensity of an
area by the luminance of surrounding areas. Although V1 is traditionally regarded as
an area mostly responsive to retinal information, neurophysiological evidence
suggests that it may explicitly represent brightness information. In this work, we
investigate possible neural mechanisms underlying brightness induction. To this end,
we consider the model by Z Li (1999 Computation and Neural Systems10187-212)
which is constrained by neurophysiological data and focuses on the part of V1
responsible for contextual influences. This model, which has proven to account for
phenomena such as contour detection and preattentive segmentation, shares with
brightness induction the relevant effect of contextual influences. Importantly, the
input to our network model derives from a complete multiscale and multiorientation
wavelet decomposition, which makes it possible to recover an image reflecting the
perceived luminance and successfully accounts for well known psychophysical
effects for both static and dynamic contexts. By further considering inverse problem
techniques we define induction-reversed images: given a target image, we build an
image whose perceived luminance matches the actual luminance of the original
stimulus, thus effectively canceling out brightness induction effects. We suggest that
induction-reversed images may help remove undesired perceptual effects and can
find potential applications in fields such as radiological image interpretation
Address
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference
Notes CIC Approved no
Call Number Admin @ si @ PDO2012a Serial 2180
Permanent link to this record
 

 
Author Ayan Banerjee; Sanket Biswas; Josep Llados; Umapada Pal
Title (down) SwinDocSegmenter: An End-to-End Unified Domain Adaptive Transformer for Document Instance Segmentation Type Conference Article
Year 2023 Publication 17th International Conference on Document Analysis and Recognition Abbreviated Journal
Volume 14187 Issue Pages 307–325
Keywords
Abstract Instance-level segmentation of documents consists in assigning a class-aware and instance-aware label to each pixel of the image. It is a key step in document parsing for their understanding. In this paper, we present a unified transformer encoder-decoder architecture for en-to-end instance segmentation of complex layouts in document images. The method adapts a contrastive training with a mixed query selection for anchor initialization in the decoder. Later on, it performs a dot product between the obtained query embeddings and the pixel embedding map (coming from the encoder) for semantic reasoning. Extensive experimentation on competitive benchmarks like PubLayNet, PRIMA, Historical Japanese (HJ), and TableBank demonstrate that our model with SwinL backbone achieves better segmentation performance than the existing state-of-the-art approaches with the average precision of 93.72, 54.39, 84.65 and 98.04 respectively under one billion parameters. The code is made publicly available at: github.com/ayanban011/SwinDocSegmenter .
Address San Jose; CA; USA; August 2023
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title LNCS
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference ICDAR
Notes DAG Approved no
Call Number Admin @ si @ BBL2023 Serial 3893
Permanent link to this record
 

 
Author S. Chanda; Oriol Ramos Terrades; Umapada Pal
Title (down) SVM Based Scheme for Thai and English Script Identification Type Conference Article
Year 2007 Publication 9th International Conference on Document Analysis and Recognition Abbreviated Journal
Volume 1 Issue Pages 551–555
Keywords
Abstract
Address Curitiba (Brazil)
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference ICDAR
Notes DAG Approved no
Call Number DAG @ dag @ CRP2007a Serial 885
Permanent link to this record
 

 
Author Ciprian Corneanu; Marc Oliu; Jeffrey F. Cohn; Sergio Escalera
Title (down) Survey on RGB, 3D, Thermal, and Multimodal Approaches for Facial Expression Recognition: History Type Journal Article
Year 2016 Publication IEEE Transactions on Pattern Analysis and Machine Intelligence Abbreviated Journal TPAMI
Volume 28 Issue 8 Pages 1548-1568
Keywords Facial expression; affect; emotion recognition; RGB; 3D; thermal; multimodal
Abstract Facial expressions are an important way through which humans interact socially. Building a system capable of automatically recognizing facial expressions from images and video has been an intense field of study in recent years. Interpreting such expressions remains challenging and much research is needed about the way they relate to human affect. This paper presents a general overview of automatic RGB, 3D, thermal and multimodal facial expression analysis. We define a new taxonomy for the field, encompassing all steps from face detection to facial expression recognition, and describe and classify the state of the art methods accordingly. We also present the important datasets and the bench-marking of most influential methods. We conclude with a general discussion about trends, important questions and future lines of research.
Address
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference
Notes HuPBA;MILAB; Approved no
Call Number Admin @ si @ COC2016 Serial 2718
Permanent link to this record
 

 
Author David Geronimo; Antonio Lopez; Angel Sappa; Thorsten Graf
Title (down) Survey on Pedestrian Detection for Advanced Driver Assistance Systems Type Journal Article
Year 2010 Publication IEEE Transaction on Pattern Analysis and Machine Intelligence Abbreviated Journal TPAMI
Volume 32 Issue 7 Pages 1239–1258
Keywords ADAS, pedestrian detection, on-board vision, survey
Abstract Advanced driver assistance systems (ADASs), and particularly pedestrian protection systems (PPSs), have become an active research area aimed at improving traffic safety. The major challenge of PPSs is the development of reliable on-board pedestrian detection systems. Due to the varying appearance of pedestrians (e.g., different clothes, changing size, aspect ratio, and dynamic shape) and the unstructured environment, it is very difficult to cope with the demanded robustness of this kind of system. Two problems arising in this research area are the lack of public benchmarks and the difficulty in reproducing many of the proposed methods, which makes it difficult to compare the approaches. As a result, surveying the literature by enumerating the proposals one-after-another is not the most useful way to provide a comparative point of view. Accordingly, we present a more convenient strategy to survey the different approaches. We divide the problem of detecting pedestrians from images into different processing steps, each with attached responsibilities. Then, the different proposed methods are analyzed and classified with respect to each processing stage, favoring a comparative viewpoint. Finally, discussion of the important topics is presented, putting special emphasis on the future needs and challenges.
Address
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN 0162-8828 ISBN Medium
Area Expedition Conference
Notes ADAS Approved no
Call Number ADAS @ adas @ GLS2010 Serial 1340
Permanent link to this record
 

 
Author Fatemeh Noroozi; Ciprian Corneanu; Dorota Kamińska; Tomasz Sapiński; Sergio Escalera; Gholamreza Anbarjafari
Title (down) Survey on Emotional Body Gesture Recognition Type Journal Article
Year 2021 Publication IEEE Transactions on Affective Computing Abbreviated Journal TAC
Volume 12 Issue 2 Pages 505 - 523
Keywords
Abstract Automatic emotion recognition has become a trending research topic in the past decade. While works based on facial expressions or speech abound, recognizing affect from body gestures remains a less explored topic. We present a new comprehensive survey hoping to boost research in the field. We first introduce emotional body gestures as a component of what is commonly known as “body language” and comment general aspects as gender differences and culture dependence. We then define a complete framework for automatic emotional body gesture recognition. We introduce person detection and comment static and dynamic body pose estimation methods both in RGB and 3D. We then comment the recent literature related to representation learning and emotion recognition from images of emotionally expressive gestures. We also discuss multi-modal approaches that combine speech or face with body gestures for improved emotion recognition. While pre-processing methodologies (e.g. human detection and pose estimation) are nowadays mature technologies fully developed for robust large scale analysis, we show that for emotion recognition the quantity of labelled data is scarce, there is no agreement on clearly defined output spaces and the representations are shallow and largely based on naive geometrical representations.
Address
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference
Notes HUPBA; no proj Approved no
Call Number Admin @ si @ NCK2021 Serial 3657
Permanent link to this record
 

 
Author Angel Sappa; Niki Aifanti; Sotiris Malassiotis; N. Grammalidis
Title (down) Survey of 3D Human Body Representations Type Book Chapter
Year 2005 Publication Encyclopedia of Information Science and Technology, 1(5):2696–2701 Abbreviated Journal
Volume Issue Pages
Keywords
Abstract
Address
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference
Notes Approved no
Call Number ADAS @ adas @ SAM2005a Serial 497
Permanent link to this record
 

 
Author Hao Fang; Ajian Liu; Jun Wan; Sergio Escalera; Hugo Jair Escalante; Zhen Lei
Title (down) Surveillance Face Presentation Attack Detection Challenge Type Conference Article
Year 2023 Publication Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops Abbreviated Journal
Volume Issue Pages 6360-6370
Keywords
Abstract Face Anti-spoofing (FAS) is essential to secure face recognition systems from various physical attacks. However, most of the studies lacked consideration of long-distance scenarios. Specifically, compared with FAS in traditional scenes such as phone unlocking, face payment, and self-service security inspection, FAS in long-distance such as station squares, parks, and self-service supermarkets are equally important, but it has not been sufficiently explored yet. In order to fill this gap in the FAS community, we collect a large-scale Surveillance High-Fidelity Mask (SuHiFiMask). SuHiFiMask contains 10,195 videos from 101 subjects of different age groups, which are collected by 7 mainstream surveillance cameras. Based on this dataset and protocol-3 for evaluating the robustness of the algorithm under quality changes, we organized a face presentation attack detection challenge in surveillance scenarios. It attracted 180 teams for the development phase with a total of 37 teams qualifying for the final round. The organization team re-verified and re-ran the submitted code and used the results as the final ranking. In this paper, we present an overview of the challenge, including an introduction to the dataset used, the definition of the protocol, the evaluation metrics, and the announcement of the competition results. Finally, we present the top-ranked algorithms and the research ideas provided by the competition for attack detection in long-range surveillance scenarios.
Address Vancouver; Canada; June 2023
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference CVPRW
Notes HuPBA Approved no
Call Number Admin @ si @ FLW2023 Serial 3917
Permanent link to this record
 

 
Author Hao Fang; Ajian Liu; Jun Wan; Sergio Escalera; Chenxu Zhao; Xu Zhang; Stan Z Li; Zhen Lei
Title (down) Surveillance Face Anti-spoofing Type Journal Article
Year 2024 Publication IEEE Transactions on Information Forensics and Security Abbreviated Journal TIFS
Volume 19 Issue Pages 1535-1546
Keywords
Abstract Face Anti-spoofing (FAS) is essential to secure face recognition systems from various physical attacks. However, recent research generally focuses on short-distance applications (i.e., phone unlocking) while lacking consideration of long-distance scenes (i.e., surveillance security checks). In order to promote relevant research and fill this gap in the community, we collect a large-scale Surveillance High-Fidelity Mask (SuHiFiMask) dataset captured under 40 surveillance scenes, which has 101 subjects from different age groups with 232 3D attacks (high-fidelity masks), 200 2D attacks (posters, portraits, and screens), and 2 adversarial attacks. In this scene, low image resolution and noise interference are new challenges faced in surveillance FAS. Together with the SuHiFiMask dataset, we propose a Contrastive Quality-Invariance Learning (CQIL) network to alleviate the performance degradation caused by image quality from three aspects: (1) An Image Quality Variable module (IQV) is introduced to recover image information associated with discrimination by combining the super-resolution network. (2) Using generated sample pairs to simulate quality variance distributions to help contrastive learning strategies obtain robust feature representation under quality variation. (3) A Separate Quality Network (SQN) is designed to learn discriminative features independent of image quality. Finally, a large number of experiments verify the quality of the SuHiFiMask dataset and the superiority of the proposed CQIL.
Address
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference
Notes HUPBA Approved no
Call Number Admin @ si @ FLW2024 Serial 3869
Permanent link to this record
 

 
Author Angel Sappa
Title (down) Surface Model Generation from Range Images of Industrial Environments Type Miscellaneous
Year 2004 Publication IEEE Int. Symp. on 3D Data Processing, Visualization and Transmission Abbreviated Journal
Volume Issue Pages
Keywords
Abstract
Address Thessaloniki (Greece)
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference
Notes Approved no
Call Number ADAS @ adas @ Sap2004b Serial 455
Permanent link to this record
 

 
Author Misael Rosales; Petia Radeva; Oriol Rodriguez; Debora Gil
Title (down) Suppression of IVUS Image Rotation. A Kinematic Approach Type Book Chapter
Year 2005 Publication Functional Imaging and Modeling of the Heart Abbreviated Journal LNCS
Volume 3504 Issue Pages 889-892
Keywords
Abstract IntraVascular Ultrasound (IVUS) is an exploratory technique used in interventional procedures that shows cross section images of arteries and provides qualitative information about the causes and severity of the arterial lumen narrowing. Cross section analysis as well as visualization of plaque extension in a vessel segment during the catheter imaging pullback are the technique main advantages. However, IVUS sequence exhibits a periodic rotation artifact that makes difficult the longitudinal lesion inspection and hinders any segmentation algorithm. In this paper we propose a new kinematic method to estimate and remove the image rotation of IVUS images sequences. Results on several IVUS sequences show good results and prompt some of the clinical applications to vessel dynamics study, and relation to vessel pathology.
Address
Corporate Author Thesis
Publisher Springer Berlin / Heidelberg Place of Publication Editor Frangi, Alejandro and Radeva, Petia and Santos, Andres and Hernandez, Monica
Language Summary Language Original Title
Series Editor Series Title Lecture Notes in Computer Science Abbreviated Series Title LNCS
Series Volume 3504 Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference
Notes IAM;MILAB Approved no
Call Number IAM @ iam @ RRR2005 Serial 1645
Permanent link to this record
 

 
Author Mohammad Ali Bagheri; Qigang Gao; Sergio Escalera
Title (down) Support Vector Machines with Time Series Distance Kernels for Action Classification Type Conference Article
Year 2016 Publication IEEE Winter Conference on Applications of Computer Vision Abbreviated Journal
Volume Issue Pages 1-7
Keywords
Abstract Despite the outperformance of Support Vector Machine (SVM) on many practical classification problems, the algorithm is not directly applicable to multi-dimensional trajectories having different lengths. In this paper, a new class of SVM that is applicable to trajectory classification, such as action recognition, is developed by incorporating two efficient time-series distances measures into the kernel function.
Dynamic Time Warping and Longest Common Subsequence distance measures along with their derivatives are
employed as the SVM kernel. In addition, the pairwise proximity learning strategy is utilized in order to make use of non-positive semi-definite kernels in the SVM formulation. The proposed method is employed for a challenging classification problem: action recognition by depth cameras using only skeleton data; and evaluated on three benchmark action datasets. Experimental results demonstrate the outperformance of our methodology compared to the state-ofthe-art on the considered datasets.
Address Lake Placid; NY (USA); March 2016
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference WACV
Notes HuPBA;MILAB; Approved no
Call Number Admin @ si @ BGE2016a Serial 2773
Permanent link to this record
 

 
Author A. Pujol; Jose Luis Alba; Juan J. Villanueva
Title (down) Supervised Hausdorff-based measures for face recognition. Type Miscellaneous
Year 2001 Publication Proceedings of the IX Spanish Symposium on Pattern Recognition and Image Analysis, 1:255–261. Abbreviated Journal
Volume Issue Pages
Keywords
Abstract
Address
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference
Notes Approved no
Call Number ISE @ ise @ PAV2001 Serial 148
Permanent link to this record