|
Records |
Links |
|
Author |
Victor Ponce; Mario Gorga; Xavier Baro; Sergio Escalera |
|
|
Title |
Human Behavior Analysis from Video Data Using Bag-of-Gestures |
Type |
Conference Article |
|
Year |
2011 |
Publication |
22nd International Joint Conference on Artificial Intelligence |
Abbreviated Journal |
|
|
|
Volume |
3 |
Issue |
|
Pages |
2836-2837 |
|
|
Keywords |
|
|
|
Abstract |
Human Behavior Analysis in Uncontrolled Environments can be categorized in two main challenges: 1) Feature extraction and 2) Behavior analysis from a set of corporal language vocabulary. In this work, we present our achievements characterizing some simple behaviors from visual data on different real applications and discuss our plan for future work: low level vocabulary definition from bag-of-gesture units and high level modelling and inference of human behaviors. |
|
|
Address |
Barcelona |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
978-1-57735-516-8 |
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
IJCAI |
|
|
Notes |
HuPBA;MV |
Approved |
no |
|
|
Call Number |
Admin @ si @ PGB2011b |
Serial |
1770 |
|
Permanent link to this record |
|
|
|
|
Author |
Dorota Kaminska; Kadir Aktas; Davit Rizhinashvili; Danila Kuklyanov; Abdallah Hussein Sham; Sergio Escalera; Kamal Nasrollahi; Thomas B. Moeslund; Gholamreza Anbarjafari |
|
|
Title |
Two-stage Recognition and Beyond for Compound Facial Emotion Recognition |
Type |
Journal Article |
|
Year |
2021 |
Publication |
Electronics |
Abbreviated Journal |
ELEC |
|
|
Volume |
10 |
Issue |
22 |
Pages |
2847 |
|
|
Keywords |
compound emotion recognition; facial expression recognition; dominant and complementary emotion recognition; deep learning |
|
|
Abstract |
Facial emotion recognition is an inherently complex problem due to individual diversity in facial features and racial and cultural differences. Moreover, facial expressions typically reflect the mixture of people’s emotional statuses, which can be expressed using compound emotions. Compound facial emotion recognition makes the problem even more difficult because the discrimination between dominant and complementary emotions is usually weak. We have created a database that includes 31,250 facial images with different emotions of 115 subjects whose gender distribution is almost uniform to address compound emotion recognition. In addition, we have organized a competition based on the proposed dataset, held at FG workshop 2020. This paper analyzes the winner’s approach—a two-stage recognition method (1st stage, coarse recognition; 2nd stage, fine recognition), which enhances the classification of symmetrical emotion labels. |
|
|
Address |
|
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
|
|
|
Notes |
HUPBA; no proj |
Approved |
no |
|
|
Call Number |
Admin @ si @ KAR2021 |
Serial |
3642 |
|
Permanent link to this record |
|
|
|
|
Author |
Yaxing Wang; Luis Herranz; Joost Van de Weijer |
|
|
Title |
Mix and match networks: multi-domain alignment for unpaired image-to-image translation |
Type |
Journal Article |
|
Year |
2020 |
Publication |
International Journal of Computer Vision |
Abbreviated Journal |
IJCV |
|
|
Volume |
128 |
Issue |
|
Pages |
2849–2872 |
|
|
Keywords |
|
|
|
Abstract |
This paper addresses the problem of inferring unseen cross-modal image-to-image translations between multiple modalities. We assume that only some of the pairwise translations have been seen (i.e. trained) and infer the remaining unseen translations (where training pairs are not available). We propose mix and match networks, an approach where multiple encoders and decoders are aligned in such a way that the desired translation can be obtained by simply cascading the source encoder and the target decoder, even when they have not interacted during the training stage (i.e. unseen). The main challenge lies in the alignment of the latent representations at the bottlenecks of encoder-decoder pairs. We propose an architecture with several tools to encourage alignment, including autoencoders and robust side information and latent consistency losses. We show the benefits of our approach in terms of effectiveness and scalability compared with other pairwise image-to-image translation approaches. We also propose zero-pair cross-modal image translation, a challenging setting where the objective is inferring semantic segmentation from depth (and vice-versa) without explicit segmentation-depth pairs, and only from two (disjoint) segmentation-RGB and depth-RGB training sets. We observe that a certain part of the shared information between unseen modalities might not be reachable, so we further propose a variant that leverages pseudo-pairs which allows us to exploit this shared information between the unseen modalities |
|
|
Address |
|
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
|
|
|
Notes |
LAMP; 600.109; 600.106; 600.141; 600.120 |
Approved |
no |
|
|
Call Number |
Admin @ si @ WHW2020 |
Serial |
3424 |
|
Permanent link to this record |
|
|
|
|
Author |
Ivo Everts; Jan van Gemert; Theo Gevers |
|
|
Title |
Evaluation of Color STIPs for Human Action Recognition |
Type |
Conference Article |
|
Year |
2013 |
Publication |
IEEE Conference on Computer Vision and Pattern Recognition |
Abbreviated Journal |
|
|
|
Volume |
|
Issue |
|
Pages |
2850-2857 |
|
|
Keywords |
|
|
|
Abstract |
This paper is concerned with recognizing realistic human actions in videos based on spatio-temporal interest points (STIPs). Existing STIP-based action recognition approaches operate on intensity representations of the image data. Because of this, these approaches are sensitive to disturbing photometric phenomena such as highlights and shadows. Moreover, valuable information is neglected by discarding chromaticity from the photometric representation. These issues are addressed by Color STIPs. Color STIPs are multi-channel reformulations of existing intensity-based STIP detectors and descriptors, for which we consider a number of chromatic representations derived from the opponent color space. This enhanced modeling of appearance improves the quality of subsequent STIP detection and description. Color STIPs are shown to substantially outperform their intensity-based counterparts on the challenging UCF~sports, UCF11 and UCF50 action recognition benchmarks. Moreover, the results show that color STIPs are currently the single best low-level feature choice for STIP-based approaches to human action recognition. |
|
|
Address |
Portland; oregon; June 2013 |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
1063-6919 |
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
CVPR |
|
|
Notes |
ALTRES;ISE |
Approved |
no |
|
|
Call Number |
Admin @ si @ EGG2013 |
Serial |
2364 |
|
Permanent link to this record |
|
|
|
|
Author |
Ernest Valveny; Enric Marti |
|
|
Title |
A model for image generation and symbol recognition through the deformation of lineal shapes |
Type |
Journal Article |
|
Year |
2003 |
Publication |
Pattern Recognition Letters |
Abbreviated Journal |
PRL |
|
|
Volume |
24 |
Issue |
15 |
Pages |
2857-2867 |
|
|
Keywords |
|
|
|
Abstract |
We describe a general framework for the recognition of distorted images of lineal shapes, which relies on three items: a model to represent lineal shapes and their deformations, a model for the generation of distorted binary images and the combination of both models in a common probabilistic framework, where the generation of deformations is related to an internal energy, and the generation of binary images to an external energy. Then, recognition consists in the minimization of a global energy function, performed by using the EM algorithm. This general framework has been applied to the recognition of hand-drawn lineal symbols in graphic documents. |
|
|
Address |
|
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
Elsevier Science Inc. |
Place of Publication |
New York, NY, USA |
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
0167-8655 |
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
|
|
|
Notes |
DAG; IAM |
Approved |
no |
|
|
Call Number |
IAM @ iam @ VAM2003 |
Serial |
1653 |
|
Permanent link to this record |
|
|
|
|
Author |
Rahat Khan; Joost Van de Weijer; Fahad Shahbaz Khan; Damien Muselet; christophe Ducottet; Cecile Barat |
|
|
Title |
Discriminative Color Descriptors |
Type |
Conference Article |
|
Year |
2013 |
Publication |
IEEE Conference on Computer Vision and Pattern Recognition |
Abbreviated Journal |
|
|
|
Volume |
|
Issue |
|
Pages |
2866 - 2873 |
|
|
Keywords |
|
|
|
Abstract |
Color description is a challenging task because of large variations in RGB values which occur due to scene accidental events, such as shadows, shading, specularities, illuminant color changes, and changes in viewing geometry. Traditionally, this challenge has been addressed by capturing the variations in physics-based models, and deriving invariants for the undesired variations. The drawback of this approach is that sets of distinguishable colors in the original color space are mapped to the same value in the photometric invariant space. This results in a drop of discriminative power of the color description. In this paper we take an information theoretic approach to color description. We cluster color values together based on their discriminative power in a classification problem. The clustering has the explicit objective to minimize the drop of mutual information of the final representation. We show that such a color description automatically learns a certain degree of photometric invariance. We also show that a universal color representation, which is based on other data sets than the one at hand, can obtain competing performance. Experiments show that the proposed descriptor outperforms existing photometric invariants. Furthermore, we show that combined with shape description these color descriptors obtain excellent results on four challenging datasets, namely, PASCAL VOC 2007, Flowers-102, Stanford dogs-120 and Birds-200. |
|
|
Address |
Portland; Oregon; June 2013 |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
1063-6919 |
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
CVPR |
|
|
Notes |
CIC; 600.048 |
Approved |
no |
|
|
Call Number |
Admin @ si @ KWK2013a |
Serial |
2262 |
|
Permanent link to this record |
|
|
|
|
Author |
Hongxing Gao; Marçal Rusiñol; Dimosthenis Karatzas; Josep Llados |
|
|
Title |
Embedding Document Structure to Bag-of-Words through Pair-wise Stable Key-regions |
Type |
Conference Article |
|
Year |
2014 |
Publication |
22nd International Conference on Pattern Recognition |
Abbreviated Journal |
|
|
|
Volume |
|
Issue |
|
Pages |
2903 - 2908 |
|
|
Keywords |
|
|
|
Abstract |
Since the document structure carries valuable discriminative information, plenty of efforts have been made for extracting and understanding document structure among which layout analysis approaches are the most commonly used. In this paper, Distance Transform based MSER (DTMSER) is employed to efficiently extract the document structure as a dendrogram of key-regions which roughly correspond to structural elements such as characters, words and paragraphs. Inspired by the Bag
of Words (BoW) framework, we propose an efficient method for structural document matching by representing the document image as a histogram of key-region pairs encoding structural relationships.
Applied to the scenario of document image retrieval, experimental results demonstrate a remarkable improvement when comparing the proposed method with typical BoW and pyramidal BoW methods. |
|
|
Address |
Stockholm; Sweden; August 2014 |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
ICPR |
|
|
Notes |
DAG; 600.056; 600.061; 600.077 |
Approved |
no |
|
|
Call Number |
Admin @ si @ GRK2014b |
Serial |
2497 |
|
Permanent link to this record |
|
|
|
|
Author |
Ignasi Rius; Jordi Gonzalez; Javier Varona; Xavier Roca |
|
|
Title |
Action-specific motion prior for efficient bayesian 3D human body tracking |
Type |
Journal Article |
|
Year |
2009 |
Publication |
Pattern Recognition |
Abbreviated Journal |
PR |
|
|
Volume |
42 |
Issue |
11 |
Pages |
2907–2921 |
|
|
Keywords |
|
|
|
Abstract |
In this paper, we aim to reconstruct the 3D motion parameters of a human body
model from the known 2D positions of a reduced set of joints in the image plane.
Towards this end, an action-specific motion model is trained from a database of real
motion-captured performances. The learnt motion model is used within a particle
filtering framework as a priori knowledge on human motion. First, our dynamic
model guides the particles according to similar situations previously learnt. Then, the solution space is constrained so only feasible human postures are accepted as valid solutions at each time step. As a result, we are able to track the 3D configuration of the full human body from several cycles of walking motion sequences using only the 2D positions of a very reduced set of joints from lateral or frontal viewpoints. |
|
|
Address |
|
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
0031-3203 |
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
|
|
|
Notes |
ISE |
Approved |
no |
|
|
Call Number |
ISE @ ise @ RGV2009 |
Serial |
1159 |
|
Permanent link to this record |
|
|
|
|
Author |
Lu Yu; Vacit Oguz Yazici; Xialei Liu; Joost Van de Weijer; Yongmei Cheng; Arnau Ramisa |
|
|
Title |
Learning Metrics from Teachers: Compact Networks for Image Embedding |
Type |
Conference Article |
|
Year |
2019 |
Publication |
32nd IEEE Conference on Computer Vision and Pattern Recognition |
Abbreviated Journal |
|
|
|
Volume |
|
Issue |
|
Pages |
2907-2916 |
|
|
Keywords |
|
|
|
Abstract |
Metric learning networks are used to compute image embeddings, which are widely used in many applications such as image retrieval and face recognition. In this paper, we propose to use network distillation to efficiently compute image embeddings with small networks. Network distillation has been successfully applied to improve image classification, but has hardly been explored for metric learning. To do so, we propose two new loss functions that model the
communication of a deep teacher network to a small student network. We evaluate our system in several datasets, including CUB-200-2011, Cars-196, Stanford Online Products and show that embeddings computed using small student networks perform significantly better than those computed using standard networks of similar size. Results on a very compact network (MobileNet-0.25), which can be
used on mobile devices, show that the proposed method can greatly improve Recall@1 results from 27.5% to 44.6%. Furthermore, we investigate various aspects of distillation for embeddings, including hint and attention layers, semisupervised learning and cross quality distillation. |
|
|
Address |
Long beach; California; june 2019 |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
CVPR |
|
|
Notes |
LAMP; 600.109; 600.120 |
Approved |
no |
|
|
Call Number |
Admin @ si @ YYL2019 |
Serial |
3281 |
|
Permanent link to this record |
|
|
|
|
Author |
Josep Llados; Dimosthenis Karatzas; Joan Mas; Gemma Sanchez |
|
|
Title |
A Generic Architecture for the Conversion of Document Collections into Semantically Annotated Digital Archives |
Type |
Journal |
|
Year |
2008 |
Publication |
Journal of Universal Computer Science |
Abbreviated Journal |
|
|
|
Volume |
14 |
Issue |
18 |
Pages |
2912–2935 |
|
|
Keywords |
Median Graph, Graph Embedding, Graph Matching, Structural Pattern Recognition |
|
|
Abstract |
|
|
|
Address |
|
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
|
|
|
Notes |
DAG |
Approved |
no |
|
|
Call Number |
DAG @ dag @ LKM2008 |
Serial |
1142 |
|
Permanent link to this record |
|
|
|
|
Author |
Yunchao Gong; Svetlana Lazebnik; Albert Gordo; Florent Perronnin |
|
|
Title |
Iterative quantization: A procrustean approach to learning binary codes for Large-Scale Image Retrieval |
Type |
Journal Article |
|
Year |
2012 |
Publication |
IEEE Transactions on Pattern Analysis and Machine Intelligence |
Abbreviated Journal |
TPAMI |
|
|
Volume |
35 |
Issue |
12 |
Pages |
2916-2929 |
|
|
Keywords |
|
|
|
Abstract |
This paper addresses the problem of learning similarity-preserving binary codes for efficient similarity search in large-scale image collections. We formulate this problem in terms of finding a rotation of zero-centered data so as to minimize the quantization error of mapping this data to the vertices of a zero-centered binary hypercube, and propose a simple and efficient alternating minimization algorithm to accomplish this task. This algorithm, dubbed iterative quantization (ITQ), has connections to multi-class spectral clustering and to the orthogonal Procrustes problem, and it can be used both with unsupervised data embeddings such as PCA and supervised embeddings such as canonical correlation analysis (CCA). The resulting binary codes significantly outperform several other state-of-the-art methods. We also show that further performance improvements can result from transforming the data with a nonlinear kernel mapping prior to PCA or CCA. Finally, we demonstrate an application of ITQ to learning binary attributes or “classemes” on the ImageNet dataset. |
|
|
Address |
|
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
0162-8828 |
ISBN |
978-1-4577-0394-2 |
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
|
|
|
Notes |
DAG |
Approved |
no |
|
|
Call Number |
Admin @ si @ GLG 2012b |
Serial |
2008 |
|
Permanent link to this record |
|
|
|
|
Author |
Mohammed Al Rawi; Ernest Valveny |
|
|
Title |
Compact and Efficient Multitask Learning in Vision, Language and Speech |
Type |
Conference Article |
|
Year |
2019 |
Publication |
IEEE International Conference on Computer Vision Workshops |
Abbreviated Journal |
|
|
|
Volume |
|
Issue |
|
Pages |
2933-2942 |
|
|
Keywords |
|
|
|
Abstract |
Across-domain multitask learning is a challenging area of computer vision and machine learning due to the intra-similarities among class distributions. Addressing this problem to cope with the human cognition system by considering inter and intra-class categorization and recognition complicates the problem even further. We propose in this work an effective holistic and hierarchical learning by using a text embedding layer on top of a deep learning model. We also propose a novel sensory discriminator approach to resolve the collisions between different tasks and domains. We then train the model concurrently on textual sentiment analysis, speech recognition, image classification, action recognition from video, and handwriting word spotting of two different scripts (Arabic and English). The model we propose successfully learned different tasks across multiple domains. |
|
|
Address |
Seul; Korea; October 2019 |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
ICCVW |
|
|
Notes |
DAG; 600.121; 600.129 |
Approved |
no |
|
|
Call Number |
Admin @ si @ RaV2019 |
Serial |
3365 |
|
Permanent link to this record |
|
|
|
|
Author |
Mikhail Mozerov; Joost Van de Weijer |
|
|
Title |
One-view occlusion detection for stereo matching with a fully connected CRF model |
Type |
Journal Article |
|
Year |
2019 |
Publication |
IEEE Transactions on Image Processing |
Abbreviated Journal |
TIP |
|
|
Volume |
28 |
Issue |
6 |
Pages |
2936-2947 |
|
|
Keywords |
Stereo matching; energy minimization; fully connected MRF model; geodesic distance filter |
|
|
Abstract |
In this paper, we extend the standard belief propagation (BP) sequential technique proposed in the tree-reweighted sequential method [15] to the fully connected CRF models with the geodesic distance affinity. The proposed method has been applied to the stereo matching problem. Also a new approach to the BP marginal solution is proposed that we call one-view occlusion detection (OVOD). In contrast to the standard winner takes all (WTA) estimation, the proposed OVOD solution allows to find occluded regions in the disparity map and simultaneously improve the matching result. As a result we can perform only
one energy minimization process and avoid the cost calculation for the second view and the left-right check procedure. We show that the OVOD approach considerably improves results for cost augmentation and energy minimization techniques in comparison with the standard one-view affinity space implementation. We apply our method to the Middlebury data set and reach state-ofthe-art especially for median, average and mean squared error metrics. |
|
|
Address |
|
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
|
|
|
Notes |
LAMP; 600.098; 600.109; 602.133; 600.120 |
Approved |
no |
|
|
Call Number |
Admin @ si @ MoW2019 |
Serial |
3221 |
|
Permanent link to this record |
|
|
|
|
Author |
Ariel Amato; Mikhail Mozerov; Andrew Bagdanov; Jordi Gonzalez |
|
|
Title |
Accurate Moving Cast Shadow Suppression Based on Local Color Constancy detection |
Type |
Journal Article |
|
Year |
2011 |
Publication |
IEEE Transactions on Image Processing |
Abbreviated Journal |
TIP |
|
|
Volume |
20 |
Issue |
10 |
Pages |
2954 - 2966 |
|
|
Keywords |
|
|
|
Abstract |
This paper describes a novel framework for detection and suppression of properly shadowed regions for most possible scenarios occurring in real video sequences. Our approach requires no prior knowledge about the scene, nor is it restricted to specific scene structures. Furthermore, the technique can detect both achromatic and chromatic shadows even in the presence of camouflage that occurs when foreground regions are very similar in color to shadowed regions. The method exploits local color constancy properties due to reflectance suppression over shadowed regions. To detect shadowed regions in a scene, the values of the background image are divided by values of the current frame in the RGB color space. We show how this luminance ratio can be used to identify segments with low gradient constancy, which in turn distinguish shadows from foreground. Experimental results on a collection of publicly available datasets illustrate the superior performance of our method compared with the most sophisticated, state-of-the-art shadow detection algorithms. These results show that our approach is robust and accurate over a broad range of shadow types and challenging video conditions. |
|
|
Address |
|
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
1057-7149 |
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
|
|
|
Notes |
ISE |
Approved |
no |
|
|
Call Number |
Admin @ si @ AMB2011 |
Serial |
1716 |
|
Permanent link to this record |
|
|
|
|
Author |
Lu Yu; Yongmei Cheng; Joost Van de Weijer |
|
|
Title |
Weakly Supervised Domain-Specific Color Naming Based on Attention |
Type |
Conference Article |
|
Year |
2018 |
Publication |
24th International Conference on Pattern Recognition |
Abbreviated Journal |
|
|
|
Volume |
|
Issue |
|
Pages |
3019 - 3024 |
|
|
Keywords |
|
|
|
Abstract |
The majority of existing color naming methods focuses on the eleven basic color terms of the English language. However, in many applications, different sets of color names are used for the accurate description of objects. Labeling data to learn these domain-specific color names is an expensive and laborious task. Therefore, in this article we aim to learn color names from weakly labeled data. For this purpose, we add an attention branch to the color naming network. The attention branch is used to modulate the pixel-wise color naming predictions of the network. In experiments, we illustrate that the attention branch correctly identifies the relevant regions. Furthermore, we show that our method obtains state-of-the-art results for pixel-wise and image-wise classification on the EBAY dataset and is able to learn color names for various domains. |
|
|
Address |
Beijing; August 2018 |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
ICPR |
|
|
Notes |
LAMP; 600.109; 602.200; 600.120 |
Approved |
no |
|
|
Call Number |
Admin @ si @ YCW2018 |
Serial |
3243 |
|
Permanent link to this record |