|
Records |
Links |
|
Author |
Ivet Rafegas; Javier Vazquez; Robert Benavente; Maria Vanrell; Susana Alvarez |
|
|
Title |
Enhancing spatio-chromatic representation with more-than-three color coding for image description |
Type |
Journal Article |
|
Year |
2017 |
Publication |
Journal of the Optical Society of America A |
Abbreviated Journal |
JOSA A |
|
|
Volume |
34 |
Issue |
5 |
Pages |
827-837 |
|
|
Keywords |
|
|
|
Abstract |
Extraction of spatio-chromatic features from color images is usually performed independently on each color channel. Usual 3D color spaces, such as RGB, present a high inter-channel correlation for natural images. This correlation can be reduced using color-opponent representations, but the spatial structure of regions with small color differences is not fully captured in two generic Red-Green and Blue-Yellow channels. To overcome these problems, we propose a new color coding that is adapted to the specific content of each image. Our proposal is based on two steps: (a) setting the number of channels to the number of distinctive colors we find in each image (avoiding the problem of channel correlation), and (b) building a channel representation that maximizes contrast differences within each color channel (avoiding the problem of low local contrast). We call this approach more-than-three color coding (MTT) to enhance the fact that the number of channels is adapted to the image content. The higher color complexity an image has, the more channels can be used to represent it. Here we select distinctive colors as the most predominant in the image, which we call color pivots, and we build the new color coding using these color pivots as a basis. To evaluate the proposed approach we measure its efficiency in an image categorization task. We show how a generic descriptor improves its performance at the description level when applied on the MTT coding. |
|
|
Address |
|
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
|
|
|
Notes |
CIC; 600.087 |
Approved |
no |
|
|
Call Number |
Admin @ si @ RVB2017 |
Serial |
2892 |
|
Permanent link to this record |
|
|
|
|
Author |
Marçal Rusiñol; Josep Llados |
|
|
Title |
Flowchart Recognition in Patent Information Retrieval |
Type |
Book Chapter |
|
Year |
2017 |
Publication |
Current Challenges in Patent Information Retrieval |
Abbreviated Journal |
|
|
|
Volume |
37 |
Issue |
|
Pages |
351-368 |
|
|
Keywords |
|
|
|
Abstract |
|
|
|
Address |
|
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
Springer Berlin Heidelberg |
Place of Publication |
|
Editor |
M. Lupu; K. Mayer; N. Kando; A.J. Trippe |
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
|
|
|
Notes |
DAG; 600.097; 600.121 |
Approved |
no |
|
|
Call Number |
Admin @ si @ RuL2017 |
Serial |
2896 |
|
Permanent link to this record |
|
|
|
|
Author |
Pau Riba; Josep Llados; Alicia Fornes |
|
|
Title |
Error-tolerant coarse-to-fine matching model for hierarchical graphs |
Type |
Conference Article |
|
Year |
2017 |
Publication |
11th IAPR-TC-15 International Workshop on Graph-Based Representations in Pattern Recognition |
Abbreviated Journal |
|
|
|
Volume |
10310 |
Issue |
|
Pages |
107-117 |
|
|
Keywords |
Graph matching; Hierarchical graph; Graph-based representation; Coarse-to-fine matching |
|
|
Abstract |
Graph-based representations are effective tools to capture structural information from visual elements. However, retrieving a query graph from a large database of graphs implies a high computational complexity. Moreover, these representations are very sensitive to noise or small changes. In this work, a novel hierarchical graph representation is designed. Using graph clustering techniques adapted from graph-based social media analysis, we propose to generate a hierarchy able to deal with different levels of abstraction while keeping information about the topology. For the proposed representations, a coarse-to-fine matching method is defined. These approaches are validated using real scenarios such as classification of colour images and handwritten word spotting. |
|
|
Address |
Anacapri; Italy; May 2017 |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
Springer International Publishing |
Place of Publication |
|
Editor |
Pasquale Foggia; Cheng-Lin Liu; Mario Vento |
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
GbRPR |
|
|
Notes |
DAG; 600.097; 601.302; 600.121 |
Approved |
no |
|
|
Call Number |
Admin @ si @ RLF2017a |
Serial |
2951 |
|
Permanent link to this record |
|
|
|
|
Author |
Muhammad Anwer Rao; Fahad Shahbaz Khan; Joost Van de Weijer; Jorma Laaksonen |
|
|
Title |
Top-Down Deep Appearance Attention for Action Recognition |
Type |
Conference Article |
|
Year |
2017 |
Publication |
20th Scandinavian Conference on Image Analysis |
Abbreviated Journal |
|
|
|
Volume |
10269 |
Issue |
|
Pages |
297-309 |
|
|
Keywords |
Action recognition; CNNs; Feature fusion |
|
|
Abstract |
Recognizing human actions in videos is a challenging problem in computer vision. Recently, convolutional neural network based deep features have shown promising results for action recognition. In this paper, we investigate the problem of fusing deep appearance and motion cues for action recognition. We propose a video representation which combines deep appearance and motion based local convolutional features within the bag-of-deep-features framework. Firstly, dense deep appearance and motion based local convolutional features are extracted from spatial (RGB) and temporal (flow) networks, respectively. Both visual cues are processed in parallel by constructing separate visual vocabularies for appearance and motion. A category-specific appearance map is then learned to modulate the weights of the deep motion features. The proposed representation is discriminative and binds the deep local convolutional features to their spatial locations. Experiments are performed on two challenging datasets: JHMDB dataset with 21 action classes and ACT dataset with 43 categories. The results clearly demonstrate that our approach outperforms both standard approaches of early and late feature fusion. Further, our approach is only employing action labels and without exploiting body part information, but achieves competitive performance compared to the state-of-the-art deep features based approaches. |
|
|
Address |
Tromso; June 2017 |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
LNCS |
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
SCIA |
|
|
Notes |
LAMP; 600.109; 600.068; 600.120 |
Approved |
no |
|
|
Call Number |
Admin @ si @ RKW2017b |
Serial |
3039 |
|
Permanent link to this record |
|
|
|
|
Author |
Muhammad Anwer Rao; Fahad Shahbaz Khan; Joost Van de Weijer; Jorma Laaksonen |
|
|
Title |
Tex-Nets: Binary Patterns Encoded Convolutional Neural Networks for Texture Recognition |
Type |
Conference Article |
|
Year |
2017 |
Publication |
19th International Conference on Multimodal Interaction |
Abbreviated Journal |
|
|
|
Volume |
|
Issue |
|
Pages |
|
|
|
Keywords |
Convolutional Neural Networks; Texture Recognition; Local Binary Paterns |
|
|
Abstract |
Recognizing materials and textures in realistic imaging conditions is a challenging computer vision problem. For many years, local features based orderless representations were a dominant approach for texture recognition. Recently deep local features, extracted from the intermediate layers of a Convolutional Neural Network (CNN), are used as filter banks. These dense local descriptors from a deep model, when encoded with Fisher Vectors, have shown to provide excellent results for texture recognition. The CNN models, employed in such approaches, take RGB patches as input and train on a large amount of labeled images. We show that CNN models, which we call TEX-Nets, trained using mapped coded images with explicit texture information provide complementary information to the standard deep models trained on RGB patches. We further investigate two deep architectures, namely early and late fusion, to combine the texture and color information. Experiments on benchmark texture datasets clearly demonstrate that TEX-Nets provide complementary information to standard RGB deep network. Our approach provides a large gain of 4.8%, 3.5%, 2.6% and 4.1% respectively in accuracy on the DTD, KTH-TIPS-2a, KTH-TIPS-2b and Texture-10 datasets, compared to the standard RGB network of the same architecture. Further, our final combination leads to consistent improvements over the state-of-the-art on all four datasets. |
|
|
Address |
Glasgow; Scothland; November 2017 |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
ACM |
|
|
Notes |
LAMP; 600.109; 600.068; 600.120 |
Approved |
no |
|
|
Call Number |
Admin @ si @ RKW2017 |
Serial |
3038 |
|
Permanent link to this record |
|
|
|
|
Author |
Adria Rico; Alicia Fornes |
|
|
Title |
Camera-based Optical Music Recognition using a Convolutional Neural Network |
Type |
Conference Article |
|
Year |
2017 |
Publication |
12th IAPR International Workshop on Graphics Recognition |
Abbreviated Journal |
|
|
|
Volume |
|
Issue |
|
Pages |
27-28 |
|
|
Keywords |
optical music recognition; document analysis; convolutional neural network; deep learning |
|
|
Abstract |
Optical Music Recognition (OMR) consists in recognizing images of music scores. Contrary to expectation, the current OMR systems usually fail when recognizing images of scores captured by digital cameras and smartphones. In this work, we propose a camera-based OMR system based on Convolutional Neural Networks, showing promising preliminary results |
|
|
Address |
|
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
GREC |
|
|
Notes |
DAG;600.097; 600.121 |
Approved |
no |
|
|
Call Number |
Admin @ si @ RiF2017 |
Serial |
3059 |
|
Permanent link to this record |
|
|
|
|
Author |
Pau Rodriguez; Jordi Gonzalez; Jordi Cucurull; Josep M. Gonfaus; Xavier Roca |
|
|
Title |
Regularizing CNNs with Locally Constrained Decorrelations |
Type |
Conference Article |
|
Year |
2017 |
Publication |
5th International Conference on Learning Representations |
Abbreviated Journal |
|
|
|
Volume |
|
Issue |
|
Pages |
|
|
|
Keywords |
|
|
|
Abstract |
|
|
|
Address |
Toulon; France; April 2017 |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
ICLR |
|
|
Notes |
ISE; 602.143; 600.119; 600.098 |
Approved |
no |
|
|
Call Number |
Admin @ si @ RGC2017 |
Serial |
2927 |
|
Permanent link to this record |
|
|
|
|
Author |
Veronica Romero; Alicia Fornes; Enrique Vidal; Joan Andreu Sanchez |
|
|
Title |
Information Extraction in Handwritten Marriage Licenses Books Using the MGGI Methodology |
Type |
Conference Article |
|
Year |
2017 |
Publication |
8th Iberian Conference on Pattern Recognition and Image Analysis |
Abbreviated Journal |
|
|
|
Volume |
10255 |
Issue |
|
Pages |
287-294 |
|
|
Keywords |
Handwritten Text Recognition; Information extraction; Language modeling; MGGI; Categories-based language model |
|
|
Abstract |
Historical records of daily activities provide intriguing insights into the life of our ancestors, useful for demographic and genealogical research. For example, marriage license books have been used for centuries by ecclesiastical and secular institutions to register marriages. These books follow a simple structure of the text in the records with a evolutionary vocabulary, mainly composed of proper names that change along the time. This distinct vocabulary makes automatic transcription and semantic information extraction difficult tasks. In previous works we studied the use of category-based language models and how a Grammatical Inference technique known as MGGI could improve the accuracy of these tasks. In this work we analyze the main causes of the semantic errors observed in previous results and apply a better implementation of the MGGI technique to solve these problems. Using the resulting language model, transcription and information extraction experiments have been carried out, and the results support our proposed approach. |
|
|
Address |
Faro; Portugal; June 2017 |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
L.A. Alexandre; J.Salvador Sanchez; Joao M. F. Rodriguez |
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
LNCS |
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
978-3-319-58837-7 |
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
IbPRIA |
|
|
Notes |
DAG; 602.006; 600.097; 600.121 |
Approved |
no |
|
|
Call Number |
Admin @ si @ RFV2017 |
Serial |
2952 |
|
Permanent link to this record |
|
|
|
|
Author |
Pau Riba; Alicia Fornes; Josep Llados |
|
|
Title |
Towards the Alignment of Handwritten Music Scores |
Type |
Book Chapter |
|
Year |
2017 |
Publication |
International Workshop on Graphics Recognition. GREC 2015.Graphic Recognition. Current Trends and Challenges |
Abbreviated Journal |
|
|
|
Volume |
9657 |
Issue |
|
Pages |
103-116 |
|
|
Keywords |
Optical Music Recognition; Handwritten Music Scores; Dynamic Time Warping alignment |
|
|
Abstract |
It is very common to nd dierent versions of the same music work in archives of Opera Theaters. These dierences correspond to modications and annotations from the musicians. From the musicologist point of view, these variations are very interesting and deserve study.
This paper explores the alignment of music scores as a tool for automatically detecting the passages that contain such dierences. Given the diculties in the recognition of handwritten music scores, our goal is to align the music scores and at the same time, avoid the recognition of music elements as much as possible. After removing the sta lines, braces and ties, the bar lines are detected. Then, the bar units are described as a whole using the Blurred Shape Model. The bar units alignment is performed by using Dynamic Time Warping. The analysis of the alignment path is used to detect the variations in the music scores. The method has been evaluated on a subset of the CVC-MUSCIMA dataset, showing encouraging results. |
|
|
Address |
|
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
Bart Lamiroy; R Dueire Lins |
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
LNCS |
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
978-3-319-52158-9 |
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
|
|
|
Notes |
DAG; 600.097; 602.006; 600.121 |
Approved |
no |
|
|
Call Number |
Admin @ si @ RFL2017 |
Serial |
2955 |
|
Permanent link to this record |
|
|
|
|
Author |
Pau Riba; Anjan Dutta; Josep Llados; Alicia Fornes |
|
|
Title |
Graph-based deep learning for graphics classification |
Type |
Conference Article |
|
Year |
2017 |
Publication |
12th IAPR International Workshop on Graphics Recognition |
Abbreviated Journal |
|
|
|
Volume |
|
Issue |
|
Pages |
29-30 |
|
|
Keywords |
|
|
|
Abstract |
Graph-based representations are a common way to deal with graphics recognition problems. However, previous works were mainly focused on developing learning-free techniques. The success of deep learning frameworks have proved that learning is a powerful tool to solve many problems, however it is not straightforward to extend these methodologies to non euclidean data such as graphs. On the other hand, graphs are a good representational structure for graphical entities. In this work, we present some deep learning techniques that have been proposed in the literature for graph-based representations and
we show how they can be used in graphics recognition problems |
|
|
Address |
|
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
GREC |
|
|
Notes |
DAG; 600.097; 601.302; 600.121 |
Approved |
no |
|
|
Call Number |
Admin @ si @ RDL2017b |
Serial |
3058 |
|
Permanent link to this record |
|
|
|
|
Author |
Pau Riba; Anjan Dutta; Josep Llados; Alicia Fornes; Sounak Dey |
|
|
Title |
Improving Information Retrieval in Multiwriter Scenario by Exploiting the Similarity Graph of Document Terms |
Type |
Conference Article |
|
Year |
2017 |
Publication |
14th International Conference on Document Analysis and Recognition |
Abbreviated Journal |
|
|
|
Volume |
|
Issue |
|
Pages |
475-480 |
|
|
Keywords |
document terms; information retrieval; affinity graph; graph of document terms; multiwriter; graph diffusion |
|
|
Abstract |
Information Retrieval (IR) is the activity of obtaining information resources relevant to a questioned information. It usually retrieves a set of objects ranked according to the relevancy to the needed fact. In document analysis, information retrieval receives a lot of attention in terms of symbol and word spotting. However, through decades the community mostly focused either on printed or on single writer scenario, where the
state-of-the-art results have achieved reasonable performance on the available datasets. Nevertheless, the existing algorithms do not perform accordingly on multiwriter scenario. A graph representing relations between a set of objects is a structure where each node delineates an individual element and the similarity between them is represented as a weight on the connecting edge. In this paper, we explore different analytics of graphs constructed from words or graphical symbols, such as diffusion, shortest path, etc. to improve the performance of information retrieval methods in multiwriter scenario |
|
|
Address |
|
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
ICDAR |
|
|
Notes |
DAG; 600.097; 601.302; 600.121 |
Approved |
no |
|
|
Call Number |
Admin @ si @ RDL2017a |
Serial |
3053 |
|
Permanent link to this record |
|
|
|
|
Author |
E. Royer; J. Chazalon; Marçal Rusiñol; F. Bouchara |
|
|
Title |
Benchmarking Keypoint Filtering Approaches for Document Image Matching |
Type |
Conference Article |
|
Year |
2017 |
Publication |
14th International Conference on Document Analysis and Recognition |
Abbreviated Journal |
|
|
|
Volume |
|
Issue |
|
Pages |
|
|
|
Keywords |
|
|
|
Abstract |
Best Poster Award.
Reducing the amount of keypoints used to index an image is particularly interesting to control processing time and memory usage in real-time document image matching applications, like augmented documents or smartphone applications. This paper benchmarks two keypoint selection methods on a task consisting of reducing keypoint sets extracted from document images, while preserving detection and segmentation accuracy. We first study the different forms of keypoint filtering, and we introduce the use of the CORE selection method on
keypoints extracted from document images. Then, we extend a previously published benchmark by including evaluations of the new method, by adding the SURF-BRISK detection/description scheme, and by reporting processing speeds. Evaluations are conducted on the publicly available dataset of ICDAR2015 SmartDOC challenge 1. Finally, we prove that reducing the original keypoint set is always feasible and can be beneficial
not only to processing speed but also to accuracy. |
|
|
Address |
Kyoto; Japan; November 2017 |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
ICDAR |
|
|
Notes |
DAG; 600.084; 600.121 |
Approved |
no |
|
|
Call Number |
Admin @ si @ RCR2017 |
Serial |
3000 |
|
Permanent link to this record |
|
|
|
|
Author |
Pau Rodriguez; Guillem Cucurull; Josep M. Gonfaus; Xavier Roca; Jordi Gonzalez |
|
|
Title |
Age and gender recognition in the wild with deep attention |
Type |
Journal Article |
|
Year |
2017 |
Publication |
Pattern Recognition |
Abbreviated Journal |
PR |
|
|
Volume |
72 |
Issue |
|
Pages |
563-571 |
|
|
Keywords |
Age recognition; Gender recognition; Deep neural networks; Attention mechanisms |
|
|
Abstract |
Face analysis in images in the wild still pose a challenge for automatic age and gender recognition tasks, mainly due to their high variability in resolution, deformation, and occlusion. Although the performance has highly increased thanks to Convolutional Neural Networks (CNNs), it is still far from optimal when compared to other image recognition tasks, mainly because of the high sensitiveness of CNNs to facial variations. In this paper, inspired by biology and the recent success of attention mechanisms on visual question answering and fine-grained recognition, we propose a novel feedforward attention mechanism that is able to discover the most informative and reliable parts of a given face for improving age and gender classification. In particular, given a downsampled facial image, the proposed model is trained based on a novel end-to-end learning framework to extract the most discriminative patches from the original high-resolution image. Experimental validation on the standard Adience, Images of Groups, and MORPH II benchmarks show that including attention mechanisms enhances the performance of CNNs in terms of robustness and accuracy. |
|
|
Address |
|
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
|
|
|
Notes |
ISE; 600.098; 602.133; 600.119 |
Approved |
no |
|
|
Call Number |
Admin @ si @ RCG2017b |
Serial |
2962 |
|
Permanent link to this record |
|
|
|
|
Author |
Pau Rodriguez; Guillem Cucurull; Jordi Gonzalez; Josep M. Gonfaus; Kamal Nasrollahi; Thomas B. Moeslund; Xavier Roca |
|
|
Title |
Deep Pain: Exploiting Long Short-Term Memory Networks for Facial Expression Classification |
Type |
Journal Article |
|
Year |
2017 |
Publication |
IEEE Transactions on cybernetics |
Abbreviated Journal |
Cyber |
|
|
Volume |
|
Issue |
|
Pages |
1-11 |
|
|
Keywords |
|
|
|
Abstract |
Pain is an unpleasant feeling that has been shown to be an important factor for the recovery of patients. Since this is costly in human resources and difficult to do objectively, there is the need for automatic systems to measure it. In this paper, contrary to current state-of-the-art techniques in pain assessment, which are based on facial features only, we suggest that the performance can be enhanced by feeding the raw frames to deep learning models, outperforming the latest state-of-the-art results while also directly facing the problem of imbalanced data. As a baseline, our approach first uses convolutional neural networks (CNNs) to learn facial features from VGG_Faces, which are then linked to a long short-term memory to exploit the temporal relation between video frames. We further compare the performances of using the so popular schema based on the canonically normalized appearance versus taking into account the whole image. As a result, we outperform current state-of-the-art area under the curve performance in the UNBC-McMaster Shoulder Pain Expression Archive Database. In addition, to evaluate the generalization properties of our proposed methodology on facial motion recognition, we also report competitive results in the Cohn Kanade+ facial expression database. |
|
|
Address |
|
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
|
|
|
Notes |
ISE; 600.119; 600.098 |
Approved |
no |
|
|
Call Number |
Admin @ si @ RCG2017a |
Serial |
2926 |
|
Permanent link to this record |
|
|
|
|
Author |
Ivet Rafegas; Maria Vanrell |
|
|
Title |
Color representation in CNNs: parallelisms with biological vision |
Type |
Conference Article |
|
Year |
2017 |
Publication |
ICCV Workshop on Mutual Benefits ofr Cognitive and Computer Vision |
Abbreviated Journal |
|
|
|
Volume |
|
Issue |
|
Pages |
|
|
|
Keywords |
|
|
|
Abstract |
Convolutional Neural Networks (CNNs) trained for object recognition tasks present representational capabilities approaching to primate visual systems [1]. This provides a computational framework to explore how image features
are efficiently represented. Here, we dissect a trained CNN
[2] to study how color is represented. We use a classical methodology used in physiology that is measuring index of selectivity of individual neurons to specific features. We use ImageNet Dataset [20] images and synthetic versions
of them to quantify color tuning properties of artificial neurons to provide a classification of the network population.
We conclude three main levels of color representation showing some parallelisms with biological visual systems: (a) a decomposition in a circular hue space to represent single color regions with a wider hue sampling beyond the first
layer (V2), (b) the emergence of opponent low-dimensional spaces in early stages to represent color edges (V1); and (c) a strong entanglement between color and shape patterns representing object-parts (e.g. wheel of a car), objectshapes (e.g. faces) or object-surrounds configurations (e.g. blue sky surrounding an object) in deeper layers (V4 or IT). |
|
|
Address |
Venice; Italy; October 2017 |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
ICCV-MBCC |
|
|
Notes |
CIC; 600.087; 600.051 |
Approved |
no |
|
|
Call Number |
Admin @ si @ RaV2017 |
Serial |
2984 |
|
Permanent link to this record |