Home | << 1 2 3 4 5 6 7 8 9 10 >> [11–11] |
Records | |||||
---|---|---|---|---|---|
Author | Stefan Schurischuster; Beatriz Remeseiro; Petia Radeva; Martin Kampel | ||||
Title | A Preliminary Study of Image Analysis for Parasite Detection on Honey Bees | Type | Conference Article | ||
Year | 2018 | Publication | 15th International Conference on Image Analysis and Recognition | Abbreviated Journal | |
Volume | 10882 | Issue | Pages | 465-473 | |
Keywords | |||||
Abstract | Varroa destructor is a parasite harming bee colonies. As the worldwide bee population is in danger, beekeepers as well as researchers are looking for methods to monitor the health of bee hives. In this context, we present a preliminary study to detect parasites on bee videos by means of image analysis and machine learning techniques. For this purpose, each video frame is analyzed individually to extract bee image patches, which are then processed to compute image descriptors and finally classified into mite and no mite bees. The experimental results demonstrated the adequacy of the proposed method, which will be a perfect stepping stone for a further bee monitoring system. | ||||
Address | Povoa de Varzim; Portugal; June 2018 | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | LNCS | ||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | ICIAR | ||
Notes | MILAB; no proj | Approved | no | ||
Call Number | Admin @ si @ SRR2018a | Serial | 3110 | ||
Permanent link to this record | |||||
Author | Marc Bolaños; Alvaro Peris; Francisco Casacuberta; Sergi Solera; Petia Radeva | ||||
Title | Egocentric video description based on temporally-linked sequences | Type | Journal Article | ||
Year | 2018 | Publication | Journal of Visual Communication and Image Representation | Abbreviated Journal | JVCIR |
Volume | 50 | Issue | Pages | 205-216 | |
Keywords | egocentric vision; video description; deep learning; multi-modal learning | ||||
Abstract | Egocentric vision consists in acquiring images along the day from a first person point-of-view using wearable cameras. The automatic analysis of this information allows to discover daily patterns for improving the quality of life of the user. A natural topic that arises in egocentric vision is storytelling, that is, how to understand and tell the story relying behind the pictures.
In this paper, we tackle storytelling as an egocentric sequences description problem. We propose a novel methodology that exploits information from temporally neighboring events, matching precisely the nature of egocentric sequences. Furthermore, we present a new method for multimodal data fusion consisting on a multi-input attention recurrent network. We also release the EDUB-SegDesc dataset. This is the first dataset for egocentric image sequences description, consisting of 1,339 events with 3,991 descriptions, from 55 days acquired by 11 people. Finally, we prove that our proposal outperforms classical attentional encoder-decoder methods for video description. |
||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | |||
Notes | MILAB; no proj | Approved | no | ||
Call Number | Admin @ si @ BPC2018 | Serial | 3109 | ||
Permanent link to this record | |||||
Author | Felipe Codevilla; Matthias Muller; Antonio Lopez; Vladlen Koltun; Alexey Dosovitskiy | ||||
Title | End-to-end Driving via Conditional Imitation Learning | Type | Conference Article | ||
Year | 2018 | Publication | IEEE International Conference on Robotics and Automation | Abbreviated Journal | |
Volume | Issue | Pages | 4693 - 4700 | ||
Keywords | |||||
Abstract | Deep networks trained on demonstrations of human driving have learned to follow roads and avoid obstacles. However, driving policies trained via imitation learning cannot be controlled at test time. A vehicle trained end-to-end to imitate an expert cannot be guided to take a specific turn at an upcoming intersection. This limits the utility of such systems. We propose to condition imitation learning on high-level command input. At test time, the learned driving policy functions as a chauffeur that handles sensorimotor coordination but continues to respond to navigational commands. We evaluate different architectures for conditional imitation learning in vision-based driving. We conduct experiments in realistic three-dimensional simulations of urban driving and on a 1/5 scale robotic truck that is trained to drive in a residential area. Both systems drive based on visual input yet remain responsive to high-level navigational commands. The supplementary video can be viewed at this https URL | ||||
Address | Brisbane; Australia; May 2018 | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | ICRA | ||
Notes | ADAS; 600.116; 600.124; 600.118 | Approved | no | ||
Call Number | Admin @ si @ CML2018 | Serial | 3108 | ||
Permanent link to this record | |||||
Author | Fahad Shahbaz Khan; Joost Van de Weijer; Muhammad Anwer Rao; Andrew Bagdanov; Michael Felsberg; Jorma | ||||
Title | Scale coding bag of deep features for human attribute and action recognition | Type | Journal Article | ||
Year | 2018 | Publication | Machine Vision and Applications | Abbreviated Journal | MVAP |
Volume | 29 | Issue | 1 | Pages | 55-71 |
Keywords | Action recognition; Attribute recognition; Bag of deep features | ||||
Abstract | Most approaches to human attribute and action recognition in still images are based on image representation in which multi-scale local features are pooled across scale into a single, scale-invariant encoding. Both in bag-of-words and the recently popular representations based on convolutional neural networks, local features are computed at multiple scales. However, these multi-scale convolutional features are pooled into a single scale-invariant representation. We argue that entirely scale-invariant image representations are sub-optimal and investigate approaches to scale coding within a bag of deep features framework. Our approach encodes multi-scale information explicitly during the image encoding stage. We propose two strategies to encode multi-scale information explicitly in the final image representation. We validate our two scale coding techniques on five datasets: Willow, PASCAL VOC 2010, PASCAL VOC 2012, Stanford-40 and Human Attributes (HAT-27). On all datasets, the proposed scale coding approaches outperform both the scale-invariant method and the standard deep features of the same network. Further, combining our scale coding approaches with standard deep features leads to consistent improvement over the state of the art. | ||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | |||
Notes | LAMP; 600.068; 600.079; 600.106; 600.120 | Approved | no | ||
Call Number | Admin @ si @ KWR2018 | Serial | 3107 | ||
Permanent link to this record | |||||
Author | V. Poulain d'Andecy; Emmanuel Hartmann; Marçal Rusiñol | ||||
Title | Field Extraction by hybrid incremental and a-priori structural templates | Type | Conference Article | ||
Year | 2018 | Publication | 13th IAPR International Workshop on Document Analysis Systems | Abbreviated Journal | |
Volume | Issue | Pages | 251 - 256 | ||
Keywords | Layout Analysis; information extraction; incremental learning | ||||
Abstract | In this paper, we present an incremental framework for extracting information fields from administrative documents. First, we demonstrate some limits of the existing state-of-the-art methods such as the delay of the system efficiency. This is a concern in industrial context when we have only few samples of each document class. Based on this analysis, we propose a hybrid system combining incremental learning by means of itf-df statistics and a-priori generic
models. We report in the experimental section our results obtained with a dataset of real invoices. |
||||
Address | Viena; Austria; April 2018 | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | DAS | ||
Notes | DAG; 600.084; 600.129; 600.121 | Approved | no | ||
Call Number | Admin @ si @ PHR2018 | Serial | 3106 | ||
Permanent link to this record | |||||
Author | David Aldavert; Marçal Rusiñol | ||||
Title | Synthetically generated semantic codebook for Bag-of-Visual-Words based word spotting | Type | Conference Article | ||
Year | 2018 | Publication | 13th IAPR International Workshop on Document Analysis Systems | Abbreviated Journal | |
Volume | Issue | Pages | 223 - 228 | ||
Keywords | Word Spotting; Bag of Visual Words; Synthetic Codebook; Semantic Information | ||||
Abstract | Word-spotting methods based on the Bag-ofVisual-Words framework have demonstrated a good retrieval performance even when used in a completely unsupervised manner. Although unsupervised approaches are suitable for
large document collections due to the cost of acquiring labeled data, these methods also present some drawbacks. For instance, having to train a suitable “codebook” for a certain dataset has a high computational cost. Therefore, in this paper we present a database agnostic codebook which is trained from synthetic data. The aim of the proposed approach is to generate a codebook where the only information required is the type of script used in the document. The use of synthetic data also allows to easily incorporate semantic information in the codebook generation. So, the proposed method is able to determine which set of codewords have a semantic representation of the descriptor feature space. Experimental results show that the resulting codebook attains a state-of-the-art performance while having a more compact representation. |
||||
Address | Viena; Austria; April 2018 | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | DAS | ||
Notes | DAG; 600.084; 600.129; 600.121 | Approved | no | ||
Call Number | Admin @ si @ AlR2018b | Serial | 3105 | ||
Permanent link to this record | |||||
Author | David Aldavert; Marçal Rusiñol | ||||
Title | Manuscript text line detection and segmentation using second-order derivatives analysis | Type | Conference Article | ||
Year | 2018 | Publication | 13th IAPR International Workshop on Document Analysis Systems | Abbreviated Journal | |
Volume | Issue | Pages | 293 - 298 | ||
Keywords | text line detection; text line segmentation; text region detection; second-order derivatives | ||||
Abstract | In this paper, we explore the use of second-order derivatives to detect text lines on handwritten document images. Taking advantage that the second derivative gives a minimum response when a dark linear element over a
bright background has the same orientation as the filter, we use this operator to create a map with the local orientation and strength of putative text lines in the document. Then, we detect line segments by selecting and merging the filter responses that have a similar orientation and scale. Finally, text lines are found by merging the segments that are within the same text region. The proposed segmentation algorithm, is learning-free while showing a performance similar to the state of the art methods in publicly available datasets. |
||||
Address | Viena; Austria; April 2018 | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | DAS | ||
Notes | DAG; 600.084; 600.129; 302.065; 600.121 | Approved | no | ||
Call Number | Admin @ si @ AlR2018a | Serial | 3104 | ||
Permanent link to this record | |||||
Author | Dimosthenis Karatzas; Lluis Gomez; Marçal Rusiñol; Anguelos Nicolaou | ||||
Title | The Robust Reading Competition Annotation and Evaluation Platform | Type | Conference Article | ||
Year | 2018 | Publication | 13th IAPR International Workshop on Document Analysis Systems | Abbreviated Journal | |
Volume | Issue | Pages | 61-66 | ||
Keywords | |||||
Abstract | The ICDAR Robust Reading Competition (RRC), initiated in 2003 and reestablished in 2011, has become the defacto evaluation standard for the international community. Concurrent with its second incarnation in 2011, a continuous
effort started to develop an online framework to facilitate the hosting and management of competitions. This short paper briefly outlines the Robust Reading Competition Annotation and Evaluation Platform, the backbone of the Robust Reading Competition, comprising a collection of tools and processes that aim to simplify the management and annotation of data, and to provide online and offline performance evaluation and analysis services. |
||||
Address | Viena; Austria; April 2018 | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | DAS | ||
Notes | DAG; 600.084; 600.121 | Approved | no | ||
Call Number | KGR2018 | Serial | 3103 | ||
Permanent link to this record | |||||
Author | Lluis Gomez; Marçal Rusiñol; Dimosthenis Karatzas | ||||
Title | Cutting Sayre's Knot: Reading Scene Text without Segmentation. Application to Utility Meters | Type | Conference Article | ||
Year | 2018 | Publication | 13th IAPR International Workshop on Document Analysis Systems | Abbreviated Journal | |
Volume | Issue | Pages | 97-102 | ||
Keywords | Robust Reading; End-to-end Systems; CNN; Utility Meters | ||||
Abstract | In this paper we present a segmentation-free system for reading text in natural scenes. A CNN architecture is trained in an end-to-end manner, and is able to directly output readings without any explicit text localization step. In order to validate our proposal, we focus on the specific case of reading utility meters. We present our results in a large dataset of images acquired by different users and devices, so text appears in any location, with different sizes, fonts and lengths, and the images present several distortions such as
dirt, illumination highlights or blur. |
||||
Address | Viena; Austria; April 2018 | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | DAS | ||
Notes | DAG; 600.084; 600.121; 600.129 | Approved | no | ||
Call Number | Admin @ si @ GRK2018 | Serial | 3102 | ||
Permanent link to this record | |||||
Author | Sangheeta Roy; Palaiahnakote Shivakumara; Namita Jain; Vijeta Khare; Anjan Dutta; Umapada Pal; Tong Lu | ||||
Title | Rough-Fuzzy based Scene Categorization for Text Detection and Recognition in Video | Type | Journal Article | ||
Year | 2018 | Publication | Pattern Recognition | Abbreviated Journal | PR |
Volume | 80 | Issue | Pages | 64-82 | |
Keywords | Rough set; Fuzzy set; Video categorization; Scene image classification; Video text detection; Video text recognition | ||||
Abstract | Scene image or video understanding is a challenging task especially when number of video types increases drastically with high variations in background and foreground. This paper proposes a new method for categorizing scene videos into different classes, namely, Animation, Outlet, Sports, e-Learning, Medical, Weather, Defense, Economics, Animal Planet and Technology, for the performance improvement of text detection and recognition, which is an effective approach for scene image or video understanding. For this purpose, at first, we present a new combination of rough and fuzzy concept to study irregular shapes of edge components in input scene videos, which helps to classify edge components into several groups. Next, the proposed method explores gradient direction information of each pixel in each edge component group to extract stroke based features by dividing each group into several intra and inter planes. We further extract correlation and covariance features to encode semantic features located inside planes or between planes. Features of intra and inter planes of groups are then concatenated to get a feature matrix. Finally, the feature matrix is verified with temporal frames and fed to a neural network for categorization. Experimental results show that the proposed method outperforms the existing state-of-the-art methods, at the same time, the performances of text detection and recognition methods are also improved significantly due to categorization. | ||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | |||
Notes | DAG; 600.097; 600.121 | Approved | no | ||
Call Number | Admin @ si @ RSJ2018 | Serial | 3096 | ||
Permanent link to this record | |||||
Author | Hugo Jair Escalante; Heysem Kaya; Albert Ali Salah; Sergio Escalera; Yagmur Gucluturk; Umut Guclu; Xavier Baro; Isabelle Guyon; Julio C. S. Jacques Junior; Meysam Madadi; Stephane Ayache; Evelyne Viegas; Furkan Gurpinar; Achmadnoer Sukma Wicaksana; Cynthia C. S. Liem; Marcel A. J. van Gerven; Rob van Lier | ||||
Title | Explaining First Impressions: Modeling, Recognizing, and Explaining Apparent Personality from Videos | Type | Miscellaneous | ||
Year | 2018 | Publication | Arxiv | Abbreviated Journal | |
Volume | Issue | Pages | |||
Keywords | |||||
Abstract | Explainability and interpretability are two critical aspects of decision support systems. Within computer vision, they are critical in certain tasks related to human behavior analysis such as in health care applications. Despite their importance, it is only recently that researchers are starting to explore these aspects. This paper provides an introduction to explainability and interpretability in the context of computer vision with an emphasis on looking at people tasks. Specifically, we review and study those mechanisms in the context of first impressions analysis. To the best of our knowledge, this is the first effort in this direction. Additionally, we describe a challenge we organized on explainability in first impressions analysis from video. We analyze in detail the newly introduced data set, the evaluation protocol, and summarize the results of the challenge. Finally, derived from our study, we outline research opportunities that we foresee will be decisive in the near future for the development of the explainable computer vision field. | ||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | |||
Notes | HUPBA | Approved | no | ||
Call Number | Admin @ si @ JKS2018 | Serial | 3095 | ||
Permanent link to this record | |||||
Author | Mohamed Ilyes Lakhal; Hakan Cevikalp; Sergio Escalera | ||||
Title | CRN: End-to-end Convolutional Recurrent Network Structure Applied to Vehicle Classification | Type | Conference Article | ||
Year | 2018 | Publication | 13th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications | Abbreviated Journal | |
Volume | 5 | Issue | Pages | 137-144 | |
Keywords | Vehicle Classification; Deep Learning; End-to-end Learning | ||||
Abstract | Vehicle type classification is considered to be a central part of Intelligent Traffic Systems. In the recent years, deep learning methods have emerged in as being the state-of-the-art in many computer vision tasks. In this paper, we present a novel yet simple deep learning framework for the vehicle type classification problem. We propose an end-to-end trainable system, that combines convolution neural network for feature extraction and recurrent neural network as a classifier. The recurrent network structure is used to handle various types of feature inputs, and at the same time allows to produce a single or a set of class predictions. In order to assess the effectiveness of our solution, we have conducted a set of experiments in two public datasets, obtaining state of the art results. In addition, we also report results on the newly released MIO-TCD dataset. | ||||
Address | Funchal; Madeira; Portugal; January 2018 | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | VISAPP | ||
Notes | HUPBA | Approved | no | ||
Call Number | Admin @ si @ LCE2018a | Serial | 3094 | ||
Permanent link to this record | |||||
Author | Sergio Escalera; Jordi Gonzalez; Hugo Jair Escalante; Xavier Baro; Isabelle Guyon | ||||
Title | Looking at People Special Issue | Type | Journal Article | ||
Year | 2018 | Publication | International Journal of Computer Vision | Abbreviated Journal | IJCV |
Volume | 126 | Issue | 2-4 | Pages | 141-143 |
Keywords | |||||
Abstract | |||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | |||
Notes | HUPBA; ISE; 600.119 | Approved | no | ||
Call Number | Admin @ si @ EGJ2018 | Serial | 3093 | ||
Permanent link to this record | |||||
Author | Katerine Diaz; Jesus Martinez del Rincon; Aura Hernandez-Sabate; Debora Gil | ||||
Title | Continuous head pose estimation using manifold subspace embedding and multivariate regression | Type | Journal Article | ||
Year | 2018 | Publication | IEEE Access | Abbreviated Journal | ACCESS |
Volume | 6 | Issue | Pages | 18325 - 18334 | |
Keywords | Head Pose estimation; HOG features; Generalized Discriminative Common Vectors; B-splines; Multiple linear regression | ||||
Abstract | In this paper, a continuous head pose estimation system is proposed to estimate yaw and pitch head angles from raw facial images. Our approach is based on manifold learningbased methods, due to their promising generalization properties shown for face modelling from images. The method combines histograms of oriented gradients, generalized discriminative common vectors and continuous local regression to achieve successful performance. Our proposal was tested on multiple standard face datasets, as well as in a realistic scenario. Results show a considerable performance improvement and a higher consistence of our model in comparison with other state-of-art methods, with angular errors varying between 9 and 17 degrees. | ||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | 2169-3536 | ISBN | Medium | ||
Area | Expedition | Conference | |||
Notes | ADAS; 600.118 | Approved | no | ||
Call Number | Admin @ si @ DMH2018b | Serial | 3091 | ||
Permanent link to this record | |||||
Author | Katerine Diaz; Francesc J. Ferri; Aura Hernandez-Sabate | ||||
Title | An overview of incremental feature extraction methods based on linear subspaces | Type | Journal Article | ||
Year | 2018 | Publication | Knowledge-Based Systems | Abbreviated Journal | KBS |
Volume | 145 | Issue | Pages | 219-235 | |
Keywords | |||||
Abstract | With the massive explosion of machine learning in our day-to-day life, incremental and adaptive learning has become a major topic, crucial to keep up-to-date and improve classification models and their corresponding feature extraction processes. This paper presents a categorized overview of incremental feature extraction based on linear subspace methods which aim at incorporating new information to the already acquired knowledge without accessing previous data. Specifically, this paper focuses on those linear dimensionality reduction methods with orthogonal matrix constraints based on global loss function, due to the extensive use of their batch approaches versus other linear alternatives. Thus, we cover the approaches derived from Principal Components Analysis, Linear Discriminative Analysis and Discriminative Common Vector methods. For each basic method, its incremental approaches are differentiated according to the subspace model and matrix decomposition involved in the updating process. Besides this categorization, several updating strategies are distinguished according to the amount of data used to update and to the fact of considering a static or dynamic number of classes. Moreover, the specific role of the size/dimension ratio in each method is considered. Finally, computational complexity, experimental setup and the accuracy rates according to published results are compiled and analyzed, and an empirical evaluation is done to compare the best approach of each kind. | ||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | 0950-7051 | ISBN | Medium | ||
Area | Expedition | Conference | |||
Notes | ADAS; 600.118 | Approved | no | ||
Call Number | Admin @ si @ DFH2018 | Serial | 3090 | ||
Permanent link to this record |