Home | [61–70] << 71 72 73 74 75 76 77 78 79 80 >> [81–90] |
Records | |||||
---|---|---|---|---|---|
Author | Mohammad Rouhani; Angel Sappa | ||||
Title | Non-Rigid Shape Registration: A Single Linear Least Squares Framework | Type | Conference Article | ||
Year | 2012 | Publication | 12th European Conference on Computer Vision | Abbreviated Journal | |
Volume | 7578 | Issue | Pages | 264-277 | |
Keywords | |||||
Abstract | This paper proposes a non-rigid registration formulation capturing both global and local deformations in a single framework. This formulation is based on a quadratic estimation of the registration distance together with a quadratic regularization term. Hence, the optimal transformation parameters are easily obtained by solving a liner system of equations, which guarantee a fast convergence. Experimental results with challenging 2D and 3D shapes are presented to show the validity of the proposed framework. Furthermore, comparisons with the most relevant approaches are provided. | ||||
Address | Florencia | ||||
Corporate Author | Thesis | ||||
Publisher | Springer Berlin Heidelberg | Place of Publication | Editor | ||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | LNCS | ||
Series Volume | Series Issue | Edition | |||
ISSN | 0302-9743 | ISBN | 978-3-642-33785-7 | Medium | |
Area | Expedition | Conference | ECCV | ||
Notes | ADAS | Approved | no | ||
Call Number | Admin @ si @ RoS2012a | Serial | 2158 | ||
Permanent link to this record | |||||
Author | Jose Manuel Alvarez; Felipe Lumbreras; Antonio Lopez; Theo Gevers | ||||
Title | Understanding Road Scenes using Visual Cues | Type | Miscellaneous | ||
Year | 2012 | Publication | European Conference on Computer Vision | Abbreviated Journal | |
Volume | Issue | Pages | |||
Keywords | |||||
Abstract | DEMO | ||||
Address | Florence; Italy | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | |||
Notes | ISE | Approved | no | ||
Call Number | Admin @ si @ ALL2012 | Serial | 2795 | ||
Permanent link to this record | |||||
Author | Jose Manuel Alvarez; Theo Gevers; Y. LeCun; Antonio Lopez | ||||
Title | Road Scene Segmentation from a Single Image | Type | Conference Article | ||
Year | 2012 | Publication | 12th European Conference on Computer Vision | Abbreviated Journal | |
Volume | 7578 | Issue | VII | Pages | 376-389 |
Keywords | road detection | ||||
Abstract | Road scene segmentation is important in computer vision for different applications such as autonomous driving and pedestrian detection. Recovering the 3D structure of road scenes provides relevant contextual information to improve their understanding.
In this paper, we use a convolutional neural network based algorithm to learn features from noisy labels to recover the 3D scene layout of a road image. The novelty of the algorithm relies on generating training labels by applying an algorithm trained on a general image dataset to classify on–board images. Further, we propose a novel texture descriptor based on a learned color plane fusion to obtain maximal uniformity in road areas. Finally, acquired (off–line) and current (on–line) information are combined to detect road areas in single images. From quantitative and qualitative experiments, conducted on publicly available datasets, it is concluded that convolutional neural networks are suitable for learning 3D scene layout from noisy labels and provides a relative improvement of 7% compared to the baseline. Furthermore, combining color planes provides a statistical description of road areas that exhibits maximal uniformity and provides a relative improvement of 8% compared to the baseline. Finally, the improvement is even bigger when acquired and current information from a single image are combined |
||||
Address | Florence, Italy | ||||
Corporate Author | Thesis | ||||
Publisher | Springer Berlin Heidelberg | Place of Publication | Editor | ||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | LNCS | ||
Series Volume | Series Issue | Edition | |||
ISSN | 0302-9743 | ISBN | 978-3-642-33785-7 | Medium | |
Area | Expedition | Conference | ECCV | ||
Notes | ADAS;ISE | Approved | no | ||
Call Number | Admin @ si @ AGL2012; ADAS @ adas @ agl2012a | Serial | 2022 | ||
Permanent link to this record | |||||
Author | Ivo Everts; Jan van Gemert; Theo Gevers | ||||
Title | Per-patch Descriptor Selection using Surface and Scene Properties | Type | Conference Article | ||
Year | 2012 | Publication | 12th European Conference on Computer Vision | Abbreviated Journal | |
Volume | 7577 | Issue | VI | Pages | 172-186 |
Keywords | |||||
Abstract | Local image descriptors are generally designed for describing all possible image patches. Such patches may be subject to complex variations in appearance due to incidental object, scene and recording conditions. Because of this, a single-best descriptor for accurate image representation under all conditions does not exist. Therefore, we propose to automatically select from a pool of descriptors the one that is best suitable based on object surface and scene properties. These properties are measured on the fly from a single image patch through a set of attributes. Attributes are input to a classifier which selects the best descriptor. Our experiments on a large dataset of colored object patches show that the proposed selection method outperforms the best single descriptor and a-priori combinations of the descriptor pool. | ||||
Address | Florence, Italy | ||||
Corporate Author | Thesis | ||||
Publisher | Springer Berlin Heidelberg | Place of Publication | Editor | ||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | LNCS | ||
Series Volume | Series Issue | Edition | |||
ISSN | 0302-9743 | ISBN | 978-3-642-33782-6 | Medium | |
Area | Expedition | Conference | ECCV | ||
Notes | ALTRES;ISE | Approved | no | ||
Call Number | Admin @ si @ EGG2012 | Serial | 2023 | ||
Permanent link to this record | |||||
Author | Hamdi Dibeklioglu; Theo Gevers; Albert Ali Salah | ||||
Title | Are You Really Smiling at Me? Spontaneous versus Posed Enjoyment Smiles | Type | Conference Article | ||
Year | 2012 | Publication | 12th European Conference on Computer Vision | Abbreviated Journal | |
Volume | 7574 | Issue | III | Pages | 525-538 |
Keywords | |||||
Abstract | Smiling is an indispensable element of nonverbal social interaction. Besides, automatic distinction between spontaneous and posed expressions is important for visual analysis of social signals. Therefore, in this paper, we propose a method to distinguish between spontaneous and posed enjoyment smiles by using the dynamics of eyelid, cheek, and lip corner movements. The discriminative power of these movements, and the effect of different fusion levels are investigated on multiple databases. Our results improve the state-of-the-art. We also introduce the largest spontaneous/posed enjoyment smile database collected to date, and report new empirical and conceptual findings on smile dynamics. The collected database consists of 1240 samples of 400 subjects. Moreover, it has the unique property of having an age range from 8 to 76 years. Large scale experiments on the new database indicate that eyelid dynamics are highly relevant for smile classification, and there are age-related differences in smile dynamics. | ||||
Address | Florence, Italy | ||||
Corporate Author | Thesis | ||||
Publisher | Springer Berlin Heidelberg | Place of Publication | Editor | ||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | LNCS | ||
Series Volume | Series Issue | Edition | |||
ISSN | 0302-9743 | ISBN | 978-3-642-33711-6 | Medium | |
Area | Expedition | Conference | ECCV | ||
Notes | ALTRES;ISE | Approved | no | ||
Call Number | Admin @ si @ DGS2012 | Serial | 2024 | ||
Permanent link to this record | |||||
Author | David Masip; Alexander Todorov; Jordi Vitria | ||||
Title | The Role of Facial Regions in Evaluating Social Dime | Type | Conference Article | ||
Year | 2012 | Publication | 12th European Conference on Computer Vision – Workshops and Demonstrations | Abbreviated Journal | |
Volume | 7584 | Issue | II | Pages | 210-219 |
Keywords | Workshops and Demonstrations | ||||
Abstract | Facial trait judgments are an important information cue for people. Recent works in the Psychology field have stated the basis of face evaluation, defining a set of traits that we evaluate from faces (e.g. dominance, trustworthiness, aggressiveness, attractiveness, threatening or intelligence among others). We rapidly infer information from others faces, usually after a short period of time (< 1000ms) we perceive a certain degree of dominance or trustworthiness of another person from the face. Although these perceptions are not necessarily accurate, they influence many important social outcomes (such as the results of the elections or the court decisions). This topic has also attracted the attention of Computer Vision scientists, and recently a computational model to automatically predict trait evaluations from faces has been proposed. These systems try to mimic the human perception by means of applying machine learning classifiers to a set of labeled data. In this paper we perform an experimental study on the specific facial features that trigger the social inferences. Using previous results from the literature, we propose to use simple similarity maps to evaluate which regions of the face influence the most the trait inferences. The correlation analysis is performed using only appearance, and the results from the experiments suggest that each trait is correlated with specific facial characteristics. | ||||
Address | Florence, Italy | ||||
Corporate Author | Thesis | ||||
Publisher | Springer Berlin Heidelberg | Place of Publication | Editor | Andrea Fusiello, Vittorio Murino, Rita Cucchiara | |
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | LNCS | ||
Series Volume | Series Issue | Edition | |||
ISSN | 0302-9743 | ISBN | 978-3-642-33867-0 | Medium | |
Area | Expedition | Conference | ECCVW | ||
Notes | OR;MV | Approved | no | ||
Call Number | Admin @ si @ MTV2012 | Serial | 2171 | ||
Permanent link to this record | |||||
Author | J.R. Serra; J.B. Subirana | ||||
Title | Adaptive non-cartesian networks for vision. | Type | Miscellaneous | ||
Year | 1997 | Publication | IX International Conference on Image Analysis and Processing. | Abbreviated Journal | |
Volume | Issue | Pages | |||
Keywords | |||||
Abstract | |||||
Address | Florence | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | |||
Notes | Approved | no | |||
Call Number | Admin @ si @ SeS1997 | Serial | 212 | ||
Permanent link to this record | |||||
Author | Aura Hernandez-Sabate; Jose Elias Yauri; Pau Folch; Miquel Angel Piera; Debora Gil | ||||
Title | Recognition of the Mental Workloads of Pilots in the Cockpit Using EEG Signals | Type | Journal Article | ||
Year | 2022 | Publication | Applied Sciences | Abbreviated Journal | APPLSCI |
Volume | 12 | Issue | 5 | Pages | 2298 |
Keywords | Cognitive states; Mental workload; EEG analysis; Neural networks; Multimodal data fusion | ||||
Abstract | The commercial flightdeck is a naturally multi-tasking work environment, one in which interruptions are frequent come in various forms, contributing in many cases to aviation incident reports. Automatic characterization of pilots’ workloads is essential to preventing these kind of incidents. In addition, minimizing the physiological sensor network as much as possible remains both a challenge and a requirement. Electroencephalogram (EEG) signals have shown high correlations with specific cognitive and mental states, such as workload. However, there is not enough evidence in the literature to validate how well models generalize in cases of new subjects performing tasks with workloads similar to the ones included during the model’s training. In this paper, we propose a convolutional neural network to classify EEG features across different mental workloads in a continuous performance task test that partly measures working memory and working memory capacity. Our model is valid at the general population level and it is able to transfer task learning to pilot mental workload recognition in a simulated operational environment. | ||||
Address | February 2022 | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | |||
Notes | IAM; ADAS; 600.139; 600.145; 600.118 | Approved | no | ||
Call Number | Admin @ si @ HYF2022 | Serial | 3720 | ||
Permanent link to this record | |||||
Author | Gabriel Villalonga | ||||
Title | Leveraging Synthetic Data to Create Autonomous Driving Perception Systems | Type | Book Whole | ||
Year | 2021 | Publication | PhD Thesis, Universitat Autonoma de Barcelona-CVC | Abbreviated Journal | |
Volume | Issue | Pages | |||
Keywords | |||||
Abstract | Manually annotating images to develop vision models has been a major bottleneck
since computer vision and machine learning started to walk together. This has been more evident since computer vision falls on the shoulders of data-hungry deep learning techniques. When addressing on-board perception for autonomous driving, the curse of data annotation is exacerbated due to the use of additional sensors such as LiDAR. Therefore, any approach aiming at reducing such a timeconsuming and costly work is of high interest for addressing autonomous driving and, in fact, for any application requiring some sort of artificial perception. In the last decade, it has been shown that leveraging from synthetic data is a paradigm worth to pursue in order to minimizing manual data annotation. The reason is that the automatic process of generating synthetic data can also produce different types of associated annotations (e.g. object bounding boxes for synthetic images and LiDAR pointclouds, pixel/point-wise semantic information, etc.). Directly using synthetic data for training deep perception models may not be the definitive solution in all circumstances since it can appear a synth-to-real domain shift. In this context, this work focuses on leveraging synthetic data to alleviate manual annotation for three perception tasks related to driving assistance and autonomous driving. In all cases, we assume the use of deep convolutional neural networks (CNNs) to develop our perception models. The first task addresses traffic sign recognition (TSR), a kind of multi-class classification problem. We assume that the number of sign classes to be recognized must be suddenly increased without having annotated samples to perform the corresponding TSR CNN re-training. We show that leveraging synthetic samples of such new classes and transforming them by a generative adversarial network (GAN) trained on the known classes (i.e. without using samples from the new classes), it is possible to re-train the TSR CNN to properly classify all the signs for a ∼ 1/4 ratio of new/known sign classes. The second task addresses on-board 2D object detection, focusing on vehicles and pedestrians. In this case, we assume that we receive a set of images without the annotations required to train an object detector, i.e. without object bounding boxes. Therefore, our goal is to self-annotate these images so that they can later be used to train the desired object detector. In order to reach this goal, we leverage from synthetic data and propose a semi-supervised learning approach based on the co-training idea. In fact, we use a GAN to reduce the synthto-real domain shift before applying co-training. Our quantitative results show that co-training and GAN-based image-to-image translation complement each other up to allow the training of object detectors without manual annotation, and still almost reaching the upper-bound performances of the detectors trained from human annotations. While in previous tasks we focus on vision-based perception, the third task we address focuses on LiDAR pointclouds. Our initial goal was to develop a 3D object detector trained on synthetic LiDAR-style pointclouds. While for images we may expect synth/real-to-real domain shift due to differences in their appearance (e.g. when source and target images come from different camera sensors), we did not expect so for LiDAR pointclouds since these active sensors factor out appearance and provide sampled shapes. However, in practice, we have seen that it can be domain shift even among real-world LiDAR pointclouds. Factors such as the sampling parameters of the LiDARs, the sensor suite configuration onboard the ego-vehicle, and the human annotation of 3D bounding boxes, do induce a domain shift. We show it through comprehensive experiments with different publicly available datasets and 3D detectors. This redirected our goal towards the design of a GAN for pointcloud-to-pointcloud translation, a relatively unexplored topic. Finally, it is worth to mention that all the synthetic datasets used for these three tasks, have been designed and generated in the context of this PhD work and will be publicly released. Overall, we think this PhD presents several steps forward to encourage leveraging synthetic data for developing deep perception models in the field of driving assistance and autonomous driving. |
||||
Address | February 2021 | ||||
Corporate Author | Thesis | Ph.D. thesis | |||
Publisher | Ediciones Graficas Rey | Place of Publication | Editor | Antonio Lopez;German Ros | |
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | 978-84-122714-2-3 | Medium | ||
Area | Expedition | Conference | |||
Notes | ADAS; 600.118 | Approved | no | ||
Call Number | Admin @ si @ Vil2021 | Serial | 3599 | ||
Permanent link to this record | |||||
Author | Edgar Riba | ||||
Title | Geometric Computer Vision Techniques for Scene Reconstruction | Type | Book Whole | ||
Year | 2021 | Publication | PhD Thesis, Universitat Autonoma de Barcelona-CVC | Abbreviated Journal | |
Volume | Issue | Pages | |||
Keywords | |||||
Abstract | From the early stages of Computer Vision, scene reconstruction has been one of the most studied topics leading to a wide variety of new discoveries and applications. Object grasping and manipulation, localization and mapping, or even visual effect generation are different examples of applications in which scene reconstruction has taken an important role for industries such as robotics, factory automation, or audio visual production. However, scene reconstruction is an extensive topic that can be approached in many different ways with already existing solutions that effectively work in controlled environments. Formally, the problem of scene reconstruction can be formulated as a sequence of independent processes which compose a pipeline. In this thesis, we analyse some parts of the reconstruction pipeline from which we contribute with novel methods using Convolutional Neural Networks (CNN) proposing innovative solutions that consider the optimisation of the methods in an end-to-end fashion. First, we review the state of the art of classical local features detectors and descriptors and contribute with two novel methods that inherently improve pre-existing solutions in the scene reconstruction pipeline.
It is a fact that computer science and software engineering are two fields that usually go hand in hand and evolve according to mutual needs making easier the design of complex and efficient algorithms. For this reason, we contribute with Kornia, a library specifically designed to work with classical computer vision techniques along with deep neural networks. In essence, we created a framework that eases the design of complex pipelines for computer vision algorithms so that can be included within neural networks and be used to backpropagate gradients throw a common optimisation framework. Finally, in the last chapter of this thesis we develop the aforementioned concept of designing end-to-end systems with classical projective geometry. Thus, we contribute with a solution to the problem of synthetic view generation by hallucinating novel views from high deformable cloths objects using a geometry aware end-to-end system. To summarize, in this thesis we demonstrate that with a proper design that combine classical geometric computer vision methods with deep learning techniques can lead to improve pre-existing solutions for the problem of scene reconstruction. |
||||
Address | February 2021 | ||||
Corporate Author | Thesis | Ph.D. thesis | |||
Publisher | Place of Publication | Editor | Daniel Ponsa | ||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | |||
Notes | MSIAU | Approved | no | ||
Call Number | Admin @ si @ Rib2021 | Serial | 3610 | ||
Permanent link to this record | |||||
Author | Saad Minhas; Aura Hernandez-Sabate; Shoaib Ehsan; Klaus McDonald Maier | ||||
Title | Effects of Non-Driving Related Tasks during Self-Driving mode | Type | Journal Article | ||
Year | 2022 | Publication | IEEE Transactions on Intelligent Transportation Systems | Abbreviated Journal | TITS |
Volume | 23 | Issue | 2 | Pages | 1391-1399 |
Keywords | |||||
Abstract | Perception reaction time and mental workload have proven to be crucial in manual driving. Moreover, in highly automated cars, where most of the research is focusing on Level 4 Autonomous driving, take-over performance is also a key factor when taking road safety into account. This study aims to investigate how the immersion in non-driving related tasks affects the take-over performance of drivers in given scenarios. The paper also highlights the use of virtual simulators to gather efficient data that can be crucial in easing the transition between manual and autonomous driving scenarios. The use of Computer Aided Simulations is of absolute importance in this day and age since the automotive industry is rapidly moving towards Autonomous technology. An experiment comprising of 40 subjects was performed to examine the reaction times of driver and the influence of other variables in the success of take-over performance in highly automated driving under different circumstances within a highway virtual environment. The results reflect the relationship between reaction times under different scenarios that the drivers might face under the circumstances stated above as well as the importance of variables such as velocity in the success on regaining car control after automated driving. The implications of the results acquired are important for understanding the criteria needed for designing Human Machine Interfaces specifically aimed towards automated driving conditions. Understanding the need to keep drivers in the loop during automation, whilst allowing drivers to safely engage in other non-driving related tasks is an important research area which can be aided by the proposed study. | ||||
Address | Feb. 2022 | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | |||
Notes | IAM; 600.139; 600.145 | Approved | no | ||
Call Number | Admin @ si @ MHE2022 | Serial | 3468 | ||
Permanent link to this record | |||||
Author | Miquel Angel Piera; Jose Luis Muñoz; Debora Gil; Gonzalo Martin; Jordi Manzano | ||||
Title | A Socio-Technical Simulation Model for the Design of the Future Single Pilot Cockpit: An Opportunity to Improve Pilot Performance | Type | Journal Article | ||
Year | 2022 | Publication | IEEE Access | Abbreviated Journal | ACCESS |
Volume | 10 | Issue | Pages | 22330-22343 | |
Keywords | Human factors ; Performance evaluation ; Simulation; Sociotechnical systems ; System performance | ||||
Abstract | The future deployment of single pilot operations must be supported by new cockpit computer services. Such services require an adaptive context-aware integration of technical functionalities with the concurrent tasks that a pilot must deal with. Advanced artificial intelligence supporting services and improved communication capabilities are the key enabling technologies that will render future cockpits more integrated with the present digitalized air traffic management system. However, an issue in the integration of such technologies is the lack of socio-technical analysis in the design of these teaming mechanisms. A key factor in determining how and when a service support should be provided is the dynamic evolution of pilot workload. This paper investigates how the socio-technical model-based systems engineering approach paves the way for the design of a digital assistant framework by formalizing this workload. The model was validated in an Airbus A-320 cockpit simulator, and the results confirmed the degraded pilot behavioral model and the performance impact according to different contextual flight deck information. This study contributes to practical knowledge for designing human-machine task-sharing systems. | ||||
Address | Feb 2022 | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | |||
Notes | IAM; | Approved | no | ||
Call Number | Admin @ si @ PMG2022 | Serial | 3697 | ||
Permanent link to this record | |||||
Author | Veronica Romero; Alicia Fornes; Enrique Vidal; Joan Andreu Sanchez | ||||
Title | Information Extraction in Handwritten Marriage Licenses Books Using the MGGI Methodology | Type | Conference Article | ||
Year | 2017 | Publication | 8th Iberian Conference on Pattern Recognition and Image Analysis | Abbreviated Journal | |
Volume | 10255 | Issue | Pages | 287-294 | |
Keywords | Handwritten Text Recognition; Information extraction; Language modeling; MGGI; Categories-based language model | ||||
Abstract | Historical records of daily activities provide intriguing insights into the life of our ancestors, useful for demographic and genealogical research. For example, marriage license books have been used for centuries by ecclesiastical and secular institutions to register marriages. These books follow a simple structure of the text in the records with a evolutionary vocabulary, mainly composed of proper names that change along the time. This distinct vocabulary makes automatic transcription and semantic information extraction difficult tasks. In previous works we studied the use of category-based language models and how a Grammatical Inference technique known as MGGI could improve the accuracy of these tasks. In this work we analyze the main causes of the semantic errors observed in previous results and apply a better implementation of the MGGI technique to solve these problems. Using the resulting language model, transcription and information extraction experiments have been carried out, and the results support our proposed approach. | ||||
Address | Faro; Portugal; June 2017 | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | L.A. Alexandre; J.Salvador Sanchez; Joao M. F. Rodriguez | ||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | LNCS | ||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | 978-3-319-58837-7 | Medium | ||
Area | Expedition | Conference | IbPRIA | ||
Notes | DAG; 602.006; 600.097; 600.121 | Approved | no | ||
Call Number | Admin @ si @ RFV2017 | Serial | 2952 | ||
Permanent link to this record | |||||
Author | Marc Bolaños; Alvaro Peris; Francisco Casacuberta; Petia Radeva | ||||
Title | VIBIKNet: Visual Bidirectional Kernelized Network for Visual Question Answering | Type | Conference Article | ||
Year | 2017 | Publication | 8th Iberian Conference on Pattern Recognition and Image Analysis | Abbreviated Journal | |
Volume | Issue | Pages | |||
Keywords | Visual Qestion Aswering; Convolutional Neural Networks; Long short-term memory networks | ||||
Abstract | In this paper, we address the problem of visual question answering by proposing a novel model, called VIBIKNet. Our model is based on integrating Kernelized Convolutional Neural Networks and Long-Short Term Memory units to generate an answer given a question about an image. We prove that VIBIKNet is an optimal trade-off between accuracy and computational load, in terms of memory and time consumption. We validate our method on the VQA challenge dataset and compare it to the top performing methods in order to illustrate its performance and speed. | ||||
Address | Faro; Portugal; June 2017 | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | IbPRIA | ||
Notes | MILAB; no proj | Approved | no | ||
Call Number | Admin @ si @ BPC2017 | Serial | 2939 | ||
Permanent link to this record | |||||
Author | Hana Jarraya; Oriol Ramos Terrades; Josep Llados | ||||
Title | Graph Embedding through Probabilistic Graphical Model applied to Symbolic Graphs | Type | Conference Article | ||
Year | 2017 | Publication | 8th Iberian Conference on Pattern Recognition and Image Analysis | Abbreviated Journal | |
Volume | Issue | Pages | |||
Keywords | Attributed Graph; Probabilistic Graphical Model; Graph Embedding; Structured Support Vector Machines | ||||
Abstract | We propose a new Graph Embedding (GEM) method that takes advantages of structural pattern representation. It models an Attributed Graph (AG) as a Probabilistic Graphical Model (PGM). Then, it learns the parameters of this PGM presented by a vector. This vector is a signature of AG in a lower dimensional vectorial space. We apply Structured Support Vector Machines (SSVM) to process classification task. As first tentative, results on the GREC dataset are encouraging enough to go further on this direction. | ||||
Address | Faro; Portugal; June 2017 | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | IbPRIA | ||
Notes | DAG; 600.097; 600.121 | Approved | no | ||
Call Number | Admin @ si @ JRL2017a | Serial | 2953 | ||
Permanent link to this record |