toggle visibility Search & Display Options

Select All    Deselect All
 |   | 
Details
   print
  Records Links
Author Pau Riba; Josep Llados; Alicia Fornes; Anjan Dutta edit  url
openurl 
  Title Large-scale graph indexing using binary embeddings of node contexts for information spotting in document image databases Type Journal Article
  Year 2017 Publication Pattern Recognition Letters Abbreviated Journal PRL  
  Volume 87 Issue Pages 203-211  
  Keywords (up)  
  Abstract Graph-based representations are experiencing a growing usage in visual recognition and retrieval due to their representational power in front of classical appearance-based representations. However, retrieving a query graph from a large dataset of graphs implies a high computational complexity. The most important property for a large-scale retrieval is the search time complexity to be sub-linear in the number of database examples. With this aim, in this paper we propose a graph indexation formalism applied to visual retrieval. A binary embedding is defined as hashing keys for graph nodes. Given a database of labeled graphs, graph nodes are complemented with vectors of attributes representing their local context. Then, each attribute vector is converted to a binary code applying a binary-valued hash function. Therefore, graph retrieval is formulated in terms of finding target graphs in the database whose nodes have a small Hamming distance from the query nodes, easily computed with bitwise logical operators. As an application example, we validate the performance of the proposed methods in different real scenarios such as handwritten word spotting in images of historical documents or symbol spotting in architectural floor plans.  
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes DAG; 600.097; 602.006; 603.053; 600.121 Approved no  
  Call Number RLF2017b Serial 2873  
Permanent link to this record
 

 
Author H. Martin Kjer; Jens Fagertun; Sergio Vera; Debora Gil; Miguel Angel Gonzalez Ballester; Rasmus R. Paulsena edit   pdf
url  openurl
  Title Free-form image registration of human cochlear uCT data using skeleton similarity as anatomical prior Type Journal Article
  Year 2016 Publication Patter Recognition Letters Abbreviated Journal PRL  
  Volume 76 Issue 1 Pages 76-82  
  Keywords (up)  
  Abstract  
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes IAM; 600.060 Approved no  
  Call Number Admin @ si @ MFV2017b Serial 2941  
Permanent link to this record
 

 
Author Lluis Gomez; Dimosthenis Karatzas edit   pdf
url  openurl
  Title TextProposals: a Text‐specific Selective Search Algorithm for Word Spotting in the Wild Type Journal Article
  Year 2017 Publication Pattern Recognition Abbreviated Journal PR  
  Volume 70 Issue Pages 60-74  
  Keywords (up)  
  Abstract Motivated by the success of powerful while expensive techniques to recognize words in a holistic way (Goel et al., 2013; Almazán et al., 2014; Jaderberg et al., 2016) object proposals techniques emerge as an alternative to the traditional text detectors. In this paper we introduce a novel object proposals method that is specifically designed for text. We rely on a similarity based region grouping algorithm that generates a hierarchy of word hypotheses. Over the nodes of this hierarchy it is possible to apply a holistic word recognition method in an efficient way.

Our experiments demonstrate that the presented method is superior in its ability of producing good quality word proposals when compared with class-independent algorithms. We show impressive recall rates with a few thousand proposals in different standard benchmarks, including focused or incidental text datasets, and multi-language scenarios. Moreover, the combination of our object proposals with existing whole-word recognizers (Almazán et al., 2014; Jaderberg et al., 2016) shows competitive performance in end-to-end word spotting, and, in some benchmarks, outperforms previously published results. Concretely, in the challenging ICDAR2015 Incidental Text dataset, we overcome in more than 10% F-score the best-performing method in the last ICDAR Robust Reading Competition (Karatzas, 2015). Source code of the complete end-to-end system is available at https://github.com/lluisgomez/TextProposals.
 
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes DAG; 600.084; 601.197; 600.121; 600.129 Approved no  
  Call Number Admin @ si @ GoK2017 Serial 2886  
Permanent link to this record
 

 
Author Lluis Gomez; Anguelos Nicolaou; Dimosthenis Karatzas edit   pdf
doi  openurl
  Title Improving patch‐based scene text script identification with ensembles of conjoined networks Type Journal Article
  Year 2017 Publication Pattern Recognition Abbreviated Journal PR  
  Volume 67 Issue Pages 85-96  
  Keywords (up)  
  Abstract  
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes DAG; 600.084; 600.121; 600.129 Approved no  
  Call Number Admin @ si @ GNK2017 Serial 2887  
Permanent link to this record
 

 
Author Ivet Rafegas; Javier Vazquez; Robert Benavente; Maria Vanrell; Susana Alvarez edit  url
openurl 
  Title Enhancing spatio-chromatic representation with more-than-three color coding for image description Type Journal Article
  Year 2017 Publication Journal of the Optical Society of America A Abbreviated Journal JOSA A  
  Volume 34 Issue 5 Pages 827-837  
  Keywords (up)  
  Abstract Extraction of spatio-chromatic features from color images is usually performed independently on each color channel. Usual 3D color spaces, such as RGB, present a high inter-channel correlation for natural images. This correlation can be reduced using color-opponent representations, but the spatial structure of regions with small color differences is not fully captured in two generic Red-Green and Blue-Yellow channels. To overcome these problems, we propose a new color coding that is adapted to the specific content of each image. Our proposal is based on two steps: (a) setting the number of channels to the number of distinctive colors we find in each image (avoiding the problem of channel correlation), and (b) building a channel representation that maximizes contrast differences within each color channel (avoiding the problem of low local contrast). We call this approach more-than-three color coding (MTT) to enhance the fact that the number of channels is adapted to the image content. The higher color complexity an image has, the more channels can be used to represent it. Here we select distinctive colors as the most predominant in the image, which we call color pivots, and we build the new color coding using these color pivots as a basis. To evaluate the proposed approach we measure its efficiency in an image categorization task. We show how a generic descriptor improves its performance at the description level when applied on the MTT coding.  
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes CIC; 600.087 Approved no  
  Call Number Admin @ si @ RVB2017 Serial 2892  
Permanent link to this record
 

 
Author Cristhian A. Aguilera-Carrasco; Angel Sappa; Cristhian Aguilera; Ricardo Toledo edit   pdf
doi  openurl
  Title Cross-Spectral Local Descriptors via Quadruplet Network Type Journal Article
  Year 2017 Publication Sensors Abbreviated Journal SENS  
  Volume 17 Issue 4 Pages 873  
  Keywords (up)  
  Abstract This paper presents a novel CNN-based architecture, referred to as Q-Net, to learn local feature descriptors that are useful for matching image patches from two different spectral bands. Given correctly matched and non-matching cross-spectral image pairs, a quadruplet network is trained to map input image patches to a common Euclidean space, regardless of the input spectral band. Our approach is inspired by the recent success of triplet networks in the visible spectrum, but adapted for cross-spectral scenarios, where, for each matching pair, there are always two possible non-matching patches: one for each spectrum. Experimental evaluations on a public cross-spectral VIS-NIR dataset shows that the proposed approach improves the state-of-the-art. Moreover, the proposed technique can also be used in mono-spectral settings, obtaining a similar performance to triplet network descriptors, but requiring less training data.  
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes ADAS; 600.086; 600.118 Approved no  
  Call Number Admin @ si @ ASA2017 Serial 2914  
Permanent link to this record
 

 
Author Pau Rodriguez; Guillem Cucurull; Jordi Gonzalez; Josep M. Gonfaus; Kamal Nasrollahi; Thomas B. Moeslund; Xavier Roca edit   pdf
doi  openurl
  Title Deep Pain: Exploiting Long Short-Term Memory Networks for Facial Expression Classification Type Journal Article
  Year 2017 Publication IEEE Transactions on cybernetics Abbreviated Journal Cyber  
  Volume Issue Pages 1-11  
  Keywords (up)  
  Abstract Pain is an unpleasant feeling that has been shown to be an important factor for the recovery of patients. Since this is costly in human resources and difficult to do objectively, there is the need for automatic systems to measure it. In this paper, contrary to current state-of-the-art techniques in pain assessment, which are based on facial features only, we suggest that the performance can be enhanced by feeding the raw frames to deep learning models, outperforming the latest state-of-the-art results while also directly facing the problem of imbalanced data. As a baseline, our approach first uses convolutional neural networks (CNNs) to learn facial features from VGG_Faces, which are then linked to a long short-term memory to exploit the temporal relation between video frames. We further compare the performances of using the so popular schema based on the canonically normalized appearance versus taking into account the whole image. As a result, we outperform current state-of-the-art area under the curve performance in the UNBC-McMaster Shoulder Pain Expression Archive Database. In addition, to evaluate the generalization properties of our proposed methodology on facial motion recognition, we also report competitive results in the Cohn Kanade+ facial expression database.  
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes ISE; 600.119; 600.098 Approved no  
  Call Number Admin @ si @ RCG2017a Serial 2926  
Permanent link to this record
 

 
Author Anjan Dutta; Josep Llados; Horst Bunke; Umapada Pal edit   pdf
url  openurl
  Title Product graph-based higher order contextual similarities for inexact subgraph matching Type Journal Article
  Year 2018 Publication Pattern Recognition Abbreviated Journal PR  
  Volume 76 Issue Pages 596-611  
  Keywords (up)  
  Abstract Many algorithms formulate graph matching as an optimization of an objective function of pairwise quantification of nodes and edges of two graphs to be matched. Pairwise measurements usually consider local attributes but disregard contextual information involved in graph structures. We address this issue by proposing contextual similarities between pairs of nodes. This is done by considering the tensor product graph (TPG) of two graphs to be matched, where each node is an ordered pair of nodes of the operand graphs. Contextual similarities between a pair of nodes are computed by accumulating weighted walks (normalized pairwise similarities) terminating at the corresponding paired node in TPG. Once the contextual similarities are obtained, we formulate subgraph matching as a node and edge selection problem in TPG. We use contextual similarities to construct an objective function and optimize it with a linear programming approach. Since random walk formulation through TPG takes into account higher order information, it is not a surprise that we obtain more reliable similarities and better discrimination among the nodes and edges. Experimental results shown on synthetic as well as real benchmarks illustrate that higher order contextual similarities increase discriminating power and allow one to find approximate solutions to the subgraph matching problem.  
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes DAG; 602.167; 600.097; 600.121 Approved no  
  Call Number Admin @ si @ DLB2018 Serial 3083  
Permanent link to this record
 

 
Author Karim Lekadir; Alfiia Galimzianova; Angels Betriu; Maria del Mar Vila; Laura Igual; Daniel L. Rubin; Elvira Fernandez-Giraldez; Petia Radeva; Sandy Napel edit  doi
openurl 
  Title A Convolutional Neural Network for Automatic Characterization of Plaque Composition in Carotid Ultrasound Type Journal Article
  Year 2017 Publication IEEE Journal Biomedical and Health Informatics Abbreviated Journal J-BHI  
  Volume 21 Issue 1 Pages 48-55  
  Keywords (up)  
  Abstract Characterization of carotid plaque composition, more specifically the amount of lipid core, fibrous tissue, and calcified tissue, is an important task for the identification of plaques that are prone to rupture, and thus for early risk estimation of cardiovascular and cerebrovascular events. Due to its low costs and wide availability, carotid ultrasound has the potential to become the modality of choice for plaque characterization in clinical practice. However, its significant image noise, coupled with the small size of the plaques and their complex appearance, makes it difficult for automated techniques to discriminate between the different plaque constituents. In this paper, we propose to address this challenging problem by exploiting the unique capabilities of the emerging deep learning framework. More specifically, and unlike existing works which require a priori definition of specific imaging features or thresholding values, we propose to build a convolutional neural network (CNN) that will automatically extract from the images the information that is optimal for the identification of the different plaque constituents. We used approximately 90 000 patches extracted from a database of images and corresponding expert plaque characterizations to train and to validate the proposed CNN. The results of cross-validation experiments show a correlation of about 0.90 with the clinical assessment for the estimation of lipid core, fibrous cap, and calcified tissue areas, indicating the potential of deep learning for the challenging task of automatic characterization of plaque composition in carotid ultrasound.  
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes MILAB; no menciona Approved no  
  Call Number Admin @ si @ LGB2017 Serial 2931  
Permanent link to this record
 

 
Author Anastasios Doulamis; Nikolaos Doulamis; Marco Bertini; Jordi Gonzalez; Thomas B. Moeslund edit   pdf
url  openurl
  Title Introduction to the Special Issue on the Analysis and Retrieval of Events/Actions and Workflows in Video Streams Type Journal Article
  Year 2016 Publication Multimedia Tools and Applications Abbreviated Journal MTAP  
  Volume 75 Issue 22 Pages 14985-14990  
  Keywords (up)  
  Abstract  
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes ISE; HUPBA Approved no  
  Call Number Admin @ si @ DDB2016 Serial 2934  
Permanent link to this record
 

 
Author Marta Diez-Ferrer; Debora Gil; Elena Carreño; Susana Padrones; Samantha Aso; Vanesa Vicens; Noelia Cubero de Frutos; Rosa Lopez Lisbona; Carles Sanchez; Agnes Borras; Antoni Rosell edit   pdf
url  openurl
  Title Positive Airway Pressure-Enhanced CT to Improve Virtual Bronchoscopic Navigation Type Journal Article
  Year 2017 Publication European Respiratory Journal Abbreviated Journal ERJ  
  Volume Issue Pages  
  Keywords (up)  
  Abstract  
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes IAM Approved no  
  Call Number Admin @ si @ DGC2017b Serial 3632  
Permanent link to this record
 

 
Author Sergio Escalera; Jordi Gonzalez; Hugo Jair Escalante; Xavier Baro; Isabelle Guyon edit  url
openurl 
  Title Looking at People Special Issue Type Journal Article
  Year 2018 Publication International Journal of Computer Vision Abbreviated Journal IJCV  
  Volume 126 Issue 2-4 Pages 141-143  
  Keywords (up)  
  Abstract  
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes HUPBA; ISE; 600.119 Approved no  
  Call Number Admin @ si @ EGJ2018 Serial 3093  
Permanent link to this record
 

 
Author Xinhang Song; Shuqiang Jiang; Luis Herranz edit  doi
openurl 
  Title Multi-Scale Multi-Feature Context Modeling for Scene Recognition in the Semantic Manifold Type Journal Article
  Year 2017 Publication IEEE Transactions on Image Processing Abbreviated Journal TIP  
  Volume 26 Issue 6 Pages 2721-2735  
  Keywords (up)  
  Abstract Before the big data era, scene recognition was often approached with two-step inference using localized intermediate representations (objects, topics, and so on). One of such approaches is the semantic manifold (SM), in which patches and images are modeled as points in a semantic probability simplex. Patch models are learned resorting to weak supervision via image labels, which leads to the problem of scene categories co-occurring in this semantic space. Fortunately, each category has its own co-occurrence patterns that are consistent across the images in that category. Thus, discovering and modeling these patterns are critical to improve the recognition performance in this representation. Since the emergence of large data sets, such as ImageNet and Places, these approaches have been relegated in favor of the much more powerful convolutional neural networks (CNNs), which can automatically learn multi-layered representations from the data. In this paper, we address many limitations of the original SM approach and related works. We propose discriminative patch representations using neural networks and further propose a hybrid architecture in which the semantic manifold is built on top of multiscale CNNs. Both representations can be computed significantly faster than the Gaussian mixture models of the original SM. To combine multiple scales, spatial relations, and multiple features, we formulate rich context models using Markov random fields. To solve the optimization problem, we analyze global and local approaches, where a top-down hierarchical algorithm has the best performance. Experimental results show that exploiting different types of contextual relations jointly consistently improves the recognition accuracy.  
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes LAMP; 600.120 Approved no  
  Call Number Admin @ si @ SJH2017a Serial 2963  
Permanent link to this record
 

 
Author Weiqing Min; Shuqiang Jiang; Jitao Sang; Huayang Wang; Xinda Liu; Luis Herranz edit  doi
openurl 
  Title Being a Supercook: Joint Food Attributes and Multimodal Content Modeling for Recipe Retrieval and Exploration Type Journal Article
  Year 2017 Publication IEEE Transactions on Multimedia Abbreviated Journal TMM  
  Volume 19 Issue 5 Pages 1100 - 1113  
  Keywords (up)  
  Abstract This paper considers the problem of recipe-oriented image-ingredient correlation learning with multi-attributes for recipe retrieval and exploration. Existing methods mainly focus on food visual information for recognition while we model visual information, textual content (e.g., ingredients), and attributes (e.g., cuisine and course) together to solve extended recipe-oriented problems, such as multimodal cuisine classification and attribute-enhanced food image retrieval. As a solution, we propose a multimodal multitask deep belief network (M3TDBN) to learn joint image-ingredient representation regularized by different attributes. By grouping ingredients into visible ingredients (which are visible in the food image, e.g., “chicken” and “mushroom”) and nonvisible ingredients (e.g., “salt” and “oil”), M3TDBN is capable of learning both midlevel visual representation between images and visible ingredients and nonvisual representation. Furthermore, in order to utilize different attributes to improve the intermodality correlation, M3TDBN incorporates multitask learning to make different attributes collaborate each other. Based on the proposed M3TDBN, we exploit the derived deep features and the discovered correlations for three extended novel applications: 1) multimodal cuisine classification; 2) attribute-augmented cross-modal recipe image retrieval; and 3) ingredient and attribute inference from food images. The proposed approach is evaluated on the constructed Yummly dataset and the evaluation results have validated the effectiveness of the proposed approach.  
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes LAMP; 600.120 Approved no  
  Call Number Admin @ si @ MJS2017 Serial 2964  
Permanent link to this record
 

 
Author Luis Herranz; Shuqiang Jiang; Ruihan Xu edit   pdf
doi  openurl
  Title Modeling Restaurant Context for Food Recognition Type Journal Article
  Year 2017 Publication IEEE Transactions on Multimedia Abbreviated Journal TMM  
  Volume 19 Issue 2 Pages 430 - 440  
  Keywords (up)  
  Abstract Food photos are widely used in food logs for diet monitoring and in social networks to share social and gastronomic experiences. A large number of these images are taken in restaurants. Dish recognition in general is very challenging, due to different cuisines, cooking styles, and the intrinsic difficulty of modeling food from its visual appearance. However, contextual knowledge can be crucial to improve recognition in such scenario. In particular, geocontext has been widely exploited for outdoor landmark recognition. Similarly, we exploit knowledge about menus and location of restaurants and test images. We first adapt a framework based on discarding unlikely categories located far from the test image. Then, we reformulate the problem using a probabilistic model connecting dishes, restaurants, and locations. We apply that model in three different tasks: dish recognition, restaurant recognition, and location refinement. Experiments on six datasets show that by integrating multiple evidences (visual, location, and external knowledge) our system can boost the performance in all tasks.  
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes LAMP; 600.120 Approved no  
  Call Number Admin @ si @ HJX2017 Serial 2965  
Permanent link to this record
Select All    Deselect All
 |   | 
Details
   print

Save Citations:
Export Records: