|   | 
Details
   web
Records
Author Marçal Rusiñol; Dimosthenis Karatzas; Andrew Bagdanov; Josep Llados
Title Multipage Document Retrieval by Textual and Visual Representations Type Conference Article
Year 2012 Publication (down) 21st International Conference on Pattern Recognition Abbreviated Journal
Volume Issue Pages 521-524
Keywords
Abstract In this paper we present a multipage administrative document image retrieval system based on textual and visual representations of document pages. Individual pages are represented by textual or visual information using a bag-of-words framework. Different fusion strategies are evaluated which allow the system to perform multipage document retrieval on the basis of a single page retrieval system. Results are reported on a large dataset of document images sampled from a banking workflow.
Address Tsukuba Science City, Japan
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN 1051-4651 ISBN 978-1-4673-2216-4 Medium
Area Expedition Conference ICPR
Notes DAG Approved no
Call Number Admin @ si @ RKB2012 Serial 2053
Permanent link to this record
 

 
Author Antonio Hernandez; Miguel Angel Bautista; Xavier Perez Sala; Victor Ponce; Xavier Baro; Oriol Pujol; Cecilio Angulo; Sergio Escalera
Title BoVDW: Bag-of-Visual-and-Depth-Words for Gesture Recognition Type Conference Article
Year 2012 Publication (down) 21st International Conference on Pattern Recognition Abbreviated Journal
Volume Issue Pages
Keywords
Abstract We present a Bag-of-Visual-and-Depth-Words (BoVDW) model for gesture recognition, an extension of the Bag-of-Visual-Words (BoVW) model, that benefits from the multimodal fusion of visual and depth features. State-of-the-art RGB and depth features, including a new proposed depth descriptor, are analysed and combined in a late fusion fashion. The method is integrated in a continuous gesture recognition pipeline, where Dynamic Time Warping (DTW) algorithm is used to perform prior segmentation of gestures. Results of the method in public data sets, within our gesture recognition pipeline, show better performance in comparison to a standard BoVW model.
Address
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN 1051-4651 ISBN 978-1-4673-2216-4 Medium
Area Expedition Conference ICPR
Notes HuPBA;MV Approved no
Call Number Admin @ si @ HBP2012 Serial 2122
Permanent link to this record
 

 
Author Anjan Dutta; Jaume Gibert; Josep Llados; Horst Bunke; Umapada Pal
Title Combination of Product Graph and Random Walk Kernel for Symbol Spotting in Graphical Documents Type Conference Article
Year 2012 Publication (down) 21st International Conference on Pattern Recognition Abbreviated Journal
Volume Issue Pages 1663-1666
Keywords
Abstract This paper explores the utilization of product graph for spotting symbols on graphical documents. Product graph is intended to find the candidate subgraphs or components in the input graph containing the paths similar to the query graph. The acute angle between two edges and their length ratio are considered as the node labels. In a second step, each of the candidate subgraphs in the input graph is assigned with a distance measure computed by a random walk kernel. Actually it is the minimum of the distances of the component to all the components of the model graph. This distance measure is then used to eliminate dissimilar components. The remaining neighboring components are grouped and the grouped zone is considered as a retrieval zone of a symbol similar to the queried one. The entire method works online, i.e., it doesn't need any preprocessing step. The present paper reports the initial results of the method, which are very encouraging.
Address Tsukuba, Japan
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN 1051-4651 ISBN 978-1-4673-2216-4 Medium
Area Expedition Conference ICPR
Notes DAG Approved no
Call Number Admin @ si @ DGL2012 Serial 2125
Permanent link to this record
 

 
Author Thanh Ha Do; Salvatore Tabbone; Oriol Ramos Terrades
Title Text/graphic separation using a sparse representation with multi-learned dictionaries Type Conference Article
Year 2012 Publication (down) 21st International Conference on Pattern Recognition Abbreviated Journal
Volume Issue Pages
Keywords Graphics Recognition; Layout Analysis; Document Understandin
Abstract In this paper, we propose a new approach to extract text regions from graphical documents. In our method, we first empirically construct two sequences of learned dictionaries for the text and graphical parts respectively. Then, we compute the sparse representations of all different sizes and non-overlapped document patches in these learned dictionaries. Based on these representations, each patch can be classified into the text or graphic category by comparing its reconstruction errors. Same-sized patches in one category are then merged together to define the corresponding text or graphic layers which are combined to createfinal text/graphic layer. Finally, in a post-processing step, text regions are further filtered out by using some learned thresholds.
Address Tsukuba
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference ICPR
Notes DAG Approved no
Call Number Admin @ si @ DTR2012a Serial 2135
Permanent link to this record
 

 
Author Josep M. Gonfaus; Theo Gevers; Arjan Gijsenij; Xavier Roca; Jordi Gonzalez
Title Edge Classification using Photo-Geo metric features Type Conference Article
Year 2012 Publication (down) 21st International Conference on Pattern Recognition Abbreviated Journal
Volume Issue Pages 1497 - 1500
Keywords
Abstract Edges are caused by several imaging cues such as shadow, material and illumination transitions. Classification methods have been proposed which are solely based on photometric information, ignoring geometry to classify the physical nature of edges in images. In this paper, the aim is to present a novel strategy to handle both photometric and geometric information for edge classification. Photometric information is obtained through the use of quasi-invariants while geometric information is derived from the orientation and contrast of edges. Different combination frameworks are compared with a new principled approach that captures both information into the same descriptor. From large scale experiments on different datasets, it is shown that, in addition to photometric information, the geometry of edges is an important visual cue to distinguish between different edge types. It is concluded that by combining both cues the performance improves by more than 7% for shadows and highlights.
Address
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN 1051-4651 ISBN 978-1-4673-2216-4 Medium
Area Expedition Conference ICPR
Notes ISE Approved no
Call Number Admin @ si @ GGG2012b Serial 2142
Permanent link to this record
 

 
Author Adela Barbulescu; Wenjuan Gong; Jordi Gonzalez; Thomas B. Moeslund; Xavier Roca
Title 3D Human Pose Estimation Using 2D Body Part Detectors Type Conference Article
Year 2012 Publication (down) 21st International Conference on Pattern Recognition Abbreviated Journal
Volume Issue Pages 2484 - 2487
Keywords
Abstract Automatic 3D reconstruction of human poses from monocular images is a challenging and popular topic in the computer vision community, which provides a wide range of applications in multiple areas. Solutions for 3D pose estimation involve various learning approaches, such as support vector machines and Gaussian processes, but many encounter difficulties in cluttered scenarios and require additional input data, such as silhouettes, or controlled camera settings. We present a framework that is capable of estimating the 3D pose of a person from single images or monocular image sequences without requiring background information and which is robust to camera variations. The framework models the non-linearity present in human pose estimation as it benefits from flexible learning approaches, including a highly customizable 2D detector. Results on the HumanEva benchmark show how they perform and influence the quality of the 3D pose estimates.
Address Tsubuka, Japan
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN 1051-4651 ISBN 978-1-4673-2216-4 Medium
Area Expedition Conference ICPR
Notes ISE Approved no
Call Number Admin @ si @ BGG2012 Serial 2172
Permanent link to this record
 

 
Author Md. Mostafa Kamal Sarker; Hatem A. Rashwan; Farhan Akram; Syeda Furruka Banu; Adel Saleh; Vivek Kumar Singh; Forhad U. H. Chowdhury; Saddam Abdulwahab; Santiago Romani; Petia Radeva; Domenec Puig
Title SLSDeep: Skin Lesion Segmentation Based on Dilated Residual and Pyramid Pooling Networks. Type Conference Article
Year 2018 Publication (down) 21st International Conference on Medical Image Computing & Computer Assisted Intervention Abbreviated Journal
Volume 2 Issue Pages 21-29
Keywords
Abstract Skin lesion segmentation (SLS) in dermoscopic images is a crucial task for automated diagnosis of melanoma. In this paper, we present a robust deep learning SLS model, so-called SLSDeep, which is represented as an encoder-decoder network. The encoder network is constructed by dilated residual layers, in turn, a pyramid pooling network followed by three convolution layers is used for the decoder. Unlike the traditional methods employing a cross-entropy loss, we investigated a loss function by combining both Negative Log Likelihood (NLL) and End Point Error (EPE) to accurately segment the melanoma regions with sharp boundaries. The robustness of the proposed model was evaluated on two public databases: ISBI 2016 and 2017 for skin lesion analysis towards melanoma detection challenge. The proposed model outperforms the state-of-the-art methods in terms of segmentation accuracy. Moreover, it is capable to segment more than 100 images of size 384x384 per second on a recent GPU.
Address Granada; Espanya; September 2018
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference MICCAI
Notes MILAB; no proj Approved no
Call Number Admin @ si @ SRA2018 Serial 3112
Permanent link to this record
 

 
Author Md. Mostafa Kamal Sarker; Mohammed Jabreel; Hatem A. Rashwan; Syeda Furruka Banu; Petia Radeva; Domenec Puig
Title CuisineNet: Food Attributes Classification using Multi-scale Convolution Network Type Conference Article
Year 2018 Publication (down) 21st International Conference of the Catalan Association for Artificial Intelligence Abbreviated Journal
Volume Issue Pages 365-372
Keywords
Abstract Diversity of food and its attributes represents the culinary habits of peoples from different countries. Thus, this paper addresses the problem of identifying food culture of people around the world and its flavor by classifying two main food attributes, cuisine and flavor. A deep learning model based on multi-scale convotuional networks is proposed for extracting more accurate features from input images. The aggregation of multi-scale convolution layers with different kernel size is also used for weighting the features results from different scales. In addition, a joint loss function based on Negative Log Likelihood (NLL) is used to fit the model probability to multi labeled classes for multi-modal classification task. Furthermore, this work provides a new dataset for food attributes, so-called Yummly48K, extracted from the popular food website, Yummly. Our model is assessed on the constructed Yummly48K dataset. The experimental results show that our proposed method yields 65% and 62% average F1 score on validation and test set which outperforming the state-of-the-art models.
Address Roses; catalonia; October 2018
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference CCIA
Notes MILAB; no menciona Approved no
Call Number Admin @ si @ SJR2018 Serial 3113
Permanent link to this record
 

 
Author Michal Drozdzal; Jordi Vitria; Santiago Segui; Carolina Malagelada; Fernando Azpiroz; Petia Radeva
Title Intestinal event segmentation for endoluminal video analysis Type Conference Article
Year 2014 Publication (down) 21st IEEE International Conference on Image Processing Abbreviated Journal
Volume Issue Pages 3592 - 3596
Keywords
Abstract
Address Paris; Francia; October 2014
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference ICIP
Notes MILAB; OR;MV Approved no
Call Number Admin @ si @ DVS2014 Serial 2565
Permanent link to this record
 

 
Author Juan A. Carvajal Ayala; Dennis Romero; Angel Sappa
Title Fine-tuning based deep convolutional networks for lepidopterous genus recognition Type Conference Article
Year 2016 Publication (down) 21st Ibero American Congress on Pattern Recognition Abbreviated Journal
Volume Issue Pages 467-475
Keywords
Abstract This paper describes an image classification approach oriented to identify specimens of lepidopterous insects at Ecuadorian ecological reserves. This work seeks to contribute to studies in the area of biology about genus of butterflies and also to facilitate the registration of unrecognized specimens. The proposed approach is based on the fine-tuning of three widely used pre-trained Convolutional Neural Networks (CNNs). This strategy is intended to overcome the reduced number of labeled images. Experimental results with a dataset labeled by expert biologists is presented, reaching a recognition accuracy above 92%.
Address Lima; Perú; November 2016
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title LNCS
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference CIARP
Notes ADAS; 600.086 Approved no
Call Number Admin @ si @ CRS2016 Serial 2913
Permanent link to this record
 

 
Author H. Emrah Tasli; Jan van Gemert; Theo Gevers
Title Spot the differences: from a photograph burst to the single best picture Type Conference Article
Year 2013 Publication (down) 21ST ACM International Conference on Multimedia Abbreviated Journal
Volume Issue Pages 729-732
Keywords
Abstract With the rise of the digital camera, people nowadays typically take several near-identical photos of the same scene to maximize the chances of a good shot. This paper proposes a user-friendly tool for exploring a personal photo gallery for selecting or even creating the best shot of a scene between its multiple alternatives. This functionality is realized through a graphical user interface where the best viewpoint can be selected from a generated panorama of the scene. Once the viewpoint is selected, the user is able to go explore possible alternatives coming from the other images. Using this tool, one can explore a photo gallery efficiently. Moreover, additional compositions from other images are also possible. With such additional compositions, one can go from a burst of photographs to the single best one. Even funny compositions of images, where you can duplicate a person in the same image, are possible with our proposed tool.
Address Barcelona
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference ACM-MM
Notes ALTRES;ISE Approved no
Call Number TGG2013 Serial 2368
Permanent link to this record
 

 
Author Sezer Karaoglu; Jan van Gemert; Theo Gevers
Title Con-text: text detection using background connectivity for fine-grained object classification Type Conference Article
Year 2013 Publication (down) 21ST ACM International Conference on Multimedia Abbreviated Journal
Volume Issue Pages 757-760
Keywords
Abstract
Address
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference ACM-MM
Notes ALTRES;ISE Approved no
Call Number Admin @ si @ KGG2013 Serial 2369
Permanent link to this record
 

 
Author Fernando Vilariño; Dan Norton; Onur Ferhat
Title Memory Fields: DJs in the Library Type Conference Article
Year 2015 Publication (down) 21 st Symposium of Electronic Arts Abbreviated Journal
Volume Issue Pages
Keywords
Abstract
Address Vancouver; Canada; August 2015
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference ISEA
Notes ;SIAI Approved no
Call Number Admin @ si @VNF2015 Serial 2800
Permanent link to this record
 

 
Author Muhammad Anwer Rao; Fahad Shahbaz Khan; Joost Van de Weijer; Jorma Laaksonen
Title Top-Down Deep Appearance Attention for Action Recognition Type Conference Article
Year 2017 Publication (down) 20th Scandinavian Conference on Image Analysis Abbreviated Journal
Volume 10269 Issue Pages 297-309
Keywords Action recognition; CNNs; Feature fusion
Abstract Recognizing human actions in videos is a challenging problem in computer vision. Recently, convolutional neural network based deep features have shown promising results for action recognition. In this paper, we investigate the problem of fusing deep appearance and motion cues for action recognition. We propose a video representation which combines deep appearance and motion based local convolutional features within the bag-of-deep-features framework. Firstly, dense deep appearance and motion based local convolutional features are extracted from spatial (RGB) and temporal (flow) networks, respectively. Both visual cues are processed in parallel by constructing separate visual vocabularies for appearance and motion. A category-specific appearance map is then learned to modulate the weights of the deep motion features. The proposed representation is discriminative and binds the deep local convolutional features to their spatial locations. Experiments are performed on two challenging datasets: JHMDB dataset with 21 action classes and ACT dataset with 43 categories. The results clearly demonstrate that our approach outperforms both standard approaches of early and late feature fusion. Further, our approach is only employing action labels and without exploiting body part information, but achieves competitive performance compared to the state-of-the-art deep features based approaches.
Address Tromso; June 2017
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title LNCS
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference SCIA
Notes LAMP; 600.109; 600.068; 600.120 Approved no
Call Number Admin @ si @ RKW2017b Serial 3039
Permanent link to this record
 

 
Author Carles Sanchez; Miguel Viñas; Coen Antens; Agnes Borras; Debora Gil
Title Back to Front Architecture for Diagnosis as a Service Type Conference Article
Year 2018 Publication (down) 20th International Symposium on Symbolic and Numeric Algorithms for Scientific Computing Abbreviated Journal
Volume Issue Pages 343-346
Keywords
Abstract Software as a Service (SaaS) is a cloud computing model in which a provider hosts applications in a server that customers use via internet. Since SaaS does not require to install applications on customers' own computers, it allows the use by multiple users of highly specialized software without extra expenses for hardware acquisition or licensing. A SaaS tailored for clinical needs not only would alleviate licensing costs, but also would facilitate easy access to new methods for diagnosis assistance. This paper presents a SaaS client-server architecture for Diagnosis as a Service (DaaS). The server is based on docker technology in order to allow execution of softwares implemented in different languages with the highest portability and scalability. The client is a content management system allowing the design of websites with multimedia content and interactive visualization of results allowing user editing. We explain a usage case that uses our DaaS as crowdsourcing platform in a multicentric pilot study carried out to evaluate the clinical benefits of a software for assessment of central airway obstruction.
Address Timisoara; Rumania; September 2018
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference SYNASC
Notes IAM; 600.145 Approved no
Call Number Admin @ si @ SVA2018 Serial 3360
Permanent link to this record