|   | 
Details
   web
Records
Author Joana Maria Pujadas-Mora; Alicia Fornes; Josep Llados; Gabriel Brea-Martinez; Miquel Valls-Figols
Title The Baix Llobregat (BALL) Demographic Database, between Historical Demography and Computer Vision (nineteenth–twentieth centuries Type Book Chapter
Year 2019 Publication Nominative Data in Demographic Research in the East and the West: monograph Abbreviated Journal
Volume Issue Pages 29-61
Keywords (down)
Abstract The Baix Llobregat (BALL) Demographic Database is an ongoing database project containing individual census data from the Catalan region of Baix Llobregat (Spain) during the nineteenth and twentieth centuries. The BALL Database is built within the project ‘NETWORKS: Technology and citizen innovation for building historical social networks to understand the demographic past’ directed by Alícia Fornés from the Center for Computer Vision and Joana Maria Pujadas-Mora from the Center for Demographic Studies, both at the Universitat Autònoma de Barcelona, funded by the Recercaixa program (2017–2019).
Its webpage is http://dag.cvc.uab.es/xarxes/.The aim of the project is to develop technologies facilitating massive digitalization of demographic sources, and more specifically the padrones (local censuses), in order to reconstruct historical ‘social’ networks employing computer vision technology. Such virtual networks can be created thanks to the linkage of nominative records compiled in the local censuses across time and space. Thus, digitized versions of individual and family lifespans are established, and individuals and families can be located spatially.
Address
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN 978-5-7996-2656-3 Medium
Area Expedition Conference
Notes DAG; 600.121 Approved no
Call Number Admin @ si @ PFL2019 Serial 3351
Permanent link to this record
 

 
Author Jialuo Chen; M.A.Souibgui; Alicia Fornes; Beata Megyesi
Title A Web-based Interactive Transcription Tool for Encrypted Manuscripts Type Conference Article
Year 2020 Publication 3rd International Conference on Historical Cryptology Abbreviated Journal
Volume Issue Pages 52-59
Keywords (down)
Abstract Manual transcription of handwritten text is a time consuming task. In the case of encrypted manuscripts, the recognition is even more complex due to the huge variety of alphabets and symbol sets. To speed up and ease this process, we present a web-based tool aimed to (semi)-automatically transcribe the encrypted sources. The user uploads one or several images of the desired encrypted document(s) as input, and the system returns the transcription(s). This process is carried out in an interactive fashion with
the user to obtain more accurate results. For discovering and testing, the developed web tool is freely available.
Address Virtual; June 2020
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference HistoCrypt
Notes DAG; 600.140; 602.230; 600.121 Approved no
Call Number Admin @ si @ CSF2020 Serial 3447
Permanent link to this record
 

 
Author Veronica Romero; Emilio Granell; Alicia Fornes; Enrique Vidal; Joan Andreu Sanchez
Title Information Extraction in Handwritten Marriage Licenses Books Type Conference Article
Year 2019 Publication 5th International Workshop on Historical Document Imaging and Processing Abbreviated Journal
Volume Issue Pages 66-71
Keywords (down)
Abstract Handwritten marriage licenses books are characterized by a simple structure of the text in the records with an evolutionary vocabulary, mainly composed of proper names that change along the time. This distinct vocabulary makes automatic transcription and semantic information extraction difficult tasks. Previous works have shown that the use of category-based language models and a Grammatical Inference technique known as MGGI can improve the accuracy of these
tasks. However, the application of the MGGI algorithm requires an a priori knowledge to label the words of the training strings, that is not always easy to obtain. In this paper we study how to automatically obtain the information required by the MGGI algorithm using a technique based on Confusion Networks. Using the resulting language model, full handwritten text recognition and information extraction experiments have been carried out with results supporting the proposed approach.
Address Sydney; Australia; September 2019
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference HIP
Notes DAG; 600.140; 600.121 Approved no
Call Number Admin @ si @ RGF2019 Serial 3352
Permanent link to this record
 

 
Author Pau Riba; Anjan Dutta; Lutz Goldmann; Alicia Fornes; Oriol Ramos Terrades; Josep Llados
Title Table Detection in Invoice Documents by Graph Neural Networks Type Conference Article
Year 2019 Publication 15th International Conference on Document Analysis and Recognition Abbreviated Journal
Volume Issue Pages 122-127
Keywords (down)
Abstract Tabular structures in documents offer a complementary dimension to the raw textual data, representing logical or quantitative relationships among pieces of information. In digital mail room applications, where a large amount of
administrative documents must be processed with reasonable accuracy, the detection and interpretation of tables is crucial. Table recognition has gained interest in document image analysis, in particular in unconstrained formats (absence of rule lines, unknown information of rows and columns). In this work, we propose a graph-based approach for detecting tables in document images. Instead of using the raw content (recognized text), we make use of the location, context and content type, thus it is purely a structure perception approach, not dependent on the language and the quality of the text
reading. Our framework makes use of Graph Neural Networks (GNNs) in order to describe the local repetitive structural information of tables in invoice documents. Our proposed model has been experimentally validated in two invoice datasets and achieved encouraging results. Additionally, due to the scarcity
of benchmark datasets for this task, we have contributed to the community a novel dataset derived from the RVL-CDIP invoice data. It will be publicly released to facilitate future research.
Address Sydney; Australia; September 2019
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference ICDAR
Notes DAG; 600.140; 601.302; 602.167; 600.121; 600.141 Approved no
Call Number Admin @ si @ RDG2019 Serial 3355
Permanent link to this record
 

 
Author Debora Gil; Carles Sanchez; Agnes Borras; Marta Diez-Ferrer; Antoni Rosell
Title Segmentation of Distal Airways using Structural Analysis Type Journal Article
Year 2019 Publication PloS one Abbreviated Journal Plos
Volume 14 Issue 12 Pages
Keywords (down)
Abstract Segmentation of airways in Computed Tomography (CT) scans is a must for accurate support of diagnosis and intervention of many pulmonary disorders. In particular, lung cancer diagnosis would benefit from segmentations reaching most distal airways. We present a method that combines descriptors of bronchi local appearance and graph global structural analysis to fine-tune thresholds on the descriptors adapted for each bronchial level. We have compared our method to the top performers of the EXACT09 challenge and to a commercial software for biopsy planning evaluated in an own-collected data-base of high resolution CT scans acquired under different breathing conditions. Results on EXACT09 data show that our method provides a high leakage reduction with minimum loss in airway detection. Results on our data-base show the reliability across varying breathing conditions and a competitive performance for biopsy planning compared to a commercial solution.
Address
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference
Notes IAM; 600.139; 600.145 Approved no
Call Number Admin @ si @ GSB2019 Serial 3357
Permanent link to this record
 

 
Author Marta Ligero; Guillermo Torres; Carles Sanchez; Katerine Diaz; Raquel Perez; Debora Gil
Title Selection of Radiomics Features based on their Reproducibility Type Conference Article
Year 2019 Publication 41st Annual International Conference of the IEEE Engineering in Medicine and Biology Society Abbreviated Journal
Volume Issue Pages 403-408
Keywords (down)
Abstract Dimensionality reduction is key to alleviate machine learning artifacts in clinical applications with Small Sample Size (SSS) unbalanced datasets. Existing methods rely on either the probabilistic distribution of training data or the discriminant power of the reduced space, disregarding the impact of repeatability and uncertainty in features.In the present study is proposed the use of reproducibility of radiomics features to select features with high inter-class correlation coefficient (ICC). The reproducibility includes the variability introduced in the image acquisition, like medical scans acquisition parameters and convolution kernels, that affects intensity-based features and tumor annotations made by physicians, that influences morphological descriptors of the lesion.For the reproducibility of radiomics features three studies were conducted on cases collected at Vall Hebron Oncology Institute (VHIO) on responders to oncology treatment. The studies focused on the variability due to the convolution kernel, image acquisition parameters, and the inter-observer lesion identification. The features selected were those features with a ICC higher than 0.7 in the three studies.The selected features based on reproducibility were evaluated for lesion malignancy classification using a different database. Results show better performance compared to several state-of-the-art methods including Principal Component Analysis (PCA), Kernel Discriminant Analysis via QR decomposition (KDAQR), LASSO, and an own built Convolutional Neural Network.
Address Berlin; Alemanya; July 2019
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference EMBC
Notes IAM; 600.139; 600.145 Approved no
Call Number Admin @ si @ LTS2019 Serial 3358
Permanent link to this record
 

 
Author Carles Sanchez; Miguel Viñas; Coen Antens; Agnes Borras; Debora Gil
Title Back to Front Architecture for Diagnosis as a Service Type Conference Article
Year 2018 Publication 20th International Symposium on Symbolic and Numeric Algorithms for Scientific Computing Abbreviated Journal
Volume Issue Pages 343-346
Keywords (down)
Abstract Software as a Service (SaaS) is a cloud computing model in which a provider hosts applications in a server that customers use via internet. Since SaaS does not require to install applications on customers' own computers, it allows the use by multiple users of highly specialized software without extra expenses for hardware acquisition or licensing. A SaaS tailored for clinical needs not only would alleviate licensing costs, but also would facilitate easy access to new methods for diagnosis assistance. This paper presents a SaaS client-server architecture for Diagnosis as a Service (DaaS). The server is based on docker technology in order to allow execution of softwares implemented in different languages with the highest portability and scalability. The client is a content management system allowing the design of websites with multimedia content and interactive visualization of results allowing user editing. We explain a usage case that uses our DaaS as crowdsourcing platform in a multicentric pilot study carried out to evaluate the clinical benefits of a software for assessment of central airway obstruction.
Address Timisoara; Rumania; September 2018
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference SYNASC
Notes IAM; 600.145 Approved no
Call Number Admin @ si @ SVA2018 Serial 3360
Permanent link to this record
 

 
Author Debora Gil; Antoni Rosell
Title Advances in Artificial Intelligence – How Lung Cancer CT Screening Will Progress? Type Abstract
Year 2019 Publication World Lung Cancer Conference Abbreviated Journal
Volume Issue Pages
Keywords (down)
Abstract Invited speaker
Address Barcelona; September 2019
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference IASLC WCLC
Notes IAM; 600.139; 600.145 Approved no
Call Number Admin @ si @ GiR2019 Serial 3361
Permanent link to this record
 

 
Author Rada Deeb; Joost Van de Weijer; Damien Muselet; Mathieu Hebert; Alain Tremeau
Title Deep spectral reflectance and illuminant estimation from self-interreflections Type Journal Article
Year 2019 Publication Journal of the Optical Society of America A Abbreviated Journal JOSA A
Volume 31 Issue 1 Pages 105-114
Keywords (down)
Abstract In this work, we propose a convolutional neural network based approach to estimate the spectral reflectance of a surface and spectral power distribution of light from a single RGB image of a V-shaped surface. Interreflections happening in a concave surface lead to gradients of RGB values over its area. These gradients carry a lot of information concerning the physical properties of the surface and the illuminant. Our network is trained with only simulated data constructed using a physics-based interreflection model. Coupling interreflection effects with deep learning helps to retrieve the spectral reflectance under an unknown light and to estimate spectral power distribution of this light as well. In addition, it is more robust to the presence of image noise than classical approaches. Our results show that the proposed approach outperforms state-of-the-art learning-based approaches on simulated data. In addition, it gives better results on real data compared to other interreflection-based approaches.
Address
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference
Notes LAMP; 600.120 Approved no
Call Number Admin @ si @ DWM2019 Serial 3362
Permanent link to this record
 

 
Author Yaxing Wang; Abel Gonzalez-Garcia; Joost Van de Weijer; Luis Herranz
Title SDIT: Scalable and Diverse Cross-domain Image Translation Type Conference Article
Year 2019 Publication 27th ACM International Conference on Multimedia Abbreviated Journal
Volume Issue Pages 1267–1276
Keywords (down)
Abstract Recently, image-to-image translation research has witnessed remarkable progress. Although current approaches successfully generate diverse outputs or perform scalable image transfer, these properties have not been combined into a single method. To address this limitation, we propose SDIT: Scalable and Diverse image-to-image translation. These properties are combined into a single generator. The diversity is determined by a latent variable which is randomly sampled from a normal distribution. The scalability is obtained by conditioning the network on the domain attributes. Additionally, we also exploit an attention mechanism that permits the generator to focus on the domain-specific attribute. We empirically demonstrate the performance of the proposed method on face mapping and other datasets beyond faces.
Address Nice; Francia; October 2019
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference ACM-MM
Notes LAMP; 600.106; 600.109; 600.141; 600.120 Approved no
Call Number Admin @ si @ WGW2019 Serial 3363
Permanent link to this record
 

 
Author Arka Ujjal Dey; Suman Ghosh; Ernest Valveny; Gaurav Harit
Title Beyond Visual Semantics: Exploring the Role of Scene Text in Image Understanding Type Journal Article
Year 2021 Publication Pattern Recognition Letters Abbreviated Journal PRL
Volume 149 Issue Pages 164-171
Keywords (down)
Abstract Images with visual and scene text content are ubiquitous in everyday life. However, current image interpretation systems are mostly limited to using only the visual features, neglecting to leverage the scene text content. In this paper, we propose to jointly use scene text and visual channels for robust semantic interpretation of images. We do not only extract and encode visual and scene text cues, but also model their interplay to generate a contextual joint embedding with richer semantics. The contextual embedding thus generated is applied to retrieval and classification tasks on multimedia images, with scene text content, to demonstrate its effectiveness. In the retrieval framework, we augment our learned text-visual semantic representation with scene text cues, to mitigate vocabulary misses that may have occurred during the semantic embedding. To deal with irrelevant or erroneous recognition of scene text, we also apply query-based attention to our text channel. We show how the multi-channel approach, involving visual semantics and scene text, improves upon state of the art.
Address
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference
Notes DAG; 600.121 Approved no
Call Number Admin @ si @ DGV2021 Serial 3364
Permanent link to this record
 

 
Author Mohammed Al Rawi; Ernest Valveny
Title Compact and Efficient Multitask Learning in Vision, Language and Speech Type Conference Article
Year 2019 Publication IEEE International Conference on Computer Vision Workshops Abbreviated Journal
Volume Issue Pages 2933-2942
Keywords (down)
Abstract Across-domain multitask learning is a challenging area of computer vision and machine learning due to the intra-similarities among class distributions. Addressing this problem to cope with the human cognition system by considering inter and intra-class categorization and recognition complicates the problem even further. We propose in this work an effective holistic and hierarchical learning by using a text embedding layer on top of a deep learning model. We also propose a novel sensory discriminator approach to resolve the collisions between different tasks and domains. We then train the model concurrently on textual sentiment analysis, speech recognition, image classification, action recognition from video, and handwriting word spotting of two different scripts (Arabic and English). The model we propose successfully learned different tasks across multiple domains.
Address Seul; Korea; October 2019
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference ICCVW
Notes DAG; 600.121; 600.129 Approved no
Call Number Admin @ si @ RaV2019 Serial 3365
Permanent link to this record
 

 
Author Md.Mostafa Kamal Sarker; Syeda Furruka Banu; Hatem A. Rashwan; Mohamed Abdel-Nasser; Vivek Kumar Singh; Sylvie Chambon; Petia Radeva; Domenec Puig
Title Food Places Classification in Egocentric Images Using Siamese Neural Networks Type Conference Article
Year 2019 Publication 22nd International Conference of the Catalan Association of Artificial Intelligence Abbreviated Journal
Volume Issue Pages 145-151
Keywords (down)
Abstract Wearable cameras are become more popular in recent years for capturing the unscripted moments of the first-person that help to analyze the users lifestyle. In this work, we aim to recognize the places related to food in egocentric images during a day to identify the daily food patterns of the first-person. Thus, this system can assist to improve their eating behavior to protect users against food-related diseases. In this paper, we use Siamese Neural Networks to learn the similarity between images from corresponding inputs for one-shot food places classification. We tested our proposed method with ‘MiniEgoFoodPlaces’ with 15 food related places. The proposed Siamese Neural Networks model with MobileNet achieved an overall classification accuracy of 76.74% and 77.53% on the validation and test sets of the “MiniEgoFoodPlaces” dataset, respectively outperforming with the base models, such as ResNet50, InceptionV3, and InceptionResNetV2.
Address Illes Balears; October 2019
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference CCIA
Notes MILAB; no proj Approved no
Call Number Admin @ si @ SBR2019 Serial 3368
Permanent link to this record
 

 
Author Eduardo Aguilar; Petia Radeva
Title Food Recognition by Integrating Local and Flat Classifiers Type Conference Article
Year 2019 Publication 9th Iberian Conference on Pattern Recognition and Image Analysis Abbreviated Journal
Volume 11867 Issue Pages 65-74
Keywords (down)
Abstract The recognition of food image is an interesting research topic, in which its applicability in the creation of nutritional diaries stands out with the aim of improving the quality of life of people with a chronic disease (e.g. diabetes, heart disease) or prone to acquire it (e.g. people with overweight or obese). For a food recognition system to be useful in real applications, it is necessary to recognize a huge number of different foods. We argue that for very large scale classification, a traditional flat classifier is not enough to acquire an acceptable result. To address this, we propose a method that performs prediction with local classifiers, based on a class hierarchy, or with flat classifier. We decide which approach to use, depending on the analysis of both the Epistemic Uncertainty obtained for the image in the children classifiers and the prediction of the parent classifier. When our criterion is met, the final prediction is obtained with the respective local classifier; otherwise, with the flat classifier. From the results, we can see that the proposed method improves the classification performance compared to the use of a single flat classifier.
Address Madrid; July 2019
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title LNCS
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference IbPRIA
Notes MILAB; no proj Approved no
Call Number Admin @ si @ AgR2019b Serial 3369
Permanent link to this record
 

 
Author Emanuel Sanchez Aimar; Petia Radeva; Mariella Dimiccoli
Title Social Relation Recognition in Egocentric Photostreams Type Conference Article
Year 2019 Publication 26th International Conference on Image Processing Abbreviated Journal
Volume Issue Pages 3227-3231
Keywords (down)
Abstract This paper proposes an approach to automatically categorize the social interactions of a user wearing a photo-camera (2fpm), by relying solely on what the camera is seeing. The problem is challenging due to the overwhelming complexity of social life and the extreme intra-class variability of social interactions captured under unconstrained conditions. We adopt the formalization proposed in Bugental's social theory, that groups human relations into five social domains with related categories. Our method is a new deep learning architecture that exploits the hierarchical structure of the label space and relies on a set of social attributes estimated at frame level to provide a semantic representation of social interactions. Experimental results on the new EgoSocialRelation dataset demonstrate the effectiveness of our proposal.
Address Taipei; Taiwan; September 2019
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference ICIP
Notes MILAB; no menciona Approved no
Call Number Admin @ si @ SRD2019 Serial 3370
Permanent link to this record