Records |
Author |
Lluis Gomez; Anguelos Nicolaou; Marçal Rusiñol; Dimosthenis Karatzas |
Title |
12 years of ICDAR Robust Reading Competitions: The evolution of reading systems for unconstrained text understanding |
Type |
Book Chapter |
Year |
2020 |
Publication |
Visual Text Interpretation – Algorithms and Applications in Scene Understanding and Document Analysis |
Abbreviated Journal |
|
Volume |
|
Issue |
|
Pages |
|
Keywords |
|
Abstract |
|
Address |
|
Corporate Author |
|
Thesis |
|
Publisher |
Springer |
Place of Publication |
|
Editor |
K. Alahari; C.V. Jawahar |
Language |
|
Summary Language |
|
Original Title |
|
Series Editor |
|
Series Title |
Series on Advances in Computer Vision and Pattern Recognition |
Abbreviated Series Title |
|
Series Volume |
|
Series Issue |
|
Edition |
|
ISSN |
|
ISBN |
|
Medium |
|
Area |
|
Expedition |
|
Conference |
|
Notes |
DAG; 600.121 |
Approved |
no |
Call Number |
GNR2020 |
Serial |
3494 |
Permanent link to this record |
|
|
|
Author |
Mohamed Ali Souibgui; Y.Kessentini; Alicia Fornes |
Title |
A conditional GAN based approach for distorted camera captured documents recovery |
Type |
Conference Article |
Year |
2020 |
Publication |
4th Mediterranean Conference on Pattern Recognition and Artificial Intelligence |
Abbreviated Journal |
|
Volume |
|
Issue |
|
Pages |
|
Keywords |
|
Abstract |
|
Address |
Virtual; December 2020 |
Corporate Author |
|
Thesis |
|
Publisher |
|
Place of Publication |
|
Editor |
|
Language |
|
Summary Language |
|
Original Title |
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
Series Volume |
|
Series Issue |
|
Edition |
|
ISSN |
|
ISBN |
|
Medium |
|
Area |
|
Expedition |
|
Conference |
MedPRAI |
Notes |
DAG; 600.121 |
Approved |
no |
Call Number |
Admin @ si @ SKF2020 |
Serial |
3450 |
Permanent link to this record |
|
|
|
Author |
Oriol Ramos Terrades; Albert Berenguel; Debora Gil |
Title |
A flexible outlier detector based on a topology given by graph communities |
Type |
Miscellaneous |
Year |
2020 |
Publication |
Arxiv |
Abbreviated Journal |
|
Volume |
|
Issue |
|
Pages |
|
Keywords |
|
Abstract |
Outlier, or anomaly, detection is essential for optimal performance of machine learning methods and statistical predictive models. It is not just a technical step in a data cleaning process but a key topic in many fields such as fraudulent document detection, in medical applications and assisted diagnosis systems or detecting security threats. In contrast to population-based methods, neighborhood based local approaches are simple flexible methods that have the potential to perform well in small sample size unbalanced problems. However, a main concern of local approaches is the impact that the computation of each sample neighborhood has on the method performance. Most approaches use a distance in the feature space to define a single neighborhood that requires careful selection of several parameters. This work presents a local approach based on a local measure of the heterogeneity of sample labels in the feature space considered as a topological manifold. Topology is computed using the communities of a weighted graph codifying mutual nearest neighbors in the feature space. This way, we provide with a set of multiple neighborhoods able to describe the structure of complex spaces without parameter fine tuning. The extensive experiments on real-world data sets show that our approach overall outperforms, both, local and global strategies in multi and single view settings. |
Address |
|
Corporate Author |
|
Thesis |
|
Publisher |
|
Place of Publication |
|
Editor |
|
Language |
|
Summary Language |
|
Original Title |
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
Series Volume |
|
Series Issue |
|
Edition |
|
ISSN |
|
ISBN |
|
Medium |
|
Area |
|
Expedition |
|
Conference |
|
Notes |
IAM; DAG; 600.139; 600.145; 600.140; 600.121 |
Approved |
no |
Call Number |
Admin @ si @ RBG2020 |
Serial |
3475 |
Permanent link to this record |
|
|
|
Author |
Debora Gil; Guillermo Torres |
Title |
A multi-shape loss function with adaptive class balancing for the segmentation of lung structures |
Type |
Conference Article |
Year |
2020 |
Publication |
34th International Congress and Exhibition on Computer Assisted Radiology & Surgery |
Abbreviated Journal |
|
Volume |
|
Issue |
|
Pages |
|
Keywords |
|
Abstract |
|
Address |
Virtual; June 2020 |
Corporate Author |
|
Thesis |
|
Publisher |
|
Place of Publication |
|
Editor |
|
Language |
|
Summary Language |
|
Original Title |
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
Series Volume |
|
Series Issue |
|
Edition |
|
ISSN |
|
ISBN |
|
Medium |
|
Area |
|
Expedition |
|
Conference |
CARS |
Notes |
IAM; 600.139; 600.145 |
Approved |
no |
Call Number |
Admin @ si @ GiT2020 |
Serial |
3472 |
Permanent link to this record |
|
|
|
Author |
Guillermo Torres; Debora Gil |
Title |
A multi-shape loss function with adaptive class balancing for the segmentation of lung structures |
Type |
Journal Article |
Year |
2020 |
Publication |
International Journal of Computer Assisted Radiology and Surgery |
Abbreviated Journal |
IJCAR |
Volume |
15 |
Issue |
1 |
Pages |
S154-55 |
Keywords |
|
Abstract |
|
Address |
|
Corporate Author |
|
Thesis |
|
Publisher |
|
Place of Publication |
|
Editor |
|
Language |
|
Summary Language |
|
Original Title |
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
Series Volume |
|
Series Issue |
|
Edition |
|
ISSN |
|
ISBN |
|
Medium |
|
Area |
|
Expedition |
|
Conference |
|
Notes |
IAM |
Approved |
no |
Call Number |
Admin @ si @ ToG2020 |
Serial |
3590 |
Permanent link to this record |
|
|
|
Author |
Manuel Carbonell; Alicia Fornes; Mauricio Villegas; Josep Llados |
Title |
A Neural Model for Text Localization, Transcription and Named Entity Recognition in Full Pages |
Type |
Journal Article |
Year |
2020 |
Publication |
Pattern Recognition Letters |
Abbreviated Journal |
PRL |
Volume |
136 |
Issue |
|
Pages |
219-227 |
Keywords |
|
Abstract |
In the last years, the consolidation of deep neural network architectures for information extraction in document images has brought big improvements in the performance of each of the tasks involved in this process, consisting of text localization, transcription, and named entity recognition. However, this process is traditionally performed with separate methods for each task. In this work we propose an end-to-end model that combines a one stage object detection network with branches for the recognition of text and named entities respectively in a way that shared features can be learned simultaneously from the training error of each of the tasks. By doing so the model jointly performs handwritten text detection, transcription, and named entity recognition at page level with a single feed forward step. We exhaustively evaluate our approach on different datasets, discussing its advantages and limitations compared to sequential approaches. The results show that the model is capable of benefiting from shared features by simultaneously solving interdependent tasks. |
Address |
|
Corporate Author |
|
Thesis |
|
Publisher |
|
Place of Publication |
|
Editor |
|
Language |
|
Summary Language |
|
Original Title |
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
Series Volume |
|
Series Issue |
|
Edition |
|
ISSN |
|
ISBN |
|
Medium |
|
Area |
|
Expedition |
|
Conference |
|
Notes |
DAG; 600.140; 601.311; 600.121 |
Approved |
no |
Call Number |
Admin @ si @ CFV2020 |
Serial |
3451 |
Permanent link to this record |
|
|
|
Author |
Jialuo Chen; M.A.Souibgui; Alicia Fornes; Beata Megyesi |
Title |
A Web-based Interactive Transcription Tool for Encrypted Manuscripts |
Type |
Conference Article |
Year |
2020 |
Publication |
3rd International Conference on Historical Cryptology |
Abbreviated Journal |
|
Volume |
|
Issue |
|
Pages |
52-59 |
Keywords |
|
Abstract |
Manual transcription of handwritten text is a time consuming task. In the case of encrypted manuscripts, the recognition is even more complex due to the huge variety of alphabets and symbol sets. To speed up and ease this process, we present a web-based tool aimed to (semi)-automatically transcribe the encrypted sources. The user uploads one or several images of the desired encrypted document(s) as input, and the system returns the transcription(s). This process is carried out in an interactive fashion with
the user to obtain more accurate results. For discovering and testing, the developed web tool is freely available. |
Address |
Virtual; June 2020 |
Corporate Author |
|
Thesis |
|
Publisher |
|
Place of Publication |
|
Editor |
|
Language |
|
Summary Language |
|
Original Title |
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
Series Volume |
|
Series Issue |
|
Edition |
|
ISSN |
|
ISBN |
|
Medium |
|
Area |
|
Expedition |
|
Conference |
HistoCrypt |
Notes |
DAG; 600.140; 602.230; 600.121 |
Approved |
no |
Call Number |
Admin @ si @ CSF2020 |
Serial |
3447 |
Permanent link to this record |
|
|
|
Author |
Yi Xiao; Felipe Codevilla; Christopher Pal; Antonio Lopez |
Title |
Action-Based Representation Learning for Autonomous Driving |
Type |
Conference Article |
Year |
2020 |
Publication |
Conference on Robot Learning |
Abbreviated Journal |
|
Volume |
|
Issue |
|
Pages |
|
Keywords |
|
Abstract |
Human drivers produce a vast amount of data which could, in principle, be used to improve autonomous driving systems. Unfortunately, seemingly straightforward approaches for creating end-to-end driving models that map sensor data directly into driving actions are problematic in terms of interpretability, and typically have significant difficulty dealing with spurious correlations. Alternatively, we propose to use this kind of action-based driving data for learning representations. Our experiments show that an affordance-based driving model pre-trained with this approach can leverage a relatively small amount of weakly annotated imagery and outperform pure end-to-end driving models, while being more interpretable. Further, we demonstrate how this strategy outperforms previous methods based on learning inverse dynamics models as well as other methods based on heavy human supervision (ImageNet). |
Address |
virtual; November 2020 |
Corporate Author |
|
Thesis |
|
Publisher |
|
Place of Publication |
|
Editor |
|
Language |
|
Summary Language |
|
Original Title |
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
Series Volume |
|
Series Issue |
|
Edition |
|
ISSN |
|
ISBN |
|
Medium |
|
Area |
|
Expedition |
|
Conference |
CORL |
Notes |
ADAS; 600.118 |
Approved |
no |
Call Number |
Admin @ si @ XCP2020 |
Serial |
3487 |
Permanent link to this record |
|
|
|
Author |
Alejandro Cartas; Petia Radeva; Mariella Dimiccoli |
Title |
Activities of Daily Living Monitoring via a Wearable Camera: Toward Real-World Applications |
Type |
Journal Article |
Year |
2020 |
Publication |
IEEE Access |
Abbreviated Journal |
ACCESS |
Volume |
8 |
Issue |
|
Pages |
77344 - 77363 |
Keywords |
|
Abstract |
Activity recognition from wearable photo-cameras is crucial for lifestyle characterization and health monitoring. However, to enable its wide-spreading use in real-world applications, a high level of generalization needs to be ensured on unseen users. Currently, state-of-the-art methods have been tested only on relatively small datasets consisting of data collected by a few users that are partially seen during training. In this paper, we built a new egocentric dataset acquired by 15 people through a wearable photo-camera and used it to test the generalization capabilities of several state-of-the-art methods for egocentric activity recognition on unseen users and daily image sequences. In addition, we propose several variants to state-of-the-art deep learning architectures, and we show that it is possible to achieve 79.87% accuracy on users unseen during training. Furthermore, to show that the proposed dataset and approach can be useful in real-world applications, where data can be acquired by different wearable cameras and labeled data are scarcely available, we employed a domain adaptation strategy on two egocentric activity recognition benchmark datasets. These experiments show that the model learned with our dataset, can easily be transferred to other domains with a very small amount of labeled data. Taken together, those results show that activity recognition from wearable photo-cameras is mature enough to be tested in real-world applications. |
Address |
|
Corporate Author |
|
Thesis |
|
Publisher |
|
Place of Publication |
|
Editor |
|
Language |
|
Summary Language |
|
Original Title |
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
Series Volume |
|
Series Issue |
|
Edition |
|
ISSN |
|
ISBN |
|
Medium |
|
Area |
|
Expedition |
|
Conference |
|
Notes |
MILAB; no proj |
Approved |
no |
Call Number |
Admin @ si @ CRD2020 |
Serial |
3436 |
Permanent link to this record |
|
|
|
Author |
Raquel Justo; Leila Ben Letaifa; Cristina Palmero; Eduardo Gonzalez-Fraile; Anna Torp Johansen; Alain Vazquez; Gennaro Cordasco; Stephan Schlogl; Begoña Fernandez-Ruanova; Micaela Silva; Sergio Escalera; Mikel de Velasco; Joffre Tenorio-Laranga; Anna Esposito; Maria Korsnes; M. Ines Torres |
Title |
Analysis of the Interaction between Elderly People and a Simulated Virtual Coach, Journal of Ambient Intelligence and Humanized Computing |
Type |
Journal Article |
Year |
2020 |
Publication |
Journal of Ambient Intelligence and Humanized Computing |
Abbreviated Journal |
AIHC |
Volume |
11 |
Issue |
12 |
Pages |
6125-6140 |
Keywords |
|
Abstract |
The EMPATHIC project develops and validates new interaction paradigms for personalized virtual coaches (VC) to promote healthy and independent aging. To this end, the work presented in this paper is aimed to analyze the interaction between the EMPATHIC-VC and the users. One of the goals of the project is to ensure an end-user driven design, involving senior users from the beginning and during each phase of the project. Thus, the paper focuses on some sessions where the seniors carried out interactions with a Wizard of Oz driven, simulated system. A coaching strategy based on the GROW model was used throughout these sessions so as to guide interactions and engage the elderly with the goals of the project. In this interaction framework, both the human and the system behavior were analyzed. The way the wizard implements the GROW coaching strategy is a key aspect of the system behavior during the interaction. The language used by the virtual agent as well as his or her physical aspect are also important cues that were analyzed. Regarding the user behavior, the vocal communication provides information about the speaker’s emotional status, that is closely related to human behavior and which can be extracted from the speech and language analysis. In the same way, the analysis of the facial expression, gazes and gestures can provide information on the non verbal human communication even when the user is not talking. In addition, in order to engage senior users, their preferences and likes had to be considered. To this end, the effect of the VC on the users was gathered by means of direct questionnaires. These analyses have shown a positive and calm behavior of users when interacting with the simulated virtual coach as well as some difficulties of the system to develop the proposed coaching strategy. |
Address |
|
Corporate Author |
|
Thesis |
|
Publisher |
|
Place of Publication |
|
Editor |
|
Language |
|
Summary Language |
|
Original Title |
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
Series Volume |
|
Series Issue |
|
Edition |
|
ISSN |
|
ISBN |
|
Medium |
|
Area |
|
Expedition |
|
Conference |
|
Notes |
HuPBA; no proj |
Approved |
no |
Call Number |
Admin @ si @ JLP2020 |
Serial |
3443 |
Permanent link to this record |
|
|
|
Author |
Reza Azad; Maryam Asadi-Aghbolaghi; Mahmood Fathy; Sergio Escalera |
Title |
Attention Deeplabv3+: Multi-level Context Attention Mechanism for Skin Lesion Segmentation |
Type |
Conference Article |
Year |
2020 |
Publication |
Bioimage computation workshop |
Abbreviated Journal |
|
Volume |
|
Issue |
|
Pages |
|
Keywords |
|
Abstract |
|
Address |
Virtual; August 2020 |
Corporate Author |
|
Thesis |
|
Publisher |
|
Place of Publication |
|
Editor |
|
Language |
|
Summary Language |
|
Original Title |
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
Series Volume |
|
Series Issue |
|
Edition |
|
ISSN |
|
ISBN |
|
Medium |
|
Area |
|
Expedition |
|
Conference |
ECCVW |
Notes |
HUPBA |
Approved |
no |
Call Number |
Admin @ si @ AAF2020 |
Serial |
3520 |
Permanent link to this record |
|
|
|
Author |
Mariona Caros; Maite Garolera; Petia Radeva; Xavier Giro |
Title |
Automatic Reminiscence Therapy for Dementia |
Type |
Conference Article |
Year |
2020 |
Publication |
10th ACM International Conference on Multimedia Retrieval |
Abbreviated Journal |
|
Volume |
|
Issue |
|
Pages |
383-387 |
Keywords |
|
Abstract |
With people living longer than ever, the number of cases with dementia such as Alzheimer's disease increases steadily. It affects more than 46 million people worldwide, and it is estimated that in 2050 more than 100 million will be affected. While there are not effective treatments for these terminal diseases, therapies such as reminiscence, that stimulate memories from the past are recommended. Currently, reminiscence therapy takes place in care homes and is guided by a therapist or a carer. In this work, we present an AI-based solution to automatize the reminiscence therapy, which consists in a dialogue system that uses photos as input to generate questions. We run a usability case study with patients diagnosed of mild cognitive impairment that shows they found the system very entertaining and challenging. Overall, this paper presents how reminiscence therapy can be automatized by using machine learning, and deployed to smartphones and laptops, making the therapy more accessible to every person affected by dementia. |
Address |
Virtual; October 2020 |
Corporate Author |
|
Thesis |
|
Publisher |
|
Place of Publication |
|
Editor |
|
Language |
|
Summary Language |
|
Original Title |
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
Series Volume |
|
Series Issue |
|
Edition |
|
ISSN |
|
ISBN |
|
Medium |
|
Area |
|
Expedition |
|
Conference |
ICRM |
Notes |
|
Approved |
no |
Call Number |
Admin @ si @ CGR2020 |
Serial |
3529 |
Permanent link to this record |
|
|
|
Author |
Khalid El Asnaoui; Petia Radeva |
Title |
Automatically Assess Day Similarity Using Visual Lifelogs |
Type |
Journal Article |
Year |
2020 |
Publication |
International Journal of Intelligent Systems |
Abbreviated Journal |
IJIS |
Volume |
29 |
Issue |
|
Pages |
298–310 |
Keywords |
|
Abstract |
Today, we witness the appearance of many lifelogging cameras that are able to capture the life of a person wearing the camera and which produce a large number of images everyday. Automatically characterizing the experience and extracting patterns of behavior of individuals from this huge collection of unlabeled and unstructured egocentric data present major challenges and require novel and efficient algorithmic solutions. The main goal of this work is to propose a new method to automatically assess day similarity from the lifelogging images of a person. We propose a technique to measure the similarity between images based on the Swain’s distance and generalize it to detect the similarity between daily visual data. To this purpose, we apply the dynamic time warping (DTW) combined with the Swain’s distance for final day similarity estimation. For validation, we apply our technique on the Egocentric Dataset of University of Barcelona (EDUB) of 4912 daily images acquired by four persons with preliminary encouraging results. |
Address |
|
Corporate Author |
|
Thesis |
|
Publisher |
|
Place of Publication |
|
Editor |
|
Language |
|
Summary Language |
|
Original Title |
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
Series Volume |
|
Series Issue |
|
Edition |
|
ISSN |
|
ISBN |
|
Medium |
|
Area |
|
Expedition |
|
Conference |
|
Notes |
MILAB; no proj |
Approved |
no |
Call Number |
AsR2020 |
Serial |
3409 |
Permanent link to this record |
|
|
|
Author |
Martin Menchon; Estefania Talavera; Jose M. Massa; Petia Radeva |
Title |
Behavioural Pattern Discovery from Collections of Egocentric Photo-Streams |
Type |
Conference Article |
Year |
2020 |
Publication |
ECCV Workshops |
Abbreviated Journal |
|
Volume |
12538 |
Issue |
|
Pages |
469-484 |
Keywords |
|
Abstract |
The automatic discovery of behaviour is of high importance when aiming to assess and improve the quality of life of people. Egocentric images offer a rich and objective description of the daily life of the camera wearer. This work proposes a new method to identify a person’s patterns of behaviour from collected egocentric photo-streams. Our model characterizes time-frames based on the context (place, activities and environment objects) that define the images composition. Based on the similarity among the time-frames that describe the collected days for a user, we propose a new unsupervised greedy method to discover the behavioural pattern set based on a novel semantic clustering approach. Moreover, we present a new score metric to evaluate the performance of the proposed algorithm. We validate our method on 104 days and more than 100k images extracted from 7 users. Results show that behavioural patterns can be discovered to characterize the routine of individuals and consequently their lifestyle. |
Address |
Virtual; August 2020 |
Corporate Author |
|
Thesis |
|
Publisher |
|
Place of Publication |
|
Editor |
|
Language |
|
Summary Language |
|
Original Title |
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
LNCS |
Series Volume |
|
Series Issue |
|
Edition |
|
ISSN |
|
ISBN |
|
Medium |
|
Area |
|
Expedition |
|
Conference |
ECCVW |
Notes |
MILAB; no proj |
Approved |
no |
Call Number |
Admin @ si @ MTM2020 |
Serial |
3528 |
Permanent link to this record |
|
|
|
Author |
Kai Wang; Luis Herranz; Anjan Dutta; Joost Van de Weijer |
Title |
Bookworm continual learning: beyond zero-shot learning and continual learning |
Type |
Conference Article |
Year |
2020 |
Publication |
Workshop TASK-CV 2020 |
Abbreviated Journal |
|
Volume |
|
Issue |
|
Pages |
|
Keywords |
|
Abstract |
We propose bookworm continual learning(BCL), a flexible setting where unseen classes can be inferred via a semantic model, and the visual model can be updated continually. Thus BCL generalizes both continual learning (CL) and zero-shot learning (ZSL). We also propose the bidirectional imagination (BImag) framework to address BCL where features of both past and future classes are generated. We observe that conditioning the feature generator on attributes can actually harm the continual learning ability, and propose two variants (joint class-attribute conditioning and asymmetric generation) to alleviate this problem. |
Address |
Virtual; August 2020 |
Corporate Author |
|
Thesis |
|
Publisher |
|
Place of Publication |
|
Editor |
|
Language |
|
Summary Language |
|
Original Title |
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
Series Volume |
|
Series Issue |
|
Edition |
|
ISSN |
|
ISBN |
|
Medium |
|
Area |
|
Expedition |
|
Conference |
ECCVW |
Notes |
LAMP; 600.141; 600.120 |
Approved |
no |
Call Number |
Admin @ si @ WHD2020 |
Serial |
3466 |
Permanent link to this record |