toggle visibility Search & Display Options

Select All    Deselect All
 |   | 
Details
   print
  Records Links
Author Manisha Das; Deep Gupta; Petia Radeva; Ashwini M. Bakde edit  url
doi  openurl
  Title Optimized CT-MR neurological image fusion framework using biologically inspired spiking neural model in hybrid ℓ1 - ℓ0 layer decomposition domain Type Journal Article
  Year 2021 Publication Biomedical Signal Processing and Control Abbreviated Journal BSPC  
  Volume 68 Issue Pages 102535  
  Keywords  
  Abstract Medical image fusion plays an important role in the clinical diagnosis of several critical neurological diseases by merging complementary information available in multimodal images. In this paper, a novel CT-MR neurological image fusion framework is proposed using an optimized biologically inspired feedforward neural model in two-scale hybrid ℓ1 − ℓ0 decomposition domain using gray wolf optimization to preserve the structural as well as texture information present in source CT and MR images. Initially, the source images are subjected to two-scale ℓ1 − ℓ0 decomposition with optimized parameters, giving a scale-1 detail layer, a scale-2 detail layer and a scale-2 base layer. Two detail layers at scale-1 and 2 are fused using an optimized biologically inspired neural model and weighted average scheme based on local energy and modified spatial frequency to maximize the preservation of edges and local textures, respectively, while the scale-2 base layer gets fused using choose max rule to preserve the background information. To optimize the hyper-parameters of hybrid ℓ1 − ℓ0 decomposition and biologically inspired neural model, a fitness function is evaluated based on spatial frequency and edge index of the resultant fused image obtained by adding all the fused components. The fusion performance is analyzed by conducting extensive experiments on different CT-MR neurological images. Experimental results indicate that the proposed method provides better-fused images and outperforms the other state-of-the-art fusion methods in both visual and quantitative assessments.  
  Address  
  Corporate Author Thesis (down)  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes MILAB; no proj Approved no  
  Call Number Admin @ si @ DGR2021b Serial 3636  
Permanent link to this record
 

 
Author Andreea Glavan; Alina Matei; Petia Radeva; Estefania Talavera edit  url
openurl 
  Title Does our social life influence our nutritional behaviour? Understanding nutritional habits from egocentric photo-streams Type Journal Article
  Year 2021 Publication Expert Systems with Applications Abbreviated Journal ESWA  
  Volume 171 Issue Pages 114506  
  Keywords  
  Abstract Nutrition and social interactions are both key aspects of the daily lives of humans. In this work, we propose a system to evaluate the influence of social interaction in the nutritional habits of a person from a first-person perspective. In order to detect the routine of an individual, we construct a nutritional behaviour pattern discovery model, which outputs routines over a number of days. Our method evaluates similarity of routines with respect to visited food-related scenes over the collected days, making use of Dynamic Time Warping, as well as considering social engagement and its correlation with food-related activities. The nutritional and social descriptors of the collected days are evaluated and encoded using an LSTM Autoencoder. Later, the obtained latent space is clustered to find similar days unaffected by outliers using the Isolation Forest method. Moreover, we introduce a new score metric to evaluate the performance of the proposed algorithm. We validate our method on 104 days and more than 100 k egocentric images gathered by 7 users. Several different visualizations are evaluated for the understanding of the findings. Our results demonstrate good performance and applicability of our proposed model for social-related nutritional behaviour understanding. At the end, relevant applications of the model are discussed by analysing the discovered routine of particular individuals.  
  Address  
  Corporate Author Thesis (down)  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes MILAB; no proj Approved no  
  Call Number Admin @ si @ GMR2021 Serial 3634  
Permanent link to this record
 

 
Author Josep Llados; Daniel Lopresti; Seiichi Uchida (eds) edit  doi
isbn  openurl
  Title 16th International Conference, 2021, Proceedings, Part III Type Book Whole
  Year 2021 Publication Document Analysis and Recognition – ICDAR 2021 Abbreviated Journal  
  Volume 12823 Issue Pages  
  Keywords  
  Abstract This four-volume set of LNCS 12821, LNCS 12822, LNCS 12823 and LNCS 12824, constitutes the refereed proceedings of the 16th International Conference on Document Analysis and Recognition, ICDAR 2021, held in Lausanne, Switzerland in September 2021. The 182 full papers were carefully reviewed and selected from 340 submissions, and are presented with 13 competition reports.

The papers are organized into the following topical sections: document analysis for literature search, document summarization and translation, multimedia document analysis, mobile text recognition, document analysis for social good, indexing and retrieval of documents, physical and logical layout analysis, recognition of tables and formulas, and natural language processing (NLP) for document understanding.
 
  Address Lausanne, Switzerland, September 5-10, 2021  
  Corporate Author Thesis (down)  
  Publisher Springer Cham Place of Publication Editor Josep Llados; Daniel Lopresti; Seiichi Uchida  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title LNCS  
  Series Volume Series Issue Edition  
  ISSN ISBN 978-3-030-86333-3 Medium  
  Area Expedition Conference ICDAR  
  Notes DAG Approved no  
  Call Number Admin @ si @ Serial 3727  
Permanent link to this record
 

 
Author Josep Llados; Daniel Lopresti; Seiichi Uchida (eds) edit  doi
isbn  openurl
  Title 16th International Conference, 2021, Proceedings, Part IV Type Book Whole
  Year 2021 Publication Document Analysis and Recognition – ICDAR 2021 Abbreviated Journal  
  Volume 12824 Issue Pages  
  Keywords  
  Abstract This four-volume set of LNCS 12821, LNCS 12822, LNCS 12823 and LNCS 12824, constitutes the refereed proceedings of the 16th International Conference on Document Analysis and Recognition, ICDAR 2021, held in Lausanne, Switzerland in September 2021. The 182 full papers were carefully reviewed and selected from 340 submissions, and are presented with 13 competition reports.

The papers are organized into the following topical sections: document analysis for literature search, document summarization and translation, multimedia document analysis, mobile text recognition, document analysis for social good, indexing and retrieval of documents, physical and logical layout analysis, recognition of tables and formulas, and natural language processing (NLP) for document understanding.
 
  Address Lausanne, Switzerland, September 5-10, 2021  
  Corporate Author Thesis (down)  
  Publisher Springer Cham Place of Publication Editor Josep Llados; Daniel Lopresti; Seiichi Uchida  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title LNCS  
  Series Volume Series Issue Edition  
  ISSN ISBN 978-3-030-86336-4 Medium  
  Area Expedition Conference ICDAR  
  Notes DAG Approved no  
  Call Number Admin @ si @ Serial 3728  
Permanent link to this record
 

 
Author Mohamed Ali Souibgui; Alicia Fornes; Y.Kessentini; C.Tudor edit   pdf
doi  openurl
  Title A Few-shot Learning Approach for Historical Encoded Manuscript Recognition Type Conference Article
  Year 2021 Publication 25th International Conference on Pattern Recognition Abbreviated Journal  
  Volume Issue Pages 5413-5420  
  Keywords  
  Abstract Encoded (or ciphered) manuscripts are a special type of historical documents that contain encrypted text. The automatic recognition of this kind of documents is challenging because: 1) the cipher alphabet changes from one document to another, 2) there is a lack of annotated corpus for training and 3) touching symbols make the symbol segmentation difficult and complex. To overcome these difficulties, we propose a novel method for handwritten ciphers recognition based on few-shot object detection. Our method first detects all symbols of a given alphabet in a line image, and then a decoding step maps the symbol similarity scores to the final sequence of transcribed symbols. By training on synthetic data, we show that the proposed architecture is able to recognize handwritten ciphers with unseen alphabets. In addition, if few labeled pages with the same alphabet are used for fine tuning, our method surpasses existing unsupervised and supervised HTR methods for ciphers recognition.  
  Address Virtual; January 2021  
  Corporate Author Thesis (down)  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference ICPR  
  Notes DAG; 600.121; 600.140 Approved no  
  Call Number Admin @ si @ SFK2021 Serial 3449  
Permanent link to this record
 

 
Author Yaxing Wang; Abel Gonzalez-Garcia; Luis Herranz; Joost Van de Weijer edit   pdf
url  openurl
  Title Controlling biases and diversity in diverse image-to-image translation Type Journal Article
  Year 2021 Publication Computer Vision and Image Understanding Abbreviated Journal CVIU  
  Volume 202 Issue Pages 103082  
  Keywords  
  Abstract JCR 2019 Q2, IF=3.121
The task of unpaired image-to-image translation is highly challenging due to the lack of explicit cross-domain pairs of instances. We consider here diverse image translation (DIT), an even more challenging setting in which an image can have multiple plausible translations. This is normally achieved by explicitly disentangling content and style in the latent representation and sampling different styles codes while maintaining the image content. Despite the success of current DIT models, they are prone to suffer from bias. In this paper, we study the problem of bias in image-to-image translation. Biased datasets may add undesired changes (e.g. change gender or race in face images) to the output translations as a consequence of the particular underlying visual distribution in the target domain. In order to alleviate the effects of this problem we propose the use of semantic constraints that enforce the preservation of desired image properties. Our proposed model is a step towards unbiased diverse image-to-image translation (UDIT), and results in less unwanted changes in the translated images while still performing the wanted transformation. Experiments on several heavily biased datasets show the effectiveness of the proposed techniques in different domains such as faces, objects, and scenes.
 
  Address  
  Corporate Author Thesis (down)  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes LAMP; 600.141; 600.109; 600.147 Approved no  
  Call Number Admin @ si @ WGH2021 Serial 3464  
Permanent link to this record
 

 
Author Akhil Gurram; Ahmet Faruk Tuna; Fengyi Shen; Onay Urfalioglu; Antonio Lopez edit   pdf
doi  openurl
  Title Monocular Depth Estimation through Virtual-world Supervision and Real-world SfM Self-Supervision Type Journal Article
  Year 2021 Publication IEEE Transactions on Intelligent Transportation Systems Abbreviated Journal TITS  
  Volume 23 Issue 8 Pages 12738-12751  
  Keywords  
  Abstract Depth information is essential for on-board perception in autonomous driving and driver assistance. Monocular depth estimation (MDE) is very appealing since it allows for appearance and depth being on direct pixelwise correspondence without further calibration. Best MDE models are based on Convolutional Neural Networks (CNNs) trained in a supervised manner, i.e., assuming pixelwise ground truth (GT). Usually, this GT is acquired at training time through a calibrated multi-modal suite of sensors. However, also using only a monocular system at training time is cheaper and more scalable. This is possible by relying on structure-from-motion (SfM) principles to generate self-supervision. Nevertheless, problems of camouflaged objects, visibility changes, static-camera intervals, textureless areas, and scale ambiguity, diminish the usefulness of such self-supervision. In this paper, we perform monocular depth estimation by virtual-world supervision (MonoDEVS) and real-world SfM self-supervision. We compensate the SfM self-supervision limitations by leveraging virtual-world images with accurate semantic and depth supervision and addressing the virtual-to-real domain gap. Our MonoDEVSNet outperforms previous MDE CNNs trained on monocular and even stereo sequences.  
  Address  
  Corporate Author Thesis (down)  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes ADAS; 600.118 Approved no  
  Call Number Admin @ si @ GTS2021 Serial 3598  
Permanent link to this record
 

 
Author Diego Porres edit   pdf
url  openurl
  Title Discriminator Synthesis: On reusing the other half of Generative Adversarial Networks Type Conference Article
  Year 2021 Publication Machine Learning for Creativity and Design, Neurips Workshop Abbreviated Journal  
  Volume Issue Pages  
  Keywords  
  Abstract Generative Adversarial Networks have long since revolutionized the world of computer vision and, tied to it, the world of art. Arduous efforts have gone into fully utilizing and stabilizing training so that outputs of the Generator network have the highest possible fidelity, but little has gone into using the Discriminator after training is complete. In this work, we propose to use the latter and show a way to use the features it has learned from the training dataset to both alter an image and generate one from scratch. We name this method Discriminator Dreaming, and the full code can be found at this https URL.  
  Address Virtual; December 2021  
  Corporate Author Thesis (down)  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference NEURIPSW  
  Notes ADAS; 601.365 Approved no  
  Call Number Admin @ si @ Por2021 Serial 3597  
Permanent link to this record
 

 
Author Andres Mafla; Sounak Dey; Ali Furkan Biten; Lluis Gomez; Dimosthenis Karatzas edit   pdf
doi  openurl
  Title Multi-modal reasoning graph for scene-text based fine-grained image classification and retrieval Type Conference Article
  Year 2021 Publication IEEE Winter Conference on Applications of Computer Vision Abbreviated Journal  
  Volume Issue Pages 4022-4032  
  Keywords  
  Abstract  
  Address Virtual; January 2021  
  Corporate Author Thesis (down)  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference WACV  
  Notes DAG; 600.121 Approved no  
  Call Number Admin @ si @ MDB2021 Serial 3491  
Permanent link to this record
 

 
Author Andres Mafla; Rafael S. Rezende; Lluis Gomez; Diana Larlus; Dimosthenis Karatzas edit   pdf
doi  openurl
  Title StacMR: Scene-Text Aware Cross-Modal Retrieval Type Conference Article
  Year 2021 Publication IEEE Winter Conference on Applications of Computer Vision Abbreviated Journal  
  Volume Issue Pages 2219-2229  
  Keywords  
  Abstract  
  Address Virtual; January 2021  
  Corporate Author Thesis (down)  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference WACV  
  Notes DAG; 600.121 Approved no  
  Call Number Admin @ si @ MRG2021a Serial 3492  
Permanent link to this record
 

 
Author Andres Mafla; Ruben Tito; Sounak Dey; Lluis Gomez; Marçal Rusiñol; Ernest Valveny; Dimosthenis Karatzas edit  url
openurl 
  Title Real-time Lexicon-free Scene Text Retrieval Type Journal Article
  Year 2021 Publication Pattern Recognition Abbreviated Journal PR  
  Volume 110 Issue Pages 107656  
  Keywords  
  Abstract In this work, we address the task of scene text retrieval: given a text query, the system returns all images containing the queried text. The proposed model uses a single shot CNN architecture that predicts bounding boxes and builds a compact representation of spotted words. In this way, this problem can be modeled as a nearest neighbor search of the textual representation of a query over the outputs of the CNN collected from the totality of an image database. Our experiments demonstrate that the proposed model outperforms previous state-of-the-art, while offering a significant increase in processing speed and unmatched expressiveness with samples never seen at training time. Several experiments to assess the generalization capability of the model are conducted in a multilingual dataset, as well as an application of real-time text spotting in videos.  
  Address  
  Corporate Author Thesis (down)  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes DAG; 600.121; 600.129; 601.338 Approved no  
  Call Number Admin @ si @ MTD2021 Serial 3493  
Permanent link to this record
 

 
Author Minesh Mathew; Dimosthenis Karatzas; C.V. Jawahar edit   pdf
openurl 
  Title DocVQA: A Dataset for VQA on Document Images Type Conference Article
  Year 2021 Publication IEEE Winter Conference on Applications of Computer Vision Abbreviated Journal  
  Volume Issue Pages 2200-2209  
  Keywords  
  Abstract We present a new dataset for Visual Question Answering (VQA) on document images called DocVQA. The dataset consists of 50,000 questions defined on 12,000+ document images. Detailed analysis of the dataset in comparison with similar datasets for VQA and reading comprehension is presented. We report several baseline results by adopting existing VQA and reading comprehension models. Although the existing models perform reasonably well on certain types of questions, there is large performance gap compared to human performance (94.36% accuracy). The models need to improve specifically on questions where understanding structure of the document is crucial. The dataset, code and leaderboard are available at docvqa. org  
  Address Virtual; January 2021  
  Corporate Author Thesis (down)  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference WACV  
  Notes DAG; 600.121 Approved no  
  Call Number Admin @ si @ MKJ2021 Serial 3498  
Permanent link to this record
 

 
Author Razieh Rastgoo; Kourosh Kiani; Sergio Escalera edit  url
openurl 
  Title Sign Language Recognition: A Deep Survey Type Journal Article
  Year 2021 Publication Expert Systems With Applications Abbreviated Journal ESWA  
  Volume 164 Issue Pages 113794  
  Keywords  
  Abstract Sign language, as a different form of the communication language, is important to large groups of people in society. There are different signs in each sign language with variability in hand shape, motion profile, and position of the hand, face, and body parts contributing to each sign. So, visual sign language recognition is a complex research area in computer vision. Many models have been proposed by different researchers with significant improvement by deep learning approaches in recent years. In this survey, we review the vision-based proposed models of sign language recognition using deep learning approaches from the last five years. While the overall trend of the proposed models indicates a significant improvement in recognition accuracy in sign language recognition, there are some challenges yet that need to be solved. We present a taxonomy to categorize the proposed models for isolated and continuous sign language recognition, discussing applications, datasets, hybrid models, complexity, and future lines of research in the field.  
  Address  
  Corporate Author Thesis (down)  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes HUPBA; no proj Approved no  
  Call Number Admin @ si @ RKE2021a Serial 3521  
Permanent link to this record
 

 
Author Giuseppe Pezzano; Vicent Ribas Ripoll; Petia Radeva edit  url
openurl 
  Title CoLe-CNN: Context-learning convolutional neural network with adaptive loss function for lung nodule segmentation Type Journal Article
  Year 2021 Publication Computer Methods and Programs in Biomedicine Abbreviated Journal CMPB  
  Volume 198 Issue Pages 105792  
  Keywords  
  Abstract Background and objective:An accurate segmentation of lung nodules in computed tomography images is a crucial step for the physical characterization of the tumour. Being often completely manually accomplished, nodule segmentation turns to be a tedious and time-consuming procedure and this represents a high obstacle in clinical practice. In this paper, we propose a novel Convolutional Neural Network for nodule segmentation that combines a light and efficient architecture with innovative loss function and segmentation strategy. Methods:In contrast to most of the standard end-to-end architectures for nodule segmentation, our network learns the context of the nodules by producing two masks representing all the background and secondary-important elements in the Computed Tomography scan. The nodule is detected by subtracting the context from the original scan image. Additionally, we introduce an asymmetric loss function that automatically compensates for potential errors in the nodule annotations. We trained and tested our Neural Network on the public LIDC-IDRI database, compared it with the state of the art and run a pseudo-Turing test between four radiologists and the network. Results:The results proved that the behaviour of the algorithm is very near to the human performance and its segmentation masks are almost indistinguishable from the ones made by the radiologists. Our method clearly outperforms the state of the art on CT nodule segmentation in terms of F1 score and IoU of and respectively. Conclusions: The main structure of the network ensures all the properties of the UNet architecture, while the Multi Convolutional Layers give a more accurate pattern recognition. The newly adopted solutions also increase the details on the border of the nodule, even under the noisiest conditions. This method can be applied now for single CT slice nodule segmentation and it represents a starting point for the future development of a fully automatic 3D segmentation software.  
  Address  
  Corporate Author Thesis (down)  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes MILAB; no proj Approved no  
  Call Number Admin @ si @ PRR2021 Serial 3530  
Permanent link to this record
 

 
Author Cristina Palmero; Javier Selva; Sorina Smeureanu; Julio C. S. Jacques Junior; Albert Clapes; Alexa Mosegui; Zejian Zhang; David Gallardo; Georgina Guilera; David Leiva; Sergio Escalera edit   pdf
doi  openurl
  Title Context-Aware Personality Inference in Dyadic Scenarios: Introducing the UDIVA Dataset Type Conference Article
  Year 2021 Publication IEEE Winter Conference on Applications of Computer Vision Abbreviated Journal  
  Volume Issue Pages 1-12  
  Keywords  
  Abstract This paper introduces UDIVA, a new non-acted dataset of face-to-face dyadic interactions, where interlocutors perform competitive and collaborative tasks with different behavior elicitation and cognitive workload. The dataset consists of 90.5 hours of dyadic interactions among 147 participants distributed in 188 sessions, recorded using multiple audiovisual and physiological sensors. Currently, it includes sociodemographic, self- and peer-reported personality, internal state, and relationship profiling from participants. As an initial analysis on UDIVA, we propose a
transformer-based method for self-reported personality inference in dyadic scenarios, which uses audiovisual data and different sources of context from both interlocutors to
regress a target person’s personality traits. Preliminary results from an incremental study show consistent improvements when using all available context information.
 
  Address Virtual; January 2021  
  Corporate Author Thesis (down)  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference WACV  
  Notes HUPBA Approved no  
  Call Number Admin @ si @ PSS2021 Serial 3532  
Permanent link to this record
Select All    Deselect All
 |   | 
Details
   print

Save Citations:
Export Records: