Publicacions CVC -- Query Results

	Publicacions CVC Home \| Show All \| Simple Search \| Advanced Search \| Add Record \| Import	Login Quick Search: Field: contains: ...
	631–645 of 3413 records found matching your query (RSS):

Search & Display Options

Select All Deselect All

[31–40] << 41 42 43 44 45 46 47 48 49 50 >> [51–60]

List View

Citations

Details

	Records
	Author	Alejandro Cartas; Petia Radeva; Mariella Dimiccoli
	Title	Modeling long-term interactions to enhance action recognition			Type	Conference Article
	Year	2021	Publication	25th International Conference on Pattern Recognition	Abbreviated Journal
	Volume		Issue		Pages	10351-10358
	Keywords
	Abstract	In this paper, we propose a new approach to under-stand actions in egocentric videos that exploits the semantics of object interactions at both frame and temporal levels. At the frame level, we use a region-based approach that takes as input a primary region roughly corresponding to the user hands and a set of secondary regions potentially corresponding to the interacting objects and calculates the action score through a CNN formulation. This information is then fed to a Hierarchical LongShort-Term Memory Network (HLSTM) that captures temporal dependencies between actions within and across shots. Ablation studies thoroughly validate the proposed approach, showing in particular that both levels of the HLSTM architecture contribute to performance improvement. Furthermore, quantitative comparisons show that the proposed approach outperforms the state-of-the-art in terms of action recognition on standard benchmarks,without relying on motion information
	Address	January 2021
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference	ICPR
	Notes	MILAB;			Approved	no
	Call Number	Admin @ si @ CRD2021			Serial	3626
Permanent link to this record



	Author	Quentin Angermann; Jorge Bernal; Cristina Sanchez Montes; Maroua Hammami; Gloria Fernandez Esparrach; Xavier Dray; Olivier Romain; F. Javier Sanchez; Aymeric Histace
	Title	Clinical Usability Quantification Of a Real-Time Polyp Detection Method In Videocolonoscopy			Type	Conference Article
	Year	2017	Publication	25th United European Gastroenterology Week	Abbreviated Journal
	Volume		Issue		Pages
	Keywords
	Abstract
	Address	Barcelona, October 2017
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference	ESGE
	Notes	MV; no menciona			Approved	no
	Call Number	Admin @ si @ ABS2017c			Serial	2978
Permanent link to this record



	Author	Cristina Sanchez Montes; F. Javier Sanchez; Cristina Rodriguez de Miguel; Henry Cordova; Jorge Bernal; Maria Lopez Ceron; Josep Llach; Gloria Fernandez Esparrach
	Title	Histological Prediction Of Colonic Polyps By Computer Vision. Preliminary Results			Type	Conference Article
	Year	2017	Publication	25th United European Gastroenterology Week	Abbreviated Journal
	Volume		Issue		Pages
	Keywords	polyps; histology; computer vision
	Abstract	during colonoscopy, clinicians perform visual inspection of the polyps to predict histology. Kudo’s pit pattern classification is one of the most commonly used for optical diagnosis. These surface patterns present a contrast with respect to their neighboring regions and they can be considered as bright regions in the image that can attract the attention of computational methods.
	Address	Barcelona; October 2017
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference	ESGE
	Notes	MV; no menciona			Approved	no
	Call Number	Admin @ si @ SSR2017			Serial	2979
Permanent link to this record



	Author	Xinhang Song; Shuqiang Jiang; Luis Herranz
	Title	Combining Models from Multiple Sources for RGB-D Scene Recognition			Type	Conference Article
	Year	2017	Publication	26th International Joint Conference on Artificial Intelligence	Abbreviated Journal
	Volume		Issue		Pages	4523-4529
	Keywords	Robotics and Vision; Vision and Perception
	Abstract	Depth can complement RGB with useful cues about object volumes and scene layout. However, RGB-D image datasets are still too small for directly training deep convolutional neural networks (CNNs), in contrast to the massive monomodal RGB datasets. Previous works in RGB-D recognition typically combine two separate networks for RGB and depth data, pretrained with a large RGB dataset and then fine tuned to the respective target RGB and depth datasets. These approaches have several limitations: 1) only use low-level filters learned from RGB data, thus not being able to exploit properly depth-specific patterns, and 2) RGB and depth features are only combined at high-levels but rarely at lower-levels. In this paper, we propose a framework that leverages both knowledge acquired from large RGB datasets together with depth-specific cues learned from the limited depth data, obtaining more effective multi-source and multi-modal representations. We propose a multi-modal combination method that selects discriminative combinations of layers from the different source models and target modalities, capturing both high-level properties of the task and intrinsic low-level properties of both modalities.
	Address	Melbourne; Australia; August 2017
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference	IJCAI
	Notes	LAMP; 600.120			Approved	no
	Call Number	Admin @ si @ SJH2017b			Serial	2966
Permanent link to this record



	Author	Victor Ponce; Hugo Jair Escalante; Sergio Escalera; Xavier Baro
	Title	Gesture and Action Recognition by Evolved Dynamic Subgestures			Type	Conference Article
	Year	2015	Publication	26th British Machine Vision Conference	Abbreviated Journal
	Volume		Issue		Pages	129.1-129.13
	Keywords
	Abstract	This paper introduces a framework for gesture and action recognition based on the evolution of temporal gesture primitives, or subgestures. Our work is inspired on the principle of producing genetic variations within a population of gesture subsequences, with the goal of obtaining a set of gesture units that enhance the generalization capability of standard gesture recognition approaches. In our context, gesture primitives are evolved over time using dynamic programming and generative models in order to recognize complex actions. In few generations, the proposed subgesture-based representation of actions and gestures outperforms the state of the art results on the MSRDaily3D and MSRAction3D datasets.
	Address	Swansea; uk; September 2015
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference	BMVC
	Notes	HuPBA;MV			Approved	no
	Call Number	Admin @ si @ PEE2015			Serial	2657
Permanent link to this record



	Author	Huamin Ren; Weifeng Liu; Soren Ingvor Olsen; Sergio Escalera; Thomas B. Moeslund
	Title	Unsupervised Behavior-Specific Dictionary Learning for Abnormal Event Detection			Type	Conference Article
	Year	2015	Publication	26th British Machine Vision Conference	Abbreviated Journal
	Volume		Issue		Pages
	Keywords
	Abstract
	Address	Swansea; uk; September 2015
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference	BMVC
	Notes	HuPBA;MILAB			Approved	no
	Call Number	Admin @ si @ RLO2015			Serial	2658
Permanent link to this record



	Author	Mohammad Ali Bagheri; Qigang Gao; Sergio Escalera
	Title	Logo recognition Based on the Dempster-Shafer Fusion of Multiple Classifiers			Type	Conference Article
	Year	2013	Publication	26th Canadian Conference on Artificial Intelligence	Abbreviated Journal
	Volume	7884	Issue		Pages	1-12
	Keywords	Logo recognition; ensemble classification; Dempster-Shafer fusion; Zernike moments; generic Fourier descriptor; shape signature
	Abstract	Best paper award The performance of different feature extraction and shape description methods in trademark image recognition systems have been studied by several researchers. However, the potential improvement in classification through feature fusion by ensemble-based methods has remained unattended. In this work, we evaluate the performance of an ensemble of three classifiers, each trained on different feature sets. Three promising shape description techniques, including Zernike moments, generic Fourier descriptors, and shape signature are used to extract informative features from logo images, and each set of features is fed into an individual classifier. In order to reduce recognition error, a powerful combination strategy based on the Dempster-Shafer theory is utilized to fuse the three classifiers trained on different sources of information. This combination strategy can effectively make use of diversity of base learners generated with different set of features. The recognition results of the individual classifiers are compared with those obtained from fusing the classifiers’ output, showing significant performance improvements of the proposed methodology.
	Address	Canada; May 2013
	Corporate Author				Thesis
	Publisher	Springer Berlin Heidelberg	Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN	0302-9743	ISBN	978-3-642-38456-1	Medium
	Area		Expedition		Conference	AI
	Notes	HuPBA;MILAB			Approved	no
	Call Number	Admin @ si @ BGE2013b			Serial	2249
Permanent link to this record



	Author	Hassan Ahmed Sial; S. Sancho; Ramon Baldrich; Robert Benavente; Maria Vanrell
	Title	Color-based data augmentation for Reflectance Estimation			Type	Conference Article
	Year	2018	Publication	26th Color Imaging Conference	Abbreviated Journal
	Volume		Issue		Pages	284-289
	Keywords
	Abstract	Deep convolutional architectures have shown to be successful frameworks to solve generic computer vision problems. The estimation of intrinsic reflectance from single image is not a solved problem yet. Encoder-Decoder architectures are a perfect approach for pixel-wise reflectance estimation, although it usually suffers from the lack of large datasets. Lack of data can be partially solved with data augmentation, however usual techniques focus on geometric changes which does not help for reflectance estimation. In this paper we propose a color-based data augmentation technique that extends the training data by increasing the variability of chromaticity. Rotation on the red-green blue-yellow plane of an opponent space enable to increase the training set in a coherent and sound way that improves network generalization capability for reflectance estimation. We perform some experiments on the Sintel dataset showing that our color-based augmentation increase performance and overcomes one of the state-of-the-art methods.
	Address	Vancouver; November 2018
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference	CIC
	Notes	CIC			Approved	no
	Call Number	Admin @ si @ SSB2018a			Serial	3129
Permanent link to this record



	Author	Emanuel Sanchez Aimar; Petia Radeva; Mariella Dimiccoli
	Title	Social Relation Recognition in Egocentric Photostreams			Type	Conference Article
	Year	2019	Publication	26th International Conference on Image Processing	Abbreviated Journal
	Volume		Issue		Pages	3227-3231
	Keywords
	Abstract	This paper proposes an approach to automatically categorize the social interactions of a user wearing a photo-camera (2fpm), by relying solely on what the camera is seeing. The problem is challenging due to the overwhelming complexity of social life and the extreme intra-class variability of social interactions captured under unconstrained conditions. We adopt the formalization proposed in Bugental's social theory, that groups human relations into five social domains with related categories. Our method is a new deep learning architecture that exploits the hierarchical structure of the label space and relies on a set of social attributes estimated at frame level to provide a semantic representation of social interactions. Experimental results on the new EgoSocialRelation dataset demonstrate the effectiveness of our proposal.
	Address	Taipei; Taiwan; September 2019
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference	ICIP
	Notes	MILAB; no menciona			Approved	no
	Call Number	Admin @ si @ SRD2019			Serial	3370
Permanent link to this record



	Author	Mohamed Ali Souibgui; Sanket Biswas; Sana Khamekhem Jemni; Yousri Kessentini; Alicia Fornes; Josep Llados; Umapada Pal
	Title	DocEnTr: An End-to-End Document Image Enhancement Transformer			Type	Conference Article
	Year	2022	Publication	26th International Conference on Pattern Recognition	Abbreviated Journal
	Volume		Issue		Pages	1699-1705
	Keywords	Degradation; Head; Optical character recognition; Self-supervised learning; Benchmark testing; Transformers; Magnetic heads
	Abstract	Document images can be affected by many degradation scenarios, which cause recognition and processing difficulties. In this age of digitization, it is important to denoise them for proper usage. To address this challenge, we present a new encoder-decoder architecture based on vision transformers to enhance both machine-printed and handwritten document images, in an end-to-end fashion. The encoder operates directly on the pixel patches with their positional information without the use of any convolutional layers, while the decoder reconstructs a clean image from the encoded patches. Conducted experiments show a superiority of the proposed model compared to the state-of the-art methods on several DIBCO benchmarks. Code and models will be publicly available at: https://github.com/dali92002/DocEnTR
	Address	August 21-25, 2022 , Montréal Québec
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference	ICPR
	Notes	DAG; 600.121; 600.162; 602.230; 600.140			Approved	no
	Call Number	Admin @ si @ SBJ2022			Serial	3730
Permanent link to this record



	Author	Carlos Boned Riera; Oriol Ramos Terrades
	Title	Discriminative Neural Variational Model for Unbalanced Classification Tasks in Knowledge Graph			Type	Conference Article
	Year	2022	Publication	26th International Conference on Pattern Recognition	Abbreviated Journal
	Volume		Issue		Pages	2186-2191
	Keywords	Measurement; Couplings; Semantics; Ear; Benchmark testing; Data models; Pattern recognition
	Abstract	Nowadays the paradigm of link discovery problems has shown significant improvements on Knowledge Graphs. However, method performances are harmed by the unbalanced nature of this classification problem, since many methods are easily biased to not find proper links. In this paper we present a discriminative neural variational auto-encoder model, called DNVAE from now on, in which we have introduced latent variables to serve as embedding vectors. As a result, the learnt generative model approximate better the underlying distribution and, at the same time, it better differentiate the type of relations in the knowledge graph. We have evaluated this approach on benchmark knowledge graph and Census records. Results in this last data set are quite impressive since we reach the highest possible score in the evaluation metrics. However, further experiments are still needed to deeper evaluate the performance of the method in more challenging tasks.
	Address	Montreal; Quebec; Canada; August 2022
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference	ICPR
	Notes	DAG; 600.121; 600.162			Approved	no
	Call Number	Admin @ si @ BoR2022			Serial	3741
Permanent link to this record



	Author	Vacit Oguz Yazici; Joost Van de Weijer; Longlong Yu
	Title	Visual Transformers with Primal Object Queries for Multi-Label Image Classification			Type	Conference Article
	Year	2022	Publication	26th International Conference on Pattern Recognition	Abbreviated Journal
	Volume		Issue		Pages
	Keywords
	Abstract	Multi-label image classification is about predicting a set of class labels that can be considered as orderless sequential data. Transformers process the sequential data as a whole, therefore they are inherently good at set prediction. The first vision-based transformer model, which was proposed for the object detection task introduced the concept of object queries. Object queries are learnable positional encodings that are used by attention modules in decoder layers to decode the object classes or bounding boxes using the region of interests in an image. However, inputting the same set of object queries to different decoder layers hinders the training: it results in lower performance and delays convergence. In this paper, we propose the usage of primal object queries that are only provided at the start of the transformer decoder stack. In addition, we improve the mixup technique proposed for multi-label classification. The proposed transformer model with primal object queries improves the state-of-the-art class wise F1 metric by 2.1% and 1.8%; and speeds up the convergence by 79.0% and 38.6% on MS-COCO and NUS-WIDE datasets respectively.
	Address	Montreal; Quebec; Canada; August 2022
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference	ICPR
	Notes	LAMP; 600.147; 601.309			Approved	no
	Call Number	Admin @ si @ YWY2022			Serial	3786
Permanent link to this record



	Author	Ayan Banerjee; Palaiahnakote Shivakumara; Parikshit Acharya; Umapada Pal; Josep Llados
	Title	TWD: A New Deep E2E Model for Text Watermark Detection in Video Images			Type	Conference Article
	Year	2022	Publication	26th International Conference on Pattern Recognition	Abbreviated Journal
	Volume		Issue		Pages
	Keywords	Deep learning; U-Net; FCENet; Scene text detection; Video text detection; Watermark text detection
	Abstract	Text watermark detection in video images is challenging because text watermark characteristics are different from caption and scene texts in the video images. Developing a successful model for detecting text watermark, caption, and scene texts is an open challenge. This study aims at developing a new Deep End-to-End model for Text Watermark Detection (TWD), caption and scene text in video images. To standardize non-uniform contrast, quality, and resolution, we explore the U-Net3+ model for enhancing poor quality text without affecting high-quality text. Similarly, to address the challenges of arbitrary orientation, text shapes and complex background, we explore Stacked Hourglass Encoded Fourier Contour Embedding Network (SFCENet) by feeding the output of the U-Net3+ model as input. Furthermore, the proposed work integrates enhancement and detection models as an end-to-end model for detecting multi-type text in video images. To validate the proposed model, we create our own dataset (named TW-866), which provides video images containing text watermark, caption (subtitles), as well as scene text. The proposed model is also evaluated on standard natural scene text detection datasets, namely, ICDAR 2019 MLT, CTW1500, Total-Text, and DAST1500. The results show that the proposed method outperforms the existing methods. This is the first work on text watermark detection in video images to the best of our knowledge
	Address	Montreal; Quebec; Canada; August 2022
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference	ICPR
	Notes	DAG;			Approved	no
	Call Number	Admin @ si @ BSA2022			Serial	3788
Permanent link to this record



	Author	Yaxing Wang; Abel Gonzalez-Garcia; Joost Van de Weijer; Luis Herranz
	Title	SDIT: Scalable and Diverse Cross-domain Image Translation			Type	Conference Article
	Year	2019	Publication	27th ACM International Conference on Multimedia	Abbreviated Journal
	Volume		Issue		Pages	1267–1276
	Keywords
	Abstract	Recently, image-to-image translation research has witnessed remarkable progress. Although current approaches successfully generate diverse outputs or perform scalable image transfer, these properties have not been combined into a single method. To address this limitation, we propose SDIT: Scalable and Diverse image-to-image translation. These properties are combined into a single generator. The diversity is determined by a latent variable which is randomly sampled from a normal distribution. The scalability is obtained by conditioning the network on the domain attributes. Additionally, we also exploit an attention mechanism that permits the generator to focus on the domain-specific attribute. We empirically demonstrate the performance of the proposed method on face mapping and other datasets beyond faces.
	Address	Nice; Francia; October 2019
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference	ACM-MM
	Notes	LAMP; 600.106; 600.109; 600.141; 600.120			Approved	no
	Call Number	Admin @ si @ WGW2019			Serial	3363
Permanent link to this record



	Author	Joan Codina-Filba; Sergio Escalera; Joan Escudero; Coen Antens; Pau Buch-Cardona; Mireia Farrus
	Title	Mobile eHealth Platform for Home Monitoring of Bipolar Disorder			Type	Conference Article
	Year	2021	Publication	27th ACM International Conference on Multimedia Modeling	Abbreviated Journal
	Volume	12573	Issue		Pages	330-341
	Keywords
	Abstract	People suffering Bipolar Disorder (BD) experiment changes in mood status having depressive or manic episodes with normal periods in the middle. BD is a chronic disease with a high level of non-adherence to medication that needs a continuous monitoring of patients to detect when they relapse in an episode, so that physicians can take care of them. Here we present MoodRecord, an easy-to-use, non-intrusive, multilingual, robust and scalable platform suitable for home monitoring patients with BD, that allows physicians and relatives to track the patient state and get alarms when abnormalities occur. MoodRecord takes advantage of the capabilities of smartphones as a communication and recording device to do a continuous monitoring of patients. It automatically records user activity, and asks the user to answer some questions or to record himself in video, according to a predefined plan designed by physicians. The video is analysed, recognising the mood status from images and bipolar assessment scores are extracted from speech parameters. The data obtained from the different sources are merged periodically to observe if a relapse may start and if so, raise the corresponding alarm. The application got a positive evaluation in a pilot with users from three different countries. During the pilot, the predictions of the voice and image modules showed a coherent correlation with the diagnosis performed by clinicians.
	Address
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title	LNCS
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference	MMM
	Notes	HUPBA; no proj			Approved	no
	Call Number	Admin @ si @ CEE2021			Serial	3659
Permanent link to this record