Publicacions CVC -- Query Results

[171–180] << 181 182 183 184 185 186 187 188 189 190 >> [191–200]

Details

Records
Author	Alejandro Cartas; Estefania Talavera; Petia Radeva; Mariella Dimiccoli
Title	On the Role of Event Boundaries in Egocentric Activity Recognition from Photostreams			Type	Miscellaneous
Year	2018	Publication	Arxiv	Abbreviated Journal
Volume		Issue		Pages
Keywords
Abstract	Event boundaries play a crucial role as a pre-processing step for detection, localization, and recognition tasks of human activities in videos. Typically, although their intrinsic subjectiveness, temporal bounds are provided manually as input for training action recognition algorithms. However, their role for activity recognition in the domain of egocentric photostreams has been so far neglected. In this paper, we provide insights of how automatically computed boundaries can impact activity recognition results in the emerging domain of egocentric photostreams. Furthermore, we collected a new annotated dataset acquired by 15 people by a wearable photo-camera and we used it to show the generalization capabilities of several deep learning based architectures to unseen users.
Address
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference
Notes	MILAB; no proj			Approved	no
Call Number	Admin @ si @ CTR2018			Serial	3184
Permanent link to this record



Author	Md. Mostafa Kamal Sarker; Hatem A. Rashwan; Hatem A. Rashwan; Estefania Talavera; Syeda Furruka Banu; Petia Radeva; Domenec Puig
Title	MACNet: Multi-scale Atrous Convolution Networks for Food Places Classification in Egocentric Photo-streams			Type	Conference Article
Year	2018	Publication	European Conference on Computer Vision workshops	Abbreviated Journal
Volume		Issue		Pages	423-433
Keywords
Abstract	First-person (wearable) camera continually captures unscripted interactions of the camera user with objects, people, and scenes reflecting his personal and relational tendencies. One of the preferences of people is their interaction with food events. The regulation of food intake and its duration has a great importance to protect against diseases. Consequently, this work aims to develop a smart model that is able to determine the recurrences of a person on food places during a day. This model is based on a deep end-to-end model for automatic food places recognition by analyzing egocentric photo-streams. In this paper, we apply multi-scale Atrous convolution networks to extract the key features related to food places of the input images. The proposed model is evaluated on an in-house private dataset called “EgoFoodPlaces”. Experimental results shows promising results of food places classification recognition in egocentric photo-streams.
Address
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title	LCNS
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference	ECCVW
Notes	MILAB; no menciona			Approved	no
Call Number	Admin @ si @ SRR2018b			Serial	3185
Permanent link to this record



Author	Mariella Dimiccoli; Cathal Gurrin; David J. Crandall; Xavier Giro; Petia Radeva
Title	Introduction to the special issue: Egocentric Vision and Lifelogging			Type	Journal Article
Year	2018	Publication	Journal of Visual Communication and Image Representation	Abbreviated Journal	JVCIR
Volume	55	Issue		Pages	352-353
Keywords
Abstract
Address
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference
Notes	MILAB; no proj			Approved	no
Call Number	Admin @ si @ DGC2018			Serial	3187
Permanent link to this record



Author	L. Rothacker; Marçal Rusiñol; Josep Llados; G.A. Fink
Title	A Two-stage Approach to Segmentation-Free Query-by-example Word Spotting			Type	Journal
Year	2014	Publication	Manuscript Cultures	Abbreviated Journal
Volume	7	Issue		Pages	47-58
Keywords
Abstract	With the ongoing progress in digitization, huge document collections and archives have become available to a broad audience. Scanned document images can be transmitted electronically and studied simultaneously throughout the world. While this is very beneficial, it is often impossible to perform automated searches on these document collections. Optical character recognition usually fails when it comes to handwritten or historic documents. In order to address the need for exploring document collections rapidly, researchers are working on word spotting. In query-by-example word spotting scenarios, the user selects an exemplary occurrence of the query word in a document image. The word spotting system then retrieves all regions in the collection that are visually similar to the given example of the query word. The best matching regions are presented to the user and no actual transcription is required. An important property of a word spotting system is the computational speed with which queries can be executed. In our previous work, we presented a relatively slow but high-precision method. In the present work, we will extend this baseline system to an integrated two-stage approach. In a coarse-grained first stage, we will filter document images efficiently in order to identify regions that are likely to contain the query word. In the fine-grained second stage, these regions will be analyzed with our previously presented high-precision method. Finally, we will report recognition results and query times for the well-known George Washington benchmark in our evaluation. We achieve state-of-the-art recognition results while the query times can be reduced to 50% in comparison with our baseline.
Address
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference
Notes	DAG; 600.061; 600.077			Approved	no
Call Number	Admin @ si @			Serial	3190
Permanent link to this record



Author	Patricia Suarez; Angel Sappa; Boris X. Vintimilla
Title	Vegetation Index Estimation from Monospectral Images			Type	Conference Article
Year	2018	Publication	15th International Conference on Images Analysis and Recognition	Abbreviated Journal
Volume	10882	Issue		Pages	353-362
Keywords
Abstract	This paper proposes a novel approach to estimate Normalized Difference Vegetation Index (NDVI) from just the red channel of a RGB image. The NDVI index is defined as the ratio of the difference of the red and infrared radiances over their sum. In other words, information from the red channel of a RGB image and the corresponding infrared spectral band are required for its computation. In the current work the NDVI index is estimated just from the red channel by training a Conditional Generative Adversarial Network (CGAN). The architecture proposed for the generative network consists of a single level structure, which combines at the final layer results from convolutional operations together with the given red channel with Gaussian noise to enhance details, resulting in a sharp NDVI image. Then, the discriminative model estimates the probability that the NDVI generated index came from the training dataset, rather than the index automatically generated. Experimental results with a large set of real images are provided showing that a Conditional GAN single level model represents an acceptable approach to estimate NDVI index.
Address	Povoa de Varzim; Portugal; June 2018
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title	LNCS
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference	ICIAR
Notes	MSIAU; 600.086; 600.130; 600.122			Approved	no
Call Number	Admin @ si @ SSV2018c			Serial	3196
Permanent link to this record



Author	Md. Mostafa Kamal Sarker; Hatem A. Rashwan; Farhan Akram; Vivek Kumar Singh; Syeda Furruka Banu; Forhad U H Chowdhury; Kabir Ahmed Choudhury; Sylvie Chambon; Petia Radeva; Domenec Puig; Mohamed Abdel-Nasser
Title	SLSNet: Skin lesion segmentation using a lightweight generative adversarial network			Type	Journal Article
Year	2021	Publication	Expert Systems With Applications	Abbreviated Journal	ESWA
Volume	183	Issue		Pages	115433
Keywords
Abstract	The determination of precise skin lesion boundaries in dermoscopic images using automated methods faces many challenges, most importantly, the presence of hair, inconspicuous lesion edges and low contrast in dermoscopic images, and variability in the color, texture and shapes of skin lesions. Existing deep learning-based skin lesion segmentation algorithms are expensive in terms of computational time and memory. Consequently, running such segmentation algorithms requires a powerful GPU and high bandwidth memory, which are not available in dermoscopy devices. Thus, this article aims to achieve precise skin lesion segmentation with minimum resources: a lightweight, efficient generative adversarial network (GAN) model called SLSNet, which combines 1-D kernel factorized networks, position and channel attention, and multiscale aggregation mechanisms with a GAN model. The 1-D kernel factorized network reduces the computational cost of 2D filtering. The position and channel attention modules enhance the discriminative ability between the lesion and non-lesion feature representations in spatial and channel dimensions, respectively. A multiscale block is also used to aggregate the coarse-to-fine features of input skin images and reduce the effect of the artifacts. SLSNet is evaluated on two publicly available datasets: ISBI 2017 and the ISIC 2018. Although SLSNet has only 2.35 million parameters, the experimental results demonstrate that it achieves segmentation results on a par with the state-of-the-art skin lesion segmentation methods with an accuracy of 97.61%, and Dice and Jaccard similarity coefficients of 90.63% and 81.98%, respectively. SLSNet can run at more than 110 frames per second (FPS) in a single GTX1080Ti GPU, which is faster than well-known deep learning-based image segmentation models, such as FCN. Therefore, SLSNet can be used for practical dermoscopic applications.
Address
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference
Notes	MILAB; no proj			Approved	no
Call Number	Admin @ si @ SRA2021			Serial	3633
Permanent link to this record



Author	Sergio Escalera; Markus Weimer; Mikhail Burtsev; Valentin Malykh; Varvara Logacheva; Ryan Lowe; Iulian Vlad Serban; Yoshua Bengio; Alexander Rudnicky; Alan W. Black; Shrimai Prabhumoye; Łukasz Kidzinski; Mohanty Sharada; Carmichael Ong; Jennifer Hicks; Sergey Levine; Marcel Salathe; Scott Delp; Iker Huerga; Alexander Grigorenko; Leifur Thorbergsson; Anasuya Das; Kyla Nemitz; Jenna Sandker; Stephen King; Alexander S. Ecker; Leon A. Gatys; Matthias Bethge; Jordan Boyd Graber; Shi Feng; Pedro Rodriguez; Mohit Iyyer; He He; Hal Daume III; Sean McGregor; Amir Banifatemi; Alexey Kurakin; Ian Goodfellow; Samy Bengio
Title	Introduction to NIPS 2017 Competition Track			Type	Book Chapter
Year	2018	Publication	The NIPS ’17 Competition: Building Intelligent Systems	Abbreviated Journal
Volume		Issue		Pages	1-23
Keywords
Abstract	Competitions have become a popular tool in the data science community to solve hard problems, assess the state of the art and spur new research directions. Companies like Kaggle and open source platforms like Codalab connect people with data and a data science problem to those with the skills and means to solve it. Hence, the question arises: What, if anything, could NIPS add to this rich ecosystem? In 2017, we embarked to find out. We attracted 23 potential competitions, of which we selected five to be NIPS 2017 competitions. Our final selection features competitions advancing the state of the art in other sciences such as “Classifying Clinically Actionable Genetic Mutations” and “Learning to Run”. Others, like “The Conversational Intelligence Challenge” and “Adversarial Attacks and Defences” generated new data sets that we expect to impact the progress in their respective communities for years to come. And “Human-Computer Question Answering Competition” showed us just how far we as a field have come in ability and efficiency since the break-through performance of Watson in Jeopardy. Two additional competitions, DeepArt and AI XPRIZE Milestions, were also associated to the NIPS 2017 competition track, whose results are also presented within this chapter.
Address
Corporate Author				Thesis
Publisher	Springer	Place of Publication		Editor	Sergio Escalera; Markus Weimer
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN	978-3-319-94042-7	Medium
Area		Expedition		Conference
Notes	HUPBA; no proj			Approved	no
Call Number	Admin @ si @ EWB2018			Serial	3200
Permanent link to this record



Author	Pau Riba; Lutz Goldmann; Oriol Ramos Terrades; Diede Rusticus; Alicia Fornes; Josep Llados
Title	Table detection in business document images by message passing networks			Type	Journal Article
Year	2022	Publication	Pattern Recognition	Abbreviated Journal	PR
Volume	127	Issue		Pages	108641
Keywords
Abstract	Tabular structures in business documents offer a complementary dimension to the raw textual data. For instance, there is information about the relationships among pieces of information. Nowadays, digital mailroom applications have become a key service for workflow automation. Therefore, the detection and interpretation of tables is crucial. With the recent advances in information extraction, table detection and recognition has gained interest in document image analysis, in particular, with the absence of rule lines and unknown information about rows and columns. However, business documents usually contain sensitive contents limiting the amount of public benchmarking datasets. In this paper, we propose a graph-based approach for detecting tables in document images which do not require the raw content of the document. Hence, the sensitive content can be previously removed and, instead of using the raw image or textual content, we propose a purely structural approach to keep sensitive data anonymous. Our framework uses graph neural networks (GNNs) to describe the local repetitive structures that constitute a table. In particular, our main application domain are business documents. We have carefully validated our approach in two invoice datasets and a modern document benchmark. Our experiments demonstrate that tables can be detected by purely structural approaches.
Address	July 2022
Corporate Author				Thesis
Publisher	Elsevier	Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference
Notes	DAG; 600.162; 600.121			Approved	no
Call Number	Admin @ si @ RGR2022			Serial	3729
Permanent link to this record



Author	Meysam Madadi; Sergio Escalera; Alex Carruesco Llorens; Carlos Andujar; Xavier Baro; Jordi Gonzalez
Title	Top-down model fitting for hand pose recovery in sequences of depth images			Type	Journal Article
Year	2018	Publication	Image and Vision Computing	Abbreviated Journal	IMAVIS
Volume	79	Issue		Pages	63-75
Keywords
Abstract	State-of-the-art approaches on hand pose estimation from depth images have reported promising results under quite controlled considerations. In this paper we propose a two-step pipeline for recovering the hand pose from a sequence of depth images. The pipeline has been designed to deal with images taken from any viewpoint and exhibiting a high degree of finger occlusion. In a first step we initialize the hand pose using a part-based model, fitting a set of hand components in the depth images. In a second step we consider temporal data and estimate the parameters of a trained bilinear model consisting of shape and trajectory bases. We evaluate our approach on a new created synthetic hand dataset along with NYU and MSRA real datasets. Results demonstrate that the proposed method outperforms the most recent pose recovering approaches, including those based on CNNs.
Address
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference
Notes	HUPBA; 600.098			Approved	no
Call Number	Admin @ si @ MEC2018			Serial	3203
Permanent link to this record



Author	Marc Oliu; Javier Selva; Sergio Escalera
Title	Folded Recurrent Neural Networks for Future Video Prediction			Type	Conference Article
Year	2018	Publication	15th European Conference on Computer Vision	Abbreviated Journal
Volume	11218	Issue		Pages	745-761
Keywords
Abstract	Future video prediction is an ill-posed Computer Vision problem that recently received much attention. Its main challenges are the high variability in video content, the propagation of errors through time, and the non-specificity of the future frames: given a sequence of past frames there is a continuous distribution of possible futures. This work introduces bijective Gated Recurrent Units, a double mapping between the input and output of a GRU layer. This allows for recurrent auto-encoders with state sharing between encoder and decoder, stratifying the sequence representation and helping to prevent capacity problems. We show how with this topology only the encoder or decoder needs to be applied for input encoding and prediction, respectively. This reduces the computational cost and avoids re-encoding the predictions when generating a sequence of frames, mitigating the propagation of errors. Furthermore, it is possible to remove layers from an already trained model, giving an insight to the role performed by each layer and making the model more explainable. We evaluate our approach on three video datasets, outperforming state of the art prediction results on MMNIST and UCF101, and obtaining competitive results on KTH with 2 and 3 times less memory usage and computational cost than the best scored approach.
Address	Munich; September 2018
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title	LNCS
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference	ECCV
Notes	HUPBA; no menciona			Approved	no
Call Number	Admin @ si @ OSE2018			Serial	3204
Permanent link to this record



Author	Giuseppe Pezzano; Oliver Diaz; Vicent Ribas Ripoll; Petia Radeva
Title	CoLe-CNN+: Context learning – Convolutional neural network for COVID-19-Ground-Glass-Opacities detection and segmentation			Type	Journal Article
Year	2021	Publication	Computers in Biology and Medicine	Abbreviated Journal	CBM
Volume	136	Issue		Pages	104689
Keywords
Abstract	The most common tool for population-wide COVID-19 identification is the Reverse Transcription-Polymerase Chain Reaction test that detects the presence of the virus in the throat (or sputum) in swab samples. This test has a sensitivity between 59% and 71%. However, this test does not provide precise information regarding the extension of the pulmonary infection. Moreover, it has been proven that through the reading of a computed tomography (CT) scan, a clinician can provide a more complete perspective of the severity of the disease. Therefore, we propose a comprehensive system for fully-automated COVID-19 detection and lesion segmentation from CT scans, powered by deep learning strategies to support decision-making process for the diagnosis of COVID-19.
Address
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference
Notes	MILAB; no menciona			Approved	no
Call Number	Admin @ si @ PDR2021			Serial	3635
Permanent link to this record



Author	Cristina Palmero; Javier Selva; Mohammad Ali Bagueri; Sergio Escalera
Title	Recurrent CNN for 3D Gaze Estimation using Appearance and Shape Cues			Type	Conference Article
Year	2018	Publication	29th British Machine Vision Conference	Abbreviated Journal
Volume		Issue		Pages
Keywords
Abstract	Gaze behavior is an important non-verbal cue in social signal processing and humancomputer interaction. In this paper, we tackle the problem of person- and head poseindependent 3D gaze estimation from remote cameras, using a multi-modal recurrent convolutional neural network (CNN). We propose to combine face, eyes region, and face landmarks as individual streams in a CNN to estimate gaze in still images. Then, we exploit the dynamic nature of gaze by feeding the learned features of all the frames in a sequence to a many-to-one recurrent module that predicts the 3D gaze vector of the last frame. Our multi-modal static solution is evaluated on a wide range of head poses and gaze directions, achieving a significant improvement of 14.6% over the state of the art on EYEDIAP dataset, further improved by 4% when the temporal modality is included.
Address	Newcastle; UK; September 2018
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference	BMVC
Notes	HUPBA; no proj			Approved	no
Call Number	Admin @ si @ PSB2018			Serial	3208
Permanent link to this record



Author	Yagmur Gucluturk; Umut Guclu; Xavier Baro; Hugo Jair Escalante; Isabelle Guyon; Sergio Escalera; Marcel A. J. van Gerven; Rob van Lier
Title	Multimodal First Impression Analysis with Deep Residual Networks			Type	Journal Article
Year	2018	Publication	IEEE Transactions on Affective Computing	Abbreviated Journal	TAC
Volume	8	Issue	3	Pages	316-329
Keywords
Abstract	People form first impressions about the personalities of unfamiliar individuals even after very brief interactions with them. In this study we present and evaluate several models that mimic this automatic social behavior. Specifically, we present several models trained on a large dataset of short YouTube video blog posts for predicting apparent Big Five personality traits of people and whether they seem suitable to be recommended to a job interview. Along with presenting our audiovisual approach and results that won the third place in the ChaLearn First Impressions Challenge, we investigate modeling in different modalities including audio only, visual only, language only, audiovisual, and combination of audiovisual and language. Our results demonstrate that the best performance could be obtained using a fusion of all data modalities. Finally, in order to promote explainability in machine learning and to provide an example for the upcoming ChaLearn challenges, we present a simple approach for explaining the predictions for job interview recommendations
Address
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference
Notes	HUPBA; no proj			Approved	no
Call Number	Admin @ si @ GGB2018			Serial	3210
Permanent link to this record



Author	Gabriela Ramirez; Esau Villatoro; Bogdan Ionescu; Hugo Jair Escalante; Sergio Escalera; Martha Larson; Henning Muller; Isabelle Guyon
Title	Overview of the Multimedia Information Processing for Personality & Social Networks Analysis Contes			Type	Conference Article
Year	2018	Publication	Multimedia Information Processing for Personality and Social Networks Analysis (MIPPSNA 2018)	Abbreviated Journal
Volume		Issue		Pages
Keywords
Abstract
Address	Beijing; China; August 2018
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference	ICPRW
Notes	HUPBA			Approved	no
Call Number	Admin @ si @ RVI2018			Serial	3211
Permanent link to this record



Author	Ester Fornells; Manuel De Armas; Maria Teresa Anguera; Sergio Escalera; Marcos Antonio Catalán; Josep Moya
Title	Desarrollo del proyecto del Consell Comarcal del Baix Llobregat “Buen Trato a las personas mayores y aquellas en situación de fragilidad con sufrimiento emocional: Hacia un envejecimiento saludable”			Type	Journal
Year	2018	Publication	Informaciones Psiquiatricas	Abbreviated Journal
Volume	232	Issue		Pages	47-59
Keywords
Abstract
Address
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN	0210-7279	ISBN		Medium
Area		Expedition		Conference
Notes	HUPBA; no menciona			Approved	no
Call Number	Admin @ si @ FAA2018			Serial	3214
Permanent link to this record