Publicacions CVC -- Query Results

[121–130] << 131 132 133 134 135 136 137 138 139 140 >> [141–150]

Details

Records
Author	Xinhang Song; Shuqiang Jiang; Luis Herranz
Title	Combining Models from Multiple Sources for RGB-D Scene Recognition			Type	Conference Article
Year	2017	Publication	26th International Joint Conference on Artificial Intelligence	Abbreviated Journal
Volume		Issue		Pages	4523-4529
Keywords	Robotics and Vision; Vision and Perception
Abstract	Depth can complement RGB with useful cues about object volumes and scene layout. However, RGB-D image datasets are still too small for directly training deep convolutional neural networks (CNNs), in contrast to the massive monomodal RGB datasets. Previous works in RGB-D recognition typically combine two separate networks for RGB and depth data, pretrained with a large RGB dataset and then fine tuned to the respective target RGB and depth datasets. These approaches have several limitations: 1) only use low-level filters learned from RGB data, thus not being able to exploit properly depth-specific patterns, and 2) RGB and depth features are only combined at high-levels but rarely at lower-levels. In this paper, we propose a framework that leverages both knowledge acquired from large RGB datasets together with depth-specific cues learned from the limited depth data, obtaining more effective multi-source and multi-modal representations. We propose a multi-modal combination method that selects discriminative combinations of layers from the different source models and target modalities, capturing both high-level properties of the task and intrinsic low-level properties of both modalities.
Address	Melbourne; Australia; August 2017
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference	IJCAI
Notes	LAMP; 600.120			Approved	no
Call Number	Admin @ si @ SJH2017b			Serial	2966
Permanent link to this record



Author	Xinhang Song; Shuqiang Jiang; Luis Herranz; Chengpeng Chen
Title	Learning Effective RGB-D Representations for Scene Recognition			Type	Journal Article
Year	2019	Publication	IEEE Transactions on Image Processing	Abbreviated Journal	TIP
Volume	28	Issue	2	Pages	980-993
Keywords
Abstract	Deep convolutional networks can achieve impressive results on RGB scene recognition thanks to large data sets such as places. In contrast, RGB-D scene recognition is still underdeveloped in comparison, due to two limitations of RGB-D data we address in this paper. The ﬁrst limitation is the lack of depth data for training deep learning models. Rather than ﬁne tuning or transferring RGB-speciﬁc features, we address this limitation by proposing an architecture and a two-step training approach that directly learns effective depth-speciﬁc features using weak supervision via patches. The resulting RGB-D model also beneﬁts from more complementary multimodal features. Another limitation is the short range of depth sensors (typically 0.5 m to 5.5 m), resulting in depth images not capturing distant objects in the scenes that RGB images can. We show that this limitation can be addressed by using RGB-D videos, where more comprehensive depth information is accumulated as the camera travels across the scenes. Focusing on this scenario, we introduce the ISIA RGB-D video data set to evaluate RGB-D scene recognition with videos. Our video recognition architecture combines convolutional and recurrent neural networks that are trained in three steps with increasingly complex data to learn effective features (i.e., patches, frames, and sequences). Our approach obtains the state-of-the-art performances on RGB-D image (NYUD2 and SUN RGB-D) and video (ISIA RGB-D) scene recognition.
Address
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference
Notes	LAMP; 600.141; 600.120			Approved	no
Call Number	Admin @ si @ SJH2019			Serial	3247
Permanent link to this record



Author	Md. Mostafa Kamal Sarker; Mohammed Jabreel; Hatem A. Rashwan; Syeda Furruka Banu; Petia Radeva; Domenec Puig
Title	CuisineNet: Food Attributes Classification using Multi-scale Convolution Network			Type	Conference Article
Year	2018	Publication	21st International Conference of the Catalan Association for Artificial Intelligence	Abbreviated Journal
Volume		Issue		Pages	365-372
Keywords
Abstract	Diversity of food and its attributes represents the culinary habits of peoples from different countries. Thus, this paper addresses the problem of identifying food culture of people around the world and its flavor by classifying two main food attributes, cuisine and flavor. A deep learning model based on multi-scale convotuional networks is proposed for extracting more accurate features from input images. The aggregation of multi-scale convolution layers with different kernel size is also used for weighting the features results from different scales. In addition, a joint loss function based on Negative Log Likelihood (NLL) is used to fit the model probability to multi labeled classes for multi-modal classification task. Furthermore, this work provides a new dataset for food attributes, so-called Yummly48K, extracted from the popular food website, Yummly. Our model is assessed on the constructed Yummly48K dataset. The experimental results show that our proposed method yields 65% and 62% average F1 score on validation and test set which outperforming the state-of-the-art models.
Address	Roses; catalonia; October 2018
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference	CCIA
Notes	MILAB; no menciona			Approved	no
Call Number	Admin @ si @ SJR2018			Serial	3113
Permanent link to this record



Author	Mohamed Ali Souibgui; Y.Kessentini; Alicia Fornes
Title	A conditional GAN based approach for distorted camera captured documents recovery			Type	Conference Article
Year	2020	Publication	4th Mediterranean Conference on Pattern Recognition and Artificial Intelligence	Abbreviated Journal
Volume		Issue		Pages
Keywords
Abstract
Address	Virtual; December 2020
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference	MedPRAI
Notes	DAG; 600.121			Approved	no
Call Number	Admin @ si @ SKF2020			Serial	3450
Permanent link to this record



Author	Ariel Amato; Angel Sappa; Alicia Fornes; Felipe Lumbreras; Josep Llados
Title	Divide and Conquer: Atomizing and Parallelizing A Task in A Mobile Crowdsourcing Platform			Type	Conference Article
Year	2013	Publication	2nd International ACM Workshop on Crowdsourcing for Multimedia	Abbreviated Journal
Volume		Issue		Pages	21-22
Keywords
Abstract	In this paper we present some conclusions about the advantages of having an efficient task formulation when a crowdsourcing platform is used. In particular we show how the task atomization and distribution can help to obtain results in an efficient way. Our proposal is based on a recursive splitting of the original task into a set of smaller and simpler tasks. As a result both more accurate and faster solutions are obtained. Our evaluation is performed on a set of ancient documents that need to be digitized.
Address	Barcelona; October 2013
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN	978-1-4503-2396-3	Medium
Area		Expedition		Conference	CrowdMM
Notes	ADAS; ISE; DAG; 600.054; 600.055; 600.045; 600.061; 602.006			Approved	no
Call Number	Admin @ si @ SLA2013			Serial	2335
Permanent link to this record



Author	Joan Serrat; Felipe Lumbreras; Francisco Blanco; Manuel Valiente; Montserrat Lopez-Mesas
Title	myStone: A system for automatic kidney stone classification			Type	Journal Article
Year	2017	Publication	Expert Systems with Applications	Abbreviated Journal	ESA
Volume	89	Issue		Pages	41-51
Keywords	Kidney stone; Optical device; Computer vision; Image classification
Abstract	Kidney stone formation is a common disease and the incidence rate is constantly increasing worldwide. It has been shown that the classification of kidney stones can lead to an important reduction of the recurrence rate. The classification of kidney stones by human experts on the basis of certain visual color and texture features is one of the most employed techniques. However, the knowledge of how to analyze kidney stones is not widespread, and the experts learn only after being trained on a large number of samples of the different classes. In this paper we describe a new device specifically designed for capturing images of expelled kidney stones, and a method to learn and apply the experts knowledge with regard to their classification. We show that with off the shelf components, a carefully selected set of features and a state of the art classifier it is possible to automate this difficult task to a good degree. We report results on a collection of 454 kidney stones, achieving an overall accuracy of 63% for a set of eight classes covering almost all of the kidney stones taxonomy. Moreover, for more than 80% of samples the real class is the first or the second most probable class according to the system, being then the patient recommendations for the two top classes similar. This is the first attempt towards the automatic visual classification of kidney stones, and based on the current results we foresee better accuracies with the increase of the dataset size.
Address
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference
Notes	ADAS; MSIAU; 603.046; 600.122; 600.118			Approved	no
Call Number	Admin @ si @ SLB2017			Serial	3026
Permanent link to this record



Author	Damian Sojka; Yuyang Liu; Dipam Goswami; Sebastian Cygert; Bartłomiej Twardowski; Joost van de Weijer
Title	Technical Report for ICCV 2023 Visual Continual Learning Challenge: Continuous Test-time Adaptation for Semantic Segmentation			Type	Miscellaneous
Year	2023	Publication	Arxiv	Abbreviated Journal
Volume		Issue		Pages
Keywords
Abstract	The goal of the challenge is to develop a test-time adaptation (TTA) method, which could adapt the model to gradually changing domains in video sequences for semantic segmentation task. It is based on a synthetic driving video dataset – SHIFT. The source model is trained on images taken during daytime in clear weather. Domain changes at test-time are mainly caused by varying weather conditions and times of day. The TTA methods are evaluated in each image sequence (video) separately, meaning the model is reset to the source model state before the next sequence. Images come one by one and a prediction has to be made at the arrival of each frame. Each sequence is composed of 401 images and starts with the source domain, then gradually drifts to a different one (changing weather or time of day) until the middle of the sequence. In the second half of the sequence, the domain gradually shifts back to the source one. Ground truth data is available only for the validation split of the SHIFT dataset, in which there are only six sequences that start and end with the source domain. We conduct an analysis specifically on those sequences. Ground truth data for test split, on which the developed TTA methods are evaluated for leader board ranking, are not publicly available. The proposed solution secured a 3rd place in a challenge and received an innovation award. Contrary to the solutions that scored better, we did not use any external pretrained models or specialized data augmentations, to keep the solutions as general as possible. We have focused on analyzing the distributional shift and developing a method that could adapt to changing data dynamics and generalize across different scenarios.
Address
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference
Notes	LAMP			Approved	no
Call Number	Admin @ si @ SLG2023			Serial	3993
Permanent link to this record



Author	Joan Serrat; Felipe Lumbreras; Antonio Lopez
Title	Cost estimation of custom hoses from STL files and CAD drawings			Type	Journal Article
Year	2013	Publication	Computers in Industry	Abbreviated Journal	COMPUTIND
Volume	64	Issue	3	Pages	299-309
Keywords	On-line quotation; STL format; Regression; Gaussian process
Abstract	We present a method for the cost estimation of custom hoses from CAD models. They can come in two formats, which are easy to generate: a STL file or the image of a CAD drawing showing several orthogonal projections. The challenges in either cases are, first, to obtain from them a high level 3D description of the shape, and second, to learn a regression function for the prediction of the manufacturing time, based on geometric features of the reconstructed shape. The chosen description is the 3D line along the medial axis of the tube and the diameter of the circular sections along it. In order to extract it from STL files, we have adapted RANSAC, a robust parametric fitting algorithm. As for CAD drawing images, we propose a new technique for 3D reconstruction from data entered on any number of orthogonal projections. The regression function is a Gaussian process, which does not constrain the function to adopt any specific form and is governed by just two parameters. We assess the accuracy of the manufacturing time estimation by k-fold cross validation on 171 STL file models for which the time is provided by an expert. The results show the feasibility of the method, whereby the relative error for 80% of the testing samples is below 15%.
Address
Corporate Author				Thesis
Publisher	Elsevier	Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference
Notes	ADAS; 600.057; 600.054; 605.203			Approved	no
Call Number	Admin @ si @ SLL2013; ADAS @ adas @			Serial	2161
Permanent link to this record



Author	Hans Stadthagen-Gonzalez; Luis Lopez; M. Carmen Parafita; C. Alejandro Parraga
Title	Using two-alternative forced choice tasks and Thurstone law of comparative judgments for code-switching research			Type	Book Chapter
Year	2018	Publication	Linguistic Approaches to Bilingualism	Abbreviated Journal
Volume		Issue		Pages	67-97
Keywords	two-alternative forced choice and Thurstone's law; acceptability judgment; code-switching
Abstract	This article argues that 2-alternative forced choice tasks and Thurstone’s law of comparative judgments (Thurstone, 1927) are well suited to investigate code-switching competence by means of acceptability judgments. We compare this method with commonly used Likert scale judgments and find that the 2-alternative forced choice task provides granular details that remain invisible in a Likert scale experiment. In order to compare and contrast both methods, we examined the syntactic phenomenon usually referred to as the Adjacency Condition (AC) (apud Stowell, 1981), which imposes a condition of adjacency between verb and object. Our interest in the AC comes from the fact that it is a subtle feature of English grammar which is absent in Spanish, and this provides an excellent springboard to create minimal code-switched pairs that allow us to formulate a clear research question that can be tested using both methods.
Address
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference
Notes	NEUROBIT; no menciona			Approved	no
Call Number	Admin @ si @ SLP2018			Serial	2994
Permanent link to this record



Author	Joan Serrat; Felipe Lumbreras; Idoia Ruiz
Title	Learning to measure for preshipment garment sizing			Type	Journal Article
Year	2018	Publication	Measurement	Abbreviated Journal	MEASURE
Volume	130	Issue		Pages	327-339
Keywords	Apparel; Computer vision; Structured prediction; Regression
Abstract	Clothing is still manually manufactured for the most part nowadays, resulting in discrepancies between nominal and real dimensions, and potentially ill-fitting garments. Hence, it is common in the apparel industry to manually perform measures at preshipment time. We present an automatic method to obtain such measures from a single image of a garment that speeds up this task. It is generic and extensible in the sense that it does not depend explicitly on the garment shape or type. Instead, it learns through a probabilistic graphical model to identify the different contour parts. Subsequently, a set of Lasso regressors, one per desired measure, can predict the actual values of the measures. We present results on a dataset of 130 images of jackets and 98 of pants, of varying sizes and styles, obtaining 1.17 and 1.22 cm of mean absolute error, respectively.
Address
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference
Notes	ADAS; MSIAU; 600.122; 600.118			Approved	no
Call Number	Admin @ si @ SLR2018			Serial	3128
Permanent link to this record



Author	Xavier Soria; Yachuan Li; Mohammad Rouhani; Angel Sappa
Title	Tiny and Efficient Model for the Edge Detection Generalization			Type	Conference Article
Year	2023	Publication	Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV) Workshops	Abbreviated Journal
Volume		Issue		Pages
Keywords
Abstract	Most high-level computer vision tasks rely on low-level image operations as their initial processes. Operations such as edge detection, image enhancement, and super-resolution, provide the foundations for higher level image analysis. In this work we address the edge detection considering three main objectives: simplicity, efficiency, and generalization since current state-of-the-art (SOTA) edge detection models are increased in complexity for better accuracy. To achieve this, we present Tiny and Efficient Edge Detector (TEED), a light convolutional neural network with only 58K parameters, less than 0:2% of the state-of-the-art models. Training on the BIPED dataset takes less than 30 minutes, with each epoch requiring less than 5 minutes. Our proposed model is easy to train and it quickly converges within very first few epochs, while the predicted edge-maps are crisp and of high quality. Additionally, we propose a new dataset to test the generalization of edge detection, which comprises samples from popular images used in edge detection and image segmentation. The source code is available in https://github.com/xavysp/TEED.
Address	Paris; France; October 2023
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference	ICCVW
Notes	MSIAU			Approved	no
Call Number	Admin @ si @ SLR2023			Serial	3941
Permanent link to this record



Author	Aleksandr Setkov; Fabio Martinez Carillo; Michele Gouiffes; Christian Jacquemin; Maria Vanrell; Ramon Baldrich
Title	DAcImPro: A Novel Database of Acquired Image Projections and Its Application to Object Recognition			Type	Conference Article
Year	2015	Publication	Advances in Visual Computing. Proceedings of 11th International Symposium, ISVC 2015 Part II	Abbreviated Journal
Volume	9475	Issue		Pages	463-473
Keywords	Projector-camera systems; Feature descriptors; Object recognition
Abstract	Projector-camera systems are designed to improve the projection quality by comparing original images with their captured projections, which is usually complicated due to high photometric and geometric variations. Many research works address this problem using their own test data which makes it extremely difficult to compare different proposals. This paper has two main contributions. Firstly, we introduce a new database of acquired image projections (DAcImPro) that, covering photometric and geometric conditions and providing data for ground-truth computation, can serve to evaluate different algorithms in projector-camera systems. Secondly, a new object recognition scenario from acquired projections is presented, which could be of a great interest in such domains, as home video projections and public presentations. We show that the task is more challenging than the classical recognition problem and thus requires additional pre-processing, such as color compensation or projection area selection.
Address
Corporate Author				Thesis
Publisher	Springer International Publishing	Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title	LNCS
Series Volume		Series Issue		Edition
ISSN	0302-9743	ISBN	978-3-319-27862-9	Medium
Area		Expedition		Conference	ISVC
Notes	CIC			Approved	no
Call Number	Admin @ si @ SMG2015			Serial	2736
Permanent link to this record



Author	D. Smith
Title	Solving the mean string problem for 2D shapes			Type	Report
Year	1999	Publication	CVC Technical Report #36	Abbreviated Journal
Volume		Issue		Pages
Keywords
Abstract
Address	CVC (UAB)
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference
Notes				Approved	no
Call Number	Admin @ si @ Smi1999			Serial	195
Permanent link to this record



Author	David Sanchez-Mendoza; David Masip; Agata Lapedriza
Title	Emotion recognition from mid-level features			Type	Journal Article
Year	2015	Publication	Pattern Recognition Letters	Abbreviated Journal	PRL
Volume	67	Issue	Part 1	Pages	66–74
Keywords	Facial expression; Emotion recognition; Action units; Computer vision
Abstract	In this paper we present a study on the use of Action Units as mid-level features for automatically recognizing basic and subtle emotions. We propose a representation model based on mid-level facial muscular movement features. We encode these movements dynamically using the Facial Action Coding System, and propose to use these intermediate features based on Action Units (AUs) to classify emotions. AUs activations are detected fusing a set of spatiotemporal geometric and appearance features. The algorithm is validated in two applications: (i) the recognition of 7 basic emotions using the publicly available Cohn-Kanade database, and (ii) the inference of subtle emotional cues in the Newscast database. In this second scenario, we consider emotions that are perceived cumulatively in longer periods of time. In particular, we Automatically classify whether video shoots from public News TV channels refer to Good or Bad news. To deal with the different video lengths we propose a Histogram of Action Units and compute it using a sliding window strategy on the frame sequences. Our approach achieves accuracies close to human perception.
Address
Corporate Author				Thesis
Publisher	Elsevier B.V.	Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN	0167-8655	ISBN		Medium
Area		Expedition		Conference
Notes	OR;MV			Approved	no
Call Number	Admin @ si @ SML2015			Serial	2746
Permanent link to this record



Author	Daniel Sanchez; Meysam Madadi; Marc Oliu; Sergio Escalera
Title	Multi-task human analysis in still images: 2D/3D pose, depth map, and multi-part segmentation			Type	Conference Article
Year	2019	Publication	14th IEEE International Conference on Automatic Face and Gesture Recognition	Abbreviated Journal
Volume		Issue		Pages
Keywords
Abstract	While many individual tasks in the domain of human analysis have recently received an accuracy boost from deep learning approaches, multi-task learning has mostly been ignored due to a lack of data. New synthetic datasets are being released, filling this gap with synthetic generated data. In this work, we analyze four related human analysis tasks in still images in a multi-task scenario by leveraging such datasets. Specifically, we study the correlation of 2D/3D pose estimation, body part segmentation and full-body depth estimation. These tasks are learned via the well-known Stacked Hourglass module such that each of the task-specific streams shares information with the others. The main goal is to analyze how training together these four related tasks can benefit each individual task for a better generalization. Results on the newly released SURREAL dataset show that all four tasks benefit from the multi-task approach, but with different combinations of tasks: while combining all four tasks improves 2D pose estimation the most, 2D pose improves neither 3D pose nor full-body depth estimation. On the other hand 2D parts segmentation can benefit from 2D pose but not from 3D pose. In all cases, as expected, the maximum improvement is achieved on those human body parts that show more variability in terms of spatial distribution, appearance and shape, e.g. wrists and ankles.
Address	Lille; France; May 2019
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference	FG
Notes	HUPBA; no proj			Approved	no
Call Number	Admin @ si @ SMO2019			Serial	3326
Permanent link to this record