Publicacions CVC -- Query Results

	Publicacions CVC Home \| Show All \| Simple Search \| Advanced Search \| Add Record \| Import	Login Quick Search: Field: contains: ...
	16–30 of 157 records found matching your query (RSS \| history):

Search & Display Options

Select All Deselect All

<< 1 2 3 4 5 6 7 8 9 10 >> [11–11]

List View

Citations

Details

	Records
	Author	Weiqing Min; Shuqiang Jiang; Jitao Sang; Huayang Wang; Xinda Liu; Luis Herranz
	Title	Being a Supercook: Joint Food Attributes and Multimodal Content Modeling for Recipe Retrieval and Exploration			Type	Journal Article
	Year	2017	Publication	IEEE Transactions on Multimedia	Abbreviated Journal	TMM
	Volume	19	Issue	5	Pages	1100 - 1113
	Keywords
	Abstract	This paper considers the problem of recipe-oriented image-ingredient correlation learning with multi-attributes for recipe retrieval and exploration. Existing methods mainly focus on food visual information for recognition while we model visual information, textual content (e.g., ingredients), and attributes (e.g., cuisine and course) together to solve extended recipe-oriented problems, such as multimodal cuisine classification and attribute-enhanced food image retrieval. As a solution, we propose a multimodal multitask deep belief network (M3TDBN) to learn joint image-ingredient representation regularized by different attributes. By grouping ingredients into visible ingredients (which are visible in the food image, e.g., “chicken” and “mushroom”) and nonvisible ingredients (e.g., “salt” and “oil”), M3TDBN is capable of learning both midlevel visual representation between images and visible ingredients and nonvisual representation. Furthermore, in order to utilize different attributes to improve the intermodality correlation, M3TDBN incorporates multitask learning to make different attributes collaborate each other. Based on the proposed M3TDBN, we exploit the derived deep features and the discovered correlations for three extended novel applications: 1) multimodal cuisine classification; 2) attribute-augmented cross-modal recipe image retrieval; and 3) ingredient and attribute inference from food images. The proposed approach is evaluated on the constructed Yummly dataset and the evaluation results have validated the effectiveness of the proposed approach.
	Address
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference
	Notes	LAMP; 600.120			Approved	no
	Call Number	Admin @ si @ MJS2017			Serial	2964
Permanent link to this record



	Author	Luis Herranz; Shuqiang Jiang; Ruihan Xu
	Title	Modeling Restaurant Context for Food Recognition			Type	Journal Article
	Year	2017	Publication	IEEE Transactions on Multimedia	Abbreviated Journal	TMM
	Volume	19	Issue	2	Pages	430 - 440
	Keywords
	Abstract	Food photos are widely used in food logs for diet monitoring and in social networks to share social and gastronomic experiences. A large number of these images are taken in restaurants. Dish recognition in general is very challenging, due to different cuisines, cooking styles, and the intrinsic difficulty of modeling food from its visual appearance. However, contextual knowledge can be crucial to improve recognition in such scenario. In particular, geocontext has been widely exploited for outdoor landmark recognition. Similarly, we exploit knowledge about menus and location of restaurants and test images. We first adapt a framework based on discarding unlikely categories located far from the test image. Then, we reformulate the problem using a probabilistic model connecting dishes, restaurants, and locations. We apply that model in three different tasks: dish recognition, restaurant recognition, and location refinement. Experiments on six datasets show that by integrating multiple evidences (visual, location, and external knowledge) our system can boost the performance in all tasks.
	Address
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference
	Notes	LAMP; 600.120			Approved	no
	Call Number	Admin @ si @ HJX2017			Serial	2965
Permanent link to this record



	Author	Jorge Bernal; Nima Tajkbaksh; F. Javier Sanchez; Bogdan J. Matuszewski; Hao Chen; Lequan Yu; Quentin Angermann; Olivier Romain; Bjorn Rustad; Ilangko Balasingham; Konstantin Pogorelov; Sungbin Choi; Quentin Debard; Lena Maier Hein; Stefanie Speidel; Danail Stoyanov; Patrick Brandao; Henry Cordova; Cristina Sanchez Montes; Suryakanth R. Gurudu; Gloria Fernandez Esparrach; Xavier Dray; Jianming Liang; Aymeric Histace
	Title	Comparative Validation of Polyp Detection Methods in Video Colonoscopy: Results from the MICCAI 2015 Endoscopic Vision Challenge			Type	Journal Article
	Year	2017	Publication	IEEE Transactions on Medical Imaging	Abbreviated Journal	TMI
	Volume	36	Issue	6	Pages	1231 - 1249
	Keywords	Endoscopic vision; Polyp Detection; Handcrafted features; Machine Learning; Validation Framework
	Abstract	Colonoscopy is the gold standard for colon cancer screening though still some polyps are missed, thus preventing early disease detection and treatment. Several computational systems have been proposed to assist polyp detection during colonoscopy but so far without consistent evaluation. The lack of publicly available annotated databases has made it difficult to compare methods and to assess if they achieve performance levels acceptable for clinical use. The Automatic Polyp Detection subchallenge, conducted as part of the Endoscopic Vision Challenge (http://endovis.grand-challenge.org) at the international conference on Medical Image Computing and Computer Assisted Intervention (MICCAI) in 2015, was an effort to address this need. In this paper, we report the results of this comparative evaluation of polyp detection methods, as well as describe additional experiments to further explore differences between methods. We define performance metrics and provide evaluation databases that allow comparison of multiple methodologies. Results show that convolutional neural networks (CNNs) are the state of the art. Nevertheless it is also demonstrated that combining different methodologies can lead to an improved overall performance.
	Address
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference
	Notes	MV; 600.096; 600.075			Approved	no
	Call Number	Admin @ si @ BTS2017			Serial	2949
Permanent link to this record



	Author	Mikhail Mozerov; Joost Van de Weijer
	Title	Improved Recursive Geodesic Distance Computation for Edge Preserving Filter			Type	Journal Article
	Year	2017	Publication	IEEE Transactions on Image Processing	Abbreviated Journal	TIP
	Volume	26	Issue	8	Pages	3696 - 3706
	Keywords	Geodesic distance filter; color image filtering; image enhancement
	Abstract	All known recursive filters based on the geodesic distance affinity are realized by two 1D recursions applied in two orthogonal directions of the image plane. The 2D extension of the filter is not valid and has theoretically drawbacks, which lead to known artifacts. In this paper, a maximum influence propagation method is proposed to approximate the 2D extension for the geodesic distance-based recursive filter. The method allows to partially overcome the drawbacks of the 1D recursion approach. We show that our improved recursion better approximates the true geodesic distance filter, and the application of this improved filter for image denoising outperforms the existing recursive implementation of the geodesic distance. As an application, we consider a geodesic distance-based filter for image denoising. Experimental evaluation of our denoising method demonstrates comparable and for several test images better results, than stateof-the-art approaches, while our algorithm is considerably fasterwith computational complexity O(8P).
	Address
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference
	Notes	LAMP; ISE; 600.120; 600.098; 600.119			Approved	no
	Call Number	Admin @ si @ Moz2017			Serial	2921
Permanent link to this record



	Author	Xinhang Song; Shuqiang Jiang; Luis Herranz
	Title	Multi-Scale Multi-Feature Context Modeling for Scene Recognition in the Semantic Manifold			Type	Journal Article
	Year	2017	Publication	IEEE Transactions on Image Processing	Abbreviated Journal	TIP
	Volume	26	Issue	6	Pages	2721-2735
	Keywords
	Abstract	Before the big data era, scene recognition was often approached with two-step inference using localized intermediate representations (objects, topics, and so on). One of such approaches is the semantic manifold (SM), in which patches and images are modeled as points in a semantic probability simplex. Patch models are learned resorting to weak supervision via image labels, which leads to the problem of scene categories co-occurring in this semantic space. Fortunately, each category has its own co-occurrence patterns that are consistent across the images in that category. Thus, discovering and modeling these patterns are critical to improve the recognition performance in this representation. Since the emergence of large data sets, such as ImageNet and Places, these approaches have been relegated in favor of the much more powerful convolutional neural networks (CNNs), which can automatically learn multi-layered representations from the data. In this paper, we address many limitations of the original SM approach and related works. We propose discriminative patch representations using neural networks and further propose a hybrid architecture in which the semantic manifold is built on top of multiscale CNNs. Both representations can be computed significantly faster than the Gaussian mixture models of the original SM. To combine multiple scales, spatial relations, and multiple features, we formulate rich context models using Markov random fields. To solve the optimization problem, we analyze global and local approaches, where a top-down hierarchical algorithm has the best performance. Experimental results show that exploiting different types of contextual relations jointly consistently improves the recognition accuracy.
	Address
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference
	Notes	LAMP; 600.120			Approved	no
	Call Number	Admin @ si @ SJH2017a			Serial	2963
Permanent link to this record



	Author	Marc Bolaños; Mariella Dimiccoli; Petia Radeva
	Title	Towards Storytelling from Visual Lifelogging: An Overview			Type	Journal Article
	Year	2017	Publication	IEEE Transactions on Human-Machine Systems	Abbreviated Journal	THMS
	Volume	47	Issue	1	Pages	77 - 90
	Keywords
	Abstract	Visual lifelogging consists of acquiring images that capture the daily experiences of the user by wearing a camera over a long period of time. The pictures taken offer considerable potential for knowledge mining concerning how people live their lives, hence, they open up new opportunities for many potential applications in fields including healthcare, security, leisure and the quantified self. However, automatically building a story from a huge collection of unstructured egocentric data presents major challenges. This paper provides a thorough review of advances made so far in egocentric data analysis, and in view of the current state of the art, indicates new lines of research to move us towards storytelling from visual lifelogging.
	Address
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference
	Notes	MILAB; 601.235			Approved	no
	Call Number	Admin @ si @ BDR2017			Serial	2712
Permanent link to this record



	Author	Pau Rodriguez; Guillem Cucurull; Jordi Gonzalez; Josep M. Gonfaus; Kamal Nasrollahi; Thomas B. Moeslund; Xavier Roca
	Title	Deep Pain: Exploiting Long Short-Term Memory Networks for Facial Expression Classification			Type	Journal Article
	Year	2017	Publication	IEEE Transactions on cybernetics	Abbreviated Journal	Cyber
	Volume		Issue		Pages	1-11
	Keywords
	Abstract	Pain is an unpleasant feeling that has been shown to be an important factor for the recovery of patients. Since this is costly in human resources and difficult to do objectively, there is the need for automatic systems to measure it. In this paper, contrary to current state-of-the-art techniques in pain assessment, which are based on facial features only, we suggest that the performance can be enhanced by feeding the raw frames to deep learning models, outperforming the latest state-of-the-art results while also directly facing the problem of imbalanced data. As a baseline, our approach first uses convolutional neural networks (CNNs) to learn facial features from VGG_Faces, which are then linked to a long short-term memory to exploit the temporal relation between video frames. We further compare the performances of using the so popular schema based on the canonically normalized appearance versus taking into account the whole image. As a result, we outperform current state-of-the-art area under the curve performance in the UNBC-McMaster Shoulder Pain Expression Archive Database. In addition, to evaluate the generalization properties of our proposed methodology on facial motion recognition, we also report competitive results in the Cohn Kanade+ facial expression database.
	Address
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference
	Notes	ISE; 600.119; 600.098			Approved	no
	Call Number	Admin @ si @ RCG2017a			Serial	2926
Permanent link to this record



	Author	Alejandro Gonzalez Alzate; David Vazquez; Antonio Lopez; Jaume Amores
	Title	On-Board Object Detection: Multicue, Multimodal, and Multiview Random Forest of Local Experts			Type	Journal Article
	Year	2017	Publication	IEEE Transactions on cybernetics	Abbreviated Journal	Cyber
	Volume	47	Issue	11	Pages	3980 - 3990
	Keywords	Multicue; multimodal; multiview; object detection
	Abstract	Despite recent significant advances, object detection continues to be an extremely challenging problem in real scenarios. In order to develop a detector that successfully operates under these conditions, it becomes critical to leverage upon multiple cues, multiple imaging modalities, and a strong multiview (MV) classifier that accounts for different object views and poses. In this paper, we provide an extensive evaluation that gives insight into how each of these aspects (multicue, multimodality, and strong MV classifier) affect accuracy both individually and when integrated together. In the multimodality component, we explore the fusion of RGB and depth maps obtained by high-definition light detection and ranging, a type of modality that is starting to receive increasing attention. As our analysis reveals, although all the aforementioned aspects significantly help in improving the accuracy, the fusion of visible spectrum and depth information allows to boost the accuracy by a much larger margin. The resulting detector not only ranks among the top best performers in the challenging KITTI benchmark, but it is built upon very simple blocks that are easy to implement and computationally efficient.
	Address
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN	2168-2267	ISBN		Medium
	Area		Expedition		Conference
	Notes	ADAS; 600.085; 600.082; 600.076; 600.118			Approved	no
	Call Number	Admin @ si @			Serial	2810
Permanent link to this record



	Author	Debora Gil; Aura Hernandez-Sabate; David Castells; Jordi Carrabina
	Title	CYBERH: Cyber-Physical Systems in Health for Personalized Assistance			Type	Conference Article
	Year	2017	Publication	International Symposium on Symbolic and Numeric Algorithms for Scientific Computing	Abbreviated Journal
	Volume		Issue		Pages
	Keywords
	Abstract	Assistance systems for e-Health applications have some specific requirements that demand of new methods for data gathering, analysis and modeling able to deal with SmallData: 1) systems should dynamically collect data from, both, the environment and the user to issue personalized recommendations; 2) data analysis should be able to tackle a limited number of samples prone to include non-informative data and possibly evolving in time due to changes in patient condition; 3) algorithms should run in real time with possibly limited computational resources and fluctuant internet access. Electronic medical devices (and CyberPhysical devices in general) can enhance the process of data gathering and analysis in several ways: (i) acquiring simultaneously multiple sensors data instead of single magnitudes (ii) filtering data; (iii) providing real-time implementations condition by isolating tasks in individual processors of multiprocessors Systems-on-chip (MPSoC) platforms and (iv) combining information through sensor fusion techniques. Our approach focus on both aspects of the complementary role of CyberPhysical devices and analysis of SmallData in the process of personalized models building for e-Health applications. In particular, we will address the design of Cyber-Physical Systems in Health for Personalized Assistance (CyberHealth) in two specific application cases: 1) A Smart Assisted Driving System (SADs) for dynamical assessment of the driving capabilities of Mild Cognitive Impaired (MCI) people; 2) An Intelligent Operating Room (iOR) for improving the yield of bronchoscopic interventions for in-vivo lung cancer diagnosis.
	Address	Timisoara; Rumania; September 2017
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference	SYNASC
	Notes	IAM; 600.085; 600.096; 600.075; 600.145			Approved	no
	Call Number	Admin @ si @ GHC2017			Serial	3045
Permanent link to this record



	Author	Karim Lekadir; Alfiia Galimzianova; Angels Betriu; Maria del Mar Vila; Laura Igual; Daniel L. Rubin; Elvira Fernandez-Giraldez; Petia Radeva; Sandy Napel
	Title	A Convolutional Neural Network for Automatic Characterization of Plaque Composition in Carotid Ultrasound			Type	Journal Article
	Year	2017	Publication	IEEE Journal Biomedical and Health Informatics	Abbreviated Journal	J-BHI
	Volume	21	Issue	1	Pages	48-55
	Keywords
	Abstract	Characterization of carotid plaque composition, more specifically the amount of lipid core, fibrous tissue, and calcified tissue, is an important task for the identification of plaques that are prone to rupture, and thus for early risk estimation of cardiovascular and cerebrovascular events. Due to its low costs and wide availability, carotid ultrasound has the potential to become the modality of choice for plaque characterization in clinical practice. However, its significant image noise, coupled with the small size of the plaques and their complex appearance, makes it difficult for automated techniques to discriminate between the different plaque constituents. In this paper, we propose to address this challenging problem by exploiting the unique capabilities of the emerging deep learning framework. More specifically, and unlike existing works which require a priori definition of specific imaging features or thresholding values, we propose to build a convolutional neural network (CNN) that will automatically extract from the images the information that is optimal for the identification of the different plaque constituents. We used approximately 90 000 patches extracted from a database of images and corresponding expert plaque characterizations to train and to validate the proposed CNN. The results of cross-validation experiments show a correlation of about 0.90 with the clinical assessment for the estimation of lipid core, fibrous cap, and calcified tissue areas, indicating the potential of deep learning for the challenging task of automatic characterization of plaque composition in carotid ultrasound.
	Address
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference
	Notes	MILAB; no menciona			Approved	no
	Call Number	Admin @ si @ LGB2017			Serial	2931
Permanent link to this record



	Author	Xavier Soria; Angel Sappa; Arash Akbarinia
	Title	Multispectral Single-Sensor RGB-NIR Imaging: New Challenges and Opportunities			Type	Conference Article
	Year	2017	Publication	7th International Conference on Image Processing Theory, Tools & Applications	Abbreviated Journal
	Volume		Issue		Pages
	Keywords	Color restoration; Neural networks; Singlesensor cameras; Multispectral images; RGB-NIR dataset
	Abstract	Multispectral images captured with a single sensor camera have become an attractive alternative for numerous computer vision applications. However, in order to fully exploit their potentials, the color restoration problem (RGB representation) should be addressed. This problem is more evident in outdoor scenarios containing vegetation, living beings, or specular materials. The problem of color distortion emerges from the sensitivity of sensors due to the overlap of visible and near infrared spectral bands. This paper empirically evaluates the variability of the near infrared (NIR) information with respect to the changes of light throughout the day. A tiny neural network is proposed to restore the RGB color representation from the given RGBN (Red, Green, Blue, NIR) images. In order to evaluate the proposed algorithm, different experiments on a RGBN outdoor dataset are conducted, which include various challenging cases. The obtained result shows the challenge and the importance of addressing color restoration in single sensor multispectral images.
	Address	Montreal; Canada; November 2017
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference	IPTA
	Notes	NEUROBIT; MSIAU; 600.122			Approved	no
	Call Number	Admin @ si @ SSA2017			Serial	3074
Permanent link to this record



	Author	Hugo Jair Escalante; Isabelle Guyon; Sergio Escalera; Julio C. S. Jacques Junior; Xavier Baro; Evelyne Viegas; Yagmur Gucluturk; Umut Guclu; Marcel A. J. van Gerven; Rob van Lier; Meysam Madadi; Stephane Ayache
	Title	Design of an Explainable Machine Learning Challenge for Video Interviews			Type	Conference Article
	Year	2017	Publication	International Joint Conference on Neural Networks	Abbreviated Journal
	Volume		Issue		Pages
	Keywords
	Abstract	This paper reviews and discusses research advances on “explainable machine learning” in computer vision. We focus on a particular area of the “Looking at People” (LAP) thematic domain: first impressions and personality analysis. Our aim is to make the computational intelligence and computer vision communities aware of the importance of developing explanatory mechanisms for computer-assisted decision making applications, such as automating recruitment. Judgments based on personality traits are being made routinely by human resource departments to evaluate the candidates' capacity of social insertion and their potential of career growth. However, inferring personality traits and, in general, the process by which we humans form a first impression of people, is highly subjective and may be biased. Previous studies have demonstrated that learning machines can learn to mimic human decisions. In this paper, we go one step further and formulate the problem of explaining the decisions of the models as a means of identifying what visual aspects are important, understanding how they relate to decisions suggested, and possibly gaining insight into undesirable negative biases. We design a new challenge on explainability of learning machines for first impressions analysis. We describe the setting, scenario, evaluation metrics and preliminary outcomes of the competition. To the best of our knowledge this is the first effort in terms of challenges for explainability in computer vision. In addition our challenge design comprises several other quantitative and qualitative elements of novelty, including a “coopetition” setting, which combines competition and collaboration.
	Address	Anchorage; Alaska; USA; May 2017
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference	IJCNN
	Notes	HUPBA; no proj			Approved	no
	Call Number	Admin @ si @ EGE2017			Serial	2922
Permanent link to this record



	Author	Sergio Escalera; Xavier Baro; Hugo Jair Escalante; Isabelle Guyon
	Title	ChaLearn Looking at People: A Review of Events and Resources			Type	Conference Article
	Year	2017	Publication	30th International Joint Conference on Neural Networks	Abbreviated Journal
	Volume		Issue		Pages
	Keywords
	Abstract	This paper reviews the historic of ChaLearn Looking at People (LAP) events. We started in 2011 (with the release of the first Kinect device) to run challenges related to human action/activity and gesture recognition. Since then we have regularly organized events in a series of competitions covering all aspects of visual analysis of humans. So far we have organized more than 10 international challenges and events in this field. This paper reviews associated events, and introduces the ChaLearn LAP platform where public resources (including code, data and preprints of papers) related to the organized events are available. We also provide a discussion on perspectives of ChaLearn LAP activities.
	Address	Anchorage; Alaska; USA; May 2017
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference	IJCNN
	Notes	HuPBA; 602.143			Approved	no
	Call Number	Admin @ si @ EBE2017			Serial	3012
Permanent link to this record



	Author	Victor Vaquero; German Ros; Francesc Moreno-Noguer; Antonio Lopez; Alberto Sanfeliu
	Title	Joint coarse-and-fine reasoning for deep optical flow			Type	Conference Article
	Year	2017	Publication	24th International Conference on Image Processing	Abbreviated Journal
	Volume		Issue		Pages	2558-2562
	Keywords
	Abstract	We propose a novel representation for dense pixel-wise estimation tasks using CNNs that boosts accuracy and reduces training time, by explicitly exploiting joint coarse-and-fine reasoning. The coarse reasoning is performed over a discrete classification space to obtain a general rough solution, while the fine details of the solution are obtained over a continuous regression space. In our approach both components are jointly estimated, which proved to be beneficial for improving estimation accuracy. Additionally, we propose a new network architecture, which combines coarse and fine components by treating the fine estimation as a refinement built on top of the coarse solution, and therefore adding details to the general prediction. We apply our approach to the challenging problem of optical flow estimation and empirically validate it against state-of-the-art CNN-based solutions trained from scratch and tested on large optical flow datasets.
	Address	Beijing; China; September 2017
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference	ICIP
	Notes	ADAS; 600.118			Approved	no
	Call Number	Admin @ si @ VRM2017			Serial	2898
Permanent link to this record



	Author	Maedeh Aghaei; Mariella Dimiccoli; Petia Radeva
	Title	All the people around me: face clustering in egocentric photo streams			Type	Conference Article
	Year	2017	Publication	24th International Conference on Image Processing	Abbreviated Journal
	Volume		Issue		Pages
	Keywords	face discovery; face clustering; deepmatching; bag-of-tracklets; egocentric photo-streams
	Abstract	arxiv1703.01790 Given an unconstrained stream of images captured by a wearable photo-camera (2fpm), we propose an unsupervised bottom-up approach for automatic clustering appearing faces into the individual identities present in these data. The problem is challenging since images are acquired under real world conditions; hence the visible appearance of the people in the images undergoes intensive variations. Our proposed pipeline consists of first arranging the photo-stream into events, later, localizing the appearance of multiple people in them, and finally, grouping various appearances of the same person across different events. Experimental results performed on a dataset acquired by wearing a photo-camera during one month, demonstrate the effectiveness of the proposed approach for the considered purpose.
	Address	Beijing; China; September 2017
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference	ICIP
	Notes	MILAB; no menciona			Approved	no
	Call Number	Admin @ si @ EDR2017			Serial	3025
Permanent link to this record