toggle visibility Search & Display Options

Select All    Deselect All
 |   | 
Details
   print
  Records Links
Author Ayan Banerjee; Sanket Biswas; Josep Llados; Umapada Pal edit   pdf
url  openurl
  Title GraphKD: Exploring Knowledge Distillation Towards Document Object Detection with Structured Graph Creation Type Miscellaneous
  Year 2024 Publication Arxiv Abbreviated Journal  
  Volume Issue Pages  
  Keywords  
  Abstract (down) Object detection in documents is a key step to automate the structural elements identification process in a digital or scanned document through understanding the hierarchical structure and relationships between different elements. Large and complex models, while achieving high accuracy, can be computationally expensive and memory-intensive, making them impractical for deployment on resource constrained devices. Knowledge distillation allows us to create small and more efficient models that retain much of the performance of their larger counterparts. Here we present a graph-based knowledge distillation framework to correctly identify and localize the document objects in a document image. Here, we design a structured graph with nodes containing proposal-level features and edges representing the relationship between the different proposal regions. Also, to reduce text bias an adaptive node sampling strategy is designed to prune the weight distribution and put more weightage on non-text nodes. We encode the complete graph as a knowledge representation and transfer it from the teacher to the student through the proposed distillation loss by effectively capturing both local and global information concurrently. Extensive experimentation on competitive benchmarks demonstrates that the proposed framework outperforms the current state-of-the-art approaches. The code will be available at: this https URL.  
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes DAG Approved no  
  Call Number Admin @ si @ BBL2024b Serial 4023  
Permanent link to this record
 

 
Author Andreea Glavan; Alina Matei; Petia Radeva; Estefania Talavera edit  url
openurl 
  Title Does our social life influence our nutritional behaviour? Understanding nutritional habits from egocentric photo-streams Type Journal Article
  Year 2021 Publication Expert Systems with Applications Abbreviated Journal ESWA  
  Volume 171 Issue Pages 114506  
  Keywords  
  Abstract (down) Nutrition and social interactions are both key aspects of the daily lives of humans. In this work, we propose a system to evaluate the influence of social interaction in the nutritional habits of a person from a first-person perspective. In order to detect the routine of an individual, we construct a nutritional behaviour pattern discovery model, which outputs routines over a number of days. Our method evaluates similarity of routines with respect to visited food-related scenes over the collected days, making use of Dynamic Time Warping, as well as considering social engagement and its correlation with food-related activities. The nutritional and social descriptors of the collected days are evaluated and encoded using an LSTM Autoencoder. Later, the obtained latent space is clustered to find similar days unaffected by outliers using the Isolation Forest method. Moreover, we introduce a new score metric to evaluate the performance of the proposed algorithm. We validate our method on 104 days and more than 100 k egocentric images gathered by 7 users. Several different visualizations are evaluated for the understanding of the findings. Our results demonstrate good performance and applicability of our proposed model for social-related nutritional behaviour understanding. At the end, relevant applications of the model are discussed by analysing the discovered routine of particular individuals.  
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes MILAB; no proj Approved no  
  Call Number Admin @ si @ GMR2021 Serial 3634  
Permanent link to this record
 

 
Author Mireia Forns-Nadal; Federico Sem; Anna Mane; Laura Igual; Dani Guinart; Oscar Vilarroya edit  url
doi  openurl
  Title Increased Nucleus Accumbens Volume in First-Episode Psychosis Type Journal Article
  Year 2017 Publication Psychiatry Research-Neuroimaging Abbreviated Journal PRN  
  Volume 263 Issue Pages 57-60  
  Keywords  
  Abstract (down) Nucleus accumbens has been reported as a key structure in the neurobiology of schizophrenia. Studies analyzing structural abnormalities have shown conflicting results, possibly related to confounding factors. We investigated the nucleus accumbens volume using manual delimitation in first-episode psychosis (FEP) controlling for age, cannabis use and medication. Thirty-one FEP subjects who were naive or minimally exposed to antipsychotics and a control group were MRI scanned and clinically assessed from baseline to 6 months of follow-up. FEP showed increased relative and total accumbens volumes. Clinical correlations with negative symptoms, duration of untreated psychosis and cannabis use were not significant.  
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes MILAB; no menciona Approved no  
  Call Number Admin @ si @ FSM2017 Serial 3028  
Permanent link to this record
 

 
Author Jialuo Chen; Pau Riba; Alicia Fornes; Juan Mas; Josep Llados; Joana Maria Pujadas-Mora edit   pdf
doi  openurl
  Title Word-Hunter: A Gamesourcing Experience to Validate the Transcription of Historical Manuscripts Type Conference Article
  Year 2018 Publication 16th International Conference on Frontiers in Handwriting Recognition Abbreviated Journal  
  Volume Issue Pages 528-533  
  Keywords Crowdsourcing; Gamification; Handwritten documents; Performance evaluation  
  Abstract (down) Nowadays, there are still many handwritten historical documents in archives waiting to be transcribed and indexed. Since manual transcription is tedious and time consuming, the automatic transcription seems the path to follow. However, the performance of current handwriting recognition techniques is not perfect, so a manual validation is mandatory. Crowdsourcing is a good strategy for manual validation, however it is a tedious task. In this paper we analyze experiences based in gamification
in order to propose and design a gamesourcing framework that increases the interest of users. Then, we describe and analyze our experience when validating the automatic transcription using the gamesourcing application. Moreover, thanks to the combination of clustering and handwriting recognition techniques, we can speed up the validation while maintaining the performance.
 
  Address Niagara Falls, USA; August 2018  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference ICFHR  
  Notes DAG; 600.097; 603.057; 600.121 Approved no  
  Call Number Admin @ si @ CRF2018 Serial 3169  
Permanent link to this record
 

 
Author B. Gautam; Oriol Ramos Terrades; Joana Maria Pujadas-Mora; Miquel Valls-Figols edit   pdf
url  openurl
  Title Knowledge graph based methods for record linkage Type Journal Article
  Year 2020 Publication Pattern Recognition Letters Abbreviated Journal PRL  
  Volume 136 Issue Pages 127-133  
  Keywords  
  Abstract (down) Nowadays, it is common in Historical Demography the use of individual-level data as a consequence of a predominant life-course approach for the understanding of the demographic behaviour, family transition, mobility, etc. Advanced record linkage is key since it allows increasing the data complexity and its volume to be analyzed. However, current methods are constrained to link data from the same kind of sources. Knowledge graph are flexible semantic representations, which allow to encode data variability and semantic relations in a structured manner.

In this paper we propose the use of knowledge graph methods to tackle record linkage tasks. The proposed method, named WERL, takes advantage of the main knowledge graph properties and learns embedding vectors to encode census information. These embeddings are properly weighted to maximize the record linkage performance. We have evaluated this method on benchmark data sets and we have compared it to related methods with stimulating and satisfactory results.
 
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes DAG; 600.140; 600.121 Approved no  
  Call Number Admin @ si @ GRP2020 Serial 3453  
Permanent link to this record
 

 
Author R. Bertrand; Oriol Ramos Terrades; P. Gomez-Kramer; P. Franco; Jean-Marc Ogier edit  doi
openurl 
  Title A Conditional Random Field model for font forgery detection Type Conference Article
  Year 2015 Publication 13th International Conference on Document Analysis and Recognition ICDAR2015 Abbreviated Journal  
  Volume Issue Pages 576 - 580  
  Keywords  
  Abstract (down) Nowadays, document forgery is becoming a real issue. A large amount of documents that contain critical information as payment slips, invoices or contracts, are constantly subject to fraudster manipulation because of the lack of security regarding this kind of document. Previously, a system to detect fraudulent documents based on its intrinsic features has been presented. It was especially designed to retrieve copy-move forgery and imperfection due to fraudster manipulation. However, when a set of characters is not present in the original document, copy-move forgery is not feasible. Hence, the fraudster will use a text toolbox to add or modify information in the document by imitating the font or he will cut and paste characters from another document where the font properties are similar. This often results in font type errors. Thus, a clue to detect document forgery consists of finding characters, words or sentences in a document with font properties different from their surroundings. To this end, we present in this paper an automatic forgery detection method based on document font features. Using the Conditional Random Field a measurement of probability that a character belongs to a specific font is made by comparing the character font features to a knowledge database. Then, the character is classified as a genuine or a fake one by comparing its probability to belong to a certain font type with those of the neighboring characters.  
  Address Nancy; France; August 2015  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference ICDAR  
  Notes DAG; 600.077 Approved no  
  Call Number Admin @ si @ BRG2015 Serial 2725  
Permanent link to this record
 

 
Author Jesus Jaime Moreno Escobar edit  url
isbn  openurl
  Title Perceptual Criteria on Image Compresions Type Book Whole
  Year 2011 Publication PhD Thesis, Universitat Autonoma de Barcelona-CVC Abbreviated Journal  
  Volume Issue Pages  
  Keywords  
  Abstract (down) Nowadays, digital images are used in many areas in everyday life, but they tend to be big. This increases amount of information leads us to the problem of image data storage. For example, it is common to have a representation a color pixel as a 24-bit number, where the channels red, green, and blue employ 8 bits each. In consequence, this kind of color pixel can specify one of 224 ¼ 16:78 million colors. Therefore, an image at a resolution of 512 £ 512 that allocates 24 bits per pixel, occupies 786,432 bytes. That is why image compression is important. An important feature of image compression is that it can be lossy or lossless. A compressed image is acceptable provided these losses of image information are not perceived by the eye. It is possible to assume that a portion of this information is redundant. Lossless Image Compression is defined as to mathematically decode the same image which was encoded. In Lossy Image Compression needs to identify two features inside the image: the redundancy and the irrelevancy of information. Thus, lossy compression modifies the image data in such a way when they are encoded and decoded, the recovered image is similar enough to the original one. How similar is the recovered image in comparison to the original image is defined prior to the compression process, and it depends on the implementation to be performed. In lossy compression, current image compression schemes remove information considered irrelevant by using mathematical criteria. One of the problems of these schemes is that although the numerical quality of the compressed image is low, it shows a high visual image quality, e.g. it does not show a lot of visible artifacts. It is because these mathematical criteria, used to remove information, do not take into account if the viewed information is perceived by the Human Visual System. Therefore, the aim of an image compression scheme designed to obtain images that do not show artifacts although their numerical quality can be low, is to eliminate the information that is not visible by the Human Visual System. Hence, this Ph.D. thesis proposes to exploit the visual redundancy existing in an image by reducing those features that can be unperceivable for the Human Visual System. First, we define an image quality assessment, which is highly correlated with the psychophysical experiments performed by human observers. The proposed CwPSNR metrics weights the well-known PSNR by using a particular perceptual low level model of the Human Visual System, e.g. the Chromatic Induction Wavelet Model (CIWaM). Second, we propose an image compression algorithm (called Hi-SET), which exploits the high correlation and self-similarity of pixels in a given area or neighborhood by means of a fractal function. Hi-SET possesses the main features that modern image compressors have, that is, it is an embedded coder, which allows a progressive transmission. Third, we propose a perceptual quantizer (½SQ), which is a modification of the uniform scalar quantizer. The ½SQ is applied to a pixel set in a certain Wavelet sub-band, that is, a global quantization. Unlike this, the proposed modification allows to perform a local pixel-by-pixel forward and inverse quantization, introducing into this process a perceptual distortion which depends on the surround spatial information of the pixel. Combining ½SQ method with the Hi-SET image compressor, we define a perceptual image compressor, called ©SET. Finally, a coding method for Region of Interest areas is presented, ½GBbBShift, which perceptually weights pixels into these areas and maintains only the more important perceivable features in the rest of the image. Results presented in this report show that CwPSNR is the best-ranked image quality method when it is applied to the most common image compression distortions such as JPEG and JPEG2000. CwPSNR shows the best correlation with the judgement of human observers, which is based on the results of psychophysical experiments obtained for relevant image quality databases such as TID2008, LIVE, CSIQ and IVC. Furthermore, Hi-SET coder obtains better results both for compression ratios and perceptual image quality than the JPEG2000 coder and other coders that use a Hilbert Fractal for image compression. Hence, when the proposed perceptual quantization is introduced to Hi-SET coder, our compressor improves its numerical and perceptual e±ciency. When ½GBbBShift method applied to Hi-SET is compared against MaxShift method applied to the JPEG2000 standard and Hi-SET, the images coded by our ROI method get the best results when the overall image quality is estimated. Both the proposed perceptual quantization and the ½GBbBShift method are generalized algorithms that can be applied to other Wavelet based image compression algorithms such as JPEG2000, SPIHT or SPECK.  
  Address  
  Corporate Author Thesis Ph.D. thesis  
  Publisher Ediciones Graficas Rey Place of Publication Editor Xavier Otazu  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN 978-84-938351-3-2 Medium  
  Area Expedition Conference  
  Notes CIC Approved no  
  Call Number Admin @ si @ Mor2011 Serial 1786  
Permanent link to this record
 

 
Author Jiaolong Xu; Sebastian Ramos; David Vazquez; Antonio Lopez edit   pdf
doi  openurl
  Title Incremental Domain Adaptation of Deformable Part-based Models Type Conference Article
  Year 2014 Publication 25th British Machine Vision Conference Abbreviated Journal  
  Volume Issue Pages  
  Keywords Pedestrian Detection; Part-based models; Domain Adaptation  
  Abstract (down) Nowadays, classifiers play a core role in many computer vision tasks. The underlying assumption for learning classifiers is that the training set and the deployment environment (testing) follow the same probability distribution regarding the features used by the classifiers. However, in practice, there are different reasons that can break this constancy assumption. Accordingly, reusing existing classifiers by adapting them from the previous training environment (source domain) to the new testing one (target domain)
is an approach with increasing acceptance in the computer vision community. In this paper we focus on the domain adaptation of deformable part-based models (DPMs) for object detection. In particular, we focus on a relatively unexplored scenario, i.e. incremental domain adaptation for object detection assuming weak-labeling. Therefore, our algorithm is ready to improve existing source-oriented DPM-based detectors as soon as a little amount of labeled target-domain training data is available, and keeps improving as more of such data arrives in a continuous fashion. For achieving this, we follow a multiple
instance learning (MIL) paradigm that operates in an incremental per-image basis. As proof of concept, we address the challenging scenario of adapting a DPM-based pedestrian detector trained with synthetic pedestrians to operate in real-world scenarios. The obtained results show that our incremental adaptive models obtain equally good accuracy results as the batch learned models, while being more flexible for handling continuously arriving target-domain data.
 
  Address Nottingham; uk; September 2014  
  Corporate Author Thesis  
  Publisher BMVA Press Place of Publication Editor Valstar, Michel and French, Andrew and Pridmore, Tony  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference BMVC  
  Notes ADAS; 600.057; 600.054; 600.076 Approved no  
  Call Number XRV2014c; ADAS @ adas @ xrv2014c Serial 2455  
Permanent link to this record
 

 
Author Lluis Pere de las Heras; Oriol Ramos Terrades; Josep Llados; David Fernandez; Cristina Cañero edit  doi
openurl 
  Title Use case visual Bag-of-Words techniques for camera based identity document classification Type Conference Article
  Year 2015 Publication 13th International Conference on Document Analysis and Recognition ICDAR2015 Abbreviated Journal  
  Volume Issue Pages 721 - 725  
  Keywords  
  Abstract (down) Nowadays, automatic identity document recognition, including passport and driving license recognition, is at the core of many applications within the administrative and service sectors, such as police, hospitality, car renting, etc. In former years, the document information was manually extracted whereas today this data is recognized automatically from images obtained by flat-bed scanners. Yet, since these scanners tend to be expensive and voluminous, companies in the sector have recently turned their attention to cheaper, small and yet computationally powerful scanners: the mobile devices. The document identity recognition from mobile images enclose several new difficulties w.r.t traditional scanned images, such as the loss of a controlled background, perspective, blurring, etc. In this paper we present a real application for identity document classification of images taken from mobile devices. This classification process is of extreme importance since a prior knowledge of the document type and origin strongly facilitates the subsequent information extraction. The proposed method is based on a traditional Bagof-Words in which we have taken into consideration several key aspects to enhance recognition rate. The method performance has been studied on three datasets containing more than 2000 images from 129 different document classes.  
  Address Nancy; France; August 2015  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference ICDAR  
  Notes DAG; 600.077; 600.061; Approved no  
  Call Number Admin @ si @ HRL2015a Serial 2726  
Permanent link to this record
 

 
Author Carlos Boned Riera; Oriol Ramos Terrades edit  doi
openurl 
  Title Discriminative Neural Variational Model for Unbalanced Classification Tasks in Knowledge Graph Type Conference Article
  Year 2022 Publication 26th International Conference on Pattern Recognition Abbreviated Journal  
  Volume Issue Pages 2186-2191  
  Keywords Measurement; Couplings; Semantics; Ear; Benchmark testing; Data models; Pattern recognition  
  Abstract (down) Nowadays the paradigm of link discovery problems has shown significant improvements on Knowledge Graphs. However, method performances are harmed by the unbalanced nature of this classification problem, since many methods are easily biased to not find proper links. In this paper we present a discriminative neural variational auto-encoder model, called DNVAE from now on, in which we have introduced latent variables to serve as embedding vectors. As a result, the learnt generative model approximate better the underlying distribution and, at the same time, it better differentiate the type of relations in the knowledge graph. We have evaluated this approach on benchmark knowledge graph and Census records. Results in this last data set are quite impressive since we reach the highest possible score in the evaluation metrics. However, further experiments are still needed to deeper evaluate the performance of the method in more challenging tasks.  
  Address Montreal; Quebec; Canada; August 2022  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference ICPR  
  Notes DAG; 600.121; 600.162 Approved no  
  Call Number Admin @ si @ BoR2022 Serial 3741  
Permanent link to this record
 

 
Author Sandra Jimenez; Xavier Otazu; Valero Laparra; Jesus Malo edit   pdf
doi  openurl
  Title Chromatic induction and contrast masking: similar models, different goals? Type Conference Article
  Year 2013 Publication Human Vision and Electronic Imaging XVIII Abbreviated Journal  
  Volume 8651 Issue Pages  
  Keywords  
  Abstract (down) Normalization of signals coming from linear sensors is an ubiquitous mechanism of neural adaptation.1 Local interaction between sensors tuned to a particular feature at certain spatial position and neighbor sensors explains a wide range of psychophysical facts including (1) masking of spatial patterns, (2) non-linearities of motion sensors, (3) adaptation of color perception, (4) brightness and chromatic induction, and (5) image quality assessment. Although the above models have formal and qualitative similarities, it does not necessarily mean that the mechanisms involved are pursuing the same statistical goal. For instance, in the case of chromatic mechanisms (disregarding spatial information), different parameters in the normalization give rise to optimal discrimination or adaptation, and different non-linearities may give rise to error minimization or component independence. In the case of spatial sensors (disregarding color information), a number of studies have pointed out the benefits of masking in statistical independence terms. However, such statistical analysis has not been performed for spatio-chromatic induction models where chromatic perception depends on spatial configuration. In this work we investigate whether successful spatio-chromatic induction models,6 increase component independence similarly as previously reported for masking models. Mutual information analysis suggests that seeking an efficient chromatic representation may explain the prevalence of induction effects in spatially simple images. © (2013) COPYRIGHT Society of Photo-Optical Instrumentation Engineers (SPIE). Downloading of the abstract is permitted for personal use only.  
  Address San Francisco CA; USA; February 2013  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference HVEI  
  Notes CIC Approved no  
  Call Number Admin @ si @ JOL2013 Serial 2240  
Permanent link to this record
 

 
Author Maria Elena Meza-de-Luna; Juan Ramon Terven Salinas; Bogdan Raducanu; Joaquin Salas edit   pdf
doi  openurl
  Title Assessing the Influence of Mirroring on the Perception of Professional Competence using Wearable Technology Type Journal Article
  Year 2016 Publication IEEE Transactions on Affective Computing Abbreviated Journal TAC  
  Volume 9 Issue 2 Pages 161-175  
  Keywords Mirroring; Nodding; Competence; Perception; Wearable Technology  
  Abstract (down) Nonverbal communication is an intrinsic part in daily face-to-face meetings. A frequently observed behavior during social interactions is mirroring, in which one person tends to mimic the attitude of the counterpart. This paper shows that a computer vision system could be used to predict the perception of competence in dyadic interactions through the automatic detection of mirroring
events. To prove our hypothesis, we developed: (1) A social assistant for mirroring detection, using a wearable device which includes a video camera and (2) an automatic classifier for the perception of competence, using the number of nodding gestures and mirroring events as predictors. For our study, we used a mixed-method approach in an experimental design where 48 participants acting as customers interacted with a confederated psychologist. We found that the number of nods or mirroring events has a significant influence on the perception of competence. Our results suggest that: (1) Customer mirroring is a better predictor than psychologist mirroring; (2) the number of psychologist’s nods is a better predictor than the number of customer’s nods; (3) except for the psychologist mirroring, the computer vision algorithm we used worked about equally well whether it was acquiring images from wearable smartglasses or fixed cameras.
 
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes OR; 600.072;MV Approved no  
  Call Number Admin @ si @ MTR2016 Serial 2826  
Permanent link to this record
 

 
Author German Barquero; Johnny Nuñez; Sergio Escalera; Zhen Xu; Wei-Wei Tu; Isabelle Guyon edit  url
openurl 
  Title Didn’t see that coming: a survey on non-verbal social human behavior forecasting Type Conference Article
  Year 2022 Publication Understanding Social Behavior in Dyadic and Small Group Interactions Abbreviated Journal  
  Volume 173 Issue Pages 139-178  
  Keywords  
  Abstract (down) Non-verbal social human behavior forecasting has increasingly attracted the interest of the research community in recent years. Its direct applications to human-robot interaction and socially-aware human motion generation make it a very attractive field. In this survey, we define the behavior forecasting problem for multiple interactive agents in a generic way that aims at unifying the fields of social signals prediction and human motion forecasting, traditionally separated. We hold that both problem formulations refer to the same conceptual problem, and identify many shared fundamental challenges: future stochasticity, context awareness, history exploitation, etc. We also propose a taxonomy that comprises
methods published in the last 5 years in a very informative way and describes the current main concerns of the community with regard to this problem. In order to promote further research on this field, we also provide a summarized and friendly overview of audiovisual datasets featuring non-acted social interactions. Finally, we describe the most common metrics used in this task and their particular issues.
 
  Address Virtual; June 2022  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference PMLR  
  Notes HuPBA; no proj Approved no  
  Call Number Admin @ si @ BNE2022 Serial 3766  
Permanent link to this record
 

 
Author G. Roig; Xavier Boix; F. de la Torre; Joan Serrat; C. Vilella edit  openurl
  Title Hierarchical CRF with product label spaces for parts-based Models Type Conference Article
  Year 2011 Publication IEEE Conference on Automatic Face and Gesture Recognition Abbreviated Journal  
  Volume Issue Pages  
  Keywords  
  Abstract (down) Non-rigid object detection is a challenging an open research problem in computer vision. It is a critical part in many applications such as image search, surveillance, human-computer interaction or image auto-annotation. Most successful approaches to non-rigid object detection make use of part-based models. In particular, Conditional Random Fields (CRF) have been successfully embedded into a discriminative parts-based model framework due to its effectiveness for learning and inference (usually based on a tree structure). However, CRF-based approaches do not incorporate global constraints and only model pairwise interactions. This is especially important when modeling object classes that may have complex parts interactions (e.g. facial features or body articulations), because neglecting them yields an oversimplified model with suboptimal performance. To overcome this limitation, this paper proposes a novel hierarchical CRF (HCRF). The main contribution is to build a hierarchy of part combinations by extending the label set to a hierarchy of product label spaces. In order to keep the inference computation tractable, we propose an effective method to reduce the new label set. We test our method on two applications: facial feature detection on the Multi-PIE database and human pose estimation on the Buffy dataset.  
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference FG  
  Notes ADAS Approved no  
  Call Number Admin @ si @ RBT2011 Serial 1862  
Permanent link to this record
 

 
Author Bogdan Raducanu; Fadi Dornaika edit   pdf
doi  openurl
  Title Embedding new observations via sparse-coding for non-linear manifold learning Type Journal Article
  Year 2014 Publication Pattern Recognition Abbreviated Journal PR  
  Volume 47 Issue 1 Pages 480-492  
  Keywords  
  Abstract (down) Non-linear dimensionality reduction techniques are affected by two critical aspects: (i) the design of the adjacency graphs, and (ii) the embedding of new test data-the out-of-sample problem. For the first aspect, the proposed solutions, in general, were heuristically driven. For the second aspect, the difficulty resides in finding an accurate mapping that transfers unseen data samples into an existing manifold. Past works addressing these two aspects were heavily parametric in the sense that the optimal performance is only achieved for a suitable parameter choice that should be known in advance. In this paper, we demonstrate that the sparse representation theory not only serves for automatic graph construction as shown in recent works, but also represents an accurate alternative for out-of-sample embedding. Considering for a case study the Laplacian Eigenmaps, we applied our method to the face recognition problem. To evaluate the effectiveness of the proposed out-of-sample embedding, experiments are conducted using the K-nearest neighbor (KNN) and Kernel Support Vector Machines (KSVM) classifiers on six public face datasets. The experimental results show that the proposed model is able to achieve high categorization effectiveness as well as high consistency with non-linear embeddings/manifolds obtained in batch modes.  
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes OR;MV Approved no  
  Call Number Admin @ si @ RaD2013b Serial 2316  
Permanent link to this record
Select All    Deselect All
 |   | 
Details
   print

Save Citations:
Export Records: