|
Records |
Links |
|
Author |
Ayan Banerjee; Sanket Biswas; Josep Llados; Umapada Pal |
![goto web page url](img/www.gif)
|
|
Title |
SwinDocSegmenter: An End-to-End Unified Domain Adaptive Transformer for Document Instance Segmentation |
Type |
Conference Article |
|
Year |
2023 |
Publication |
17th International Conference on Document Analysis and Recognition |
Abbreviated Journal |
|
|
|
Volume |
14187 |
Issue |
|
Pages |
307–325 |
|
|
Keywords |
|
|
|
Abstract ![sorted by Abstract field, descending order (down)](img/sort_desc.gif) |
Instance-level segmentation of documents consists in assigning a class-aware and instance-aware label to each pixel of the image. It is a key step in document parsing for their understanding. In this paper, we present a unified transformer encoder-decoder architecture for en-to-end instance segmentation of complex layouts in document images. The method adapts a contrastive training with a mixed query selection for anchor initialization in the decoder. Later on, it performs a dot product between the obtained query embeddings and the pixel embedding map (coming from the encoder) for semantic reasoning. Extensive experimentation on competitive benchmarks like PubLayNet, PRIMA, Historical Japanese (HJ), and TableBank demonstrate that our model with SwinL backbone achieves better segmentation performance than the existing state-of-the-art approaches with the average precision of 93.72, 54.39, 84.65 and 98.04 respectively under one billion parameters. The code is made publicly available at: github.com/ayanban011/SwinDocSegmenter . |
|
|
Address |
San Jose; CA; USA; August 2023 |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
LNCS |
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
ICDAR |
|
|
Notes |
DAG |
Approved |
no |
|
|
Call Number |
Admin @ si @ BBL2023 |
Serial |
3893 |
|
Permanent link to this record |
|
|
|
|
Author |
Pau Riba; Anjan Dutta; Josep Llados; Alicia Fornes; Sounak Dey |
![download PDF file pdf](img/file_PDF.gif)
|
|
Title |
Improving Information Retrieval in Multiwriter Scenario by Exploiting the Similarity Graph of Document Terms |
Type |
Conference Article |
|
Year |
2017 |
Publication |
14th International Conference on Document Analysis and Recognition |
Abbreviated Journal |
|
|
|
Volume |
|
Issue |
|
Pages |
475-480 |
|
|
Keywords |
document terms; information retrieval; affinity graph; graph of document terms; multiwriter; graph diffusion |
|
|
Abstract ![sorted by Abstract field, descending order (down)](img/sort_desc.gif) |
Information Retrieval (IR) is the activity of obtaining information resources relevant to a questioned information. It usually retrieves a set of objects ranked according to the relevancy to the needed fact. In document analysis, information retrieval receives a lot of attention in terms of symbol and word spotting. However, through decades the community mostly focused either on printed or on single writer scenario, where the
state-of-the-art results have achieved reasonable performance on the available datasets. Nevertheless, the existing algorithms do not perform accordingly on multiwriter scenario. A graph representing relations between a set of objects is a structure where each node delineates an individual element and the similarity between them is represented as a weight on the connecting edge. In this paper, we explore different analytics of graphs constructed from words or graphical symbols, such as diffusion, shortest path, etc. to improve the performance of information retrieval methods in multiwriter scenario |
|
|
Address |
|
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
ICDAR |
|
|
Notes |
DAG; 600.097; 601.302; 600.121 |
Approved |
no |
|
|
Call Number |
Admin @ si @ RDL2017a |
Serial |
3053 |
|
Permanent link to this record |
|
|
|
|
Author |
Sanket Biswas; Pau Riba; Josep Llados; Umapada Pal |
![download PDF file pdf](img/file_PDF.gif)
![goto web page (via DOI) doi](img/doi.gif)
|
|
Title |
Beyond Document Object Detection: Instance-Level Segmentation of Complex Layouts |
Type |
Journal Article |
|
Year |
2021 |
Publication |
International Journal on Document Analysis and Recognition |
Abbreviated Journal |
IJDAR |
|
|
Volume |
24 |
Issue |
|
Pages |
269–281 |
|
|
Keywords |
|
|
|
Abstract ![sorted by Abstract field, descending order (down)](img/sort_desc.gif) |
Information extraction is a fundamental task of many business intelligence services that entail massive document processing. Understanding a document page structure in terms of its layout provides contextual support which is helpful in the semantic interpretation of the document terms. In this paper, inspired by the progress of deep learning methodologies applied to the task of object recognition, we transfer these models to the specific case of document object detection, reformulating the traditional problem of document layout analysis. Moreover, we importantly contribute to prior arts by defining the task of instance segmentation on the document image domain. An instance segmentation paradigm is especially important in complex layouts whose contents should interact for the proper rendering of the page, i.e., the proper text wrapping around an image. Finally, we provide an extensive evaluation, both qualitative and quantitative, that demonstrates the superior performance of the proposed methodology over the current state of the art. |
|
|
Address |
|
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
|
|
|
Notes |
DAG; 600.121; 600.140; 110.312 |
Approved |
no |
|
|
Call Number |
Admin @ si @ BRL2021b |
Serial |
3574 |
|
Permanent link to this record |
|
|
|
|
Author |
Minesh Mathew; Viraj Bagal; Ruben Tito; Dimosthenis Karatzas; Ernest Valveny; C.V. Jawahar |
![download PDF file pdf](img/file_PDF.gif)
![goto web page (via DOI) doi](img/doi.gif)
|
|
Title |
InfographicVQA |
Type |
Conference Article |
|
Year |
2022 |
Publication |
Winter Conference on Applications of Computer Vision |
Abbreviated Journal |
|
|
|
Volume |
|
Issue |
|
Pages |
1697-1706 |
|
|
Keywords |
Document Analysis Datasets; Evaluation and Comparison of Vision Algorithms; Vision and Languages |
|
|
Abstract ![sorted by Abstract field, descending order (down)](img/sort_desc.gif) |
Infographics communicate information using a combination of textual, graphical and visual elements. This work explores the automatic understanding of infographic images by using a Visual Question Answering technique. To this end, we present InfographicVQA, a new dataset comprising a diverse collection of infographics and question-answer annotations. The questions require methods that jointly reason over the document layout, textual content, graphical elements, and data visualizations. We curate the dataset with an emphasis on questions that require elementary reasoning and basic arithmetic skills. For VQA on the dataset, we evaluate two Transformer-based strong baselines. Both the baselines yield unsatisfactory results compared to near perfect human performance on the dataset. The results suggest that VQA on infographics--images that are designed to communicate information quickly and clearly to human brain--is ideal for benchmarking machine understanding of complex document images. The dataset is available for download at docvqa. org |
|
|
Address |
Virtual; Waikoloa; Hawai; USA; January 2022 |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
WACV |
|
|
Notes |
DAG; 600.155 |
Approved |
no |
|
|
Call Number |
MBT2022 |
Serial |
3625 |
|
Permanent link to this record |
|
|
|
|
Author |
Ahmed M. A. Salih; Ilaria Boscolo Galazzo; Federica Cruciani; Lorenza Brusini; Petia Radeva |
![goto web page url](img/www.gif)
![find record details (via OpenURL) openurl](img/xref.gif)
|
|
Title |
Investigating Explainable Artificial Intelligence for MRI-based Classification of Dementia: a New Stability Criterion for Explainable Methods |
Type |
Conference Article |
|
Year |
2022 |
Publication |
29th IEEE International Conference on Image Processing |
Abbreviated Journal |
|
|
|
Volume |
|
Issue |
|
Pages |
|
|
|
Keywords |
Image processing; Stability criteria; Machine learning; Robustness; Alzheimer's disease; Monitoring |
|
|
Abstract ![sorted by Abstract field, descending order (down)](img/sort_desc.gif) |
Individuals diagnosed with Mild Cognitive Impairment (MCI) have shown an increased risk of developing Alzheimer’s Disease (AD). As such, early identification of dementia represents a key prognostic element, though hampered by complex disease patterns. Increasing efforts have focused on Machine Learning (ML) to build accurate classification models relying on a multitude of clinical/imaging variables. However, ML itself does not provide sensible explanations related to the model mechanism and feature contribution. Explainable Artificial Intelligence (XAI) represents the enabling technology in this framework, allowing to understand ML outcomes and derive human-understandable explanations. In this study, we aimed at exploring ML combined with MRI-based features and XAI to solve this classification problem and interpret the outcome. In particular, we propose a new method to assess the robustness of feature rankings provided by XAI methods, especially when multicollinearity exists. Our findings indicate that our method was able to disentangle the list of the informative features underlying dementia, with important implications for aiding personalized monitoring plans. |
|
|
Address |
Bordeaux; France; October 2022 |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
ICIP |
|
|
Notes |
MILAB |
Approved |
no |
|
|
Call Number |
Admin @ si @ SBC2022 |
Serial |
3789 |
|
Permanent link to this record |
|
|
|
|
Author |
Oscar Amoros; Sergio Escalera; Anna Puig |
![find record details (via OpenURL) openurl](img/xref.gif)
|
|
Title |
Adaboost GPU-based Classifier for Direct Volume Rendering |
Type |
Conference Article |
|
Year |
2011 |
Publication |
International Conference on Computer Graphics Theory and Applications |
Abbreviated Journal |
|
|
|
Volume |
|
Issue |
|
Pages |
215-219 |
|
|
Keywords |
|
|
|
Abstract ![sorted by Abstract field, descending order (down)](img/sort_desc.gif) |
In volume visualization, the voxel visibitity and materials are carried out through an interactive editing of Transfer Function. In this paper, we present a two-level GPU-based labeling method that computes in times of rendering a set of labeled structures using the Adaboost machine learning classifier. In a pre-processing step, Adaboost trains a binary classifier from a pre-labeled dataset and, in each sample, takes into account a set of features. This binary classifier is a weighted combination of weak classifiers, which can be expressed as simple decision functions estimated on a single feature values. Then, at the testing stage, each weak classifier is independently applied on the features of a set of unlabeled samples. We propose an alternative representation of these classifiers that allow a GPU-based parallelizated testing stage embedded into the visualization pipeline. The empirical results confirm the OpenCL-based classification of biomedical datasets as a tough problem where an opportunity for further research emerges. |
|
|
Address |
Algarve, Portugal |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
GRAPP |
|
|
Notes |
MILAB; HuPBA |
Approved |
no |
|
|
Call Number |
Admin @ si @ AEP2011 |
Serial |
1774 |
|
Permanent link to this record |
|
|
|
|
Author |
Chengyi Zou; Shuai Wan; Marta Mrak; Marc Gorriz Blanch; Luis Herranz; Tiannan Ji |
![goto web page url](img/www.gif)
![find record details (via OpenURL) openurl](img/xref.gif)
|
|
Title |
Towards Lightweight Neural Network-based Chroma Intra Prediction for Video Coding |
Type |
Conference Article |
|
Year |
2022 |
Publication |
29th IEEE International Conference on Image Processing |
Abbreviated Journal |
|
|
|
Volume |
|
Issue |
|
Pages |
|
|
|
Keywords |
Video coding; Quantization (signal); Computational modeling; Neural networks; Predictive models; Video compression; Syntactics |
|
|
Abstract ![sorted by Abstract field, descending order (down)](img/sort_desc.gif) |
In video compression the luma channel can be useful for predicting chroma channels (Cb, Cr), as has been demonstrated with the Cross-Component Linear Model (CCLM) used in Versatile Video Coding (VVC) standard. More recently, it has been shown that neural networks can even better capture the relationship among different channels. In this paper, a new attention-based neural network is proposed for cross-component intra prediction. With the goal to simplify neural network design, the new framework consists of four branches: boundary branch and luma branch for extracting features from reference samples, attention branch for fusing the first two branches, and prediction branch for computing the predicted chroma samples. The proposed scheme is integrated into VVC test model together with one additional binary block-level syntax flag which indicates whether a given block makes use of the proposed method. Experimental results demonstrate 0.31%/2.36%/2.00% BD-rate reductions on Y/Cb/Cr components, respectively, on top of the VVC Test Model (VTM) 7.0 which uses CCLM. |
|
|
Address |
Bordeaux; France; October 2022 |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
ICIP |
|
|
Notes |
MACO |
Approved |
no |
|
|
Call Number |
Admin @ si @ ZWM2022 |
Serial |
3790 |
|
Permanent link to this record |
|
|
|
|
Author |
C. Alejandro Parraga; Robert Benavente; Maria Vanrell |
![find record details (via OpenURL) openurl](img/xref.gif)
|
|
Title |
Towards a general model of colour categorization which considers context |
Type |
Journal Article |
|
Year |
2010 |
Publication |
Perception. ECVP Abstract Supplement |
Abbreviated Journal |
PER |
|
|
Volume |
39 |
Issue |
|
Pages |
86 |
|
|
Keywords |
|
|
|
Abstract ![sorted by Abstract field, descending order (down)](img/sort_desc.gif) |
In two previous experiments [Parraga et al, 2009 J. of Im. Sci. and Tech 53(3) 031106; Benavente et al,2009 Perception 38 ECVP Supplement, 36] the boundaries of basic colour categories were measured.
In the first experiment, samples were presented in isolation (ie on a dark background) and boundaries were measured using a yes/no paradigm. In the second, subjects adjusted the chromaticity of a sample presented on a random Mondrian background to find the boundary between pairs of adjacent colours.
Results from these experiments showed significant dierences but it was not possible to conclude whether this discrepancy was due to the absence/presence of a colourful background or to the dierences in the paradigms used. In this work, we settle this question by repeating the first experiment (ie samples presented on a dark background) using the second paradigm. A comparison of results shows that
although boundary locations are very similar, boundaries measured in context are significantly dierent(more diuse) than those measured in isolation (confirmed by a Student’s t-test analysis on the subject’s answers statistical distributions). In addition, we completed the mapping of colour name space by measuring the boundaries between chromatic colours and the achromatic centre. With these results we
completed our parametric fuzzy-sets model of colour naming space. |
|
|
Address |
|
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
|
|
|
Notes |
CIC |
Approved |
no |
|
|
Call Number |
CAT @ cat @ PBV2010b |
Serial |
1326 |
|
Permanent link to this record |
|
|
|
|
Author |
Olivier Penacchio; C. Alejandro Parraga; Maria Vanrell |
![find record details (via OpenURL) openurl](img/xref.gif)
|
|
Title |
Natural Scene Statistics account for Human Cones Ratios |
Type |
Journal Article |
|
Year |
2010 |
Publication |
Perception. ECVP Abstract Supplement |
Abbreviated Journal |
PER |
|
|
Volume |
39 |
Issue |
|
Pages |
101 |
|
|
Keywords |
|
|
|
Abstract ![sorted by Abstract field, descending order (down)](img/sort_desc.gif) |
In two previous experiments [Parraga et al, 2009 J. of Im. Sci. and Tech 53(3) 031106; Benavente et al,2009 Perception 38 ECVP Supplement, 36] the boundaries of basic colour categories were measured.
In the first experiment, samples were presented in isolation (ie on a dark background) and boundaries were measured using a yes/no paradigm. In the second, subjects adjusted the chromaticity of a sample presented on a random Mondrian background to find the boundary between pairs of adjacent colours.
Results from these experiments showed significant dierences but it was not possible to conclude whether this discrepancy was due to the absence/presence of a colourful background or to the dierences in the paradigms used. In this work, we settle this question by repeating the first experiment (ie samples presented on a dark background) using the second paradigm. A comparison of results shows that
although boundary locations are very similar, boundaries measured in context are significantly dierent(more diuse) than those measured in isolation (confirmed by a Student’s t-test analysis on the subject’s answers statistical distributions). In addition, we completed the mapping of colour name space by measuring the boundaries between chromatic colours and the achromatic centre. With these results we completed our parametric fuzzy-sets model of colour naming space. |
|
|
Address |
|
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
|
|
|
Notes |
CIC |
Approved |
no |
|
|
Call Number |
CAT @ cat @ PPV2010 |
Serial |
1357 |
|
Permanent link to this record |
|
|
|
|
Author |
Enric Marti; Ferran Poveda; Antoni Gurgui; Debora Gil |
![download PDF file pdf](img/file_PDF.gif)
![find book details (via ISBN) isbn](img/isbn.gif)
|
|
Title |
Aprendizaje Basado en Proyectos en Ingeniería Informática. Resultados y reflexiones de seis años de experiencia |
Type |
Miscellaneous |
|
Year |
2011 |
Publication |
Actas del Simposio-Taller JENUI 2011 |
Abbreviated Journal |
|
|
|
Volume |
|
Issue |
|
Pages |
1-8 |
|
|
Keywords |
|
|
|
Abstract ![sorted by Abstract field, descending order (down)](img/sort_desc.gif) |
In this workshop a 6 years experience in Project Based Learning (PBL) in Computer Graphics, Computer Engineering course at the Autonomous University of Barcelona (UAB) is presented. We use a Moodle environment suited to manage the documentation generated in PBL. The course is organized by means of two alternative routes: a classic itinerary of lectures and test-based evaluation and another with PBL. In the PBL itinerary we explain the organization in teamgroups, homework tutoring and monitoring and evaluation guidelines for students. We provide some of the work done by students, and the results of assessment surveys carried out to students during these years. We report the evolution of our PBL itinerary in terms of, both, organization and student surveys.
The workshop aims at discussing about on the advantages and disadvantages of using these active methodologies in technical degrees such as computer engineering, in order to debate about the most suitable way of organizing PBL and assessing students learning rate. |
|
|
Address |
Sevilla, Spain |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
spanish |
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
978-84-694-5440-4 |
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
JENUI |
|
|
Notes |
IAM |
Approved |
no |
|
|
Call Number |
IAM @ iam @ MPG2011 |
Serial |
1584 |
|
Permanent link to this record |
|
|
|
|
Author |
Yasuko Sugito; Trevor Canham; Javier Vazquez; Marcelo Bertalmio |
![goto web page url](img/www.gif)
![find record details (via OpenURL) openurl](img/xref.gif)
|
|
Title |
A Study of Objective Quality Metrics for HLG-Based HDR/WCG Image Coding |
Type |
Journal |
|
Year |
2021 |
Publication |
SMPTE Motion Imaging Journal |
Abbreviated Journal |
SMPTE |
|
|
Volume |
130 |
Issue |
4 |
Pages |
53 - 65 |
|
|
Keywords |
|
|
|
Abstract ![sorted by Abstract field, descending order (down)](img/sort_desc.gif) |
In this work, we study the suitability of high dynamic range, wide color gamut (HDR/WCG) objective quality metrics to assess the perceived deterioration of compressed images encoded using the hybrid log-gamma (HLG) method, which is the standard for HDR television. Several image quality metrics have been developed to deal specifically with HDR content, although in previous work we showed that the best results (i.e., better matches to the opinion of human expert observers) are obtained by an HDR metric that consists simply in applying a given standard dynamic range metric, called visual information fidelity (VIF), directly to HLG-encoded images. However, all these HDR metrics ignore the chroma components for their calculations, that is, they consider only the luminance channel. For this reason, in the current work, we conduct subjective evaluation experiments in a professional setting using compressed HDR/WCG images encoded with HLG and analyze the ability of the best HDR metric to detect perceivable distortions in the chroma components, as well as the suitability of popular color metrics (including ΔITPR , which supports parameters for HLG) to correlate with the opinion scores. Our first contribution is to show that there is a need to consider the chroma components in HDR metrics, as there are color distortions that subjects perceive but that the best HDR metric fails to detect. Our second contribution is the surprising result that VIF, which utilizes only the luminance channel, correlates much better with the subjective evaluation scores than the metrics investigated that do consider the color components. |
|
|
Address |
|
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
|
|
|
Notes |
CIC |
Approved |
no |
|
|
Call Number |
SCV2021 |
Serial |
3671 |
|
Permanent link to this record |
|
|
|
|
Author |
Josep Famadas; Meysam Madadi; Cristina Palmero; Sergio Escalera |
![download PDF file pdf](img/file_PDF.gif)
![find record details (via OpenURL) openurl](img/xref.gif)
|
|
Title |
Generative Video Face Reenactment by AUs and Gaze Regularization |
Type |
Conference Article |
|
Year |
2020 |
Publication |
15th IEEE International Conference on Automatic Face and Gesture Recognition |
Abbreviated Journal |
|
|
|
Volume |
|
Issue |
|
Pages |
444-451 |
|
|
Keywords |
|
|
|
Abstract ![sorted by Abstract field, descending order (down)](img/sort_desc.gif) |
In this work, we propose an encoder-decoder-like architecture to perform face reenactment in image sequences. Our goal is to transfer the training subject identity to a given test subject. We regularize face reenactment by facial action unit intensity and 3D gaze vector regression. This way, we enforce the network to transfer subtle facial expressions and eye dynamics, providing a more lifelike result. The proposed encoder-decoder receives as input the previous sequence frame stacked to the current frame image of facial landmarks. Thus, the generated frames benefit from appearance and geometry, while keeping temporal coherence for the generated sequence. At test stage, a new target subject with the facial performance of the source subject and the appearance of the training subject is reenacted. Principal component analysis is applied to project the test subject geometry to the closest training subject geometry before reenactment. Evaluation of our proposal shows faster convergence, and more accurate and realistic results in comparison to other architectures without action units and gaze regularization. |
|
|
Address |
Virtual; November 2020 |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
FG |
|
|
Notes |
HUPBA |
Approved |
no |
|
|
Call Number |
Admin @ si @ FMP2020 |
Serial |
3517 |
|
Permanent link to this record |
|
|
|
|
Author |
Mariella Dimiccoli; Jean-Pascal Jacob; Lionel Moisan |
![download PDF file pdf](img/file_PDF.gif)
![find record details (via OpenURL) openurl](img/xref.gif)
|
|
Title |
Particle detection and tracking in fluorescence time-lapse imaging: a contrario approach |
Type |
Journal Article |
|
Year |
2016 |
Publication |
Journal of Machine Vision and Applications |
Abbreviated Journal |
MVAP |
|
|
Volume |
27 |
Issue |
|
Pages |
511-527 |
|
|
Keywords |
particle detection; particle tracking; a-contrario approach; time-lapse fluorescence imaging |
|
|
Abstract ![sorted by Abstract field, descending order (down)](img/sort_desc.gif) |
In this work, we propose a probabilistic approach for the detection and the
tracking of particles on biological images. In presence of very noised and poor
quality data, particles and trajectories can be characterized by an a-contrario
model, that estimates the probability of observing the structures of interest
in random data. This approach, first introduced in the modeling of human visual
perception and then successfully applied in many image processing tasks, leads
to algorithms that do not require a previous learning stage, nor a tedious
parameter tuning and are very robust to noise. Comparative evaluations against
a well established baseline show that the proposed approach outperforms the
state of the art. |
|
|
Address |
|
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
|
|
|
Notes |
MILAB; |
Approved |
no |
|
|
Call Number |
Admin @ si @ DJM2016 |
Serial |
2735 |
|
Permanent link to this record |
|
|
|
|
Author |
Rada Deeb; Joost Van de Weijer; Damien Muselet; Mathieu Hebert; Alain Tremeau |
![download PDF file pdf](img/file_PDF.gif)
![find record details (via OpenURL) openurl](img/xref.gif)
|
|
Title |
Deep spectral reflectance and illuminant estimation from self-interreflections |
Type |
Journal Article |
|
Year |
2019 |
Publication |
Journal of the Optical Society of America A |
Abbreviated Journal |
JOSA A |
|
|
Volume |
31 |
Issue |
1 |
Pages |
105-114 |
|
|
Keywords |
|
|
|
Abstract ![sorted by Abstract field, descending order (down)](img/sort_desc.gif) |
In this work, we propose a convolutional neural network based approach to estimate the spectral reflectance of a surface and spectral power distribution of light from a single RGB image of a V-shaped surface. Interreflections happening in a concave surface lead to gradients of RGB values over its area. These gradients carry a lot of information concerning the physical properties of the surface and the illuminant. Our network is trained with only simulated data constructed using a physics-based interreflection model. Coupling interreflection effects with deep learning helps to retrieve the spectral reflectance under an unknown light and to estimate spectral power distribution of this light as well. In addition, it is more robust to the presence of image noise than classical approaches. Our results show that the proposed approach outperforms state-of-the-art learning-based approaches on simulated data. In addition, it gives better results on real data compared to other interreflection-based approaches. |
|
|
Address |
|
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
|
|
|
Notes |
LAMP; 600.120 |
Approved |
no |
|
|
Call Number |
Admin @ si @ DWM2019 |
Serial |
3362 |
|
Permanent link to this record |
|
|
|
|
Author |
Oscar Camara; Estanislao Oubel; Gemma Piella; Simone Balocco; Mathieu De Craene; Alejandro F. Frangi |
![goto web page (via DOI) doi](img/doi.gif)
![find record details (via OpenURL) openurl](img/xref.gif)
|
|
Title |
Multi-sequence Registration of Cine, Tagged and Delay-Enhancement MRI with Shift Correction and Steerable Pyramid-Based Detagging |
Type |
Conference Article |
|
Year |
2009 |
Publication |
5th International Conference on Functional Imaging and Modeling of the Heart |
Abbreviated Journal |
|
|
|
Volume |
5528 |
Issue |
|
Pages |
330–338 |
|
|
Keywords |
|
|
|
Abstract ![sorted by Abstract field, descending order (down)](img/sort_desc.gif) |
In this work, we present a registration framework for cardiac cine MRI (cMRI), tagged (tMRI) and delay-enhancement MRI (deMRI), where the two main issues to find an accurate alignment between these images have been taking into account: the presence of tags in tMRI and respiration artifacts in all sequences. A steerable pyramid image decomposition has been used for detagging purposes since it is suitable to extract high-order oriented structures by directional adaptive filtering. Shift correction of cMRI is achieved by firstly maximizing the similarity between the Long Axis and Short Axis cMRI. Subsequently, these shift-corrected images are used as target images in a rigid registration procedure with their corresponding tMRI/deMRI in order to correct their shift. The proposed registration framework has been evaluated by 840 registration tests, considerably improving the alignment of the MR images (mean RMS error of 2.04mm vs. 5.44mm). |
|
|
Address |
Nice, France |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
Springer Berlin Heidelberg |
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
LNCS |
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
0302-9743 |
ISBN |
978-3-642-01931-9 |
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
FIMH |
|
|
Notes |
MILAB |
Approved |
no |
|
|
Call Number |
BCNPCL @ bcnpcl @ COP2009 |
Serial |
1255 |
|
Permanent link to this record |