|
Records |
Links |
|
Author |
Alex Gomez-Villa; Bartlomiej Twardowski; Lu Yu; Andrew Bagdanov; Joost Van de Weijer |
![download PDF file pdf](img/file_PDF.gif)
![find record details (via OpenURL) openurl](img/xref.gif)
|
|
Title |
Continually Learning Self-Supervised Representations With Projected Functional Regularization |
Type ![sorted by Type field, descending order (down)](img/sort_desc.gif) |
Conference Article |
|
Year |
2022 |
Publication |
CVPR 2022 Workshop on Continual Learning (CLVision, 3rd Edition) |
Abbreviated Journal |
|
|
|
Volume |
|
Issue |
|
Pages |
3866-3876 |
|
|
Keywords |
Computer vision; Conferences; Self-supervised learning; Image representation; Pattern recognition |
|
|
Abstract |
Recent self-supervised learning methods are able to learn high-quality image representations and are closing the gap with supervised approaches. However, these methods are unable to acquire new knowledge incrementally – they are, in fact, mostly used only as a pre-training phase over IID data. In this work we investigate self-supervised methods in continual learning regimes without any replay
mechanism. We show that naive functional regularization,also known as feature distillation, leads to lower plasticity and limits continual learning performance. Instead, we propose Projected Functional Regularization in which a separate temporal projection network ensures that the newly learned feature space preserves information of the previous one, while at the same time allowing for the learning of new features. This prevents forgetting while maintaining the plasticity of the learner. Comparison with other incremental learning approaches applied to self-supervision demonstrates that our method obtains competitive performance in
different scenarios and on multiple datasets. |
|
|
Address |
New Orleans, USA; 20 June 2022 |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
CVPRW |
|
|
Notes |
LAMP: 600.147; 600.120 |
Approved |
no |
|
|
Call Number |
Admin @ si @ GTY2022 |
Serial |
3704 |
|
Permanent link to this record |
|
|
|
|
Author |
Giuseppe De Gregorio; Sanket Biswas; Mohamed Ali Souibgui; Asma Bensalah; Josep Llados; Alicia Fornes; Angelo Marcelli |
![download PDF file pdf](img/file_PDF.gif)
![find record details (via OpenURL) openurl](img/xref.gif)
|
|
Title |
A Few Shot Multi-representation Approach for N-Gram Spotting in Historical Manuscripts |
Type ![sorted by Type field, descending order (down)](img/sort_desc.gif) |
Conference Article |
|
Year |
2022 |
Publication |
Frontiers in Handwriting Recognition. International Conference on Frontiers in Handwriting Recognition (ICFHR2022) |
Abbreviated Journal |
|
|
|
Volume |
13639 |
Issue |
|
Pages |
3-12 |
|
|
Keywords |
N-gram spotting; Few-shot learning; Multimodal understanding; Historical handwritten collections |
|
|
Abstract |
Despite recent advances in automatic text recognition, the performance remains moderate when it comes to historical manuscripts. This is mainly because of the scarcity of available labelled data to train the data-hungry Handwritten Text Recognition (HTR) models. The Keyword Spotting System (KWS) provides a valid alternative to HTR due to the reduction in error rate, but it is usually limited to a closed reference vocabulary. In this paper, we propose a few-shot learning paradigm for spotting sequences of a few characters (N-gram) that requires a small amount of labelled training data. We exhibit that recognition of important n-grams could reduce the system’s dependency on vocabulary. In this case, an out-of-vocabulary (OOV) word in an input handwritten line image could be a sequence of n-grams that belong to the lexicon. An extensive experimental evaluation of our proposed multi-representation approach was carried out on a subset of Bentham’s historical manuscript collections to obtain some really promising results in this direction. |
|
|
Address |
December 04 – 07, 2022; Hyderabad, India |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
LNCS |
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
ICFHR |
|
|
Notes |
DAG; 600.121; 600.162; 602.230; 600.140 |
Approved |
no |
|
|
Call Number |
Admin @ si @ GBS2022 |
Serial |
3733 |
|
Permanent link to this record |
|
|
|
|
Author |
Arnau Baro; Carles Badal; Pau Torras; Alicia Fornes |
![download PDF file pdf](img/file_PDF.gif)
![find record details (via OpenURL) openurl](img/xref.gif)
|
|
Title |
Handwritten Historical Music Recognition through Sequence-to-Sequence with Attention Mechanism |
Type ![sorted by Type field, descending order (down)](img/sort_desc.gif) |
Conference Article |
|
Year |
2022 |
Publication |
3rd International Workshop on Reading Music Systems (WoRMS2021) |
Abbreviated Journal |
|
|
|
Volume |
|
Issue |
|
Pages |
55-59 |
|
|
Keywords |
Optical Music Recognition; Digits; Image Classification |
|
|
Abstract |
Despite decades of research in Optical Music Recognition (OMR), the recognition of old handwritten music scores remains a challenge because of the variabilities in the handwriting styles, paper degradation, lack of standard notation, etc. Therefore, the research in OMR systems adapted to the particularities of old manuscripts is crucial to accelerate the conversion of music scores existing in archives into digital libraries, fostering the dissemination and preservation of our music heritage. In this paper we explore the adaptation of sequence-to-sequence models with attention mechanism (used in translation and handwritten text recognition) and the generation of specific synthetic data for recognizing old music scores. The experimental validation demonstrates that our approach is promising, especially when compared with long short-term memory neural networks. |
|
|
Address |
July 23, 2021, Alicante (Spain) |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
WoRMS |
|
|
Notes |
DAG; 600.121; 600.162; 602.230; 600.140 |
Approved |
no |
|
|
Call Number |
Admin @ si @ BBT2022 |
Serial |
3734 |
|
Permanent link to this record |
|
|
|
|
Author |
Pau Torras; Arnau Baro; Alicia Fornes; Lei Kang |
![download PDF file pdf](img/file_PDF.gif)
|
|
Title |
Improving Handwritten Music Recognition through Language Model Integration |
Type ![sorted by Type field, descending order (down)](img/sort_desc.gif) |
Conference Article |
|
Year |
2022 |
Publication |
4th International Workshop on Reading Music Systems (WoRMS2022) |
Abbreviated Journal |
|
|
|
Volume |
|
Issue |
|
Pages |
42-46 |
|
|
Keywords |
optical music recognition; historical sources; diversity; music theory; digital humanities |
|
|
Abstract |
Handwritten Music Recognition, especially in the historical domain, is an inherently challenging endeavour; paper degradation artefacts and the ambiguous nature of handwriting make recognising such scores an error-prone process, even for the current state-of-the-art Sequence to Sequence models. In this work we propose a way of reducing the production of statistically implausible output sequences by fusing a Language Model into a recognition Sequence to Sequence model. The idea is leveraging visually-conditioned and context-conditioned output distributions in order to automatically find and correct any mistakes that would otherwise break context significantly. We have found this approach to improve recognition results to 25.15 SER (%) from a previous best of 31.79 SER (%) in the literature. |
|
|
Address |
November 18, 2022 |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
WoRMS |
|
|
Notes |
DAG; 600.121; 600.162; 602.230 |
Approved |
no |
|
|
Call Number |
Admin @ si @ TBF2022 |
Serial |
3735 |
|
Permanent link to this record |
|
|
|
|
Author |
Asma Bensalah; Alicia Fornes; Cristina Carmona_Duarte; Josep Llados |
![download PDF file pdf](img/file_PDF.gif)
![find record details (via OpenURL) openurl](img/xref.gif)
|
|
Title |
Easing Automatic Neurorehabilitation via Classification and Smoothness Analysis |
Type ![sorted by Type field, descending order (down)](img/sort_desc.gif) |
Conference Article |
|
Year |
2022 |
Publication |
Intertwining Graphonomics with Human Movements. 20th International Conference of the International Graphonomics Society, IGS 2022 |
Abbreviated Journal |
|
|
|
Volume |
13424 |
Issue |
|
Pages |
336-348 |
|
|
Keywords |
Neurorehabilitation; Upper-lim; Movement classification; Movement smoothness; Deep learning; Jerk |
|
|
Abstract |
Assessing the quality of movements for post-stroke patients during the rehabilitation phase is vital given that there is no standard stroke rehabilitation plan for all the patients. In fact, it depends basically on the patient’s functional independence and its progress along the rehabilitation sessions. To tackle this challenge and make neurorehabilitation more agile, we propose an automatic assessment pipeline that starts by recognising patients’ movements by means of a shallow deep learning architecture, then measuring the movement quality using jerk measure and related measures. A particularity of this work is that the dataset used is clinically relevant, since it represents movements inspired from Fugl-Meyer a well common upper-limb clinical stroke assessment scale for stroke patients. We show that it is possible to detect the contrast between healthy and patients movements in terms of smoothness, besides achieving conclusions about the patients’ progress during the rehabilitation sessions that correspond to the clinicians’ findings about each case. |
|
|
Address |
June 7-9, 2022, Las Palmas de Gran Canaria, Spain |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
LNCS |
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
IGS |
|
|
Notes |
DAG; 600.121; 600.162; 602.230; 600.140 |
Approved |
no |
|
|
Call Number |
Admin @ si @ BFC2022 |
Serial |
3738 |
|
Permanent link to this record |
|
|
|
|
Author |
Alicia Fornes; Asma Bensalah; Cristina Carmona_Duarte; Jialuo Chen; Miguel A. Ferrer; Andreas Fischer; Josep Llados; Cristina Martin; Eloy Opisso; Rejean Plamondon; Anna Scius-Bertrand; Josep Maria Tormos |
![download PDF file pdf](img/file_PDF.gif)
![goto web page (via DOI) doi](img/doi.gif)
|
|
Title |
The RPM3D Project: 3D Kinematics for Remote Patient Monitoring |
Type ![sorted by Type field, descending order (down)](img/sort_desc.gif) |
Conference Article |
|
Year |
2022 |
Publication |
Intertwining Graphonomics with Human Movements. 20th International Conference of the International Graphonomics Society, IGS 2022 |
Abbreviated Journal |
|
|
|
Volume |
13424 |
Issue |
|
Pages |
217-226 |
|
|
Keywords |
Healthcare applications; Kinematic; Theory of Rapid Human Movements; Human activity recognition; Stroke rehabilitation; 3D kinematics |
|
|
Abstract |
This project explores the feasibility of remote patient monitoring based on the analysis of 3D movements captured with smartwatches. We base our analysis on the Kinematic Theory of Rapid Human Movement. We have validated our research in a real case scenario for stroke rehabilitation at the Guttmann Institute (https://www.guttmann.com/en/) (neurorehabilitation hospital), showing promising results. Our work could have a great impact in remote healthcare applications, improving the medical efficiency and reducing the healthcare costs. Future steps include more clinical validation, developing multi-modal analysis architectures (analysing data from sensors, images, audio, etc.), and exploring the application of our technology to monitor other neurodegenerative diseases. |
|
|
Address |
June 7-9, 2022, Las Palmas de Gran Canaria, Spain |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
LNCS |
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
IGS |
|
|
Notes |
DAG; 600.121; 600.162; 602.230; 600.140 |
Approved |
no |
|
|
Call Number |
Admin @ si @ FBC2022 |
Serial |
3739 |
|
Permanent link to this record |
|
|
|
|
Author |
Arnau Baro; Pau Riba; Alicia Fornes |
![goto web page (via DOI) doi](img/doi.gif)
|
|
Title |
Musigraph: Optical Music Recognition Through Object Detection and Graph Neural Network |
Type ![sorted by Type field, descending order (down)](img/sort_desc.gif) |
Conference Article |
|
Year |
2022 |
Publication |
Frontiers in Handwriting Recognition. International Conference on Frontiers in Handwriting Recognition (ICFHR2022) |
Abbreviated Journal |
|
|
|
Volume |
13639 |
Issue |
|
Pages |
171-184 |
|
|
Keywords |
Object detection; Optical music recognition; Graph neural network |
|
|
Abstract |
During the last decades, the performance of optical music recognition has been increasingly improving. However, and despite the 2-dimensional nature of music notation (e.g. notes have rhythm and pitch), most works treat musical scores as a sequence of symbols in one dimension, which make their recognition still a challenge. Thus, in this work we explore the use of graph neural networks for musical score recognition. First, because graphs are suited for n-dimensional representations, and second, because the combination of graphs with deep learning has shown a great performance in similar applications. Our methodology consists of: First, we will detect each isolated/atomic symbols (those that can not be decomposed in more graphical primitives) and the primitives that form a musical symbol. Then, we will build the graph taking as root node the notehead and as leaves those primitives or symbols that modify the note’s rhythm (stem, beam, flag) or pitch (flat, sharp, natural). Finally, the graph is translated into a human-readable character sequence for a final transcription and evaluation. Our method has been tested on more than five thousand measures, showing promising results. |
|
|
Address |
December 04 – 07, 2022; Hyderabad, India |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
LNCS |
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
ICFHR |
|
|
Notes |
DAG; 600.162; 600.140; 602.230 |
Approved |
no |
|
|
Call Number |
Admin @ si @ BRF2022b |
Serial |
3740 |
|
Permanent link to this record |
|
|
|
|
Author |
Carlos Boned Riera; Oriol Ramos Terrades |
![goto web page (via DOI) doi](img/doi.gif)
|
|
Title |
Discriminative Neural Variational Model for Unbalanced Classification Tasks in Knowledge Graph |
Type ![sorted by Type field, descending order (down)](img/sort_desc.gif) |
Conference Article |
|
Year |
2022 |
Publication |
26th International Conference on Pattern Recognition |
Abbreviated Journal |
|
|
|
Volume |
|
Issue |
|
Pages |
2186-2191 |
|
|
Keywords |
Measurement; Couplings; Semantics; Ear; Benchmark testing; Data models; Pattern recognition |
|
|
Abstract |
Nowadays the paradigm of link discovery problems has shown significant improvements on Knowledge Graphs. However, method performances are harmed by the unbalanced nature of this classification problem, since many methods are easily biased to not find proper links. In this paper we present a discriminative neural variational auto-encoder model, called DNVAE from now on, in which we have introduced latent variables to serve as embedding vectors. As a result, the learnt generative model approximate better the underlying distribution and, at the same time, it better differentiate the type of relations in the knowledge graph. We have evaluated this approach on benchmark knowledge graph and Census records. Results in this last data set are quite impressive since we reach the highest possible score in the evaluation metrics. However, further experiments are still needed to deeper evaluate the performance of the method in more challenging tasks. |
|
|
Address |
Montreal; Quebec; Canada; August 2022 |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
ICPR |
|
|
Notes |
DAG; 600.121; 600.162 |
Approved |
no |
|
|
Call Number |
Admin @ si @ BoR2022 |
Serial |
3741 |
|
Permanent link to this record |
|
|
|
|
Author |
Danna Xue; Fei Yang; Pei Wang; Luis Herranz; Jinqiu Sun; Yu Zhu; Yanning Zhang |
![download PDF file pdf](img/file_PDF.gif)
![find book details (via ISBN) isbn](img/isbn.gif)
|
|
Title |
SlimSeg: Slimmable Semantic Segmentation with Boundary Supervision |
Type ![sorted by Type field, descending order (down)](img/sort_desc.gif) |
Conference Article |
|
Year |
2022 |
Publication |
30th ACM International Conference on Multimedia |
Abbreviated Journal |
|
|
|
Volume |
|
Issue |
|
Pages |
6539-6548 |
|
|
Keywords |
|
|
|
Abstract |
Accurate semantic segmentation models typically require significant computational resources, inhibiting their use in practical applications. Recent works rely on well-crafted lightweight models to achieve fast inference. However, these models cannot flexibly adapt to varying accuracy and efficiency requirements. In this paper, we propose a simple but effective slimmable semantic segmentation (SlimSeg) method, which can be executed at different capacities during inference depending on the desired accuracy-efficiency tradeoff. More specifically, we employ parametrized channel slimming by stepwise downward knowledge distillation during training. Motivated by the observation that the differences between segmentation results of each submodel are mainly near the semantic borders, we introduce an additional boundary guided semantic segmentation loss to further improve the performance of each submodel. We show that our proposed SlimSeg with various mainstream networks can produce flexible models that provide dynamic adjustment of computational cost and better performance than independent models. Extensive experiments on semantic segmentation benchmarks, Cityscapes and CamVid, demonstrate the generalization ability of our framework. |
|
|
Address |
Lisboa, Portugal, October 2022 |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
Association for Computing Machinery |
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
978-1-4503-9203-7 |
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
MM |
|
|
Notes |
MACO; 600.161; 601.400 |
Approved |
no |
|
|
Call Number |
Admin @ si @ XYW2022 |
Serial |
3758 |
|
Permanent link to this record |
|
|
|
|
Author |
Shiqi Yang; Yaxing Wang; Kai Wang; Shangling Jui; Joost Van de Weijer |
![find record details (via OpenURL) openurl](img/xref.gif)
|
|
Title |
Attracting and Dispersing: A Simple Approach for Source-free Domain Adaptation |
Type ![sorted by Type field, descending order (down)](img/sort_desc.gif) |
Conference Article |
|
Year |
2022 |
Publication |
36th Conference on Neural Information Processing Systems |
Abbreviated Journal |
|
|
|
Volume |
|
Issue |
|
Pages |
|
|
|
Keywords |
|
|
|
Abstract |
We propose a simple but effective source-free domain adaptation (SFDA) method.
Treating SFDA as an unsupervised clustering problem and following the intuition
that local neighbors in feature space should have more similar predictions than
other features, we propose to optimize an objective of prediction consistency. This
objective encourages local neighborhood features in feature space to have similar
predictions while features farther away in feature space have dissimilar predictions, leading to efficient feature clustering and cluster assignment simultaneously. For efficient training, we seek to optimize an upper-bound of the objective resulting in two simple terms. Furthermore, we relate popular existing methods in domain adaptation, source-free domain adaptation and contrastive learning via the perspective of discriminability and diversity. The experimental results prove the superiority of our method, and our method can be adopted as a simple but strong baseline for future research in SFDA. Our method can be also adapted to source-free open-set and partial-set DA which further shows the generalization ability of our method. |
|
|
Address |
Virtual; November 2022 |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
NEURIPS |
|
|
Notes |
LAMP; 600.147 |
Approved |
no |
|
|
Call Number |
Admin @ si @ YWW2022a |
Serial |
3792 |
|
Permanent link to this record |
|
|
|
|
Author |
Saiping Zhang; Luis Herranz; Marta Mrak; Marc Gorriz Blanch; Shuai Wan; Fuzheng Yang |
![download PDF file pdf](img/file_PDF.gif)
![goto web page (via DOI) doi](img/doi.gif)
|
|
Title |
DCNGAN: A Deformable Convolution-Based GAN with QP Adaptation for Perceptual Quality Enhancement of Compressed Video |
Type ![sorted by Type field, descending order (down)](img/sort_desc.gif) |
Conference Article |
|
Year |
2022 |
Publication |
47th International Conference on Acoustics, Speech, and Signal Processing |
Abbreviated Journal |
|
|
|
Volume |
|
Issue |
|
Pages |
|
|
|
Keywords |
|
|
|
Abstract |
In this paper, we propose a deformable convolution-based generative adversarial network (DCNGAN) for perceptual quality enhancement of compressed videos. DCNGAN is also adaptive to the quantization parameters (QPs). Compared with optical flows, deformable convolutions are more effective and efficient to align frames. Deformable convolutions can operate on multiple frames, thus leveraging more temporal information, which is beneficial for enhancing the perceptual quality of compressed videos. Instead of aligning frames in a pairwise manner, the deformable convolution can process multiple frames simultaneously, which leads to lower computational complexity. Experimental results demonstrate that the proposed DCNGAN outperforms other state-of-the-art compressed video quality enhancement algorithms. |
|
|
Address |
Virtual; May 2022 |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
ICASSP |
|
|
Notes |
MACO; 600.161; 601.379 |
Approved |
no |
|
|
Call Number |
Admin @ si @ ZHM2022a |
Serial |
3765 |
|
Permanent link to this record |
|
|
|
|
Author |
German Barquero; Johnny Nuñez; Sergio Escalera; Zhen Xu; Wei-Wei Tu; Isabelle Guyon |
![goto web page url](img/www.gif)
|
|
Title |
Didn’t see that coming: a survey on non-verbal social human behavior forecasting |
Type ![sorted by Type field, descending order (down)](img/sort_desc.gif) |
Conference Article |
|
Year |
2022 |
Publication |
Understanding Social Behavior in Dyadic and Small Group Interactions |
Abbreviated Journal |
|
|
|
Volume |
173 |
Issue |
|
Pages |
139-178 |
|
|
Keywords |
|
|
|
Abstract |
Non-verbal social human behavior forecasting has increasingly attracted the interest of the research community in recent years. Its direct applications to human-robot interaction and socially-aware human motion generation make it a very attractive field. In this survey, we define the behavior forecasting problem for multiple interactive agents in a generic way that aims at unifying the fields of social signals prediction and human motion forecasting, traditionally separated. We hold that both problem formulations refer to the same conceptual problem, and identify many shared fundamental challenges: future stochasticity, context awareness, history exploitation, etc. We also propose a taxonomy that comprises
methods published in the last 5 years in a very informative way and describes the current main concerns of the community with regard to this problem. In order to promote further research on this field, we also provide a summarized and friendly overview of audiovisual datasets featuring non-acted social interactions. Finally, we describe the most common metrics used in this task and their particular issues. |
|
|
Address |
Virtual; June 2022 |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
PMLR |
|
|
Notes |
HuPBA; no proj |
Approved |
no |
|
|
Call Number |
Admin @ si @ BNE2022 |
Serial |
3766 |
|
Permanent link to this record |
|
|
|
|
Author |
Guillem Martinez; Maya Aghaei; Martin Dijkstra; Bhalaji Nagarajan; Femke Jaarsma; Jaap van de Loosdrecht; Petia Radeva; Klaas Dijkstra |
![download PDF file pdf](img/file_PDF.gif)
![goto web page (via DOI) doi](img/doi.gif)
|
|
Title |
Hyper-Spectral Imaging for Overlapping Plastic Flakes Segmentation |
Type ![sorted by Type field, descending order (down)](img/sort_desc.gif) |
Conference Article |
|
Year |
2022 |
Publication |
47th International Conference on Acoustics, Speech, and Signal Processing |
Abbreviated Journal |
|
|
|
Volume |
|
Issue |
|
Pages |
|
|
|
Keywords |
Hyper-spectral imaging; plastic sorting; multi-label segmentation; bitfield encoding |
|
|
Abstract |
In this paper, we propose a deformable convolution-based generative adversarial network (DCNGAN) for perceptual quality enhancement of compressed videos. DCNGAN is also adaptive to the quantization parameters (QPs). Compared with optical flows, deformable convolutions are more effective and efficient to align frames. Deformable convolutions can operate on multiple frames, thus leveraging more temporal information, which is beneficial for enhancing the perceptual quality of compressed videos. Instead of aligning frames in a pairwise manner, the deformable convolution can process multiple frames simultaneously, which leads to lower computational complexity. Experimental results demonstrate that the proposed DCNGAN outperforms other state-of-the-art compressed video quality enhancement algorithms. |
|
|
Address |
Singapore; May 2022 |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
ICASSP |
|
|
Notes |
MILAB; no proj |
Approved |
no |
|
|
Call Number |
Admin @ si @ MAD2022 |
Serial |
3767 |
|
Permanent link to this record |
|
|
|
|
Author |
Spencer Low; Oliver Nina; Angel Sappa; Erik Blasch; Nathan Inkawhich |
![download PDF file pdf](img/file_PDF.gif)
![goto web page (via DOI) doi](img/doi.gif)
|
|
Title |
Multi-Modal Aerial View Object Classification Challenge Results – PBVS 2022 |
Type ![sorted by Type field, descending order (down)](img/sort_desc.gif) |
Conference Article |
|
Year |
2022 |
Publication |
IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW) |
Abbreviated Journal |
|
|
|
Volume |
|
Issue |
|
Pages |
350-358 |
|
|
Keywords |
|
|
|
Abstract |
This paper details the results and main findings of the second iteration of the Multi-modal Aerial View Object Classification (MAVOC) challenge. The primary goal of both MAVOC challenges is to inspire research into methods for building recognition models that utilize both synthetic aperture radar (SAR) and electro-optical (EO) imagery. Teams are encouraged to develop multi-modal approaches that incorporate complementary information from both domains. While the 2021 challenge showed a proof of concept that both modalities could be used together, the 2022 challenge focuses on the detailed multi-modal methods. The 2022 challenge uses the same UNIfied Coincident Optical and Radar for recognitioN (UNICORN) dataset and competition format that was used in 2021. Specifically, the challenge focuses on two tasks, (1) SAR classification and (2) SAR + EO classification. The bulk of this document is dedicated to discussing the top performing methods and describing their performance on our blind test set. Notably, all of the top ten teams outperform a Resnet-18 baseline. For SAR classification, the top team showed a 129% improvement over baseline and an 8% average improvement from the 2021 winner. The top team for SAR + EO classification shows a 165% improvement with a 32% average improvement over 2021. |
|
|
Address |
New Orleans; USA; June 2022 |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
CVPRW |
|
|
Notes |
MSIAU |
Approved |
no |
|
|
Call Number |
Admin @ si @ LNS2022 |
Serial |
3768 |
|
Permanent link to this record |
|
|
|
|
Author |
Adam Fodor; Rachid R. Saboundji; Julio C. S. Jacques Junior; Sergio Escalera; David Gallardo Pujol; Andras Lorincz |
![goto web page url](img/www.gif)
|
|
Title |
Multimodal Sentiment and Personality Perception Under Speech: A Comparison of Transformer-based Architectures |
Type ![sorted by Type field, descending order (down)](img/sort_desc.gif) |
Conference Article |
|
Year |
2022 |
Publication |
Understanding Social Behavior in Dyadic and Small Group Interactions |
Abbreviated Journal |
|
|
|
Volume |
173 |
Issue |
|
Pages |
218-241 |
|
|
Keywords |
|
|
|
Abstract |
Human-machine, human-robot interaction, and collaboration appear in diverse fields, from homecare to Cyber-Physical Systems. Technological development is fast, whereas real-time methods for social communication analysis that can measure small changes in sentiment and personality states, including visual, acoustic and language modalities are lagging, particularly when the goal is to build robust, appearance invariant, and fair methods. We study and compare methods capable of fusing modalities while satisfying real-time and invariant appearance conditions. We compare state-of-the-art transformer architectures in sentiment estimation and introduce them in the much less explored field of personality perception. We show that the architectures perform differently on automatic sentiment and personality perception, suggesting that each task may be better captured/modeled by a particular method. Our work calls attention to the attractive properties of the linear versions of the transformer architectures. In particular, we show that the best results are achieved by fusing the different architectures{’} preprocessing methods. However, they pose extreme conditions in computation power and energy consumption for real-time computations for quadratic transformers due to their memory requirements. In turn, linear transformers pave the way for quantifying small changes in sentiment estimation and personality perception for real-time social communications for machines and robots. |
|
|
Address |
|
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
PMLR |
|
|
Notes |
HuPBA; no menciona |
Approved |
no |
|
|
Call Number |
Admin @ si @ FSJ2022 |
Serial |
3769 |
|
Permanent link to this record |