|
Records |
Links |
|
Author |
Sergio Escalera; Jordi Gonzalez; Xavier Baro; Miguel Reyes; Oscar Lopes; Isabelle Guyon; V. Athitsos; Hugo Jair Escalante |
![download PDF file pdf](img/file_PDF.gif)
![find book details (via ISBN) isbn](img/isbn.gif)
|
|
Title |
Multi-modal Gesture Recognition Challenge 2013: Dataset and Results |
Type |
Conference Article |
|
Year |
2013 |
Publication |
15th ACM International Conference on Multimodal Interaction |
Abbreviated Journal |
|
|
|
Volume |
|
Issue |
|
Pages |
445-452 |
|
|
Keywords |
|
|
|
Abstract ![sorted by Abstract field, ascending order (up)](img/sort_asc.gif) |
The recognition of continuous natural gestures is a complex and challenging problem due to the multi-modal nature of involved visual cues (e.g. fingers and lips movements, subtle facial expressions, body pose, etc.), as well as technical limitations such as spatial and temporal resolution and unreliable
depth cues. In order to promote the research advance on this field, we organized a challenge on multi-modal gesture recognition. We made available a large video database of 13; 858 gestures from a lexicon of 20 Italian gesture categories recorded with a KinectTM camera, providing the audio, skeletal model, user mask, RGB and depth images. The focus of the challenge was on user independent multiple gesture learning. There are no resting positions and the gestures are performed in continuous sequences lasting 1-2 minutes, containing between 8 and 20 gesture instances in each sequence. As a result, the dataset contains around 1:720:800 frames. In addition to the 20 main gesture categories, ‘distracter’ gestures are included, meaning that additional audio
and gestures out of the vocabulary are included. The final evaluation of the challenge was defined in terms of the Levenshtein edit distance, where the goal was to indicate the real order of gestures within the sequence. 54 international teams participated in the challenge, and outstanding results
were obtained by the first ranked participants. |
|
|
Address |
Sidney; Australia; December 2013 |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
978-1-4503-2129-7 |
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
ICMI |
|
|
Notes |
HUPBA; ISE; 600.063;MV |
Approved |
no |
|
|
Call Number |
Admin @ si @ EGB2013 |
Serial |
2373 |
|
Permanent link to this record |
|
|
|
|
Author |
Eduardo Aguilar; Petia Radeva |
![goto web page url](img/www.gif)
![find record details (via OpenURL) openurl](img/xref.gif)
|
|
Title |
Food Recognition by Integrating Local and Flat Classifiers |
Type |
Conference Article |
|
Year |
2019 |
Publication |
9th Iberian Conference on Pattern Recognition and Image Analysis |
Abbreviated Journal |
|
|
|
Volume |
11867 |
Issue |
|
Pages |
65-74 |
|
|
Keywords |
|
|
|
Abstract ![sorted by Abstract field, ascending order (up)](img/sort_asc.gif) |
The recognition of food image is an interesting research topic, in which its applicability in the creation of nutritional diaries stands out with the aim of improving the quality of life of people with a chronic disease (e.g. diabetes, heart disease) or prone to acquire it (e.g. people with overweight or obese). For a food recognition system to be useful in real applications, it is necessary to recognize a huge number of different foods. We argue that for very large scale classification, a traditional flat classifier is not enough to acquire an acceptable result. To address this, we propose a method that performs prediction with local classifiers, based on a class hierarchy, or with flat classifier. We decide which approach to use, depending on the analysis of both the Epistemic Uncertainty obtained for the image in the children classifiers and the prediction of the parent classifier. When our criterion is met, the final prediction is obtained with the respective local classifier; otherwise, with the flat classifier. From the results, we can see that the proposed method improves the classification performance compared to the use of a single flat classifier. |
|
|
Address |
Madrid; July 2019 |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
LNCS |
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
IbPRIA |
|
|
Notes |
MILAB; no proj |
Approved |
no |
|
|
Call Number |
Admin @ si @ AgR2019b |
Serial |
3369 |
|
Permanent link to this record |
|
|
|
|
Author |
Arnau Baro; Pau Riba; Alicia Fornes |
![download PDF file pdf](img/file_PDF.gif)
![find record details (via OpenURL) openurl](img/xref.gif)
|
|
Title |
Towards the recognition of compound music notes in handwritten music scores |
Type |
Conference Article |
|
Year |
2016 |
Publication |
15th international conference on Frontiers in Handwriting Recognition |
Abbreviated Journal |
|
|
|
Volume |
|
Issue |
|
Pages |
|
|
|
Keywords |
|
|
|
Abstract ![sorted by Abstract field, ascending order (up)](img/sort_asc.gif) |
The recognition of handwritten music scores still remains an open problem. The existing approaches can only deal with very simple handwritten scores mainly because of the variability in the handwriting style and the variability in the composition of groups of music notes (i.e. compound music notes). In this work we focus on this second problem and propose a method based on perceptual grouping for the recognition of compound music notes. Our method has been tested using several handwritten music scores of the CVC-MUSCIMA database and compared with a commercial Optical Music Recognition (OMR) software. Given that our method is learning-free, the obtained results are promising. |
|
|
Address |
Shenzhen; China; October 2016 |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
2167-6445 |
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
ICFHR |
|
|
Notes |
DAG; 600.097 |
Approved |
no |
|
|
Call Number |
Admin @ si @ BRF2016 |
Serial |
2903 |
|
Permanent link to this record |
|
|
|
|
Author |
Josep Llados; Ernest Valveny; Gemma Sanchez; Enric Marti |
![download PDF file pdf](img/file_PDF.gif)
![goto web page (via DOI) doi](img/doi.gif)
![find record details (via OpenURL) openurl](img/xref.gif)
|
|
Title |
Symbol recognition: current advances and perspectives |
Type |
Book Chapter |
|
Year |
2002 |
Publication |
Graphics Recognition Algorithms And Applications |
Abbreviated Journal |
LNCS |
|
|
Volume |
2390 |
Issue |
|
Pages |
104-128 |
|
|
Keywords |
|
|
|
Abstract ![sorted by Abstract field, ascending order (up)](img/sort_asc.gif) |
The recognition of symbols in graphic documents is an intensive research activity in the community of pattern recognition and document analysis. A key issue in the interpretation of maps, engineering drawings, diagrams, etc. is the recognition of domain dependent symbols according to a symbol database. In this work we first review the most outstanding symbol recognition methods from two different points of view: application domains and pattern recognition methods. In the second part of the paper, open and unaddressed problems involved in symbol recognition are described, analyzing their current state of art and discussing future research challenges. Thus, issues such as symbol representation, matching, segmentation, learning, scalability of recognition methods and performance evaluation are addressed in this work. Finally, we discuss the perspectives of symbol recognition concerning to new paradigms such as user interfaces in handheld computers or document database and WWW indexing by graphical content. |
|
|
Address |
London, UK |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
Springer-Verlag |
Place of Publication |
|
Editor |
Dorothea Blostein and Young- Bin Kwon |
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
Lecture Notes in Computer Science |
Abbreviated Series Title |
LNCS |
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
3-540-44066-6 |
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
GREC |
|
|
Notes |
DAG; IAM; |
Approved |
no |
|
|
Call Number |
IAM @ iam @ LVS2002 |
Serial |
1572 |
|
Permanent link to this record |
|
|
|
|
Author |
Josep Llados; Enric Marti; Juan J.Villanueva |
![find record details (via OpenURL) openurl](img/xref.gif)
|
|
Title |
Symbol recognition by error-tolerant subgraph matching between region adjacency graphs |
Type |
Journal Article |
|
Year |
2001 |
Publication |
IEEE Transactions on Pattern Analysis and Machine Intelligence |
Abbreviated Journal |
|
|
|
Volume |
23 |
Issue |
10 |
Pages |
1137-1143 |
|
|
Keywords |
|
|
|
Abstract ![sorted by Abstract field, ascending order (up)](img/sort_asc.gif) |
The recognition of symbols in graphic documents is an intensive research activity in the community of pattern recognition and document analysis. A key issue in the interpretation of maps, engineering drawings, diagrams, etc. is the recognition of domain dependent symbols according to a symbol database. In this work we first review the most outstanding symbol recognition methods from two different points of view: application domains and pattern recognition methods. In the second part of the paper, open and unaddressed problems involved in symbol recognition are described, analyzing their current state of art and discussing future research challenges. Thus, issues such as symbol representation, matching, segmentation, learning, scalability of recognition methods and performance evaluation are addressed in this work. Finally, we discuss the perspectives of symbol recognition concerning to new paradigms such as user interfaces in handheld computers or document database and WWW indexing by graphical content. |
|
|
Address |
|
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
|
|
|
Notes |
DAG;IAM;ISE; |
Approved |
no |
|
|
Call Number |
IAM @ iam @ LMV2001 |
Serial |
1581 |
|
Permanent link to this record |
|
|
|
|
Author |
Andreas Fischer; Ching Y. Suen; Volkmar Frinken; Kaspar Riesen; Horst Bunke |
![download PDF file pdf](img/file_PDF.gif)
![find book details (via ISBN) isbn](img/isbn.gif)
|
|
Title |
A Fast Matching Algorithm for Graph-Based Handwriting Recognition |
Type |
Conference Article |
|
Year |
2013 |
Publication |
9th IAPR – TC15 Workshop on Graph-based Representation in Pattern Recognition |
Abbreviated Journal |
|
|
|
Volume |
7877 |
Issue |
|
Pages |
194-203 |
|
|
Keywords |
|
|
|
Abstract ![sorted by Abstract field, ascending order (up)](img/sort_asc.gif) |
The recognition of unconstrained handwriting images is usually based on vectorial representation and statistical classification. Despite their high representational power, graphs are rarely used in this field due to a lack of efficient graph-based recognition methods. Recently, graph similarity features have been proposed to bridge the gap between structural representation and statistical classification by means of vector space embedding. This approach has shown a high performance in terms of accuracy but had shortcomings in terms of computational speed. The time complexity of the Hungarian algorithm that is used to approximate the edit distance between two handwriting graphs is demanding for a real-world scenario. In this paper, we propose a faster graph matching algorithm which is derived from the Hausdorff distance. On the historical Parzival database it is demonstrated that the proposed method achieves a speedup factor of 12.9 without significant loss in recognition accuracy. |
|
|
Address |
Vienna; Austria; May 2013 |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
Springer Berlin Heidelberg |
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
LNCS |
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
0302-9743 |
ISBN |
978-3-642-38220-8 |
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
GBR |
|
|
Notes |
DAG; 600.045; 605.203 |
Approved |
no |
|
|
Call Number |
Admin @ si @ FSF2013 |
Serial |
2294 |
|
Permanent link to this record |
|
|
|
|
Author |
Noha Elfiky |
![find record details (via OpenURL) openurl](img/xref.gif)
|
|
Title |
Compact, Adaptive and Discriminative Spatial Pyramids for Improved Object and Scene Classification |
Type |
Book Whole |
|
Year |
2012 |
Publication |
PhD Thesis, Universitat Autonoma de Barcelona-CVC |
Abbreviated Journal |
|
|
|
Volume |
|
Issue |
|
Pages |
|
|
|
Keywords |
|
|
|
Abstract ![sorted by Abstract field, ascending order (up)](img/sort_asc.gif) |
The release of challenging datasets with a vast number of images, requires the development of efficient image representations and algorithms which are able to manipulate these large-scale datasets efficiently. Nowadays the Bag-of-Words (BoW) is the most successful approach in the context of object and scene classification tasks. However, its main drawback is the absence of the important spatial information. Spatial pyramids (SP) have been successfully applied to incorporate spatial information into BoW-based image representation. Observing the remarkable performance of spatial pyramids, their growing number of applications to a broad range of vision problems, and finally its geometry inclusion, a question can be asked what are the limits of spatial pyramids. Within the SP framework, the optimal way for obtaining an image spatial representation, which is able to cope with it’s most foremost shortcomings, concretely, it’s high dimensionality and the rigidity of the resulting image representation, still remains an active research domain. In summary, the main concern of this thesis is to search for the limits of spatial pyramids and try to figure out solutions for them. |
|
|
Address |
|
|
|
Corporate Author |
|
Thesis |
Ph.D. thesis |
|
|
Publisher |
Ediciones Graficas Rey |
Place of Publication |
|
Editor |
Jordi Gonzalez;Xavier Roca |
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
|
|
|
Notes |
ISE |
Approved |
no |
|
|
Call Number |
Admin @ si @ Elf2012 |
Serial |
2202 |
|
Permanent link to this record |
|
|
|
|
Author |
Victor Ponce |
![goto web page url](img/www.gif)
|
|
Title |
Evolutionary Bags of Space-Time Features for Human Analysis |
Type |
Book Whole |
|
Year |
2016 |
Publication |
PhD Thesis Universitat de Barcelona, UOC and CVC |
Abbreviated Journal |
|
|
|
Volume |
|
Issue |
|
Pages |
|
|
|
Keywords |
Computer algorithms; Digital image processing; Digital video; Analysis of variance; Dynamic programming; Evolutionary computation; Gesture |
|
|
Abstract ![sorted by Abstract field, ascending order (up)](img/sort_asc.gif) |
The representation (or feature) learning has been an emerging concept in the last years, since it collects a set of techniques that are present in any theoretical or practical methodology referring to artificial intelligence. In computer vision, a very common representation has adopted the form of the well-known Bag of Visual Words. This representation appears implicitly in most approaches where images are described, and is also present in a huge number of areas and domains: image content retrieval, pedestrian detection, human-computer interaction, surveillance, e-health, and social computing, amongst others. The early stages of this dissertation provide an approach for learning visual representations inside evolutionary algorithms, which consists of evolving weighting schemes to improve the BoVW representations for the task of recognizing categories of videos and images. Thus, we demonstrate the applicability of the most common weighting schemes, which are often used in text mining but are less frequently found in computer vision tasks. Beyond learning these visual representations, we provide an approach based on fusion strategies for learning spatiotemporal representations, from multimodal data obtained by depth sensors. Besides, we specially aim at the evolutionary and dynamic modelling, where the temporal factor is present in the nature of the data, such as video sequences of gestures and actions. Indeed, we explore the effects of probabilistic modelling for those approaches based on dynamic programming, so as to handle the temporal deformation and variance amongst video sequences of different categories. Finally, we integrate dynamic programming and generative models into an evolutionary computation framework, with the aim of learning Bags of SubGestures (BoSG) representations and hence to improve the generalization capability of standard gesture recognition approaches. The results obtained in the experimentation demonstrate, first, that evolutionary algorithms are useful for improving the representation of BoVW approaches in several datasets for recognizing categories in still images and video sequences. On the other hand, our experimentation reveals that both, the use of dynamic programming and generative models to align video sequences, and the representations obtained from applying fusion strategies in multimodal data, entail an enhancement on the performance when recognizing some gesture categories. Furthermore, the combination of evolutionary algorithms with models based on dynamic programming and generative approaches results, when aiming at the classification of video categories on large video datasets, in a considerable improvement over standard gesture and action recognition approaches. Finally, we demonstrate the applications of these representations in several domains for human analysis: classification of images where humans may be present, action and gesture recognition for general applications, and in particular for conversational settings within the field of restorative justice |
|
|
Address |
June 2016 |
|
|
Corporate Author |
|
Thesis |
Ph.D. thesis |
|
|
Publisher |
Ediciones Graficas Rey |
Place of Publication |
|
Editor |
Sergio Escalera;Xavier Baro;Hugo Jair Escalante |
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
|
|
|
Notes |
HuPBA |
Approved |
no |
|
|
Call Number |
Pon2016 |
Serial |
2814 |
|
Permanent link to this record |
|
|
|
|
Author |
Estefania Talavera; Nicolai Petkov; Petia Radeva |
![download PDF file pdf](img/file_PDF.gif)
![goto web page (via DOI) doi](img/doi.gif)
|
|
Title |
Unsupervised Routine Discovery in Egocentric Photo-Streams |
Type |
Conference Article |
|
Year |
2019 |
Publication |
18th International Conference on Computer Analysis of Images and Patterns |
Abbreviated Journal |
|
|
|
Volume |
11678 |
Issue |
|
Pages |
576-588 |
|
|
Keywords |
Routine discovery; Lifestyle; Egocentric vision; Behaviour analysis |
|
|
Abstract ![sorted by Abstract field, ascending order (up)](img/sort_asc.gif) |
The routine of a person is defined by the occurrence of activities throughout different days, and can directly affect the person’s health. In this work, we address the recognition of routine related days. To do so, we rely on egocentric images, which are recorded by a wearable camera and allow to monitor the life of the user from a first-person view perspective. We propose an unsupervised model that identifies routine related days, following an outlier detection approach. We test the proposed framework over a total of 72 days in the form of photo-streams covering around 2 weeks of the life of 5 different camera wearers. Our model achieves an average of 76% Accuracy and 68% Weighted F-Score for all the users. Thus, we show that our framework is able to recognise routine related days and opens the door to the understanding of the behaviour of people. |
|
|
Address |
Salermo; Italy; September 2019 |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
LNCS |
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
CAIP |
|
|
Notes |
MILAB; no proj |
Approved |
no |
|
|
Call Number |
Admin @ si @ TPR2019a |
Serial |
3367 |
|
Permanent link to this record |
|
|
|
|
Author |
Eduard Vazquez; Ramon Baldrich; Joost Van de Weijer; Maria Vanrell |
![download PDF file pdf](img/file_PDF.gif)
![goto web page (via DOI) doi](img/doi.gif)
|
|
Title |
Describing Reflectances for Colour Segmentation Robust to Shadows, Highlights and Textures |
Type |
Journal Article |
|
Year |
2011 |
Publication |
IEEE Transactions on Pattern Analysis and Machine Intelligence |
Abbreviated Journal |
TPAMI |
|
|
Volume |
33 |
Issue |
5 |
Pages |
917-930 |
|
|
Keywords |
|
|
|
Abstract ![sorted by Abstract field, ascending order (up)](img/sort_asc.gif) |
The segmentation of a single material reflectance is a challenging problem due to the considerable variation in image measurements caused by the geometry of the object, shadows, and specularities. The combination of these effects has been modeled by the dichromatic reflection model. However, the application of the model to real-world images is limited due to unknown acquisition parameters and compression artifacts. In this paper, we present a robust model for the shape of a single material reflectance in histogram space. The method is based on a multilocal creaseness analysis of the histogram which results in a set of ridges representing the material reflectances. The segmentation method derived from these ridges is robust to both shadow, shading and specularities, and texture in real-world images. We further complete the method by incorporating prior knowledge from image statistics, and incorporate spatial coherence by using multiscale color contrast information. Results obtained show that our method clearly outperforms state-of-the-art segmentation methods on a widely used segmentation benchmark, having as a main characteristic its excellent performance in the presence of shadows and highlights at low computational cost. |
|
|
Address |
Los Alamitos; CA; USA; |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
IEEE Computer Society |
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
0162-8828 |
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
|
|
|
Notes |
CIC |
Approved |
no |
|
|
Call Number |
Admin @ si @ VBW2011 |
Serial |
1715 |
|
Permanent link to this record |
|
|
|
|
Author |
C. Alejandro Parraga; Arash Akbarinia |
![download PDF file pdf](img/file_PDF.gif)
![find record details (via OpenURL) openurl](img/xref.gif)
|
|
Title |
NICE: A Computational Solution to Close the Gap from Colour Perception to Colour Categorization |
Type |
Journal Article |
|
Year |
2016 |
Publication |
PLoS One |
Abbreviated Journal |
Plos |
|
|
Volume |
11 |
Issue |
3 |
Pages |
e0149538 |
|
|
Keywords |
|
|
|
Abstract ![sorted by Abstract field, ascending order (up)](img/sort_asc.gif) |
The segmentation of visible electromagnetic radiation into chromatic categories by the human visual system has been extensively studied from a perceptual point of view, resulting in several colour appearance models. However, there is currently a void when it comes to relate these results to the physiological mechanisms that are known to shape the pre-cortical and cortical visual pathway. This work intends to begin to fill this void by proposing a new physiologically plausible model of colour categorization based on Neural Isoresponsive Colour Ellipsoids (NICE) in the cone-contrast space defined by the main directions of the visual signals entering the visual cortex. The model was adjusted to fit psychophysical measures that concentrate on the categorical boundaries and are consistent with the ellipsoidal isoresponse surfaces of visual cortical neurons. By revealing the shape of such categorical colour regions, our measures allow for a more precise and parsimonious description, connecting well-known early visual processing mechanisms to the less understood phenomenon of colour categorization. To test the feasibility of our method we applied it to exemplary images and a popular ground-truth chart obtaining labelling results that are better than those of current state-of-the-art algorithms. |
|
|
Address |
|
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
|
|
|
Notes |
NEUROBIT; 600.068 |
Approved |
no |
|
|
Call Number |
Admin @ si @ PaA2016a |
Serial |
2747 |
|
Permanent link to this record |
|
|
|
|
Author |
Joakim Bruslund Haurum; Meysam Madadi; Sergio Escalera; Thomas B. Moeslund |
![download PDF file pdf](img/file_PDF.gif)
![goto web page (via DOI) doi](img/doi.gif)
|
|
Title |
Multi-Task Classification of Sewer Pipe Defects and Properties Using a Cross-Task Graph Neural Network Decoder |
Type |
Conference Article |
|
Year |
2022 |
Publication |
Winter Conference on Applications of Computer Vision |
Abbreviated Journal |
|
|
|
Volume |
|
Issue |
|
Pages |
2806-2817 |
|
|
Keywords |
Vision Systems; Applications Multi-Task Classification |
|
|
Abstract ![sorted by Abstract field, ascending order (up)](img/sort_asc.gif) |
The sewerage infrastructure is one of the most important and expensive infrastructures in modern society. In order to efficiently manage the sewerage infrastructure, automated sewer inspection has to be utilized. However, while sewer
defect classification has been investigated for decades, little attention has been given to classifying sewer pipe properties such as water level, pipe material, and pipe shape, which are needed to evaluate the level of sewer pipe deterioration.
In this work we classify sewer pipe defects and properties concurrently and present a novel decoder-focused multi-task classification architecture Cross-Task Graph Neural Network (CT-GNN), which refines the disjointed per-task predictions using cross-task information. The CT-GNN architecture extends the traditional disjointed task-heads decoder, by utilizing a cross-task graph and unique class node embeddings. The cross-task graph can either be determined a priori based on the conditional probability between the task classes or determined dynamically using self-attention.
CT-GNN can be added to any backbone and trained end-toend at a small increase in the parameter count. We achieve state-of-the-art performance on all four classification tasks in the Sewer-ML dataset, improving defect classification and
water level classification by 5.3 and 8.0 percentage points, respectively. We also outperform the single task methods as well as other multi-task classification approaches while introducing 50 times fewer parameters than previous modelfocused approaches. |
|
|
Address |
|
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
WACV |
|
|
Notes |
HUPBA; no proj |
Approved |
no |
|
|
Call Number |
Admin @ si @ BME2022 |
Serial |
3638 |
|
Permanent link to this record |
|
|
|
|
Author |
Cristhian A. Aguilera-Carrasco; F. Aguilera; Angel Sappa; C. Aguilera; Ricardo Toledo |
![download PDF file pdf](img/file_PDF.gif)
![find record details (via OpenURL) openurl](img/xref.gif)
|
|
Title |
Learning cross-spectral similarity measures with deep convolutional neural networks |
Type |
Conference Article |
|
Year |
2016 |
Publication |
29th IEEE Conference on Computer Vision and Pattern Recognition Worshops |
Abbreviated Journal |
|
|
|
Volume |
|
Issue |
|
Pages |
|
|
|
Keywords |
|
|
|
Abstract ![sorted by Abstract field, ascending order (up)](img/sort_asc.gif) |
The simultaneous use of images from different spectracan be helpful to improve the performance of many computer vision tasks. The core idea behind the usage of crossspectral approaches is to take advantage of the strengths of each spectral band providing a richer representation of a scene, which cannot be obtained with just images from one spectral band. In this work we tackle the cross-spectral image similarity problem by using Convolutional Neural Networks (CNNs). We explore three different CNN architectures to compare the similarity of cross-spectral image patches. Specifically, we train each network with images from the visible and the near-infrared spectrum, and then test the result with two public cross-spectral datasets. Experimental results show that CNN approaches outperform the current state-of-art on both cross-spectral datasets. Additionally, our experiments show that some CNN architectures are capable of generalizing between different crossspectral domains. |
|
|
Address |
Las vegas; USA; June 2016 |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
CVPRW |
|
|
Notes |
ADAS; 600.086; 600.076 |
Approved |
no |
|
|
Call Number |
Admin @ si @AAS2016 |
Serial |
2809 |
|
Permanent link to this record |
|
|
|
|
Author |
Sergio Escalera; Jordi Gonzalez; Xavier Baro; Jamie Shotton |
![goto web page (via DOI) doi](img/doi.gif)
|
|
Title |
Guest Editor Introduction to the Special Issue on Multimodal Human Pose Recovery and Behavior Analysis |
Type |
Journal Article |
|
Year |
2016 |
Publication |
IEEE Transactions on Pattern Analysis and Machine Intelligence |
Abbreviated Journal |
TPAMI |
|
|
Volume |
28 |
Issue |
|
Pages |
1489 - 1491 |
|
|
Keywords |
|
|
|
Abstract ![sorted by Abstract field, ascending order (up)](img/sort_asc.gif) |
The sixteen papers in this special section focus on human pose recovery and behavior analysis (HuPBA). This is one of the most challenging topics in computer vision, pattern analysis, and machine learning. It is of critical importance for application areas that include gaming, computer interaction, human robot interaction, security, commerce, assistive technologies and rehabilitation, sports, sign language recognition, and driver assistance technology, to mention just a few. In essence, HuPBA requires dealing with the articulated nature of the human body, changes in appearance due to clothing, and the inherent problems of clutter scenes, such as background artifacts, occlusions, and illumination changes. These papers represent the most recent research in this field, including new methods considering still images, image sequences, depth data, stereo vision, 3D vision, audio, and IMUs, among others. |
|
|
Address |
|
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
|
|
|
Notes |
HuPBA; ISE;MV; |
Approved |
no |
|
|
Call Number |
Admin @ si @ |
Serial |
2851 |
|
Permanent link to this record |
|
|
|
|
Author |
Silvio Giancola; Anthony Cioppa; Adrien Deliege; Floriane Magera; Vladimir Somers; Le Kang; Xin Zhou; Olivier Barnich; Christophe De Vleeschouwer; Alexandre Alahi; Bernard Ghanem; Marc Van Droogenbroeck; Abdulrahman Darwish; Adrien Maglo; Albert Clapes; Andreas Luyts; Andrei Boiarov; Artur Xarles; Astrid Orcesi; Avijit Shah; Baoyu Fan; Bharath Comandur; Chen Chen; Chen Zhang; Chen Zhao; Chengzhi Lin; Cheuk-Yiu Chan; Chun Chuen Hui; Dengjie Li; Fan Yang; Fan Liang; Fang Da; Feng Yan; Fufu Yu; Guanshuo Wang; H. Anthony Chan; He Zhu; Hongwei Kan; Jiaming Chu; Jianming Hu; Jianyang Gu; Jin Chen; Joao V. B. Soares; Jonas Theiner; Jorge De Corte; Jose Henrique Brito; Jun Zhang; Junjie Li; Junwei Liang; Leqi Shen; Lin Ma; Lingchi Chen; Miguel Santos Marques; Mike Azatov; Nikita Kasatkin; Ning Wang; Qiong Jia; Quoc Cuong Pham; Ralph Ewerth; Ran Song; Rengang Li; Rikke Gade; Ruben Debien; Runze Zhang; Sangrok Lee; Sergio Escalera; Shan Jiang; Shigeyuki Odashima; Shimin Chen; Shoichi Masui; Shouhong Ding; Sin-wai Chan; Siyu Chen; Tallal El-Shabrawy; Tao He; Thomas B. Moeslund; Wan-Chi Siu; Wei Zhang; Wei Li; Xiangwei Wang; Xiao Tan; Xiaochuan Li; Xiaolin Wei; Xiaoqing Ye; Xing Liu; Xinying Wang; Yandong Guo; Yaqian Zhao; Yi Yu; Yingying Li; Yue He; Yujie Zhong; Zhenhua Guo; Zhiheng Li |
![goto web page url](img/www.gif)
![find record details (via OpenURL) openurl](img/xref.gif)
|
|
Title |
SoccerNet 2022 Challenges Results |
Type |
Conference Article |
|
Year |
2022 |
Publication |
5th International ACM Workshop on Multimedia Content Analysis in Sports |
Abbreviated Journal |
|
|
|
Volume |
|
Issue |
|
Pages |
75-86 |
|
|
Keywords |
|
|
|
Abstract ![sorted by Abstract field, ascending order (up)](img/sort_asc.gif) |
The SoccerNet 2022 challenges were the second annual video understanding challenges organized by the SoccerNet team. In 2022, the challenges were composed of 6 vision-based tasks: (1) action spotting, focusing on retrieving action timestamps in long untrimmed videos, (2) replay grounding, focusing on retrieving the live moment of an action shown in a replay, (3) pitch localization, focusing on detecting line and goal part elements, (4) camera calibration, dedicated to retrieving the intrinsic and extrinsic camera parameters, (5) player re-identification, focusing on retrieving the same players across multiple views, and (6) multiple object tracking, focusing on tracking players and the ball through unedited video streams. Compared to last year's challenges, tasks (1-2) had their evaluation metrics redefined to consider tighter temporal accuracies, and tasks (3-6) were novel, including their underlying data and annotations. More information on the tasks, challenges and leaderboards are available on this https URL. Baselines and development kits are available on this https URL. |
|
|
Address |
Lisboa; Portugal; October 2022 |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
ACMW |
|
|
Notes |
HUPBA; no menciona |
Approved |
no |
|
|
Call Number |
Admin @ si @ GCD2022 |
Serial |
3801 |
|
Permanent link to this record |