|
Records |
Links |
|
Author |
Akshita Gupta; Sanath Narayan; Salman Khan; Fahad Shahbaz Khan; Ling Shao; Joost Van de Weijer |
![goto web page (via DOI) doi](http://refbase.cvc.uab.es/img/doi.gif)
|
|
Title |
Generative Multi-Label Zero-Shot Learning |
Type |
Journal Article |
|
Year |
2023 |
Publication |
IEEE Transactions on Pattern Analysis and Machine Intelligence |
Abbreviated Journal |
TPAMI |
|
|
Volume |
45 |
Issue |
12 |
Pages |
14611-14624 |
|
|
Keywords ![sorted by Keywords field, descending order (down)](http://refbase.cvc.uab.es/img/sort_desc.gif) |
Generalized zero-shot learning; Multi-label classification; Zero-shot object detection; Feature synthesis |
|
|
Abstract |
Multi-label zero-shot learning strives to classify images into multiple unseen categories for which no data is available during training. The test samples can additionally contain seen categories in the generalized variant. Existing approaches rely on learning either shared or label-specific attention from the seen classes. Nevertheless, computing reliable attention maps for unseen classes during inference in a multi-label setting is still a challenge. In contrast, state-of-the-art single-label generative adversarial network (GAN) based approaches learn to directly synthesize the class-specific visual features from the corresponding class attribute embeddings. However, synthesizing multi-label features from GANs is still unexplored in the context of zero-shot setting. When multiple objects occur jointly in a single image, a critical question is how to effectively fuse multi-class information. In this work, we introduce different fusion approaches at the attribute-level, feature-level and cross-level (across attribute and feature-levels) for synthesizing multi-label features from their corresponding multi-label class embeddings. To the best of our knowledge, our work is the first to tackle the problem of multi-label feature synthesis in the (generalized) zero-shot setting. Our cross-level fusion-based generative approach outperforms the state-of-the-art on three zero-shot benchmarks: NUS-WIDE, Open Images and MS COCO. Furthermore, we show the generalization capabilities of our fusion approach in the zero-shot detection task on MS COCO, achieving favorable performance against existing methods. |
|
|
Address |
December 2023 |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
|
|
|
Notes |
LAMP; PID2021-128178OB-I00 |
Approved |
no |
|
|
Call Number |
Admin @ si @ |
Serial |
3853 |
|
Permanent link to this record |
|
|
|
|
Author |
Laura Lopez-Fuentes; Joost Van de Weijer; Manuel Gonzalez-Hidalgo; Harald Skinnemoen; Andrew Bagdanov |
![download PDF file pdf](http://refbase.cvc.uab.es/img/file_PDF.gif)
![find record details (via OpenURL) openurl](http://refbase.cvc.uab.es/img/xref.gif)
|
|
Title |
Review on computer vision techniques in emergency situations |
Type |
Journal Article |
|
Year |
2018 |
Publication |
Multimedia Tools and Applications |
Abbreviated Journal |
MTAP |
|
|
Volume |
77 |
Issue |
13 |
Pages |
17069–17107 |
|
|
Keywords ![sorted by Keywords field, descending order (down)](http://refbase.cvc.uab.es/img/sort_desc.gif) |
Emergency management; Computer vision; Decision makers; Situational awareness; Critical situation |
|
|
Abstract |
In emergency situations, actions that save lives and limit the impact of hazards are crucial. In order to act, situational awareness is needed to decide what to do. Geolocalized photos and video of the situations as they evolve can be crucial in better understanding them and making decisions faster. Cameras are almost everywhere these days, either in terms of smartphones, installed CCTV cameras, UAVs or others. However, this poses challenges in big data and information overflow. Moreover, most of the time there are no disasters at any given location, so humans aiming to detect sudden situations may not be as alert as needed at any point in time. Consequently, computer vision tools can be an excellent decision support. The number of emergencies where computer vision tools has been considered or used is very wide, and there is a great overlap across related emergency research. Researchers tend to focus on state-of-the-art systems that cover the same emergency as they are studying, obviating important research in other fields. In order to unveil this overlap, the survey is divided along four main axes: the types of emergencies that have been studied in computer vision, the objective that the algorithms can address, the type of hardware needed and the algorithms used. Therefore, this review provides a broad overview of the progress of computer vision covering all sorts of emergencies. |
|
|
Address |
|
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
|
|
|
Notes |
LAMP; 600.068; 600.120 |
Approved |
no |
|
|
Call Number |
Admin @ si @ LWG2018 |
Serial |
3041 |
|
Permanent link to this record |
|
|
|
|
Author |
Francesco Ciompi; Oriol Pujol; Petia Radeva |
![goto web page (via DOI) doi](http://refbase.cvc.uab.es/img/doi.gif)
|
|
Title |
ECOC-DRF: Discriminative random fields based on error correcting output codes |
Type |
Journal Article |
|
Year |
2014 |
Publication |
Pattern Recognition |
Abbreviated Journal |
PR |
|
|
Volume |
47 |
Issue |
6 |
Pages |
2193-2204 |
|
|
Keywords ![sorted by Keywords field, descending order (down)](http://refbase.cvc.uab.es/img/sort_desc.gif) |
Discriminative random fields; Error-correcting output codes; Multi-class classification; Graphical models |
|
|
Abstract |
We present ECOC-DRF, a framework where potential functions for Discriminative Random Fields are formulated as an ensemble of classifiers. We introduce the label trick, a technique to express transitions in the pairwise potential as meta-classes. This allows to independently learn any possible transition between labels without assuming any pre-defined model. The Error Correcting Output Codes matrix is used as ensemble framework for the combination of margin classifiers. We apply ECOC-DRF to a large set of classification problems, covering synthetic, natural and medical images for binary and multi-class cases, outperforming state-of-the art in almost all the experiments. |
|
|
Address |
|
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
|
|
|
Notes |
LAMP; HuPBA; MILAB; 605.203; 600.046; 601.043; 600.079 |
Approved |
no |
|
|
Call Number |
Admin @ si @ CPR2014b |
Serial |
2470 |
|
Permanent link to this record |
|
|
|
|
Author |
Marçal Rusiñol; Volkmar Frinken; Dimosthenis Karatzas; Andrew Bagdanov; Josep Llados |
![goto web page (via DOI) doi](http://refbase.cvc.uab.es/img/doi.gif)
|
|
Title |
Multimodal page classification in administrative document image streams |
Type |
Journal Article |
|
Year |
2014 |
Publication |
International Journal on Document Analysis and Recognition |
Abbreviated Journal |
IJDAR |
|
|
Volume |
17 |
Issue |
4 |
Pages |
331-341 |
|
|
Keywords ![sorted by Keywords field, descending order (down)](http://refbase.cvc.uab.es/img/sort_desc.gif) |
Digital mail room; Multimodal page classification; Visual and textual document description |
|
|
Abstract |
In this paper, we present a page classification application in a banking workflow. The proposed architecture represents administrative document images by merging visual and textual descriptions. The visual description is based on a hierarchical representation of the pixel intensity distribution. The textual description uses latent semantic analysis to represent document content as a mixture of topics. Several off-the-shelf classifiers and different strategies for combining visual and textual cues have been evaluated. A final step uses an n-gram model of the page stream allowing a finer-grained classification of pages. The proposed method has been tested in a real large-scale environment and we report results on a dataset of 70,000 pages. |
|
|
Address |
|
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
Springer Berlin Heidelberg |
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
1433-2833 |
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
|
|
|
Notes |
DAG; LAMP; 600.056; 600.061; 601.240; 601.223; 600.077; 600.079 |
Approved |
no |
|
|
Call Number |
Admin @ si @ RFK2014 |
Serial |
2523 |
|
Permanent link to this record |
|
|
|
|
Author |
Svebor Karaman; Andrew Bagdanov; Lea Landucci; Gianpaolo D'Amico; Andrea Ferracani; Daniele Pezzatini; Alberto del Bimbo |
![download PDF file pdf](http://refbase.cvc.uab.es/img/file_PDF.gif)
![find record details (via OpenURL) openurl](http://refbase.cvc.uab.es/img/xref.gif)
|
|
Title |
Personalized multimedia content delivery on an interactive table by passive observation of museum visitors |
Type |
Journal Article |
|
Year |
2016 |
Publication |
Multimedia Tools and Applications |
Abbreviated Journal |
MTAP |
|
|
Volume |
75 |
Issue |
7 |
Pages |
3787-3811 |
|
|
Keywords ![sorted by Keywords field, descending order (down)](http://refbase.cvc.uab.es/img/sort_desc.gif) |
Computer vision; Video surveillance; Cultural heritage; Multimedia museum; Personalization; Natural interaction; Passive profiling |
|
|
Abstract |
The amount of multimedia data collected in museum databases is growing fast, while the capacity of museums to display information to visitors is acutely limited by physical space. Museums must seek the perfect balance of information given on individual pieces in order to provide sufficient information to aid visitor understanding while maintaining sparse usage of the walls and guaranteeing high appreciation of the exhibit. Moreover, museums often target the interests of average visitors instead of the entire spectrum of different interests each individual visitor might have. Finally, visiting a museum should not be an experience contained in the physical space of the museum but a door opened onto a broader context of related artworks, authors, artistic trends, etc. In this paper we describe the MNEMOSYNE system that attempts to address these issues through a new multimedia museum experience. Based on passive observation, the system builds a profile of the artworks of interest for each visitor. These profiles of interest are then used to drive an interactive table that personalizes multimedia content delivery. The natural user interface on the interactive table uses the visitor’s profile, an ontology of museum content and a recommendation system to personalize exploration of multimedia content. At the end of their visit, the visitor can take home a personalized summary of their visit on a custom mobile application. In this article we describe in detail each component of our approach as well as the first field trials of our prototype system built and deployed at our permanent exhibition space at LeMurate (http://www.lemurate.comune.fi.it/lemurate/) in Florence together with the first results of the evaluation process during the official installation in the National Museum of Bargello (http://www.uffizi.firenze.it/musei/?m=bargello). |
|
|
Address |
|
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
Springer US |
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
1380-7501 |
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
|
|
|
Notes |
LAMP; 601.240; 600.079 |
Approved |
no |
|
|
Call Number |
Admin @ si @ KBL2016 |
Serial |
2520 |
|
Permanent link to this record |