Home | [161–170] << 171 172 173 174 175 176 177 178 179 180 >> [181–190] |
Records | |||||
---|---|---|---|---|---|
Author | Razieh Rastgoo; Kourosh Kiani; Sergio Escalera; Vassilis Athitsos; Mohammad Sabokrou | ||||
Title | All You Need In Sign Language Production | Type | Miscellaneous | ||
Year | 2022 | Publication | Arxiv | Abbreviated Journal | |
Volume | Issue | Pages | |||
Keywords | Sign Language Production; Sign Language Recog- nition; Sign Language Translation; Deep Learning; Survey; Deaf | ||||
Abstract | Sign Language is the dominant form of communication language used in the deaf and hearing-impaired community. To make an easy and mutual communication between the hearing-impaired and the hearing communities, building a robust system capable of translating the spoken language into sign language and vice versa is fundamental.
To this end, sign language recognition and production are two necessary parts for making such a two-way system. Signlanguage recognition and production need to cope with some critical challenges. In this survey, we review recent advances in Sign Language Production (SLP) and related areas using deep learning. To have more realistic perspectives to sign language, we present an introduction to the Deaf culture, Deaf centers, psychological perspective of sign language, the main differences between spoken language and sign language. Furthermore, we present the fundamental components of a bi-directional sign language translation system, discussing the main challenges in this area. Also, the backbone architectures and methods in SLP are briefly introduced and the proposed taxonomy on SLP is presented. Finally, a general framework for SLP and performance evaluation, and also a discussion on the recent developments, advantages, and limitations in SLP, commenting on possible lines for future research are presented. |
||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | |||
Notes | HuPBA; no menciona | Approved | no | ||
Call Number | Admin @ si @ RKE2022c | Serial | 3698 | ||
Permanent link to this record | |||||
Author | Guillermo Torres; Sonia Baeza; Carles Sanchez; Ignasi Guasch; Antoni Rosell; Debora Gil | ||||
Title | An Intelligent Radiomic Approach for Lung Cancer Screening | Type | Journal Article | ||
Year | 2022 | Publication | Applied Sciences | Abbreviated Journal | APPLSCI |
Volume | 12 | Issue | 3 | Pages | 1568 |
Keywords | Lung cancer; Early diagnosis; Screening; Neural networks; Image embedding; Architecture optimization | ||||
Abstract | The efficiency of lung cancer screening for reducing mortality is hindered by the high rate of false positives. Artificial intelligence applied to radiomics could help to early discard benign cases from the analysis of CT scans. The available amount of data and the fact that benign cases are a minority, constitutes a main challenge for the successful use of state of the art methods (like deep learning), which can be biased, over-fitted and lack of clinical reproducibility. We present an hybrid approach combining the potential of radiomic features to characterize nodules in CT scans and the generalization of the feed forward networks. In order to obtain maximal reproducibility with minimal training data, we propose an embedding of nodules based on the statistical significance of radiomic features for malignancy detection. This representation space of lesions is the input to a feed
forward network, which architecture and hyperparameters are optimized using own-defined metrics of the diagnostic power of the whole system. Results of the best model on an independent set of patients achieve 100% of sensitivity and 83% of specificity (AUC = 0.94) for malignancy detection. |
||||
Address | Jan 2022 | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | |||
Notes | IAM; 600.139; 600.145 | Approved | no | ||
Call Number | Admin @ si @ TBS2022 | Serial | 3699 | ||
Permanent link to this record | |||||
Author | Bojana Gajic; Ariel Amato; Ramon Baldrich; Joost Van de Weijer; Carlo Gatta | ||||
Title | Area Under the ROC Curve Maximization for Metric Learning | Type | Conference Article | ||
Year | 2022 | Publication | CVPR 2022 Workshop on Efficien Deep Learning for Computer Vision (ECV 2022, 5th Edition) | Abbreviated Journal | |
Volume | Issue | Pages | |||
Keywords | Training; Computer vision; Conferences; Area measurement; Benchmark testing; Pattern recognition | ||||
Abstract | Most popular metric learning losses have no direct relation with the evaluation metrics that are subsequently applied to evaluate their performance. We hypothesize that training a metric learning model by maximizing the area under the ROC curve (which is a typical performance measure of recognition systems) can induce an implicit ranking suitable for retrieval problems. This hypothesis is supported by previous work that proved that a curve dominates in ROC space if and only if it dominates in Precision-Recall space. To test this hypothesis, we design and maximize an approximated, derivable relaxation of the area under the ROC curve. The proposed AUC loss achieves state-of-the-art results on two large scale retrieval benchmark datasets (Stanford Online Products and DeepFashion In-Shop). Moreover, the AUC loss achieves comparable performance to more complex, domain specific, state-of-the-art methods for vehicle re-identification. | ||||
Address | New Orleans, USA; 20 June 2022 | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | CVPRW | ||
Notes | CIC; LAMP; | Approved | no | ||
Call Number | Admin @ si @ GAB2022 | Serial | 3700 | ||
Permanent link to this record | |||||
Author | Y. Mori; M.Misawa; Jorge Bernal; M. Bretthauer; S.Kudo; A. Rastogi; Gloria Fernandez Esparrach | ||||
Title | Artificial Intelligence for Disease Diagnosis-the Gold Standard Challenge | Type | Journal Article | ||
Year | 2022 | Publication | Gastrointestinal Endoscopy | Abbreviated Journal | |
Volume | 96 | Issue | 2 | Pages | 370-372 |
Keywords | |||||
Abstract | |||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | |||
Notes | ISE | Approved | no | ||
Call Number | Admin @ si @ MMB2022 | Serial | 3701 | ||
Permanent link to this record | |||||
Author | Javad Zolfaghari Bengar; Joost Van de Weijer; Laura Lopez-Fuentes; Bogdan Raducanu | ||||
Title | Class-Balanced Active Learning for Image Classification | Type | Conference Article | ||
Year | 2022 | Publication | Winter Conference on Applications of Computer Vision | Abbreviated Journal | |
Volume | Issue | Pages | |||
Keywords | |||||
Abstract | Active learning aims to reduce the labeling effort that is required to train algorithms by learning an acquisition function selecting the most relevant data for which a label should be requested from a large unlabeled data pool. Active learning is generally studied on balanced datasets where an equal amount of images per class is available. However, real-world datasets suffer from severe imbalanced classes, the so called long-tail distribution. We argue that this further complicates the active learning process, since the imbalanced data pool can result in suboptimal classifiers. To address this problem in the context of active learning, we proposed a general optimization framework that explicitly takes class-balancing into account. Results on three datasets showed that the method is general (it can be combined with most existing active learning algorithms) and can be effectively applied to boost the performance of both informative and representative-based active learning methods. In addition, we showed that also on balanced datasets
our method 1 generally results in a performance gain. |
||||
Address | Virtual; Waikoloa; Hawai; USA; January 2022 | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | WACV | ||
Notes | LAMP; 602.200; 600.147; 600.120 | Approved | no | ||
Call Number | Admin @ si @ ZWL2022 | Serial | 3703 | ||
Permanent link to this record | |||||
Author | Alex Gomez-Villa; Bartlomiej Twardowski; Lu Yu; Andrew Bagdanov; Joost Van de Weijer | ||||
Title | Continually Learning Self-Supervised Representations With Projected Functional Regularization | Type | Conference Article | ||
Year | 2022 | Publication | CVPR 2022 Workshop on Continual Learning (CLVision, 3rd Edition) | Abbreviated Journal | |
Volume | Issue | Pages | 3866-3876 | ||
Keywords | Computer vision; Conferences; Self-supervised learning; Image representation; Pattern recognition | ||||
Abstract | Recent self-supervised learning methods are able to learn high-quality image representations and are closing the gap with supervised approaches. However, these methods are unable to acquire new knowledge incrementally – they are, in fact, mostly used only as a pre-training phase over IID data. In this work we investigate self-supervised methods in continual learning regimes without any replay
mechanism. We show that naive functional regularization,also known as feature distillation, leads to lower plasticity and limits continual learning performance. Instead, we propose Projected Functional Regularization in which a separate temporal projection network ensures that the newly learned feature space preserves information of the previous one, while at the same time allowing for the learning of new features. This prevents forgetting while maintaining the plasticity of the learner. Comparison with other incremental learning approaches applied to self-supervision demonstrates that our method obtains competitive performance in different scenarios and on multiple datasets. |
||||
Address | New Orleans, USA; 20 June 2022 | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | CVPRW | ||
Notes | LAMP: 600.147; 600.120 | Approved | no | ||
Call Number | Admin @ si @ GTY2022 | Serial | 3704 | ||
Permanent link to this record | |||||
Author | Jose Luis Gomez; Gabriel Villalonga; Antonio Lopez | ||||
Title | Co-Training for Unsupervised Domain Adaptation of Semantic Segmentation Models | Type | Journal Article | ||
Year | 2023 | Publication | Sensors – Special Issue on “Machine Learning for Autonomous Driving Perception and Prediction” | Abbreviated Journal | SENS |
Volume | 23 | Issue | 2 | Pages | 621 |
Keywords | Domain adaptation; semi-supervised learning; Semantic segmentation; Autonomous driving | ||||
Abstract | Semantic image segmentation is a central and challenging task in autonomous driving, addressed by training deep models. Since this training draws to a curse of human-based image labeling, using synthetic images with automatically generated labels together with unlabeled real-world images is a promising alternative. This implies to address an unsupervised domain adaptation (UDA) problem. In this paper, we propose a new co-training procedure for synth-to-real UDA of semantic
segmentation models. It consists of a self-training stage, which provides two domain-adapted models, and a model collaboration loop for the mutual improvement of these two models. These models are then used to provide the final semantic segmentation labels (pseudo-labels) for the real-world images. The overall procedure treats the deep models as black boxes and drives their collaboration at the level of pseudo-labeled target images, i.e., neither modifying loss functions is required, nor explicit feature alignment. We test our proposal on standard synthetic and real-world datasets for on-board semantic segmentation. Our procedure shows improvements ranging from ∼13 to ∼26 mIoU points over baselines, so establishing new state-of-the-art results. |
||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | |||
Notes | ADAS; no proj | Approved | no | ||
Call Number | Admin @ si @ GVL2023 | Serial | 3705 | ||
Permanent link to this record | |||||
Author | Reuben Dorent; Aaron Kujawa; Marina Ivory; Spyridon Bakas; Nikola Rieke; Samuel Joutard; Ben Glocker; Jorge Cardoso; Marc Modat; Kayhan Batmanghelich; Arseniy Belkov; Maria Baldeon Calisto; Jae Won Choi; Benoit M. Dawant; Hexin Dong; Sergio Escalera; Yubo Fan; Lasse Hansen; Mattias P. Heinrich; Smriti Joshi; Victoriya Kashtanova; Hyeon Gyu Kim; Satoshi Kondo; Christian N. Kruse; Susana K. Lai-Yuen; Hao Li; Han Liu; Buntheng Ly; Ipek Oguz; Hyungseob Shin; Boris Shirokikh; Zixian Su; Guotai Wang; Jianghao Wu; Yanwu Xu; Kai Yao; Li Zhang; Sebastien Ourselin, | ||||
Title | CrossMoDA 2021 challenge: Benchmark of Cross-Modality Domain Adaptation techniques for Vestibular Schwannoma and Cochlea Segmentation | Type | Journal Article | ||
Year | 2023 | Publication | Medical Image Analysis | Abbreviated Journal | MIA |
Volume | 83 | Issue | Pages | 102628 | |
Keywords | Domain Adaptation; Segmen tation; Vestibular Schwnannoma | ||||
Abstract | Domain Adaptation (DA) has recently raised strong interests in the medical imaging community. While a large variety of DA techniques has been proposed for image segmentation, most of these techniques have been validated either on private datasets or on small publicly available datasets. Moreover, these datasets mostly addressed single-class problems. To tackle these limitations, the Cross-Modality Domain Adaptation (crossMoDA) challenge was organised in conjunction with the 24th International Conference on Medical Image Computing and Computer Assisted Intervention (MICCAI 2021). CrossMoDA is the first large and multi-class benchmark for unsupervised cross-modality DA. The challenge's goal is to segment two key brain structures involved in the follow-up and treatment planning of vestibular schwannoma (VS): the VS and the cochleas. Currently, the diagnosis and surveillance in patients with VS are performed using contrast-enhanced T1 (ceT1) MRI. However, there is growing interest in using non-contrast sequences such as high-resolution T2 (hrT2) MRI. Therefore, we created an unsupervised cross-modality segmentation benchmark. The training set provides annotated ceT1 (N=105) and unpaired non-annotated hrT2 (N=105). The aim was to automatically perform unilateral VS and bilateral cochlea segmentation on hrT2 as provided in the testing set (N=137). A total of 16 teams submitted their algorithm for the evaluation phase. The level of performance reached by the top-performing teams is strikingly high (best median Dice – VS:88.4%; Cochleas:85.7%) and close to full supervision (median Dice – VS:92.5%; Cochleas:87.7%). All top-performing methods made use of an image-to-image translation approach to transform the source-domain images into pseudo-target-domain images. A segmentation network was then trained using these generated images and the manual annotations provided for the source image. | ||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | |||
Notes | HUPBA | Approved | no | ||
Call Number | Admin @ si @ DKI2023 | Serial | 3706 | ||
Permanent link to this record | |||||
Author | Julio C. S. Jacques Junior; Yagmur Gucluturk; Marc Perez; Umut Guçlu; Carlos Andujar; Xavier Baro; Hugo Jair Escalante; Isabelle Guyon; Marcel A. J. van Gerven; Rob van Lier; Sergio Escalera | ||||
Title | First Impressions: A Survey on Vision-Based Apparent Personality Trait Analysis | Type | Journal Article | ||
Year | 2022 | Publication | IEEE Transactions on Affective Computing | Abbreviated Journal | TAC |
Volume | 13 | Issue | 1 | Pages | 75-95 |
Keywords | Personality computing; first impressions; person perception; big-five; subjective bias; computer vision; machine learning; nonverbal signals; facial expression; gesture; speech analysis; multi-modal recognition | ||||
Abstract | Personality analysis has been widely studied in psychology, neuropsychology, and signal processing fields, among others. From the past few years, it also became an attractive research area in visual computing. From the computational point of view, by far speech and text have been the most considered cues of information for analyzing personality. However, recently there has been an increasing interest from the computer vision community in analyzing personality from visual data. Recent computer vision approaches are able to accurately analyze human faces, body postures and behaviors, and use these information to infer apparent personality traits. Because of the overwhelming research interest in this topic, and of the potential impact that this sort of methods could have in society, we present in this paper an up-to-date review of existing vision-based approaches for apparent personality trait recognition. We describe seminal and cutting edge works on the subject, discussing and comparing their distinctive features and limitations. Future venues of research in the field are identified and discussed. Furthermore, aspects on the subjectivity in data labeling/evaluation, as well as current datasets and challenges organized to push the research on the field are reviewed. | ||||
Address | 1 Jan.-March 2022 | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | |||
Notes | HuPBA | Approved | no | ||
Call Number | Admin @ si @ JGP2022 | Serial | 3724 | ||
Permanent link to this record | |||||
Author | Gemma Rotger; Francesc Moreno-Noguer; Felipe Lumbreras; Antonio Agudo | ||||
Title | Single view facial hair 3D reconstruction | Type | Conference Article | ||
Year | 2019 | Publication | 9th Iberian Conference on Pattern Recognition and Image Analysis | Abbreviated Journal | |
Volume | 11867 | Issue | Pages | 423-436 | |
Keywords | 3D Vision; Shape Reconstruction; Facial Hair Modeling | ||||
Abstract | n this work, we introduce a novel energy-based framework that addresses the challenging problem of 3D reconstruction of facial hair from a single RGB image. To this end, we identify hair pixels over the image via texture analysis and then determine individual hair fibers that are modeled by means of a parametric hair model based on 3D helixes. We propose to minimize an energy composed of several terms, in order to adapt the hair parameters that better fit the image detections. The final hairs respond to the resulting fibers after a post-processing step where we encourage further realism. The resulting approach generates realistic facial hair fibers from solely an RGB image without assuming any training data nor user interaction. We provide an experimental evaluation on real-world pictures where several facial hair styles and image conditions are observed, showing consistent results and establishing a comparison with respect to competing approaches. | ||||
Address | Madrid; July 2019 | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | LNCS | ||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | IbPRIA | ||
Notes | MSIAU; 600.086; 600.130; 600.122 | Approved | no | ||
Call Number | Admin @ si @ | Serial | 3707 | ||
Permanent link to this record | |||||
Author | Gemma Rotger; Francesc Moreno-Noguer; Felipe Lumbreras; Antonio Agudo | ||||
Title | Detailed 3D face reconstruction from a single RGB image | Type | Journal | ||
Year | 2019 | Publication | Journal of WSCG | Abbreviated Journal | JWSCG |
Volume | 27 | Issue | 2 | Pages | 103-112 |
Keywords | 3D Wrinkle Reconstruction; Face Analysis, Optimization. | ||||
Abstract | This paper introduces a method to obtain a detailed 3D reconstruction of facial skin from a single RGB image.
To this end, we propose the exclusive use of an input image without requiring any information about the observed material nor training data to model the wrinkle properties. They are detected and characterized directly from the image via a simple and effective parametric model, determining several features such as location, orientation, width, and height. With these ingredients, we propose to minimize a photometric error to retrieve the final detailed 3D map, which is initialized by current techniques based on deep learning. In contrast with other approaches, we only require estimating a depth parameter, making our approach fast and intuitive. Extensive experimental evaluation is presented in a wide variety of synthetic and real images, including different skin properties and facial expressions. In all cases, our method outperforms the current approaches regarding 3D reconstruction accuracy, providing striking results for both large and fine wrinkles. |
||||
Address | 2019/11 | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | |||
Notes | MSIAU; 600.086; 600.130; 600.122 | Approved | no | ||
Call Number | Admin @ si @ | Serial | 3708 | ||
Permanent link to this record | |||||
Author | Bojana Gajic; Ramon Baldrich | ||||
Title | Cross-domain fashion image retrieval | Type | Conference Article | ||
Year | 2018 | Publication | CVPR 2018 Workshop on Women in Computer Vision (WiCV 2018, 4th Edition) | Abbreviated Journal | |
Volume | Issue | Pages | 19500-19502 | ||
Keywords | |||||
Abstract | Cross domain image retrieval is a challenging task that implies matching images from one domain to their pairs from another domain. In this paper we focus on fashion image retrieval, which involves matching an image of a fashion item taken by users, to the images of the same item taken in controlled condition, usually by professional photographer. When facing this problem, we have different products
in train and test time, and we use triplet loss to train the network. We stress the importance of proper training of simple architecture, as well as adapting general models to the specific task. |
||||
Address | Salt Lake City, USA; 22 June 2018 | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | CVPRW | ||
Notes | CIC; 600.087 | Approved | no | ||
Call Number | Admin @ si @ | Serial | 3709 | ||
Permanent link to this record | |||||
Author | Bojana Gajic; Eduard Vazquez; Ramon Baldrich | ||||
Title | Evaluation of Deep Image Descriptors for Texture Retrieval | Type | Conference Article | ||
Year | 2017 | Publication | Proceedings of the 12th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications (VISIGRAPP 2017) | Abbreviated Journal | |
Volume | Issue | Pages | 251-257 | ||
Keywords | Texture Representation; Texture Retrieval; Convolutional Neural Networks; Psychophysical Evaluation | ||||
Abstract | The increasing complexity learnt in the layers of a Convolutional Neural Network has proven to be of great help for the task of classification. The topic has received great attention in recently published literature.
Nonetheless, just a handful of works study low-level representations, commonly associated with lower layers. In this paper, we explore recent findings which conclude, counterintuitively, the last layer of the VGG convolutional network is the best to describe a low-level property such as texture. To shed some light on this issue, we are proposing a psychophysical experiment to evaluate the adequacy of different layers of the VGG network for texture retrieval. Results obtained suggest that, whereas the last convolutional layer is a good choice for a specific task of classification, it might not be the best choice as a texture descriptor, showing a very poor performance on texture retrieval. Intermediate layers show the best performance, showing a good combination of basic filters, as in the primary visual cortex, and also a degree of higher level information to describe more complex textures. |
||||
Address | Porto, Portugal; 27 February – 1 March 2017 | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | VISIGRAPP | ||
Notes | CIC; 600.087 | Approved | no | ||
Call Number | Admin @ si @ | Serial | 3710 | ||
Permanent link to this record | |||||
Author | Jon Almazan; Bojana Gajic; Naila Murray; Diane Larlus | ||||
Title | Re-ID done right: towards good practices for person re-identification | Type | Miscellaneous | ||
Year | 2018 | Publication | Arxiv | Abbreviated Journal | |
Volume | Issue | Pages | |||
Keywords | |||||
Abstract | Training a deep architecture using a ranking loss has become standard for the person re-identification task. Increasingly, these deep architectures include additional components that leverage part detections, attribute predictions, pose estimators and other auxiliary information, in order to more effectively localize and align discriminative image regions. In this paper we adopt a different approach and carefully design each component of a simple deep architecture and, critically, the strategy for training it effectively for person re-identification. We extensively evaluate each design choice, leading to a list of good practices for person re-identification. By following these practices, our approach outperforms the state of the art, including more complex methods with auxiliary components, by large margins on four benchmark datasets. We also provide a qualitative analysis of our trained representation which indicates that, while compact, it is able to capture information from localized and discriminative regions, in a manner akin to an implicit attention mechanism. | ||||
Address | January 2018 | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | |||
Notes | Approved | no | |||
Call Number | Admin @ si @ | Serial | 3711 | ||
Permanent link to this record | |||||
Author | Parichehr Behjati Ardakani | ||||
Title | Towards Efficient and Robust Convolutional Neural Networks for Single Image Super-Resolution | Type | Book Whole | ||
Year | 2022 | Publication | PhD Thesis, Universitat Autonoma de Barcelona-CVC | Abbreviated Journal | |
Volume | Issue | Pages | |||
Keywords | |||||
Abstract | Single image super-resolution (SISR) is an important task in image processing which aims to enhance the resolution of imaging systems. Recently, SISR has witnessed great strides with the rapid development of deep learning. Recent advances in SISR are mostly devoted to designing deeper and wider networks to enhance their representation learning capacity. However, as the depth of networks increases, deep learning-based methods are faced with the challenge of computational complexity in practice. Moreover, most existing methods rarely leverage the intermediate features and also do not discriminate the computation of features by their frequencial components, thereby achieving relatively low performance. Aside from the aforementioned problems, another desired ability is to upsample images to arbitrary scales using a single model. Most current SISR methods train a dedicated model for each target resolution, losing generality and increasing memory requirements. In this thesis, we address the aforementioned issues and propose solutions to them: i) We present a novel frequency-based enhancement block which treats different frequencies in a heterogeneous way and also models inter-channel dependencies, which consequently enrich the output feature. Thus it helps the network generate more discriminative representations by explicitly recovering finer details. ii) We introduce OverNet which contains two main parts: a lightweight feature extractor that follows a novel recursive framework of skip and dense connections to reduce low-level feature degradation, and an overscaling module that generates an accurate SR image by internally constructing an overscaled intermediate representation of the output features. Then, to solve the problem of reconstruction at arbitrary scale factors, we introduce a novel multi-scale loss, that allows the simultaneous training of all scale factors using a single model. iii) We propose a directional variance attention network which leverages a novel attention mechanism to enhance features in different channels and spatial regions. Moreover, we introduce a novel procedure for using attention mechanisms together with residual blocks to facilitate the preservation of finer details. Finally, we demonstrate that our approaches achieve considerably better performance than previous state-of-the-art methods, in terms of both quantitative and visual quality. | ||||
Address | April, 2022 | ||||
Corporate Author | Thesis | Ph.D. thesis | |||
Publisher | Place of Publication | Editor | Jordi Gonzalez;Xavier Roca;Pau Rodriguez | ||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | 978-84-124793-1-7 | Medium | ||
Area | Expedition | Conference | |||
Notes | ISE | Approved | no | ||
Call Number | Admin @ si @ Beh2022 | Serial | 3713 | ||
Permanent link to this record |