Home | [111–120] << 121 122 123 124 125 126 127 128 129 130 >> [131–140] |
![]() |
Records | |||||
---|---|---|---|---|---|
Author | Mohamed Ali Souibgui; Sanket Biswas; Sana Khamekhem Jemni; Yousri Kessentini; Alicia Fornes; Josep Llados; Umapada Pal | ||||
Title | DocEnTr: An End-to-End Document Image Enhancement Transformer | Type | Conference Article | ||
Year | 2022 | Publication | 26th International Conference on Pattern Recognition | Abbreviated Journal | |
Volume | Issue | Pages | 1699-1705 | ||
Keywords | Degradation; Head; Optical character recognition; Self-supervised learning; Benchmark testing; Transformers; Magnetic heads | ||||
Abstract | Document images can be affected by many degradation scenarios, which cause recognition and processing difficulties. In this age of digitization, it is important to denoise them for proper usage. To address this challenge, we present a new encoder-decoder architecture based on vision transformers to enhance both machine-printed and handwritten document images, in an end-to-end fashion. The encoder operates directly on the pixel patches with their positional information without the use of any convolutional layers, while the decoder reconstructs a clean image from the encoded patches. Conducted experiments show a superiority of the proposed model compared to the state-of the-art methods on several DIBCO benchmarks. Code and models will be publicly available at: https://github.com/dali92002/DocEnTR | ||||
Address | August 21-25, 2022 , Montréal Québec | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | ICPR | ||
Notes | DAG; 600.121; 600.162; 602.230; 600.140 | Approved | no | ||
Call Number ![]() |
Admin @ si @ SBJ2022 | Serial | 3730 | ||
Permanent link to this record | |||||
Author | J.M. Sanchez; X. Binefa; J.R. Kender | ||||
Title | Coupled Markox Chains for Video Contents Characterization. | Type | Miscellaneous | ||
Year | 2002 | Publication | Proceeding of the International Conference on Pattern Recognition ICPR 2002 | Abbreviated Journal | |
Volume | Issue | Pages | |||
Keywords | |||||
Abstract | |||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | |||
Notes | Approved | no | |||
Call Number ![]() |
Admin @ si @ SBK2002a | Serial | 298 | ||
Permanent link to this record | |||||
Author | J.M. Sanchez; X. Binefa; J.R. Kender | ||||
Title | Multiple Feature Temporal Models for Object Detection in Video. | Type | Miscellaneous | ||
Year | 2002 | Publication | Proceeding of the International Conference on Multimedia and Expo ICME 2002 | Abbreviated Journal | |
Volume | Issue | Pages | |||
Keywords | |||||
Abstract | |||||
Address | Lausanne | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | |||
Notes | Approved | no | |||
Call Number ![]() |
Admin @ si @ SBK2002b | Serial | 299 | ||
Permanent link to this record | |||||
Author | Mohamed Ali Souibgui; Sanket Biswas; Andres Mafla; Ali Furkan Biten; Alicia Fornes; Yousri Kessentini; Josep Llados; Lluis Gomez; Dimosthenis Karatzas | ||||
Title | Text-DIAE: a self-supervised degradation invariant autoencoder for text recognition and document enhancement | Type | Conference Article | ||
Year | 2023 | Publication | Proceedings of the 37th AAAI Conference on Artificial Intelligence | Abbreviated Journal | |
Volume | 37 | Issue | 2 | Pages | |
Keywords | Representation Learning for Vision; CV Applications; CV Language and Vision; ML Unsupervised; Self-Supervised Learning | ||||
Abstract | In this paper, we propose a Text-Degradation Invariant Auto Encoder (Text-DIAE), a self-supervised model designed to tackle two tasks, text recognition (handwritten or scene-text) and document image enhancement. We start by employing a transformer-based architecture that incorporates three pretext tasks as learning objectives to be optimized during pre-training without the usage of labelled data. Each of the pretext objectives is specifically tailored for the final downstream tasks. We conduct several ablation experiments that confirm the design choice of the selected pretext tasks. Importantly, the proposed model does not exhibit limitations of previous state-of-the-art methods based on contrastive losses, while at the same time requiring substantially fewer data samples to converge. Finally, we demonstrate that our method surpasses the state-of-the-art in existing supervised and self-supervised settings in handwritten and scene text recognition and document image enhancement. Our code and trained models will be made publicly available at https://github.com/dali92002/SSL-OCR | ||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | AAAI | ||
Notes | DAG | Approved | no | ||
Call Number ![]() |
Admin @ si @ SBM2023 | Serial | 3848 | ||
Permanent link to this record | |||||
Author | Md.Mostafa Kamal Sarker; Syeda Furruka Banu; Hatem A. Rashwan; Mohamed Abdel-Nasser; Vivek Kumar Singh; Sylvie Chambon; Petia Radeva; Domenec Puig | ||||
Title | Food Places Classification in Egocentric Images Using Siamese Neural Networks | Type | Conference Article | ||
Year | 2019 | Publication | 22nd International Conference of the Catalan Association of Artificial Intelligence | Abbreviated Journal | |
Volume | Issue | Pages | 145-151 | ||
Keywords | |||||
Abstract | Wearable cameras are become more popular in recent years for capturing the unscripted moments of the first-person that help to analyze the users lifestyle. In this work, we aim to recognize the places related to food in egocentric images during a day to identify the daily food patterns of the first-person. Thus, this system can assist to improve their eating behavior to protect users against food-related diseases. In this paper, we use Siamese Neural Networks to learn the similarity between images from corresponding inputs for one-shot food places classification. We tested our proposed method with ‘MiniEgoFoodPlaces’ with 15 food related places. The proposed Siamese Neural Networks model with MobileNet achieved an overall classification accuracy of 76.74% and 77.53% on the validation and test sets of the “MiniEgoFoodPlaces” dataset, respectively outperforming with the base models, such as ResNet50, InceptionV3, and InceptionResNetV2. | ||||
Address | Illes Balears; October 2019 | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | CCIA | ||
Notes | MILAB; no proj | Approved | no | ||
Call Number ![]() |
Admin @ si @ SBR2019 | Serial | 3368 | ||
Permanent link to this record | |||||
Author | C. Sbert; A.F. Sole | ||||
Title | Stereo reconstruction of 3D curves. | Type | Miscellaneous | ||
Year | 2000 | Publication | 15 th International Conference on Pattern Recognition, 1:912–915. | Abbreviated Journal | |
Volume | Issue | Pages | |||
Keywords | |||||
Abstract | |||||
Address | Barcelona. | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | |||
Notes | Approved | no | |||
Call Number ![]() |
Admin @ si @ SbS2000 | Serial | 219 | ||
Permanent link to this record | |||||
Author | Carles Sanchez; Jorge Bernal; F. Javier Sanchez; Antoni Rosell; Marta Diez-Ferrer; Debora Gil | ||||
Title | Towards On-line Quantification of Tracheal Stenosis from Videobronchoscopy | Type | Journal Article | ||
Year | 2015 | Publication | International Journal of Computer Assisted Radiology and Surgery | Abbreviated Journal | IJCAR |
Volume | 10 | Issue | 6 | Pages | 935-945 |
Keywords | |||||
Abstract | |||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | |||
Notes | IAM; MV; 600.075 | Approved | no | ||
Call Number ![]() |
Admin @ si @ SBS2015a | Serial | 2611 | ||
Permanent link to this record | |||||
Author | Carles Sanchez; Jorge Bernal; F. Javier Sanchez; Marta Diez-Ferrer; Antoni Rosell; Debora Gil | ||||
Title | Towards On-line Quantification of Tracheal Stenosis from Videobronchoscopy | Type | Conference Article | ||
Year | 2015 | Publication | 6th International Conference on Information Processing in Computer-Assisted Interventions IPCAI2015 | Abbreviated Journal | |
Volume | 10 | Issue | 6 | Pages | 935-945 |
Keywords | |||||
Abstract | PURPOSE:
Lack of objective measurement of tracheal obstruction degree has a negative impact on the chosen treatment prone to lead to unnecessary repeated explorations and other scanners. Accurate computation of tracheal stenosis in videobronchoscopy would constitute a breakthrough for this noninvasive technique and a reduction in operation cost for the public health service. METHODS: Stenosis calculation is based on the comparison of the region delimited by the lumen in an obstructed frame and the region delimited by the first visible ring in a healthy frame. We propose a parametric strategy for the extraction of lumen and tracheal ring regions based on models of their geometry and appearance that guide a deformable model. To ensure a systematic applicability, we present a statistical framework to choose optimal parametric values and a strategy to choose the frames that minimize the impact of scope optical distortion. RESULTS: Our method has been tested in 40 cases covering different stenosed tracheas. Experiments report a non- clinically relevant [Formula: see text] of discrepancy in the calculated stenotic area and a computational time allowing online implementation in the operating room. CONCLUSIONS: Our methodology allows reliable measurements of airway narrowing in the operating room. To fully assess its clinical impact, a prospective clinical trial should be done. |
||||
Address | Barcelona; Spain; June 2015 | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | IPCAI | ||
Notes | IAM; MV; 600.075 | Approved | no | ||
Call Number ![]() |
Admin @ si @ SBS2015b | Serial | 2613 | ||
Permanent link to this record | |||||
Author | F. Javier Sanchez; Jorge Bernal; Cristina Sanchez Montes; Cristina Rodriguez de Miguel; Gloria Fernandez Esparrach | ||||
Title | Bright spot regions segmentation and classification for specular highlights detection in colonoscopy videos | Type | Journal Article | ||
Year | 2017 | Publication | Machine Vision and Applications | Abbreviated Journal | MVAP |
Volume | Issue | Pages | 1-20 | ||
Keywords | Specular highlights; bright spot regions segmentation; region classification; colonoscopy | ||||
Abstract | A novel specular highlights detection method in colonoscopy videos is presented. The method is based on a model of appearance dening specular
highlights as bright spots which are highly contrasted with respect to adjacent regions. Our approach proposes two stages; segmentation, and then classication of bright spot regions. The former denes a set of candidate regions obtained through a region growing process with local maxima as initial region seeds. This process creates a tree structure which keeps track, at each growing iteration, of the region frontier contrast; nal regions provided depend on restrictions over contrast value. Non-specular regions are ltered through a classication stage performed by a linear SVM classier using model-based features from each region. We introduce a new validation database with more than 25; 000 regions along with their corresponding pixel-wise annotations. We perform a comparative study against other approaches. Results show that our method is superior to other approaches, with our segmented regions being closer to actual specular regions in the image. Finally, we also present how our methodology can also be used to obtain an accurate prediction of polyp histology. |
||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | |||
Notes | MV; 600.096; 600.175 | Approved | no | ||
Call Number ![]() |
Admin @ si @ SBS2017 | Serial | 2975 | ||
Permanent link to this record | |||||
Author | Bonifaz Stuhr; Jurgen Brauer; Bernhard Schick; Jordi Gonzalez | ||||
Title | Masked Discriminators for Content-Consistent Unpaired Image-to-Image Translation | Type | Miscellaneous | ||
Year | 2023 | Publication | Arxiv | Abbreviated Journal | |
Volume | Issue | Pages | |||
Keywords | |||||
Abstract | A common goal of unpaired image-to-image translation is to preserve content consistency between source images and translated images while mimicking the style of the target domain. Due to biases between the datasets of both domains, many methods suffer from inconsistencies caused by the translation process. Most approaches introduced to mitigate these inconsistencies do not constrain the discriminator, leading to an even more ill-posed training setup. Moreover, none of these approaches is designed for larger crop sizes. In this work, we show that masking the inputs of a global discriminator for both domains with a content-based mask is sufficient to reduce content inconsistencies significantly. However, this strategy leads to artifacts that can be traced back to the masking process. To reduce these artifacts, we introduce a local discriminator that operates on pairs of small crops selected with a similarity sampling strategy. Furthermore, we apply this sampling strategy to sample global input crops from the source and target dataset. In addition, we propose feature-attentive denormalization to selectively incorporate content-based statistics into the generator stream. In our experiments, we show that our method achieves state-of-the-art performance in photorealistic sim-to-real translation and weather translation and also performs well in day-to-night translation. Additionally, we propose the cKVD metric, which builds on the sKVD metric and enables the examination of translation quality at the class or category level. | ||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | |||
Notes | ISE | Approved | no | ||
Call Number ![]() |
Admin @ si @ SBS2023 | Serial | 3863 | ||
Permanent link to this record | |||||
Author | Hassan Ahmed Sial; Ramon Baldrich; Maria Vanrell | ||||
Title | Deep intrinsic decomposition trained on surreal scenes yet with realistic light effects | Type | Journal Article | ||
Year | 2020 | Publication | Journal of the Optical Society of America A | Abbreviated Journal | JOSA A |
Volume | 37 | Issue | 1 | Pages | 1-15 |
Keywords | |||||
Abstract | Estimation of intrinsic images still remains a challenging task due to weaknesses of ground-truth datasets, which either are too small or present non-realistic issues. On the other hand, end-to-end deep learning architectures start to achieve interesting results that we believe could be improved if important physical hints were not ignored. In this work, we present a twofold framework: (a) a flexible generation of images overcoming some classical dataset problems such as larger size jointly with coherent lighting appearance; and (b) a flexible architecture tying physical properties through intrinsic losses. Our proposal is versatile, presents low computation time, and achieves state-of-the-art results. | ||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | |||
Notes | CIC; 600.140; 600.12; 600.118 | Approved | no | ||
Call Number ![]() |
Admin @ si @ SBV2019 | Serial | 3311 | ||
Permanent link to this record | |||||
Author | Hassan Ahmed Sial; Ramon Baldrich; Maria Vanrell; Dimitris Samaras | ||||
Title | Light Direction and Color Estimation from Single Image with Deep Regression | Type | Conference Article | ||
Year | 2020 | Publication | London Imaging Conference | Abbreviated Journal | |
Volume | Issue | Pages | |||
Keywords | |||||
Abstract | We present a method to estimate the direction and color of the scene light source from a single image. Our method is based on two main ideas: (a) we use a new synthetic dataset with strong shadow effects with similar constraints to the SID dataset; (b) we define a deep architecture trained on the mentioned dataset to estimate the direction and color of the scene light source. Apart from showing good performance on synthetic images, we additionally propose a preliminary procedure to obtain light positions of the Multi-Illumination dataset, and, in this way, we also prove that our trained model achieves good performance when it is applied to real scenes. | ||||
Address | Virtual; September 2020 | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | LIM | ||
Notes | CIC; 600.118; 600.140; | Approved | no | ||
Call Number ![]() |
Admin @ si @ SBV2020 | Serial | 3460 | ||
Permanent link to this record | |||||
Author | Ahmed M. A. Salih; Ilaria Boscolo Galazzo; Zahra Zahra Raisi-Estabragh; Steffen E. Petersen; Polyxeni Gkontra; Karim Lekadir; Gloria Menegaz; Petia Radeva | ||||
Title | A new scheme for the assessment of the robustness of Explainable Methods Applied to Brain Age estimation | Type | Conference Article | ||
Year | 2021 | Publication | 34th International Symposium on Computer-Based Medical Systems | Abbreviated Journal | |
Volume | Issue | Pages | 492-497 | ||
Keywords | |||||
Abstract | Deep learning methods show great promise in a range of settings including the biomedical field. Explainability of these models is important in these fields for building end-user trust and to facilitate their confident deployment. Although several Machine Learning Interpretability tools have been proposed so far, there is currently no recognized evaluation standard to transfer the explainability results into a quantitative score. Several measures have been proposed as proxies for quantitative assessment of explainability methods. However, the robustness of the list of significant features provided by the explainability methods has not been addressed. In this work, we propose a new proxy for assessing the robustness of the list of significant features provided by two explainability methods. Our validation is defined at functionality-grounded level based on the ranked correlation statistical index and demonstrates its successful application in the framework of brain aging estimation. We assessed our proxy to estimate brain age using neuroscience data. Our results indicate small variability and high robustness in the considered explainability methods using this new proxy. | ||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | CBMS | ||
Notes | MILAB; no proj | Approved | no | ||
Call Number ![]() |
Admin @ si @ SBZ2021 | Serial | 3629 | ||
Permanent link to this record | |||||
Author | Albin Soutif; Antonio Carta; Andrea Cossu; Julio Hurtado; Hamed Hemati; Vincenzo Lomonaco; Joost Van de Weijer | ||||
Title | A Comprehensive Empirical Evaluation on Online Continual Learning | Type | Conference Article | ||
Year | 2023 | Publication | Visual Continual Learning (ICCV-W) | Abbreviated Journal | |
Volume | Issue | Pages | |||
Keywords | |||||
Abstract | Online continual learning aims to get closer to a live learning experience by learning directly on a stream of data with temporally shifting distribution and by storing a minimum amount of data from that stream. In this empirical evaluation, we evaluate various methods from the literature that tackle online continual learning. More specifically, we focus on the class-incremental setting in the context of image classification, where the learner must learn new classes incrementally from a stream of data. We compare these methods on the Split-CIFAR100 and Split-TinyImagenet benchmarks, and measure their average accuracy, forgetting, stability, and quality of the representations, to evaluate various aspects of the algorithm at the end but also during the whole training period. We find that most methods suffer from stability and underfitting issues. However, the learned representations are comparable to i.i.d. training under the same computational budget. No clear winner emerges from the results and basic experience replay, when properly tuned and implemented, is a very strong baseline. We release our modular and extensible codebase at this https URL based on the avalanche framework to reproduce our results and encourage future research. | ||||
Address | Paris; France; October 2023 | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | ICCVW | ||
Notes | LAMP | Approved | no | ||
Call Number ![]() |
Admin @ si @ SCC2023 | Serial | 3938 | ||
Permanent link to this record | |||||
Author | Marc Sunset Perez; Marc Comino Trinidad; Dimosthenis Karatzas; Antonio Chica Calaf; Pere Pau Vazquez Alcocer | ||||
Title | Development of general‐purpose projection‐based augmented reality systems | Type | Journal | ||
Year | 2016 | Publication | IADIs international journal on computer science and information systems | Abbreviated Journal | IADIs |
Volume | 11 | Issue | 2 | Pages | 1-18 |
Keywords | |||||
Abstract | Despite the large amount of methods and applications of augmented reality, there is little homogenizatio n on the software platforms that support them. An exception may be the low level control software that is provided by some high profile vendors such as Qualcomm and Metaio. However, these provide fine grain modules for e.g. element tracking. We are more co ncerned on the application framework, that includes the control of the devices working together for the development of the AR experience. In this paper we describe the development of a software framework for AR setups. We concentrate on the modular design of the framework, but also on some hard problems such as the calibration stage, crucial for projection – based AR. The developed framework is suitable and has been tested in AR applications using camera – projector pairs, for both fixed and nomadic setups | ||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | |||
Notes | DAG; 600.084 | Approved | no | ||
Call Number ![]() |
Admin @ si @ SCK2016 | Serial | 2890 | ||
Permanent link to this record |