Home | << 1 2 3 4 5 6 7 8 9 10 >> [11–14] |
Records | |||||
---|---|---|---|---|---|
Author | Lu Yu | ||||
Title | Semantic Representation: From Color to Deep Embeddings | Type | Book Whole | ||
Year | 2019 | Publication | PhD Thesis, Universitat Autonoma de Barcelona-CVC | Abbreviated Journal | |
Volume | Issue | Pages | |||
Keywords | |||||
Abstract | One of the fundamental problems of computer vision is to represent images with compact semantically relevant embeddings. These embeddings could then be used in a wide variety of applications, such as image retrieval, object detection, and video search. The main objective of this thesis is to study image embeddings from two aspects: color embeddings and deep embeddings.
In the first part of the thesis we start from hand-crafted color embeddings. We propose a method to order the additional color names according to their complementary nature with the basic eleven color names. This allows us to compute color name representations with high discriminative power of arbitrary length. Psychophysical experiments confirm that our proposed method outperforms baseline approaches. Secondly, we learn deep color embeddings from weakly labeled data by adding an attention strategy. The attention branch is able to correctly identify the relevant regions for each class. The advantage of our approach is that it can learn color names for specific domains for which no pixel-wise labels exists. In the second part of the thesis, we focus on deep embeddings. Firstly, we address the problem of compressing large embedding networks into small networks, while maintaining similar performance. We propose to distillate the metrics from a teacher network to a student network. Two new losses are introduced to model the communication of a deep teacher network to a small student network: one based on an absolute teacher, where the student aims to produce the same embeddings as the teacher, and one based on a relative teacher, where the distances between pairs of data points is communicated from the teacher to the student. In addition, various aspects of distillation have been investigated for embeddings, including hint and attention layers, semi-supervised learning and cross quality distillation. Finally, another aspect of deep metric learning, namely lifelong learning, is studied. We observed some drift occurs during training of new tasks for metric learning. A method to estimate the semantic drift based on the drift which is experienced by data of the current task during its training is introduced. Having this estimation, previous tasks can be compensated for this drift, thereby improving their performance. Furthermore, we show that embedding networks suffer significantly less from catastrophic forgetting compared to classification networks when learning new tasks. |
||||
Address | November 2019 | ||||
Corporate Author | Thesis | Ph.D. thesis | |||
Publisher | Ediciones Graficas Rey | Place of Publication | Editor | Joost Van de Weijer;Yongmei Cheng | |
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | 978-84-121011-3-3 | Medium | ||
Area | Expedition | Conference | |||
Notes | LAMP; 600.120 | Approved | no | ||
Call Number | Admin @ si @ Yu2019 | Serial | 3394 | ||
Permanent link to this record | |||||
Author | Manuel Carbonell | ||||
Title | Neural Information Extraction from Semi-structured Documents A | Type | Book Whole | ||
Year | 2020 | Publication | PhD Thesis, Universitat Autonoma de Barcelona-CVC | Abbreviated Journal | |
Volume | Issue | Pages | |||
Keywords | |||||
Abstract | Sectors as fintech, legaltech or insurance process an inflow of millions of forms, invoices, id documents, claims or similar every day. Together with these, historical archives provide gigantic amounts of digitized documents containing useful information that needs to be stored in machine encoded text with a meaningful structure. This procedure, known as information extraction (IE) comprises the steps of localizing and recognizing text, identifying named entities contained in it and optionally finding relationships among its elements. In this work we explore multi-task neural models at image and graph level to solve all steps in a unified way. While doing so we find benefits and limitations of these end-to-end approaches in comparison with sequential separate methods. More specifically, we first propose a method to produce textual as well as semantic labels with a unified model from handwritten text line images. We do so with the use of a convolutional recurrent neural model trained with connectionist temporal classification to predict the textual as well as semantic information encoded in the images. Secondly, motivated by the success of this approach we investigate the unification of the localization and recognition tasks of handwritten text in full pages with an end-to-end model, observing benefits in doing so. Having two models that tackle information extraction subsequent task pairs in an end-to-end to end manner, we lastly contribute with a method to put them all together in a single neural network to solve the whole information extraction pipeline in a unified way. Doing so we observe some benefits and some limitations in the approach, suggesting that in certain cases it is beneficial to train specialized models that excel at a single challenging task of the information extraction process, as it can be the recognition of named entities or the extraction of relationships between them. For this reason we lastly study the use of the recently arrived graph neural network architectures for the semantic tasks of the information extraction process, which are recognition of named entities and relation extraction, achieving promising results on the relation extraction part. | ||||
Address | |||||
Corporate Author | Thesis | Ph.D. thesis | |||
Publisher | Ediciones Graficas Rey | Place of Publication | Editor | Alicia Fornes;Mauricio Villegas;Josep Llados | |
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | 978-84-122714-1-6 | Medium | ||
Area | Expedition | Conference | |||
Notes | DAG; 600.121 | Approved | no | ||
Call Number | Admin @ si @ Car20 | Serial | 3483 | ||
Permanent link to this record | |||||
Author | Marçal Rusiñol | ||||
Title | Geometric and Structural-based Symbol Spotting. Application to Focused Retrieval in Graphic Document Collections | Type | Book Whole | ||
Year | 2009 | Publication | PhD Thesis, Universitat Autonoma de Barcelona-CVC | Abbreviated Journal | |
Volume | Issue | Pages | |||
Keywords | |||||
Abstract | Usually, pattern recognition systems consist of two main parts. On the one hand, the data acquisition and, on the other hand, the classification of this data on a certain category. In order to recognize which category a certain query element belongs to, a set of pattern models must be provided beforehand. An off-line learning stage is needed to train the classifier and to offer a robust classification of the patterns. Within the pattern recognition field, we are interested in the recognition of graphics and, in particular, on the analysis of documents rich in graphical information. In this context, one of the main concerns is to see if the proposed systems remain scalable with respect to the data volume so as it can handle growing amounts of symbol models. In order to avoid to work with a database of reference symbols, symbol spotting and on-the-fly symbol recognition methods have been introduced in the past years.
Generally speaking, the symbol spotting problem can be defined as the identification of a set of regions of interest from a document image which are likely to contain an instance of a certain queriedn symbol without explicitly applying the whole pattern recognition scheme. Our application framework consists on indexing a collection of graphic-rich document images. This collection is queried by example with a single instance of the symbol to look for and, by means of symbol spotting methods we retrieve the regions of interest where the symbol is likely to appear within the documents. This kind of applications are known as focused retrieval methods. In order that the focused retrieval application can handle large collections of documents there is a need to provide an efficient access to the large volume of information that might be stored. We use indexing strategies in order to efficiently retrieve by similarity the locations where a certain part of the symbol appears. In that scenario, graphical patterns should be used as indices for accessing and navigating the collection of documents. These indexing mechanism allow the user to search for similar elements using graphical information rather than textual queries. Along this thesis we present a spotting architecture and different methods aiming to build a complete focused retrieval application dealing with a graphic-rich document collections. In addition, a protocol to evaluate the performance of symbol spotting systems in terms of recognition abilities, location accuracy and scalability is proposed. |
||||
Address | Barcelona (Spain) | ||||
Corporate Author | Thesis | Ph.D. thesis | |||
Publisher | Ediciones Graficas Rey | Place of Publication | Editor | Josep Llados | |
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | |||
Notes | DAG | Approved | no | ||
Call Number | DAG @ dag @ Rus2009 | Serial | 1264 | ||
Permanent link to this record | |||||
Author | Marçal Rusiñol; Josep Llados | ||||
Title | Symbol Spotting in Digital Libraries:Focused Retrieval over Graphic-rich Document Collections | Type | Book Whole | ||
Year | 2010 | Publication | Symbol Spotting in Digital Libraries:Focused Retrieval over Graphic-rich Document Collections | Abbreviated Journal | |
Volume | Issue | Pages | |||
Keywords | Focused Retrieval , Graphical Pattern Indexation,Graphics Recognition ,Pattern Recognition , Performance Evaluation , Symbol Description ,Symbol Spotting | ||||
Abstract | The specific problem of symbol recognition in graphical documents requires additional techniques to those developed for character recognition. The most well-known obstacle is the so-called Sayre paradox: Correct recognition requires good segmentation, yet improvement in segmentation is achieved using information provided by the recognition process. This dilemma can be avoided by techniques that identify sets of regions containing useful information. Such symbol-spotting methods allow the detection of symbols in maps or technical drawings without having to fully segment or fully recognize the entire content.
This unique text/reference provides a complete, integrated and large-scale solution to the challenge of designing a robust symbol-spotting method for collections of graphic-rich documents. The book examines a number of features and descriptors, from basic photometric descriptors commonly used in computer vision techniques to those specific to graphical shapes, presenting a methodology which can be used in a wide variety of applications. Additionally, readers are supplied with an insight into the problem of performance evaluation of spotting methods. Some very basic knowledge of pattern recognition, document image analysis and graphics recognition is assumed. |
||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Springer | Place of Publication | Editor | ||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | 978-1-84996-208-7 | Medium | ||
Area | Expedition | Conference | |||
Notes | DAG | Approved | no | ||
Call Number | DAG @ dag @ RuL2010a | Serial | 1292 | ||
Permanent link to this record | |||||
Author | Marc Masana | ||||
Title | Lifelong Learning of Neural Networks: Detecting Novelty and Adapting to New Domains without Forgetting | Type | Book Whole | ||
Year | 2020 | Publication | PhD Thesis, Universitat Autonoma de Barcelona-CVC | Abbreviated Journal | |
Volume | Issue | Pages | |||
Keywords | |||||
Abstract | Computer vision has gone through considerable changes in the last decade as neural networks have come into common use. As available computational capabilities have grown, neural networks have achieved breakthroughs in many computer vision tasks, and have even surpassed human performance in others. With accuracy being so high, focus has shifted to other issues and challenges. One research direction that saw a notable increase in interest is on lifelong learning systems. Such systems should be capable of efficiently performing tasks, identifying and learning new ones, and should moreover be able to deploy smaller versions of themselves which are experts on specific tasks. In this thesis, we contribute to research on lifelong learning and address the compression and adaptation of networks to small target domains, the incremental learning of networks faced with a variety of tasks, and finally the detection of out-of-distribution samples at inference time.
We explore how knowledge can be transferred from large pretrained models to more task-specific networks capable of running on smaller devices by extracting the most relevant information. Using a pretrained model provides more robust representations and a more stable initialization when learning a smaller task, which leads to higher performance and is known as domain adaptation. However, those models are too large for certain applications that need to be deployed on devices with limited memory and computational capacity. In this thesis we show that, after performing domain adaptation, some learned activations barely contribute to the predictions of the model. Therefore, we propose to apply network compression based on low-rank matrix decomposition using the activation statistics. This results in a significant reduction of the model size and the computational cost. Like human intelligence, machine intelligence aims to have the ability to learn and remember knowledge. However, when a trained neural network is presented with learning a new task, it ends up forgetting previous ones. This is known as catastrophic forgetting and its avoidance is studied in continual learning. The work presented in this thesis extensively surveys continual learning techniques and presents an approach to avoid catastrophic forgetting in sequential task learning scenarios. Our technique is based on using ternary masks in order to update a network to new tasks, reusing the knowledge of previous ones while not forgetting anything about them. In contrast to earlier work, our masks are applied to the activations of each layer instead of the weights. This considerably reduces the number of parameters to be added for each new task. Furthermore, the analysis on a wide range of work on incremental learning without access to the task-ID, provides insight on current state-of-the-art approaches that focus on avoiding catastrophic forgetting by using regularization, rehearsal of previous tasks from a small memory, or compensating the task-recency bias. Neural networks trained with a cross-entropy loss force the outputs of the model to tend toward a one-hot encoded vector. This leads to models being too overly confident when presented with images or classes that were not present in the training distribution. The capacity of a system to be aware of the boundaries of the learned tasks and identify anomalies or classes which have not been learned yet is key to lifelong learning and autonomous systems. In this thesis, we present a metric learning approach to out-of-distribution detection that learns the task at hand on an embedding space. |
||||
Address | |||||
Corporate Author | Thesis | Ph.D. thesis | |||
Publisher | Ediciones Graficas Rey | Place of Publication | Editor | Joost Van de Weijer;Andrew Bagdanov | |
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | 978-84-121011-9-5 | Medium | ||
Area | Expedition | Conference | |||
Notes | LAMP; 600.120 | Approved | no | ||
Call Number | Admin @ si @ Mas20 | Serial | 3481 | ||
Permanent link to this record | |||||
Author | Marc Serra | ||||
Title | Modeling, estimation and evaluation of intrinsic images considering color information | Type | Book Whole | ||
Year | 2015 | Publication | PhD Thesis, Universitat Autonoma de Barcelona-CVC | Abbreviated Journal | |
Volume | Issue | Pages | |||
Keywords | |||||
Abstract | Image values are the result of a combination of visual information coming from multiple sources. Recovering information from the multiple factors thatproduced an image seems a hard and ill-posed problem. However, it is important to observe that humans develop the ability to interpret images and recognize and isolate specific physical properties of the scene.
Images describing a single physical characteristic of an scene are called intrinsic images. These images would benefit most computer vision tasks which are often affected by the multiple complex effects that are usually found in natural images (e.g. cast shadows, specularities, interreflections...). In this thesis we analyze the problem of intrinsic image estimation from different perspectives, including the theoretical formulation of the problem, the visual cues that can be used to estimate the intrinsic components and the evaluation mechanisms of the problem. |
||||
Address | September 2015 | ||||
Corporate Author | Thesis | Ph.D. thesis | |||
Publisher | Ediciones Graficas Rey | Place of Publication | Editor | Robert Benavente;Olivier Penacchio | |
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | 978-84-943427-4-5 | Medium | ||
Area | Expedition | Conference | |||
Notes | CIC; 600.074 | Approved | no | ||
Call Number | Admin @ si @ Ser2015 | Serial | 2688 | ||
Permanent link to this record | |||||
Author | Marco Pedersoli | ||||
Title | Hierarchical Multiresolution Models for fast Object Detection | Type | Book Whole | ||
Year | 2012 | Publication | PhD Thesis, Universitat Autonoma de Barcelona-CVC | Abbreviated Journal | |
Volume | Issue | Pages | |||
Keywords | |||||
Abstract | The ability to automatically detect and recognize objects in unconstrained images is becoming more and more critical: from security systems and autonomous robots, to smart phones and augmented reality, intelligent devices need to understand the meaning of images as a composition of semantic objects. This Thesis tackles the problem of fast object detection based on template models. Detection consists of searching for an object in an image by evaluating the similarity between a template model and an image region at each possible location and scale. In this work, we show that using a template model representation based on a multiple resolution hierarchy is an optimal choice that can lead to excellent detection accuracy and fast computation. We implement two different approaches that make use of a hierarchy of multiresolution models: a multiresolution cascade and a coarse-to-fine search. Also, we extend the coarse-to-fine search by introducing a deformable part-based model that achieves state-of-the-art results together with a very reduced computational cost. Finally, we specialize our approach to the challenging task of pedestrian detection from moving vehicles and show that the overall quality of the system outperforms previous works in terms of speed and accuracy. | ||||
Address | |||||
Corporate Author | Thesis | Ph.D. thesis | |||
Publisher | Ediciones Graficas Rey | Place of Publication | Editor | Jordi Gonzalez;Xavier Roca | |
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | |||
Notes | ISE | Approved | no | ||
Call Number | Admin @ si @ Ped2012 | Serial | 2203 | ||
Permanent link to this record | |||||
Author | Marina Alberti | ||||
Title | Detection and Alignment of Vascular Structures in Intravascular Ultrasound using Pattern Recognition Techniques | Type | Book Whole | ||
Year | 2013 | Publication | PhD Thesis, Universitat de Barcelona-CVC | Abbreviated Journal | |
Volume | Issue | Pages | |||
Keywords | |||||
Abstract | In this thesis, several methods for the automatic analysis of Intravascular Ultrasound
(IVUS) sequences are presented, aimed at assisting physicians in the diagnosis, the assessment of the intervention and the monitoring of the patients with coronary disease. The basis for the developed frameworks are machine learning, pattern recognition and image processing techniques. First, a novel approach for the automatic detection of vascular bifurcations in IVUS is presented. The task is addressed as a binary classication problem (identifying bifurcation and non-bifurcation angular sectors in the sequence images). The multiscale stacked sequential learning algorithm is applied, to take into account the spatial and temporal context in IVUS sequences, and the results are rened using a-priori information about branching dimensions and geometry. The achieved performance is comparable to intra- and inter-observer variability. Then, we propose a novel method for the automatic non-rigid alignment of IVUS sequences of the same patient, acquired at dierent moments (before and after percutaneous coronary intervention, or at baseline and follow-up examinations). The method is based on the description of the morphological content of the vessel, obtained by extracting temporal morphological proles from the IVUS acquisitions, by means of methods for segmentation, characterization and detection in IVUS. A technique for non-rigid sequence alignment – the Dynamic Time Warping algorithm - is applied to the proles and adapted to the specic clinical problem. Two dierent robust strategies are proposed to address the partial overlapping between frames of corresponding sequences, and a regularization term is introduced to compensate for possible errors in the prole extraction. The benets of the proposed strategy are demonstrated by extensive validation on synthetic and in-vivo data. The results show the interest of the proposed non-linear alignment and the clinical value of the method. Finally, a novel automatic approach for the extraction of the luminal border in IVUS images is presented. The method applies the multiscale stacked sequential learning algorithm and extends it to 2-D+T, in a rst classication phase (the identi- cation of lumen and non-lumen regions of the images), while an active contour model is used in a second phase, to identify the lumen contour. The method is extended to the longitudinal dimension of the sequences and it is validated on a challenging data-set. |
||||
Address | Barcelona | ||||
Corporate Author | Thesis | Ph.D. thesis | |||
Publisher | Ediciones Graficas Rey | Place of Publication | Editor | Simone Balocco;Petia Radeva | |
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | |||
Notes | MILAB | Approved | no | ||
Call Number | Admin @ si @ Alb2013 | Serial | 2215 | ||
Permanent link to this record | |||||
Author | Mario Hernandez; Joao Sanchez; Jordi Vitria | ||||
Title | Selected papers from Iberian Conference on Pattern Recognition and Image Analysis | Type | Book Whole | ||
Year | 2012 | Publication | Pattern Recognition | Abbreviated Journal | |
Volume | 45 | Issue | 9 | Pages | 3047-3582 |
Keywords | |||||
Abstract | |||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | 0031-3203 | ISBN | Medium | ||
Area | Expedition | Conference | |||
Notes | OR;MV | Approved | no | ||
Call Number | Admin @ si @ HSV2012 | Serial | 2069 | ||
Permanent link to this record | |||||
Author | Meysam Madadi | ||||
Title | Human Segmentation, Pose Estimation and Applications | Type | Book Whole | ||
Year | 2017 | Publication | PhD Thesis, Universitat Autonoma de Barcelona-CVC | Abbreviated Journal | |
Volume | Issue | Pages | |||
Keywords | |||||
Abstract | Automatic analyzing humans in photographs or videos has great potential applications in computer vision, including medical diagnosis, sports, entertainment, movie editing and surveillance, just to name a few. Body, face and hand are the most studied components of humans. Body has many variabilities in shape and clothing along with high degrees of freedom in pose. Face has many muscles causing many visible deformity, beside variable shape and hair style. Hand is a small object, moving fast and has high degrees of freedom. Adding human characteristics to all aforementioned variabilities makes human analysis quite a challenging task.
In this thesis, we developed human segmentation in different modalities. In a first scenario, we segmented human body and hand in depth images using example-based shape warping. We developed a shape descriptor based on shape context and class probabilities of shape regions to extract nearest neighbors. We then considered rigid affine alignment vs. nonrigid iterative shape warping. In a second scenario, we segmented face in RGB images using convolutional neural networks (CNN). We modeled conditional random field with recurrent neural networks. In our model pair-wise kernels are not fixed and learned during training. We trained the network end-to-end using adversarial networks which improved hair segmentation by a high margin. We also worked on 3D hand pose estimation in depth images. In a generative approach, we fitted a finger model separately for each finger based on our example-based rigid hand segmentation. We minimized an energy function based on overlapping area, depth discrepancy and finger collisions. We also applied linear models in joint trajectory space to refine occluded joints based on visible joints error and invisible joints trajectory smoothness. In a CNN-based approach, we developed a tree-structure network to train specific features for each finger and fused them for global pose consistency. We also formulated physical and appearance constraints as loss functions. Finally, we developed a number of applications consisting of human soft biometrics measurement and garment retexturing. We also generated some datasets in this thesis consisting of human segmentation, synthetic hand pose, garment retexturing and Italian gestures. |
||||
Address | October 2017 | ||||
Corporate Author | Thesis | Ph.D. thesis | |||
Publisher | Ediciones Graficas Rey | Place of Publication | Editor | Sergio Escalera;Jordi Gonzalez | |
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | 978-84-945373-3-2 | Medium | ||
Area | Expedition | Conference | |||
Notes | HUPBA | Approved | no | ||
Call Number | Admin @ si @ Mad2017 | Serial | 3017 | ||
Permanent link to this record | |||||
Author | Michael Teutsch; Angel Sappa; Riad I. Hammoud | ||||
Title | Computer Vision in the Infrared Spectrum: Challenges and Approaches | Type | Book Whole | ||
Year | 2021 | Publication | Synthesis Lectures on Computer Vision | Abbreviated Journal | |
Volume | 10 | Issue | 2 | Pages | 1-138 |
Keywords | |||||
Abstract | Human visual perception is limited to the visual-optical spectrum. Machine vision is not. Cameras sensitive to the different infrared spectra can enhance the abilities of autonomous systems and visually perceive the environment in a holistic way. Relevant scene content can be made visible especially in situations, where sensors of other modalities face issues like a visual-optical camera that needs a source of illumination. As a consequence, not only human mistakes can be avoided by increasing the level of automation, but also machine-induced errors can be reduced that, for example, could make a self-driving car crash into a pedestrian under difficult illumination conditions. Furthermore, multi-spectral sensor systems with infrared imagery as one modality are a rich source of information and can provably increase the robustness of many autonomous systems. Applications that can benefit from utilizing infrared imagery range from robotics to automotive and from biometrics to surveillance. In this book, we provide a brief yet concise introduction to the current state-of-the-art of computer vision and machine learning in the infrared spectrum. Based on various popular computer vision tasks such as image enhancement, object detection, or object tracking, we first motivate each task starting from established literature in the visual-optical spectrum. Then, we discuss the differences between processing images and videos in the visual-optical spectrum and the various infrared spectra. An overview of the current literature is provided together with an outlook for each task. Furthermore, available and annotated public datasets and common evaluation methods and metrics are presented. In a separate chapter, popular applications that can greatly benefit from the use of infrared imagery as a data source are presented and discussed. Among them are automatic target recognition, video surveillance, or biometrics including face recognition. Finally, we conclude with recommendations for well-fitting sensor setups and data processing algorithms for certain computer vision tasks. We address this book to prospective researchers and engineers new to the field but also to anyone who wants to get introduced to the challenges and the approaches of computer vision using infrared images or videos. Readers will be able to start their work directly after reading the book supported by a highly comprehensive backlog of recent and relevant literature as well as related infrared datasets including existing evaluation frameworks. Together with consistently decreasing costs for infrared cameras, new fields of application appear and make computer vision in the infrared spectrum a great opportunity to face nowadays scientific and engineering challenges. | ||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | 978-1636392431 | Medium | ||
Area | Expedition | Conference | |||
Notes | MSIAU | Approved | no | ||
Call Number | Admin @ si @ TSH2021 | Serial | 3666 | ||
Permanent link to this record | |||||
Author | Michal Drozdzal | ||||
Title | Sequential image analysis for computer-aided wireless endoscopy | Type | Book Whole | ||
Year | 2014 | Publication | PhD Thesis, Universitat de Barcelona-CVC | Abbreviated Journal | |
Volume | Issue | Pages | |||
Keywords | |||||
Abstract | Wireless Capsule Endoscopy (WCE) is a technique for inner-visualization of the entire small intestine and, thus, offers an interesting perspective on intestinal motility. The two major drawbacks of this technique are: 1) huge amount of data acquired by WCE makes the motility analysis tedious and 2) since the capsule is the first tool that offers complete inner-visualization of the small intestine,the exact importance of the observed events is still an open issue. Therefore, in this thesis, a novel computer-aided system for intestinal motility analysis is presented. The goal of the system is to provide an easily-comprehensible visual description of motility-related intestinal events to a physician. In order to do so, several tools based either on computer vision concepts or on machine learning techniques are presented. A method for transforming 3D video signal to a holistic image of intestinal motility, called motility bar, is proposed. The method calculates the optimal mapping from video into image from the intestinal motility point of view.
To characterize intestinal motility, methods for automatic extraction of motility information from WCE are presented. Two of them are based on the motility bar and two of them are based on frame-per-frame analysis. In particular, four algorithms dealing with the problems of intestinal contraction detection, lumen size estimation, intestinal content characterization and wrinkle frame detection are proposed and validated. The results of the algorithms are converted into sequential features using an online statistical test. This test is designed to work with multivariate data streams. To this end, we propose a novel formulation of concentration inequality that is introduced into a robust adaptive windowing algorithm for multivariate data streams. The algorithm is used to obtain robust representation of segments with constant intestinal motility activity. The obtained sequential features are shown to be discriminative in the problem of abnormal motility characterization. Finally, we tackle the problem of efficient labeling. To this end, we incorporate active learning concepts to the problems present in WCE data and propose two approaches. The first one is based the concepts of sequential learning and the second one adapts the partition-based active learning to an error-free labeling scheme. All these steps are sufficient to provide an extensive visual description of intestinal motility that can be used by an expert as decision support system. |
||||
Address | |||||
Corporate Author | Thesis | Ph.D. thesis | |||
Publisher | Ediciones Graficas Rey | Place of Publication | Editor | Petia Radeva | |
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | 978-84-940902-3-3 | Medium | ||
Area | Expedition | Conference | |||
Notes | MILAB | Approved | no | ||
Call Number | Admin @ si @ Dro2014 | Serial | 2486 | ||
Permanent link to this record | |||||
Author | Mickael Coustaty; Alicia Fornes | ||||
Title | Document Analysis and Recognition – ICDAR 2023 Workshops | Type | Book Whole | ||
Year | 2023 | Publication | Document Analysis and Recognition – ICDAR 2023 Workshops | Abbreviated Journal | |
Volume | 14194 | Issue | 2 | Pages | |
Keywords | |||||
Abstract | |||||
Address | San Jose; USA; August 2023 | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | LNCS | ||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | ICDAR | ||
Notes | DAG | Approved | no | ||
Call Number | Admin @ si @ CoF2023 | Serial | 3852 | ||
Permanent link to this record | |||||
Author | Miquel Ferrer | ||||
Title | Theory and Algorithms on the Median Graph. Application to Graph-based Classification and Clustering | Type | Book Whole | ||
Year | 2008 | Publication | PhD Thesis, Universitat Autonoma de Barcelona-CVC | Abbreviated Journal | |
Volume | Issue | Pages | |||
Keywords | |||||
Abstract | |||||
Address | |||||
Corporate Author | Thesis | Ph.D. thesis | |||
Publisher | Place of Publication | Editor | Francesc Serratosa Casanelles;Ernest Valveny | ||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | 978-84-935251-7-0 | Conference | ||
Notes | Approved | no | |||
Call Number | Admin @ si @ Fer2008 | Serial | 1105 | ||
Permanent link to this record | |||||
Author | Misael Rosales | ||||
Title | A Physics-Based Image Modelling of IVUS as a Geometric and Kinematic System | Type | Book Whole | ||
Year | 2005 | Publication | PhD Thesis, Universitat Autonoma de Barcelona-CVC | Abbreviated Journal | |
Volume | Issue | Pages | |||
Keywords | |||||
Abstract | |||||
Address | CVC (UAB) | ||||
Corporate Author | Thesis | Ph.D. thesis | |||
Publisher | Place of Publication | Editor | Petia Radeva | ||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | 978-84-922529-8-7 | Conference | ||
Notes | Approved | no | |||
Call Number | Admin @ si @ Ros2005 | Serial | 603 | ||
Permanent link to this record |