Home | << 1 2 3 4 5 6 7 8 9 10 >> [11–14] |
Records | |||||
---|---|---|---|---|---|
Author | David Geronimo | ||||
Title | A Global Approach to Vision-Based Pedestrian Detection for Advanced Driver Assistance Systems | Type | Book Whole | ||
Year | 2010 | Publication | PhD Thesis, Universitat Autonoma de Barcelona-CVC | Abbreviated Journal | |
Volume | Issue | Pages | |||
Keywords | |||||
Abstract | At the beginning of the 21th century, traffic accidents have become a major problem not only for developed countries but also for emerging ones. As in other scientific areas in which Artificial Intelligence is becoming a key actor, advanced driver assistance systems, and concretely pedestrian protection systems based on Computer Vision, are becoming a strong topic of research aimed at improving the safety of pedestrians. However, the challenge is of considerable complexity due to the varying appearance of humans (e.g., clothes, size, aspect ratio, shape, etc.), the dynamic nature of on-board systems and the unstructured moving environments that urban scenarios represent. In addition, the required performance is demanding both in terms of computational time and detection rates. In this thesis, instead of focusing on improving specific tasks as it is frequent in the literature, we present a global approach to the problem. Such a global overview starts by the proposal of a generic architecture to be used as a framework both to review the literature and to organize the studied techniques along the thesis. We then focus the research on tasks such as foreground segmentation, object classification and refinement following a general viewpoint and exploring aspects that are not usually analyzed. In order to perform the experiments, we also present a novel pedestrian dataset that consists of three subsets, each one addressed to the evaluation of a different specific task in the system. The results presented in this thesis not only end with a proposal of a pedestrian detection system but also go one step beyond by pointing out new insights, formalizing existing and proposed algorithms, introducing new techniques and evaluating their performance, which we hope will provide new foundations for future research in the area. | ||||
Address | Antonio Lopez;Krystian Mikolajczyk;Jaume Amores;Dariu M. Gavrila;Oriol Pujol;Felipe Lumbreras | ||||
Corporate Author | Thesis | Ph.D. thesis | |||
Publisher | Ediciones Graficas Rey | Place of Publication | Editor | Antonio Lopez;Krystian Mikolajczyk;Jaume Amores;Dariu M. Gavrila;Oriol Pujol;Felipe Lumbreras | |
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | 978-84-936529-5-1 | Medium | ||
Area | Expedition | Conference | |||
Notes | ADAS | Approved | no | ||
Call Number | ADAS @ adas @ Ger2010 | Serial | 1279 | ||
Permanent link to this record | |||||
Author | Marçal Rusiñol; Josep Llados | ||||
Title | Symbol Spotting in Digital Libraries:Focused Retrieval over Graphic-rich Document Collections | Type | Book Whole | ||
Year | 2010 | Publication | Symbol Spotting in Digital Libraries:Focused Retrieval over Graphic-rich Document Collections | Abbreviated Journal | |
Volume | Issue | Pages | |||
Keywords | Focused Retrieval , Graphical Pattern Indexation,Graphics Recognition ,Pattern Recognition , Performance Evaluation , Symbol Description ,Symbol Spotting | ||||
Abstract | The specific problem of symbol recognition in graphical documents requires additional techniques to those developed for character recognition. The most well-known obstacle is the so-called Sayre paradox: Correct recognition requires good segmentation, yet improvement in segmentation is achieved using information provided by the recognition process. This dilemma can be avoided by techniques that identify sets of regions containing useful information. Such symbol-spotting methods allow the detection of symbols in maps or technical drawings without having to fully segment or fully recognize the entire content.
This unique text/reference provides a complete, integrated and large-scale solution to the challenge of designing a robust symbol-spotting method for collections of graphic-rich documents. The book examines a number of features and descriptors, from basic photometric descriptors commonly used in computer vision techniques to those specific to graphical shapes, presenting a methodology which can be used in a wide variety of applications. Additionally, readers are supplied with an insight into the problem of performance evaluation of spotting methods. Some very basic knowledge of pattern recognition, document image analysis and graphics recognition is assumed. |
||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Springer | Place of Publication | Editor | ||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | 978-1-84996-208-7 | Medium | ||
Area | Expedition | Conference | |||
Notes | DAG | Approved | no | ||
Call Number | DAG @ dag @ RuL2010a | Serial | 1292 | ||
Permanent link to this record | |||||
Author | Ignasi Rius | ||||
Title | Motion Priors for Efficient Bayesian Tracking in Human Sequence Evaluation | Type | Book Whole | ||
Year | 2010 | Publication | PhD Thesis, Universitat Autonoma de Barcelona-CVC | Abbreviated Journal | |
Volume | Issue | Pages | |||
Keywords | |||||
Abstract | Recovering human motion by visual analysis is a challenging computer vision research
area with a lot of potential applications. Model-based tracking approaches, and in particular particle lters, formulate the problem as a Bayesian inference task whose aim is to sequentially estimate the distribution of the parameters of a human body model over time. These approaches strongly rely on good dynamical and observation models to predict and update congurations of the human body according to measurements from the image data. However, it is very dicult to design observation models which extract useful and reliable information from image sequences robustly. This results specially challenging in monocular tracking given that only one viewpoint from the scene is available. Therefore, to overcome these limitations strong motion priors are needed to guide the exploration of the state space. The work presented in this Thesis is aimed to retrieve the 3D motion parameters of a human body model from incomplete and noisy measurements of a monocular image sequence. These measurements consist of the 2D positions of a reduced set of joints in the image plane. Towards this end, we present a novel action-specic model of human motion which is trained from several databases of real motion-captured performances of an action, and is used as a priori knowledge within a particle ltering scheme. Body postures are represented by means of a simple and compact stick gure model which uses direction cosines to represent the direction of body limbs in the 3D Cartesian space. Then, for a given action, Principal Component Analysis is applied to the training data to perform dimensionality reduction over the highly correlated input data. Before the learning stage of the action model, the input motion performances are synchronized by means of a novel dense matching algorithm based on Dynamic Programming. The algorithm synchronizes all the motion sequences of the same action class, nding an optimal solution in real-time. Then, a probabilistic action model is learnt, based on the synchronized motion examples, which captures the variability and temporal evolution of full-body motion within a specic action. In particular, for each action, the parameters learnt are: a representative manifold for the action consisting of its mean performance, the standard deviation from the mean performance, the mean observed direction vectors from each motion subsequence of a given length and the expected error at a given time instant. Subsequently, the action-specic model is used as a priori knowledge on human motion which improves the eciency and robustness of the overall particle filtering tracking framework. First, the dynamic model guides the particles according to similar situations previously learnt. Then, the state space is constrained so only feasible human postures are accepted as valid solutions at each time step. As a result, the state space is explored more eciently as the particle set covers the most probable body postures. Finally, experiments are carried out using test sequences from several motion databases. Results point out that our tracker scheme is able to estimate the rough 3D conguration of a full-body model providing only the 2D positions of a reduced set of joints. Separate tests on the sequence synchronization method and the subsequence probabilistic matching technique are also provided. |
||||
Address | |||||
Corporate Author | Thesis | Ph.D. thesis | |||
Publisher | Ediciones Graficas Rey | Place of Publication | Editor | Jordi Gonzalez;Xavier Roca | |
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | 978-84-937261-9-5 | Medium | ||
Area | Expedition | Conference | |||
Notes | Approved | no | |||
Call Number | Admin @ si @ Riu2010 | Serial | 1331 | ||
Permanent link to this record | |||||
Author | Ivan Huerta | ||||
Title | Foreground Object Segmentation and Shadow Detection for Video Sequences in Uncontrolled Environments | Type | Book Whole | ||
Year | 2010 | Publication | PhD Thesis, Universitat Autonoma de Barcelona-CVC | Abbreviated Journal | |
Volume | Issue | Pages | |||
Keywords | |||||
Abstract | This Thesis is mainly divided in two parts. The first one presents a study of motion
segmentation problems. Based on this study, a novel algorithm for mobile-object segmentation from a static background scene is also presented. This approach is demonstrated robust and accurate under most of the common problems in motion segmentation. The second one tackles the problem of shadows in depth. Firstly, a bottom-up approach based on a chromatic shadow detector is presented to deal with umbra shadows. Secondly, a top-down approach based on a tracking system has been developed in order to enhance the chromatic shadow detection. In our first contribution, a case analysis of motion segmentation problems is presented by taking into account the problems associated with different cues, namely colour, edge and intensity. Our second contribution is a hybrid architecture which handles the main problems observed in such a case analysis, by fusing (i) the knowledge from these three cues and (ii) a temporal difference algorithm. On the one hand, we enhance the colour and edge models to solve both global/local illumination changes (shadows and highlights) and camouflage in intensity. In addition, local information is exploited to cope with a very challenging problem such as the camouflage in chroma. On the other hand, the intensity cue is also applied when colour and edge cues are not available, such as when beyond the dynamic range. Additionally, temporal difference is included to segment motion when these three cues are not available, such as that background not visible during the training period. Lastly, the approach is enhanced for allowing ghost detection. As a result, our approach obtains very accurate and robust motion segmentation in both indoor and outdoor scenarios, as quantitatively and qualitatively demonstrated in the experimental results, by comparing our approach with most best-known state-of-the-art approaches. Motion Segmentation has to deal with shadows to avoid distortions when detecting moving objects. Most segmentation approaches dealing with shadow detection are typically restricted to penumbra shadows. Therefore, such techniques cannot cope well with umbra shadows. Consequently, umbra shadows are usually detected as part of moving objects. Firstly, a bottom-up approach for detection and removal of chromatic moving shadows in surveillance scenarios is proposed. Secondly, a top-down approach based on kalman filters to detect and track shadows has been developed in order to enhance the chromatic shadow detection. In the Bottom-up part, the shadow detection approach applies a novel technique based on gradient and colour models for separating chromatic moving shadows from moving objects. Well-known colour and gradient models are extended and improved into an invariant colour cone model and an invariant gradient model, respectively, to perform automatic segmentation while detecting potential shadows. Hereafter, the regions corresponding to potential shadows are grouped by considering ”a bluish effect” and an edge partitioning. Lastly, (i) temporal similarities between local gradient structures and (ii) spatial similarities between chrominance angle and brightness distortions are analysed for all potential shadow regions in order to finally identify umbra shadows. In the top-down process, after detection of objects and shadows both are tracked using Kalman filters, in order to enhance the chromatic shadow detection, when it fails to detect a shadow. Firstly, this implies a data association between the blobs (foreground and shadow) and Kalman filters. Secondly, an event analysis of the different data association cases is performed, and occlusion handling is managed by a Probabilistic Appearance Model (PAM). Based on this association, temporal consistency is looked for the association between foregrounds and shadows and their respective Kalman Filters. From this association several cases are studied, as a result lost chromatic shadows are correctly detected. Finally, the tracking results are used as feedback to improve the shadow and object detection. Unlike other approaches, our method does not make any a-priori assumptions about camera location, surface geometries, surface textures, shapes and types of shadows, objects, and background. Experimental results show the performance and accuracy of our approach in different shadowed materials and illumination conditions. |
||||
Address | |||||
Corporate Author | Thesis | Ph.D. thesis | |||
Publisher | Ediciones Graficas Rey | Place of Publication | Editor | Jordi Gonzalez;Xavier Roca | |
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | 978-84-937261-3-3 | Medium | ||
Area | Expedition | Conference | |||
Notes | Approved | no | |||
Call Number | ISE @ ise @ Hue2010 | Serial | 1332 | ||
Permanent link to this record | |||||
Author | Carles Fernandez | ||||
Title | Understanding Image Sequences: the Role of Ontologies in Cognitive Vision | Type | Book Whole | ||
Year | 2010 | Publication | PhD Thesis, Universitat Autonoma de Barcelona-CVC | Abbreviated Journal | |
Volume | Issue | Pages | |||
Keywords | |||||
Abstract | The increasing ubiquitousness of digital information in our daily lives has positioned
video as a favored information vehicle, and given rise to an astonishing generation of social media and surveillance footage. This raises a series of technological demands for automatic video understanding and management, which together with the compromising attentional limitations of human operators, have motivated the research community to guide its steps towards a better attainment of such capabilities. As a result, current trends on cognitive vision promise to recognize complex events and self-adapt to different environments, while managing and integrating several types of knowledge. Future directions suggest to reinforce the multi-modal fusion of information sources and the communication with end-users. In this thesis we tackle the problem of recognizing and describing meaningful events in video sequences from different domains, and communicating the resulting knowledge to end-users by means of advanced interfaces for human–computer interaction. This problem is addressed by designing the high-level modules of a cognitive vision framework exploiting ontological knowledge. Ontologies allow us to define the relevant concepts in a domain and the relationships among them; we prove that the use of ontologies to organize, centralize, link, and reuse different types of knowledge is a key factor in the materialization of our objectives. The proposed framework contributes to: (i) automatically learn the characteristics of different scenarios in a domain; (ii) reason about uncertain, incomplete, or vague information from visual –camera’s– or linguistic –end-user’s– inputs; (iii) derive plausible interpretations of complex events from basic spatiotemporal developments; (iv) facilitate natural interfaces that adapt to the needs of end-users, and allow them to communicate efficiently with the system at different levels of interaction; and finally, (v) find mechanisms to guide modeling processes, maintain and extend the resulting models, and to exploit multimodal resources synergically to enhance the former tasks. We describe a holistic methodology to achieve these goals. First, the use of prior taxonomical knowledge is proved useful to guide MAP-MRF inference processes in the automatic identification of semantic regions, with independence of a particular scenario. Towards the recognition of complex video events, we combine fuzzy metric-temporal reasoning with SGTs, thus assessing high-level interpretations from spatiotemporal data. Here, ontological resources like T–Boxes, onomasticons, or factual databases become useful to derive video indexing and retrieval capabilities, and also to forward highlighted content to smart user interfaces. There, we explore the application of ontologies to discourse analysis and cognitive linguistic principles, or scene augmentation techniques towards advanced communication by means of natural language dialogs and synthetic visualizations. Ontologies become fundamental to coordinate, adapt, and reuse the different modules in the system. The suitability of our ontological framework is demonstrated by a series of applications that especially benefit the field of smart video surveillance, viz. automatic generation of linguistic reports about the content of video sequences in multiple natural languages; content-based filtering and summarization of these reports; dialogue-based interfaces to query and browse video contents; automatic learning of semantic regions in a scenario; and tools to evaluate the performance of components and models in the system, via simulation and augmented reality. |
||||
Address | |||||
Corporate Author | Thesis | Ph.D. thesis | |||
Publisher | Ediciones Graficas Rey | Place of Publication | Editor | Jordi Gonzalez;Xavier Roca | |
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | 978-84-937261-2-6 | Medium | ||
Area | Expedition | Conference | |||
Notes | Approved | no | |||
Call Number | Admin @ si @ Fer2010a | Serial | 1333 | ||
Permanent link to this record | |||||
Author | Joan Mas | ||||
Title | A Syntactic Pattern Recognition Approach based on a Distribution Tolerant Adjacency Grammar and a Spatial Indexed Parser. Application to Sketched Document Recognition | Type | Book Whole | ||
Year | 2010 | Publication | PhD Thesis, Universitat Autonoma de Barcelona-CVC | Abbreviated Journal | |
Volume | Issue | Pages | |||
Keywords | |||||
Abstract | Sketch recognition is a discipline which has gained an increasing interest in the last
20 years. This is due to the appearance of new devices such as PDA, Tablet PC’s or digital pen & paper protocols. From the wide range of sketched documents we focus on those that represent structured documents such as: architectural floor-plans, engineering drawing, UML diagrams, etc. To recognize and understand these kinds of documents, first we have to recognize the different compounding symbols and then we have to identify the relations between these elements. From the way that a sketch is captured, there are two categories: on-line and off-line. On-line input modes refer to draw directly on a PDA or a Tablet PC’s while off-line input modes refer to scan a previously drawn sketch. This thesis is an overlapping of three different areas on Computer Science: Pattern Recognition, Document Analysis and Human-Computer Interaction. The aim of this thesis is to interpret sketched documents independently on whether they are captured on-line or off-line. For this reason, the proposed approach should contain the following features. First, as we are working with sketches the elements present in our input contain distortions. Second, as we would work in on-line or off-line input modes, the order in the input of the primitives is indifferent. Finally, the proposed method should be applied in real scenarios, its response time must be slow. To interpret a sketched document we propose a syntactic approach. A syntactic approach is composed of two correlated components: a grammar and a parser. The grammar allows describing the different elements on the document as well as their relations. The parser, given a document checks whether it belongs to the language generated by the grammar or not. Thus, the grammar should be able to cope with the distortions appearing on the instances of the elements. Moreover, it would be necessary to define a symbol independently of the order of their primitives. Concerning to the parser when analyzing 2D sentences, it does not assume an order in the primitives. Then, at each new primitive in the input, the parser searches among the previous analyzed symbols candidates to produce a valid reduction. Taking into account these features, we have proposed a grammar based on Adjacency Grammars. This kind of grammars defines their productions as a multiset of symbols rather than a list. This allows describing a symbol without an order in their components. To cope with distortion we have proposed a distortion model. This distortion model is an attributed estimated over the constraints of the grammar and passed through the productions. This measure gives an idea on how far is the symbol from its ideal model. In addition to the distortion on the constraints other distortions appear when working with sketches. These distortions are: overtracing, overlapping, gaps or spurious strokes. Some grammatical productions have been defined to cope with these errors. Concerning the recognition, we have proposed an incremental parser with an indexation mechanism. Incremental parsers analyze the input symbol by symbol given a response to the user when a primitive is analyzed. This makes incremental parser suitable to work in on-line as well as off-line input modes. The parser has been adapted with an indexation mechanism based on a spatial division. This indexation mechanism allows setting the primitives in the space and reducing the search to a neighbourhood. A third contribution is a grammatical inference algorithm. This method given a set of symbols captures the production describing it. In the field of formal languages, different approaches has been proposed but in the graphical domain not so much work is done in this field. The proposed method is able to capture the production from a set of symbol although they are drawn in different order. A matching step based on the Haussdorff distance and the Hungarian method has been proposed to match the primitives of the different symbols. In addition the proposed approach is able to capture the variability in the parameters of the constraints. From the experimental results, we may conclude that we have proposed a robust approach to describe and recognize sketches. Moreover, the addition of new symbols to the alphabet is not restricted to an expert. Finally, the proposed approach has been used in two real scenarios obtaining a good performance. |
||||
Address | |||||
Corporate Author | Thesis | Ph.D. thesis | |||
Publisher | Ediciones Graficas Rey | Place of Publication | Editor | Gemma Sanchez;Josep Llados | |
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | 978-84-937261-4-0 | Medium | ||
Area | Expedition | Conference | |||
Notes | DAG | Approved | no | ||
Call Number | DAG @ dag @ Mas2010 | Serial | 1334 | ||
Permanent link to this record | |||||
Author | Francisco Javier Orozco | ||||
Title | Human Emotion Evaluation on Facial Image Sequences | Type | Book Whole | ||
Year | 2010 | Publication | PhD Thesis, Universitat Autonoma de Barcelona-CVC | Abbreviated Journal | |
Volume | Issue | Pages | |||
Keywords | |||||
Abstract | Psychological evidence has emphasized the importance of affective behaviour understanding due to its high impact in nowadays interaction humans and computers. All
type of affective and behavioural patterns such as gestures, emotions and mental states are highly displayed through the face, head and body. Therefore, this thesis is focused to analyse affective behaviours on head and face. To this end, head and facial movements are encoded by using appearance based tracking methods. Specifically, a wise combination of deformable models captures rigid and non-rigid movements of different kinematics; 3D head pose, eyebrows, mouth, eyelids and irises are taken into account as basis for extracting features from databases of video sequences. This approach combines the strengths of adaptive appearance models, optimization methods and backtracking techniques. For about thirty years, computer sciences have addressed the investigation on human emotions to the automatic recognition of six prototypic emotions suggested by Darwin and systematized by Paul Ekman in the seventies. The Facial Action Coding System (FACS) which uses discrete movements of the face (called Action units or AUs) to code the six facial emotions named anger, disgust, fear, happy-Joy, sadness and surprise. However, human emotions are much complex patterns that have not received the same attention from computer scientists. Simon Baron-Cohen proposed a new taxonomy of emotions and mental states without a system coding of the facial actions. These 426 affective behaviours are more challenging for the understanding of human emotions. Beyond of classically classifying the six basic facial expressions, more subtle gestures, facial actions and spontaneous emotions are considered here. By assessing confidence on the recognition results, exploring spatial and temporal relationships of the features, some methods are combined and enhanced for developing new taxonomy of expressions and emotions. The objective of this dissertation is to develop a computer vision system, including both facial feature extraction, expression recognition and emotion understanding by building a bottom-up reasoning process. Building a detailed taxonomy of human affective behaviours is an interesting challenge for head-face-based image analysis methods. In this paper, we exploit the strengths of Canonical Correlation Analysis (CCA) to enhance an on-line head-face tracker. A relationship between head pose and local facial movements is studied according to their cognitive interpretation on affective expressions and emotions. Active Shape Models are synthesized for AAMs based on CCA-regression. Head pose and facial actions are fused into a maximally correlated space in order to assess expressiveness, confidence and classification in a CBR system. The CBR solutions are also correlated to the cognitive features, which allow avoiding exhaustive search when recognizing new head-face features. Subsequently, Support Vector Machines (SVMs) and Bayesian Networks are applied for learning the spatial relationships of facial expressions. Similarly, the temporal evolution of facial expressions, emotion and mental states are analysed based on Factorized Dynamic Bayesian Networks (FaDBN). As results, the bottom-up system recognizes six facial expressions, six basic emotions and six mental states, plus enhancing this categorization with confidence assessment at each level, intensity of expressions and a complete taxonomy |
||||
Address | |||||
Corporate Author | Thesis | Ph.D. thesis | |||
Publisher | Ediciones Graficas Rey | Place of Publication | Editor | Jordi Gonzalez;Xavier Roca | |
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | 978-84-936529-3-7 | Medium | ||
Area | Expedition | Conference | |||
Notes | Approved | no | |||
Call Number | Admin @ si @ Oro2010 | Serial | 1335 | ||
Permanent link to this record | |||||
Author | Jose Manuel Alvarez | ||||
Title | Combining Context and Appearance for Road Detection | Type | Book Whole | ||
Year | 2010 | Publication | PhD Thesis, Universitat Autonoma de Barcelona-CVC | Abbreviated Journal | |
Volume | Issue | Pages | |||
Keywords | |||||
Abstract | Road traffic crashes have become a major cause of death and injury throughout the world.
Hence, in order to improve road safety, the automobile manufacture is moving towards the development of vehicles with autonomous functionalities such as keeping in the right lane, safe distance keeping between vehicles or regulating the speed of the vehicle according to the traffic conditions. A key component of these systems is vision–based road detection that aims to detect the free road surface ahead the moving vehicle. Detecting the road using a monocular vision system is very challenging since the road is an outdoor scenario imaged from a mobile platform. Hence, the detection algorithm must be able to deal with continuously changing imaging conditions such as the presence ofdifferent objects (vehicles, pedestrians), different environments (urban, highways, off–road), different road types (shape, color), and different imaging conditions (varying illumination, different viewpoints and changing weather conditions). Therefore, in this thesis, we focus on vision–based road detection using a single color camera. More precisely, we first focus on analyzing and grouping pixels according to their low–level properties. In this way, two different approaches are presented to exploit color and photometric invariance. Then, we focus the research of the thesis on exploiting context information. This information provides relevant knowledge about the road not using pixel features from road regions but semantic information from the analysis of the scene. In this way, we present two different approaches to infer the geometry of the road ahead the moving vehicle. Finally, we focus on combining these context and appearance (color) approaches to improve the overall performance of road detection algorithms. The qualitative and quantitative results presented in this thesis on real–world driving sequences show that the proposed method is robust to varying imaging conditions, road types and scenarios going beyond the state–of–the–art. |
||||
Address | |||||
Corporate Author | Thesis | Ph.D. thesis | |||
Publisher | Ediciones Graficas Rey | Place of Publication | Editor | Antonio Lopez;Theo Gevers | |
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | 978-84-937261-8-8 | Medium | ||
Area | Expedition | Conference | |||
Notes | ADAS | Approved | no | ||
Call Number | Admin @ si @ Alv2010 | Serial | 1454 | ||
Permanent link to this record | |||||
Author | Partha Pratim Roy | ||||
Title | Multi-Oriented and Multi-Scaled Text Character Analysis and Recognition in Graphical Documents and their Applications to Document Image Retrieval | Type | Book Whole | ||
Year | 2010 | Publication | PhD Thesis, Universitat Autonoma de Barcelona-CVC | Abbreviated Journal | |
Volume | Issue | Pages | |||
Keywords | |||||
Abstract | With the advent research of Document Image Analysis and Recognition (DIAR), an
important line of research is explored on indexing and retrieval of graphics rich documents. It aims at finding relevant documents relying on segmentation and recognition of text and graphics components underlying in non-standard layout where commercial OCRs can not be applied due to complexity. This thesis is focused towards text information extraction approaches in graphical documents and retrieval of such documents using text information. Automatic text recognition in graphical documents (map, engineering drawing, etc.) involves many challenges because text characters are usually printed in multioriented and multi-scale way along with different graphical objects. Text characters are used to annotate the graphical curve lines and hence, many times they follow curvi-linear paths too. For OCR of such documents, individual text lines and their corresponding words/characters need to be extracted. For recognition of multi-font, multi-scale and multi-oriented characters, we have proposed a feature descriptor for character shape using angular information from contour pixels to take care of the invariance nature. To improve the efficiency of OCR, an approach towards the segmentation of multi-oriented touching strings into individual characters is also discussed. Convex hull based background information is used to segment a touching string into possible primitive segments and later these primitive segments are merged to get optimum segmentation using dynamic programming. To overcome the touching/overlapping problem of text with graphical lines, a character spotting approach using SIFT and skeleton information is included. Afterwards, we propose a novel method to extract individual curvi-linear text lines using the foreground and background information of the characters of the text and a water reservoir concept is used to utilize the background information. We have also formulated the methodologies for graphical document retrieval applications using query words and seals. The retrieval approaches are performed using recognition results of individual components in the document. Given a query text, the system extracts positional knowledge from the query word and uses the same to generate hypothetical locations in the document. Indexing of documents is also performed based on automatic detection of seals from documents containing cluttered background. A seal is characterized by scale and rotation invariant spatial feature descriptors computed from labelled text characters and a concept based on the Generalized Hough Transform is used to locate the seal in documents. |
||||
Address | |||||
Corporate Author | Thesis | Ph.D. thesis | |||
Publisher | Ediciones Graficas Rey | Place of Publication | Editor | Josep Llados;Umapada Pal | |
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | 978-84-937261-7-1 | Medium | ||
Area | Expedition | Conference | |||
Notes | Approved | no | |||
Call Number | Admin @ si @ Roy2010 | Serial | 1455 | ||
Permanent link to this record | |||||
Author | Angel Sappa (ed) | ||||
Title | Computer Graphics and Imaging | Type | Book Whole | ||
Year | 2010 | Publication | Computer Graphics and Imaging | Abbreviated Journal | |
Volume | Issue | Pages | |||
Keywords | |||||
Abstract | |||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | Angel Sappa | ||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | 978–0–88986–836–6 | Medium | ||
Area | Expedition | Conference | CGIM | ||
Notes | ADAS | Approved | no | ||
Call Number | ADAS @ adas @ Sap2010 | Serial | 1468 | ||
Permanent link to this record | |||||
Author | Debora Gil; Jordi Gonzalez; Gemma Sanchez (eds) | ||||
Title | Computer Vision: Advances in Research and Development | Type | Book Whole | ||
Year | 2007 | Publication | Proceedings of the 2nd CVC International Workshop | Abbreviated Journal | |
Volume | Issue | Pages | |||
Keywords | |||||
Abstract | |||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | UAB | Place of Publication | Bellaterra (Spain) | Editor | Debora Gil; Jordi Gonzalez; Gemma Sanchez |
Language | Summary Language | Original Title | |||
Series Editor | Series Title | 2 | Abbreviated Series Title | ||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | 978-84-935251-4-9 | Medium | ||
Area | Expedition | Conference | |||
Notes | IAM; ISE; DAG | Approved | no | ||
Call Number | IAM @ iam @ GGS2007 | Serial | 1493 | ||
Permanent link to this record | |||||
Author | Jaume Garcia | ||||
Title | Statistical Models of the Architecture and Function of the Left Ventricle | Type | Book Whole | ||
Year | 2009 | Publication | PhD Thesis, Universitat Autonoma de Barcelona-CVC | Abbreviated Journal | |
Volume | Issue | Pages | |||
Keywords | |||||
Abstract | Cardiovascular Diseases, specially those affecting the Left Ventricle (LV), are the leading cause of death in developed countries with approximately a 30% of all global deaths. In order to address this public health concern, physicians focus on diagnosis and therapy planning. On one hand, early and accurate detection of Regional Wall Motion Abnormalities (RWMA) significantly contributes to a quick diagnosis and prevents the patient to reach more severe stages. On the other hand, a thouroughly knowledge of the normal gross anatomy of the LV, as well as, the distribution of its muscular fibers is crucial for designing specific interventions and therapies (such as pacemaker implanction). Statistical models obtained from the analysis of different imaging modalities allow the computation of the normal ranges of variation within a given population. Normality models are a valuable tool for the definition of objective criterions quantifying the degree of (anomalous) deviation of the LV function and anatomy for a given subject. The creation of statistical models involve addressing three main issues: extraction of data from images, definition of a common domain for comparison of data across patients and designing appropriate statistical analysis schemes. In this PhD thesis we present generic image processing tools for the creation of statistical models of the LV anatomy and function. On one hand, we use differential geometry concepts to define a computational framework (the Normalized Parametric Domain, NPD) suitable for the comparison and fusion of several clinical scores obtained over the LV. On the other hand, we present a variational approach (the Harmonic Phase Flow, HPF) for the estimation of myocardial motion that provides dense and continuous vector fields without overestimating motion at injured areas. These tools are used for the creation of statistical models. Regarding anatomy, we obtain an atlas jointly modelling, both, LV gross anatomy and fiber architecture. Regarding function, we compute normality patterns of scores characterizing the (global and local) LV function and explore, for the first time, the configuration of local scores better suited for RWMA detection. | ||||
Address | |||||
Corporate Author | Thesis | Ph.D. thesis | |||
Publisher | Ediciones Graficas Rey | Place of Publication | Editor | Debora Gil | |
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | |||
Notes | IAM | Approved | no | ||
Call Number | IAM @ iam @ Gar2009a | Serial | 1499 | ||
Permanent link to this record | |||||
Author | Debora Gil | ||||
Title | Geometric Differential Operators for Shape Modelling | Type | Book Whole | ||
Year | 2004 | Publication | PhD Thesis, Universitat Autonoma de Barcelona-CVC | Abbreviated Journal | |
Volume | Issue | Pages | |||
Keywords | |||||
Abstract | Medical imaging feeds research in many computer vision and image processing fields: image filtering, segmentation, shape recovery, registration, retrieval and pattern matching. Because of their low contrast changes and large variety of artifacts and noise, medical imaging processing techniques relying on an analysis of the geometry of image level sets rather than on intensity values result in more robust treatment. From the starting point of treatment of intravascular images, this PhD thesis ad- dresses the design of differential image operators based on geometric principles for a robust shape modelling and restoration. Among all fields applying shape recovery, we approach filtering and segmentation of image objects. For a successful use in real images, the segmentation process should go through three stages: noise removing, shape modelling and shape recovery. This PhD addresses all three topics, but for the sake of algorithms as automated as possible, techniques for image processing will be designed to satisfy three main principles: a) convergence of the iterative schemes to non-trivial states avoiding image degeneration to a constant image and representing smooth models of the originals; b) smooth asymptotic behav- ior ensuring stabilization of the iterative process; c) fixed parameter values ensuring equal (domain free) performance of the algorithms whatever initial images/shapes. Our geometric approach to the generic equations that model the different processes approached enables defining techniques satisfying all the former requirements. First, we introduce a new curvature-based geometric flow for image filtering achieving a good compromise between noise removing and resemblance to original images. Sec- ond, we describe a new family of diffusion operators that restrict their scope to image level curves and serve to restore smooth closed models from unconnected sets of points. Finally, we design a regularization of snake (distance) maps that ensures its smooth convergence towards any closed shape. Experiments show that performance of the techniques proposed overpasses that of state-of-the-art algorithms. | ||||
Address | |||||
Corporate Author | Thesis | Ph.D. thesis | |||
Publisher | Ediciones Graficas Rey | Place of Publication | Barcelona (Spain) | Editor | Jordi Saludes i Closa;Petia Radeva |
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | 84-933652-0-3 | Medium | prit | |
Area | Expedition | Conference | |||
Notes | IAM; | Approved | no | ||
Call Number | IAM @ iam @ GIL2004 | Serial | 1517 | ||
Permanent link to this record | |||||
Author | Aura Hernandez-Sabate | ||||
Title | Exploring Arterial Dynamics and Structures in IntraVascular Ultrasound Sequences | Type | Book Whole | ||
Year | 2009 | Publication | PhD Thesis, Universitat Autonoma de Barcelona-CVC | Abbreviated Journal | |
Volume | Issue | Pages | |||
Keywords | |||||
Abstract | Cardiovascular diseases are a leading cause of death in developed countries. Most of them are caused by arterial (specially coronary) diseases, mainly caused by plaque accumulation. Such pathology narrows blood flow (stenosis) and affects artery bio- mechanical elastic properties (atherosclerosis). In the last decades, IntraVascular UltraSound (IVUS) has become a usual imaging technique for the diagnosis and follow up of arterial diseases. IVUS is a catheter-based imaging technique which shows a sequence of cross sections of the artery under study. Inspection of a single image gives information about the percentage of stenosis. Meanwhile, inspection of longitudinal views provides information about artery bio-mechanical properties, which can prevent a fatal outcome of the cardiovascular disease. On one hand, dynamics of arteries (due to heart pumping among others) is a major artifact for exploring tissue bio-mechanical properties. On the other one, manual stenosis measurements require a manual tracing of vessel borders, which is a time-consuming task and might suffer from inter-observer variations. This PhD thesis proposes several image processing tools for exploring vessel dy- namics and structures. We present a physics-based model to extract, analyze and correct vessel in-plane rigid dynamics and to retrieve cardiac phase. Furthermore, we introduce a deterministic-statistical method for automatic vessel borders detection. In particular, we address adventitia layer segmentation. An accurate validation pro- tocol to ensure reliable clinical applicability of the methods is a crucial step in any proposal of an algorithm. In this thesis we take special care in designing a valida- tion protocol for each approach proposed and we contribute to the in vivo dynamics validation with a quantitative and objective score to measure the amount of motion suppressed. | ||||
Address | |||||
Corporate Author | Thesis | Ph.D. thesis | |||
Publisher | Ediciones Graficas Rey | Place of Publication | Editor | Debora Gil | |
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | 978-84-937261-6-4 | Medium | ||
Area | Expedition | Conference | |||
Notes | IAM; | Approved | no | ||
Call Number | IAM @ iam @ Her2009 | Serial | 1543 | ||
Permanent link to this record | |||||
Author | Albert Clapes | ||||
Title | Learning to recognize human actions: from hand-crafted to deep-learning based visual representations | Type | Book Whole | ||
Year | 2019 | Publication | PhD Thesis, Universitat de Barcelona-CVC | Abbreviated Journal | |
Volume | Issue | Pages | |||
Keywords | |||||
Abstract | Action recognition is a very challenging and important problem in computer vision. Researchers working on this field aspire to provide computers with the abil ity to visually perceive human actions – that is, to observe, interpret, and under stand human-related events that occur in the physical environment merely from visual data. The applications of this technology are numerous: human-machine interaction, e-health, monitoring/surveillance, and content-based video retrieval, among others. Hand-crafted methods dominated the field until the apparition of the first successful deep learning-based action recognition works. Although ear lier deep-based methods underperformed with respect to hand-crafted approaches, these slowly but steadily improved to become state-of-the-art, eventually achieving better results than hand-crafted ones. Still, hand-crafted approaches can be advan tageous in certain scenarios, specially when not enough data is available to train very large deep models or simply to be combined with deep-based methods to fur ther boost the performance. Hence, showing how hand-crafted features can provide extra knowledge the deep networks are notable to easily learn about human actions.
This Thesis concurs in time with this change of paradigm and, hence, reflects it into two distinguished parts. In the first part, we focus on improving current suc cessful hand-crafted approaches for action recognition and we do so from three dif ferent perspectives. Using the dense trajectories framework as a backbone: first, we explore the use of multi-modal and multi-view input data to enrich the trajectory de scriptors. Second, we focus on the classification part of action recognition pipelines and propose an ensemble learning approach, where each classifier leams from a different set of local spatiotemporal features to then combine their outputs following an strategy based on the Dempster-Shaffer Theory. And third, we propose a novel hand-crafted feature extraction method that constructs a rnid-level feature descrip tion to better modellong-term spatiotemporal dynarnics within action videos. Moving to the second part of the Thesis, we start with a comprehensive study of the current deep-learning based action recognition methods. We review both fun damental and cutting edge methodologies reported during the last few years and introduce a taxonomy of deep-leaming methods dedicated to action recognition. In particular, we analyze and discuss how these handle the temporal dimension of data. Last but not least, we propose a residual recurrent network for action recogni tion that naturally integrates all our previous findings in a powerful and prornising framework. |
||||
Address | January 2019 | ||||
Corporate Author | Thesis | Ph.D. thesis | |||
Publisher | Ediciones Graficas Rey | Place of Publication | Editor | Sergio Escalera | |
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | 978-84-948531-2-8 | Medium | ||
Area | Expedition | Conference | |||
Notes | HUPBA | Approved | no | ||
Call Number | Admin @ si @ Cla2019 | Serial | 3219 | ||
Permanent link to this record |