Home | [131–140] << 141 142 143 144 145 146 147 148 149 150 >> [151–160] |
![]() |
Records | |||||
---|---|---|---|---|---|
Author | Ajian Liu; Jun Wan; Sergio Escalera; Hugo Jair Escalante; Zichang Tan; Qi Yuan; Kai Wang; Chi Lin; Guodong Guo; Isabelle Guyon; Stan Z. Li | ||||
Title ![]() |
Multi-Modal Face Anti-Spoofing Attack Detection Challenge at CVPR2019 | Type | Conference Article | ||
Year | 2019 | Publication | IEEE International Conference on Computer Vision and Pattern Recognition-Workshop | Abbreviated Journal | |
Volume | Issue | Pages | |||
Keywords | |||||
Abstract | Anti-spoofing attack detection is critical to guarantee the security of face-based authentication and facial analysis systems. Recently, a multi-modal face anti-spoofing dataset, CASIA-SURF, has been released with the goal of boosting research in this important topic. CASIA-SURF is the largest public data set for facial anti-spoofing attack detection in terms of both, diversity and modalities: it comprises 1,000 subjects and 21,000 video samples. We organized a challenge around this novel resource to boost research in the subject. The Chalearn LAP multi-modal face anti-spoofing attack detection challenge attracted more than 300 teams for the development phase with a total of 13 teams qualifying for the final round. This paper presents an overview of the challenge, including its design, evaluation protocol and a summary of results. We analyze the top ranked solutions and draw conclusions derived from the competition. In addition we outline future work directions. | ||||
Address | California; June 2019 | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | CVPRW | ||
Notes | HuPBA; no proj | Approved | no | ||
Call Number | Admin @ si @ LWE2019 | Serial | 3329 | ||
Permanent link to this record | |||||
Author | Jun Wan; Guodong Guo; Sergio Escalera; Hugo Jair Escalante; Stan Z. Li | ||||
Title ![]() |
Multi-modal Face Presentation Attach Detection | Type | Book Whole | ||
Year | 2020 | Publication | Synthesis Lectures on Computer Vision | Abbreviated Journal | |
Volume | 13 | Issue | Pages | ||
Keywords | |||||
Abstract | |||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | |||
Notes | HuPBA | Approved | no | ||
Call Number | Admin @ si @ WGE2020 | Serial | 3440 | ||
Permanent link to this record | |||||
Author | Lichao Zhang; Martin Danelljan; Abel Gonzalez-Garcia; Joost Van de Weijer; Fahad Shahbaz Khan | ||||
Title ![]() |
Multi-Modal Fusion for End-to-End RGB-T Tracking | Type | Conference Article | ||
Year | 2019 | Publication | IEEE International Conference on Computer Vision Workshops | Abbreviated Journal | |
Volume | Issue | Pages | 2252-2261 | ||
Keywords | |||||
Abstract | We propose an end-to-end tracking framework for fusing the RGB and TIR modalities in RGB-T tracking. Our baseline tracker is DiMP (Discriminative Model Prediction), which employs a carefully designed target prediction network trained end-to-end using a discriminative loss. We analyze the effectiveness of modality fusion in each of the main components in DiMP, i.e. feature extractor, target estimation network, and classifier. We consider several fusion mechanisms acting at different levels of the framework, including pixel-level, feature-level and response-level. Our tracker is trained in an end-to-end manner, enabling the components to learn how to fuse the information from both modalities. As data to train our model, we generate a large-scale RGB-T dataset by considering an annotated RGB tracking dataset (GOT-10k) and synthesizing paired TIR images using an image-to-image translation approach. We perform extensive experiments on VOT-RGBT2019 dataset and RGBT210 dataset, evaluating each type of modality fusing on each model component. The results show that the proposed fusion mechanisms improve the performance of the single modality counterparts. We obtain our best results when fusing at the feature-level on both the IoU-Net and the model predictor, obtaining an EAO score of 0.391 on VOT-RGBT2019 dataset. With this fusion mechanism we achieve the state-of-the-art performance on RGBT210 dataset. | ||||
Address | Seul; Corea; October 2019 | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | ICCVW | ||
Notes | LAMP; 600.109; 600.141; 600.120 | Approved | no | ||
Call Number | Admin @ si @ ZDG2019 | Serial | 3279 | ||
Permanent link to this record | |||||
Author | Sergio Escalera; Jordi Gonzalez; Xavier Baro; Miguel Reyes; Oscar Lopes; Isabelle Guyon; V. Athitsos; Hugo Jair Escalante | ||||
Title ![]() |
Multi-modal Gesture Recognition Challenge 2013: Dataset and Results | Type | Conference Article | ||
Year | 2013 | Publication | 15th ACM International Conference on Multimodal Interaction | Abbreviated Journal | |
Volume | Issue | Pages | 445-452 | ||
Keywords | |||||
Abstract | The recognition of continuous natural gestures is a complex and challenging problem due to the multi-modal nature of involved visual cues (e.g. fingers and lips movements, subtle facial expressions, body pose, etc.), as well as technical limitations such as spatial and temporal resolution and unreliable
depth cues. In order to promote the research advance on this field, we organized a challenge on multi-modal gesture recognition. We made available a large video database of 13; 858 gestures from a lexicon of 20 Italian gesture categories recorded with a KinectTM camera, providing the audio, skeletal model, user mask, RGB and depth images. The focus of the challenge was on user independent multiple gesture learning. There are no resting positions and the gestures are performed in continuous sequences lasting 1-2 minutes, containing between 8 and 20 gesture instances in each sequence. As a result, the dataset contains around 1:720:800 frames. In addition to the 20 main gesture categories, ‘distracter’ gestures are included, meaning that additional audio and gestures out of the vocabulary are included. The final evaluation of the challenge was defined in terms of the Levenshtein edit distance, where the goal was to indicate the real order of gestures within the sequence. 54 international teams participated in the challenge, and outstanding results were obtained by the first ranked participants. |
||||
Address | Sidney; Australia; December 2013 | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | 978-1-4503-2129-7 | Medium | ||
Area | Expedition | Conference | ICMI | ||
Notes | HUPBA; ISE; 600.063;MV | Approved | no | ||
Call Number | Admin @ si @ EGB2013 | Serial | 2373 | ||
Permanent link to this record | |||||
Author | Sergio Escalera | ||||
Title ![]() |
Multi-Modal Human Behaviour Analysis from Visual Data Sources | Type | Journal | ||
Year | 2013 | Publication | ERCIM News journal | Abbreviated Journal | ERCIM |
Volume | 95 | Issue | Pages | 21-22 | |
Keywords | |||||
Abstract | The Human Pose Recovery and Behaviour Analysis group (HuPBA), University of Barcelona, is developing a line of research on multi-modal analysis of humans in visual data. The novel technology is being applied in several scenarios with high social impact, including sign language recognition, assisted technology and supported diagnosis for the elderly and people with mental/physical disabilities, fitness conditioning, and Human Computer Interaction. | ||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | 0926-4981 | ISBN | Medium | ||
Area | Expedition | Conference | |||
Notes | HuPBA;MILAB | Approved | no | ||
Call Number | Admin @ si @ Esc2013 | Serial | 2361 | ||
Permanent link to this record | |||||
Author | Alejandro Gonzalez Alzate | ||||
Title ![]() |
Multi-modal Pedestrian Detection | Type | Book Whole | ||
Year | 2015 | Publication | PhD Thesis, Universitat Autonoma de Barcelona-CVC | Abbreviated Journal | |
Volume | Issue | Pages | |||
Keywords | |||||
Abstract | Pedestrian detection continues to be an extremely challenging problem in real scenarios, in which situations like illumination changes, noisy images, unexpected objects, uncontrolled scenarios and variant appearance of objects occur constantly. All these problems force the development of more robust detectors for relevant applications like vision-based autonomous vehicles, intelligent surveillance, and pedestrian tracking for behavior analysis. Most reliable vision-based pedestrian detectors base their decision on features extracted using a single sensor capturing complementary features, e.g., appearance, and texture. These features usually are extracted from the current frame, ignoring temporal information, or including it in a post process step e.g., tracking or temporal coherence. Taking into account these issues we formulate the following question: can we generate more robust pedestrian detectors by introducing new information sources in the feature extraction step?
In order to answer this question we develop different approaches for introducing new information sources to well-known pedestrian detectors. We start by the inclusion of temporal information following the Stacked Sequential Learning (SSL) paradigm which suggests that information extracted from the neighboring samples in a sequence can improve the accuracy of a base classifier. We then focus on the inclusion of complementary information from different sensors like 3D point clouds (LIDAR – depth), far infrared images (FIR), or disparity maps (stereo pair cameras). For this end we develop a multi-modal framework in which information from different sensors is used for increasing detection accuracy (by increasing information redundancy). Finally we propose a multi-view pedestrian detector, this multi-view approach splits the detection problem in n sub-problems. Each sub-problem will detect objects in a given specific view reducing in that way the variability problem faced when a single detectors is used for the whole problem. We show that these approaches obtain competitive results with other state-of-the-art methods but instead of design new features, we reuse existing ones boosting their performance. |
||||
Address | November 2015 | ||||
Corporate Author | Thesis | Ph.D. thesis | |||
Publisher | Ediciones Graficas Rey | Place of Publication | Editor | David Vazquez;Antonio Lopez; | |
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | 978-84-943427-7-6 | Medium | ||
Area | Expedition | Conference | |||
Notes | ADAS; 600.076 | Approved | no | ||
Call Number | Admin @ si @ Gon2015 | Serial | 2706 | ||
Permanent link to this record | |||||
Author | Andres Mafla; Sounak Dey; Ali Furkan Biten; Lluis Gomez; Dimosthenis Karatzas | ||||
Title ![]() |
Multi-modal reasoning graph for scene-text based fine-grained image classification and retrieval | Type | Conference Article | ||
Year | 2021 | Publication | IEEE Winter Conference on Applications of Computer Vision | Abbreviated Journal | |
Volume | Issue | Pages | 4022-4032 | ||
Keywords | |||||
Abstract | |||||
Address | Virtual; January 2021 | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | WACV | ||
Notes | DAG; 600.121 | Approved | no | ||
Call Number | Admin @ si @ MDB2021 | Serial | 3491 | ||
Permanent link to this record | |||||
Author | Cristina Palmero; Albert Clapes; Chris Bahnsen; Andreas Møgelmose; Thomas B. Moeslund; Sergio Escalera | ||||
Title ![]() |
Multi-modal RGB-Depth-Thermal Human Body Segmentation | Type | Journal Article | ||
Year | 2016 | Publication | International Journal of Computer Vision | Abbreviated Journal | IJCV |
Volume | 118 | Issue | 2 | Pages | 217-239 |
Keywords | Human body segmentation; RGB ; Depth Thermal | ||||
Abstract | This work addresses the problem of human body segmentation from multi-modal visual cues as a first stage of automatic human behavior analysis. We propose a novel RGB–depth–thermal dataset along with a multi-modal segmentation baseline. The several modalities are registered using a calibration device and a registration algorithm. Our baseline extracts regions of interest using background subtraction, defines a partitioning of the foreground regions into cells, computes a set of image features on those cells using different state-of-the-art feature extractions, and models the distribution of the descriptors per cell using probabilistic models. A supervised learning algorithm then fuses the output likelihoods over cells in a stacked feature vector representation. The baseline, using Gaussian mixture models for the probabilistic modeling and Random Forest for the stacked learning, is superior to other state-of-the-art methods, obtaining an overlap above 75 % on the novel dataset when compared to the manually annotated ground-truth of human segmentations. | ||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Springer US | Place of Publication | Editor | ||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | |||
Notes | HuPBA;MILAB; | Approved | no | ||
Call Number | Admin @ si @ PCB2016 | Serial | 2767 | ||
Permanent link to this record | |||||
Author | Victor Ponce; Sergio Escalera; Xavier Baro | ||||
Title ![]() |
Multi-modal Social Signal Analysis for Predicting Agreement in Conversation Settings | Type | Conference Article | ||
Year | 2013 | Publication | 15th ACM International Conference on Multimodal Interaction | Abbreviated Journal | |
Volume | Issue | Pages | 495-502 | ||
Keywords | |||||
Abstract | In this paper we present a non-invasive ambient intelligence framework for the analysis of non-verbal communication applied to conversational settings. In particular, we apply feature extraction techniques to multi-modal audio-RGB-depth data. We compute a set of behavioral indicators that define communicative cues coming from the fields of psychology and observational methodology. We test our methodology over data captured in victim-offender mediation scenarios. Using different state-of-the-art classification approaches, our system achieve upon 75% of recognition predicting agreement among the parts involved in the conversations, using as ground truth the experts opinions. | ||||
Address | Sidney; Australia; December 2013 | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | 978-1-4503-2129-7 | Medium | ||
Area | Expedition | Conference | ICMI | ||
Notes | HuPBA;MV | Approved | no | ||
Call Number | Admin @ si @ PEB2013 | Serial | 2488 | ||
Permanent link to this record | |||||
Author | Albert Clapes; Miguel Reyes; Sergio Escalera | ||||
Title ![]() |
Multi-modal User Identification and Object Recognition Surveillance System | Type | Journal Article | ||
Year | 2013 | Publication | Pattern Recognition Letters | Abbreviated Journal | PRL |
Volume | 34 | Issue | 7 | Pages | 799-808 |
Keywords | Multi-modal RGB-Depth data analysis; User identification; Object recognition; Intelligent surveillance; Visual features; Statistical learning | ||||
Abstract | We propose an automatic surveillance system for user identification and object recognition based on multi-modal RGB-Depth data analysis. We model a RGBD environment learning a pixel-based background Gaussian distribution. Then, user and object candidate regions are detected and recognized using robust statistical approaches. The system robustly recognizes users and updates the system in an online way, identifying and detecting new actors in the scene. Moreover, segmented objects are described, matched, recognized, and updated online using view-point 3D descriptions, being robust to partial occlusions and local 3D viewpoint rotations. Finally, the system saves the historic of user–object assignments, being specially useful for surveillance scenarios. The system has been evaluated on a novel data set containing different indoor/outdoor scenarios, objects, and users, showing accurate recognition and better performance than standard state-of-the-art approaches. | ||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Elsevier | Place of Publication | Editor | ||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | |||
Notes | HUPBA; 600.046; 605.203;MILAB | Approved | no | ||
Call Number | Admin @ si @ CRE2013 | Serial | 2248 | ||
Permanent link to this record | |||||
Author | Bogdan Raducanu; Alireza Bosaghzadeh; Fadi Dornaika | ||||
Title ![]() |
Multi-observation Face Recognition in Videos based on Label Propagation | Type | Conference Article | ||
Year | 2015 | Publication | 6th Workshop on Analysis and Modeling of Faces and Gestures AMFG2015 | Abbreviated Journal | |
Volume | Issue | Pages | 10-17 | ||
Keywords | |||||
Abstract | In order to deal with the huge amount of content generated by social media, especially for indexing and retrieval purposes, the focus shifted from single object recognition to multi-observation object recognition. Of particular interest is the problem of face recognition (used as primary cue for persons’ identity assessment), since it is highly required by popular social media search engines like Facebook and Youtube. Recently, several approaches for graph-based label propagation were proposed. However, the associated graphs were constructed in an ad-hoc manner (e.g., using the KNN graph) that cannot cope properly with the rapid and frequent changes in data appearance, a phenomenon intrinsically related with video sequences. In this paper, we
propose a novel approach for efficient and adaptive graph construction, based on a two-phase scheme: (i) the first phase is used to adaptively find the neighbors of a sample and also to find the adequate weights for the minimization function of the second phase; (ii) in the second phase, the selected neighbors along with their corresponding weights are used to locally and collaboratively estimate the sparse affinity matrix weights. Experimental results performed on Honda Video Database (HVDB) and a subset of video sequences extracted from the popular TV-series ’Friends’ show a distinct advantage of the proposed method over the existing standard graph construction methods. |
||||
Address | Boston; USA; June 2015 | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | CVPRW | ||
Notes | LAMP; 600.068; 600.072; | Approved | no | ||
Call Number | Admin @ si @ RBD2015 | Serial | 2627 | ||
Permanent link to this record | |||||
Author | Partha Pratim Roy | ||||
Title ![]() |
Multi-Oriented and Multi-Scaled Text Character Analysis and Recognition in Graphical Documents and their Applications to Document Image Retrieval | Type | Book Whole | ||
Year | 2010 | Publication | PhD Thesis, Universitat Autonoma de Barcelona-CVC | Abbreviated Journal | |
Volume | Issue | Pages | |||
Keywords | |||||
Abstract | With the advent research of Document Image Analysis and Recognition (DIAR), an
important line of research is explored on indexing and retrieval of graphics rich documents. It aims at finding relevant documents relying on segmentation and recognition of text and graphics components underlying in non-standard layout where commercial OCRs can not be applied due to complexity. This thesis is focused towards text information extraction approaches in graphical documents and retrieval of such documents using text information. Automatic text recognition in graphical documents (map, engineering drawing, etc.) involves many challenges because text characters are usually printed in multioriented and multi-scale way along with different graphical objects. Text characters are used to annotate the graphical curve lines and hence, many times they follow curvi-linear paths too. For OCR of such documents, individual text lines and their corresponding words/characters need to be extracted. For recognition of multi-font, multi-scale and multi-oriented characters, we have proposed a feature descriptor for character shape using angular information from contour pixels to take care of the invariance nature. To improve the efficiency of OCR, an approach towards the segmentation of multi-oriented touching strings into individual characters is also discussed. Convex hull based background information is used to segment a touching string into possible primitive segments and later these primitive segments are merged to get optimum segmentation using dynamic programming. To overcome the touching/overlapping problem of text with graphical lines, a character spotting approach using SIFT and skeleton information is included. Afterwards, we propose a novel method to extract individual curvi-linear text lines using the foreground and background information of the characters of the text and a water reservoir concept is used to utilize the background information. We have also formulated the methodologies for graphical document retrieval applications using query words and seals. The retrieval approaches are performed using recognition results of individual components in the document. Given a query text, the system extracts positional knowledge from the query word and uses the same to generate hypothetical locations in the document. Indexing of documents is also performed based on automatic detection of seals from documents containing cluttered background. A seal is characterized by scale and rotation invariant spatial feature descriptors computed from labelled text characters and a concept based on the Generalized Hough Transform is used to locate the seal in documents. |
||||
Address | |||||
Corporate Author | Thesis | Ph.D. thesis | |||
Publisher | Ediciones Graficas Rey | Place of Publication | Editor | Josep Llados;Umapada Pal | |
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | 978-84-937261-7-1 | Medium | ||
Area | Expedition | Conference | |||
Notes | Approved | no | |||
Call Number | Admin @ si @ Roy2010 | Serial | 1455 | ||
Permanent link to this record | |||||
Author | Partha Pratim Roy; Umapada Pal; Josep Llados; Mathieu Nicolas Delalandre | ||||
Title ![]() |
Multi-Oriented and Multi-Sized Touching Character Segmentation using Dynamic Programming | Type | Conference Article | ||
Year | 2009 | Publication | 10th International Conference on Document Analysis and Recognition | Abbreviated Journal | |
Volume | Issue | Pages | 11–15 | ||
Keywords | |||||
Abstract | In this paper, we present a scheme towards the segmentation of English multi-oriented touching strings into individual characters. When two or more characters touch, they generate a big cavity region at the background portion. Using Convex Hull information, we use these background information to find some initial points to segment a touching string into possible primitive segments (a primitive segment consists of a single character or a part of a character). Next these primitive segments are merged to get optimum segmentation and dynamic programming is applied using total likelihood of characters as the objective function. SVM classifier is used to find the likelihood of a character. To consider multi-oriented touching strings the features used in the SVM are invariant to character orientation. Circular ring and convex hull ring based approach has been used along with angular information of the contour pixels of the character to make the feature rotation invariant. From the experiment, we obtained encouraging results. | ||||
Address | Barcelona, Spain | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | 1520-5363 | ISBN | 978-1-4244-4500-4 | Medium | |
Area | Expedition | Conference | ICDAR | ||
Notes | DAG | Approved | no | ||
Call Number | DAG @ dag @ RPL2009a | Serial | 1240 | ||
Permanent link to this record | |||||
Author | Umapada Pal; Partha Pratim Roy; N. Tripathya; Josep Llados | ||||
Title ![]() |
Multi-oriented Bangla and Devnagari text recognition | Type | Journal Article | ||
Year | 2010 | Publication | Pattern Recognition | Abbreviated Journal | PR |
Volume | 43 | Issue | 12 | Pages | 4124–4136 |
Keywords | |||||
Abstract | There are printed complex documents where text lines of a single page may have different orientations or the text lines may be curved in shape. As a result, it is difficult to detect the skew of such documents and hence character segmentation and recognition of such documents are a complex task. In this paper, using background and foreground information we propose a novel scheme towards the recognition of Indian complex documents of Bangla and Devnagari script. In Bangla and Devnagari documents usually characters in a word touch and they form cavity regions. To take care of these cavity regions, background information of such documents is used. Convex hull and water reservoir principle have been applied for this purpose. Here, at first, the characters are segmented from the documents using the background information of the text. Next, individual characters are recognized using rotation invariant features obtained from the foreground part of the characters.
For character segmentation, at first, writing mode of a touching component (word) is detected using water reservoir principle based features. Next, depending on writing mode and the reservoir base-region of the touching component, a set of candidate envelope points is then selected from the contour points of the component. Based on these candidate points, the touching component is finally segmented into individual characters. For recognition of multi-sized/multi-oriented characters the features are computed from different angular information obtained from the external and internal contour pixels of the characters. These angular information are computed in such a way that they do not depend on the size and rotation of the characters. Circular and convex hull rings have been used to divide a character into smaller zones to get zone-wise features for higher recognition results. We combine circular and convex hull features to improve the results and these features are fed to support vector machines (SVM) for recognition. From our experiment we obtained recognition results of 99.18% (98.86%) accuracy when tested on 7515 (7874) Devnagari (Bangla) characters. |
||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Elsevier | Place of Publication | Editor | ||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | |||
Notes | DAG | Approved | no | ||
Call Number | DAG @ dag @ PRT2010 | Serial | 1337 | ||
Permanent link to this record | |||||
Author | Partha Pratim Roy; Josep Llados | ||||
Title ![]() |
Multi-Oriented Character Recognition from Graphical Documents | Type | Conference Article | ||
Year | 2008 | Publication | 2nd International Conference on Cognition and Recognition | Abbreviated Journal | |
Volume | Issue | Pages | 30–35 | ||
Keywords | |||||
Abstract | |||||
Address | Mandya (India) | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | ICCR | ||
Notes | DAG | Approved | no | ||
Call Number | DAG @ dag @ RLP2008 | Serial | 965 | ||
Permanent link to this record |