Home | [201–210] << 211 212 213 214 215 216 217 218 219 220 >> [221–228] |
Records | |||||
---|---|---|---|---|---|
Author | Vishwesh Pillai; Pranav Mehar; Manisha Das; Deep Gupta; Petia Radeva | ||||
Title | Integrated Hierarchical and Flat Classifiers for Food Image Classification using Epistemic Uncertainty | Type | Conference Article | ||
Year | 2022 | Publication | IEEE International Conference on Signal Processing and Communications | Abbreviated Journal | |
Volume | Issue | Pages | |||
Keywords | |||||
Abstract | The problem of food image recognition is an essential one in today’s context because health conditions such as diabetes, obesity, and heart disease require constant monitoring of a person’s diet. To automate this process, several models are available to recognize food images. Due to a considerable number of unique food dishes and various cuisines, a traditional flat classifier ceases to perform well. To address this issue, prediction schemes consisting of both flat and hierarchical classifiers, with the analysis of epistemic uncertainty are used to switch between the classifiers. However, the accuracy of the predictions made using epistemic uncertainty data remains considerably low. Therefore, this paper presents a prediction scheme using three different threshold criteria that helps to increase the accuracy of epistemic uncertainty predictions. The performance of the proposed method is demonstrated using several experiments performed on the MAFood-121 dataset. The experimental results validate the proposal performance and show that the proposed threshold criteria help to increase the overall accuracy of the predictions by correctly classifying the uncertainty distribution of the samples. | ||||
Address | Bangalore; India; July 2022 | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | SPCOM | ||
Notes | MILAB; no menciona | Approved | no | ||
Call Number | Admin @ si @ PMD2022 | Serial | 3796 | ||
Permanent link to this record | |||||
Author | Vitaliy Konovalov; Albert Clapes; Sergio Escalera | ||||
Title | Automatic Hand Detection in RGB-Depth Data Sequences | Type | Conference Article | ||
Year | 2013 | Publication | 16th Catalan Conference on Artificial Intelligence | Abbreviated Journal | |
Volume | Issue | Pages | 91-100 | ||
Keywords | |||||
Abstract | Detecting hands in multi-modal RGB-Depth visual data has become a challenging Computer Vision problem with several applications of interest. This task involves dealing with changes in illumination, viewpoint variations, the articulated nature of the human body, the high flexibility of the wrist articulation, and the deformability of the hand itself. In this work, we propose an accurate and efficient automatic hand detection scheme to be applied in Human-Computer Interaction (HCI) applications in which the user is seated at the desk and, thus, only the upper body is visible. Our main hypothesis is that hand landmarks remain at a nearly constant geodesic distance from an automatically located anatomical reference point.
In a given frame, the human body is segmented first in the depth image. Then, a graph representation of the body is built in which the geodesic paths are computed from the reference point. The dense optical flow vectors on the corresponding RGB image are used to reduce ambiguities of the geodesic paths’ connectivity, allowing to eliminate false edges interconnecting different body parts. Finally, we are able to detect the position of both hands based on invariant geodesic distances and optical flow within the body region, without involving costly learning procedures. |
||||
Address | Vic; October 2013 | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | LNCS | ||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | CCIA | ||
Notes | HuPBA;MILAB | Approved | no | ||
Call Number | Admin @ si @ KCE2013 | Serial | 2323 | ||
Permanent link to this record | |||||
Author | Volkmar Frinken; Alicia Fornes; Josep Llados; Jean-Marc Ogier | ||||
Title | Bidirectional Language Model for Handwriting Recognition | Type | Conference Article | ||
Year | 2012 | Publication | Structural, Syntactic, and Statistical Pattern Recognition, Joint IAPR International Workshop | Abbreviated Journal | |
Volume | 7626 | Issue | Pages | 611-619 | |
Keywords | |||||
Abstract | In order to improve the results of automatically recognized handwritten text, information about the language is commonly included in the recognition process. A common approach is to represent a text line as a sequence. It is processed in one direction and the language information via n-grams is directly included in the decoding. This approach, however, only uses context on one side to estimate a word’s probability. Therefore, we propose a bidirectional recognition in this paper, using distinct forward and a backward language models. By combining decoding hypotheses from both directions, we achieve a significant increase in recognition accuracy for the off-line writer independent handwriting recognition task. Both language models are of the same type and can be estimated on the same corpus. Hence, the increase in recognition accuracy comes without any additional need for training data or language modeling complexity. | ||||
Address | Japan | ||||
Corporate Author | Thesis | ||||
Publisher | Springer Berlin Heidelberg | Place of Publication | Editor | ||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | LNCS | ||
Series Volume | Series Issue | Edition | |||
ISSN | 0302-9743 | ISBN | 978-3-642-34165-6 | Medium | |
Area | Expedition | Conference | SSPR&SPR | ||
Notes | DAG | Approved | no | ||
Call Number | Admin @ si @ FFL2012 | Serial | 2057 | ||
Permanent link to this record | |||||
Author | Volkmar Frinken; Andreas Fischer; Carlos David Martinez Hinarejos | ||||
Title | Handwriting Recognition in Historical Documents using Very Large Vocabularies | Type | Conference Article | ||
Year | 2013 | Publication | 2nd International Workshop on Historical Document Imaging and Processing | Abbreviated Journal | |
Volume | Issue | Pages | 67-72 | ||
Keywords | |||||
Abstract | Language models are used in automatic transcription system to resolve ambiguities. This is done by limiting the vocabulary of words that can be recognized as well as estimating the n-gram probability of the words in the given text. In the context of historical documents, a non-unified spelling and the limited amount of written text pose a substantial problem for the selection of the recognizable vocabulary as well as the computation of the word probabilities. In this paper we propose for the transcription of historical Spanish text to keep the corpus for the n-gram limited to a sample of the target text, but expand the vocabulary with words gathered from external resources. We analyze the performance of such a transcription system with different sizes of external vocabularies and demonstrate the applicability and the significant increase in recognition accuracy of using up to 300 thousand external words. | ||||
Address | Washington; USA; August 2013 | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | 978-1-4503-2115-0 | Medium | ||
Area | Expedition | Conference | HIP | ||
Notes | DAG; 600.056; 600.045; 600.061; 602.006; 602.101 | Approved | no | ||
Call Number | Admin @ si @ FFM2013 | Serial | 2296 | ||
Permanent link to this record | |||||
Author | Volkmar Frinken; Andreas Fischer; Horst Bunke; Alicia Fornes | ||||
Title | Co-training for Handwritten Word Recognition | Type | Conference Article | ||
Year | 2011 | Publication | 11th International Conference on Document Analysis and Recognition | Abbreviated Journal | |
Volume | Issue | Pages | 314-318 | ||
Keywords | |||||
Abstract | To cope with the tremendous variations of writing styles encountered between different individuals, unconstrained automatic handwriting recognition systems need to be trained on large sets of labeled data. Traditionally, the training data has to be labeled manually, which is a laborious and costly process. Semi-supervised learning techniques offer methods to utilize unlabeled data, which can be obtained cheaply in large amounts in order, to reduce the need for labeled data. In this paper, we propose the use of Co-Training for improving the recognition accuracy of two weakly trained handwriting recognition systems. The first one is based on Recurrent Neural Networks while the second one is based on Hidden Markov Models. On the IAM off-line handwriting database we demonstrate a significant increase of the recognition accuracy can be achieved with Co-Training for single word recognition. | ||||
Address | Beijing, China | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | ICDAR | ||
Notes | DAG | Approved | no | ||
Call Number | Admin @ si @ FFB2011 | Serial | 1789 | ||
Permanent link to this record | |||||
Author | Volkmar Frinken; Andreas Fischer; Markus Baumgartner; Horst Bunke | ||||
Title | Keyword spotting for self-training of BLSTM NN based handwriting recognition systems | Type | Journal Article | ||
Year | 2014 | Publication | Pattern Recognition | Abbreviated Journal | PR |
Volume | 47 | Issue | 3 | Pages | 1073-1082 |
Keywords | Document retrieval; Keyword spotting; Handwriting recognition; Neural networks; Semi-supervised learning | ||||
Abstract | The automatic transcription of unconstrained continuous handwritten text requires well trained recognition systems. The semi-supervised paradigm introduces the concept of not only using labeled data but also unlabeled data in the learning process. Unlabeled data can be gathered at little or not cost. Hence it has the potential to reduce the need for labeling training data, a tedious and costly process. Given a weak initial recognizer trained on labeled data, self-training can be used to recognize unlabeled data and add words that were recognized with high confidence to the training set for re-training. This process is not trivial and requires great care as far as selecting the elements that are to be added to the training set is concerned. In this paper, we propose to use a bidirectional long short-term memory neural network handwritten recognition system for keyword spotting in order to select new elements. A set of experiments shows the high potential of self-training for bootstrapping handwriting recognition systems, both for modern and historical handwritings, and demonstrate the benefits of using keyword spotting over previously published self-training schemes. | ||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | |||
Notes | DAG; 600.077; 602.101 | Approved | no | ||
Call Number | Admin @ si @ FFB2014 | Serial | 2297 | ||
Permanent link to this record | |||||
Author | Volkmar Frinken; Francisco Zamora; Salvador España; Maria Jose Castro; Andreas Fischer; Horst Bunke | ||||
Title | Long-Short Term Memory Neural Networks Language Modeling for Handwriting Recognition | Type | Conference Article | ||
Year | 2012 | Publication | 21st International Conference on Pattern Recognition | Abbreviated Journal | |
Volume | Issue | Pages | 701-704 | ||
Keywords | |||||
Abstract | Unconstrained handwritten text recognition systems maximize the combination of two separate probability scores. The first one is the observation probability that indicates how well the returned word sequence matches the input image. The second score is the probability that reflects how likely a word sequence is according to a language model. Current state-of-the-art recognition systems use statistical language models in form of bigram word probabilities. This paper proposes to model the target language by means of a recurrent neural network with long-short term memory cells. Because the network is recurrent, the considered context is not limited to a fixed size especially as the memory cells are designed to deal with long-term dependencies. In a set of experiments conducted on the IAM off-line database we show the superiority of the proposed language model over statistical n-gram models. | ||||
Address | Tsukuba Science City, Japan | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | 1051-4651 | ISBN | 978-1-4673-2216-4 | Medium | |
Area | Expedition | Conference | ICPR | ||
Notes | DAG | Approved | no | ||
Call Number | Admin @ si @ FZE2012 | Serial | 2052 | ||
Permanent link to this record | |||||
Author | Volkmar Frinken; Markus Baumgartner; Andreas Fischer; Horst Bunke | ||||
Title | Semi-Supervised Learning for Cursive Handwriting Recognition using Keyword Spotting | Type | Conference Article | ||
Year | 2012 | Publication | 13th International Conference on Frontiers in Handwriting Recognition | Abbreviated Journal | |
Volume | Issue | Pages | 49-54 | ||
Keywords | |||||
Abstract | State-of-the-art handwriting recognition systems are learning-based systems that require large sets of training data. The creation of training data, and consequently the creation of a well-performing recognition system, requires therefore a substantial amount of human work. This can be reduced with semi-supervised learning, which uses unlabeled text lines for training as well. Current approaches estimate the correct transcription of the unlabeled data via handwriting recognition which is not only extremely demanding as far as computational costs are concerned but also requires a good model of the target language. In this paper, we propose a different approach that makes use of keyword spotting, which is significantly faster and does not need any language model. In a set of experiments we demonstrate its superiority over existing approaches. | ||||
Address | Bari, Italy | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | 10.1109/ICFHR.2012.268 | ISBN | 978-1-4673-2262-1 | Medium | |
Area | Expedition | Conference | ICFHR | ||
Notes | DAG | Approved | no | ||
Call Number | Admin @ si @ FBF2012 | Serial | 2055 | ||
Permanent link to this record | |||||
Author | W. Liu; Josep Llados | ||||
Title | Graphics Recognition. Ten Years Review and Future Perspectives | Type | Book Whole | ||
Year | 2006 | Publication | 6th International Workshop | Abbreviated Journal | |
Volume | 3926 | Issue | Pages | ||
Keywords | |||||
Abstract | |||||
Address | Hong Kong (China) | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | LNCS | ||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | GREC | ||
Notes | DAG | Approved | no | ||
Call Number | DAG @ dag @ LiL2006 | Serial | 800 | ||
Permanent link to this record | |||||
Author | W. Niessen; Antonio Lopez; W. Van Enk; P. Van Roermund; Bart M. Ter Haar Romeny; M. Viergever | ||||
Title | Multiscale Trabecular Bone Orientation Analysis. | Type | Miscellaneous | ||
Year | 1997 | Publication | 7th Spanish National Symposium on Pattern Recognition and Image Analysis, pp. 19–24. | Abbreviated Journal | |
Volume | Issue | Pages | |||
Keywords | |||||
Abstract | |||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | |||
Notes | ADAS | Approved | no | ||
Call Number | ADAS @ adas @ NLE1997a | Serial | 66 | ||
Permanent link to this record | |||||
Author | W. Niessen; Antonio Lopez; W. Van Enk; P. Van Roermund; Bart M. Ter Haar Romeny; M. Viergever | ||||
Title | In Vivo Analysis of Trabecular Bone Architecture. | Type | Miscellaneous | ||
Year | 1997 | Publication | Information Processing in Medical Imaging, pp. 435–440. | Abbreviated Journal | |
Volume | Issue | Pages | |||
Keywords | |||||
Abstract | |||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | |||
Notes | ADAS | Approved | no | ||
Call Number | ADAS @ adas @ NLE1997b | Serial | 67 | ||
Permanent link to this record | |||||
Author | W.Win; B.Bao; Q.Xu; Luis Herranz; Shuqiang Jiang | ||||
Title | Editorial Note: Efficient Multimedia Processing Methods and Applications | Type | Miscellaneous | ||
Year | 2019 | Publication | Multimedia Tools and Applications | Abbreviated Journal | MTAP |
Volume | 78 | Issue | 1 | Pages | |
Keywords | |||||
Abstract | |||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | |||
Notes | LAMP; 600.141; 600.120 | Approved | no | ||
Call Number | Admin @ si @ WBX2019 | Serial | 3257 | ||
Permanent link to this record | |||||
Author | Weijia Wu; Yuzhong Zhao; Zhuang Li; Jiahong Li; Mike Zheng Shou; Umapada Pal; Dimosthenis Karatzas; Xiang Bai | ||||
Title | ICDAR 2023 Competition on Video Text Reading for Dense and Small Text | Type | Conference Article | ||
Year | 2023 | Publication | 17th International Conference on Document Analysis and Recognition | Abbreviated Journal | |
Volume | 14188 | Issue | Pages | 405–419 | |
Keywords | Video Text Spotting; Small Text; Text Tracking; Dense Text | ||||
Abstract | Recently, video text detection, tracking and recognition in natural scenes are becoming very popular in the computer vision community. However, most existing algorithms and benchmarks focus on common text cases (e.g., normal size, density) and single scenario, while ignore extreme video texts challenges, i.e., dense and small text in various scenarios. In this competition report, we establish a video text reading benchmark, named DSText, which focuses on dense and small text reading challenge in the video with various scenarios. Compared with the previous datasets, the proposed dataset mainly include three new challenges: 1) Dense video texts, new challenge for video text spotter. 2) High-proportioned small texts. 3) Various new scenarios, e.g., ‘Game’, ‘Sports’, etc. The proposed DSText includes 100 video clips from 12 open scenarios, supporting two tasks (i.e., video text tracking (Task 1) and end-to-end video text spotting (Task2)). During the competition period (opened on 15th February, 2023 and closed on 20th March, 2023), a total of 24 teams participated in the three proposed tasks with around 30 valid submissions, respectively. In this article, we describe detailed statistical information of the dataset, tasks, evaluation protocols and the results summaries of the ICDAR 2023 on DSText competition. Moreover, we hope the benchmark will promise the video text research in the community. | ||||
Address | San Jose; CA; USA; August 2023 | ||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | LNCS | ||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | ICDAR | ||
Notes | DAG | Approved | no | ||
Call Number | Admin @ si @ WZL2023 | Serial | 3898 | ||
Permanent link to this record | |||||
Author | Weiqing Min; Shuqiang Jiang; Jitao Sang; Huayang Wang; Xinda Liu; Luis Herranz | ||||
Title | Being a Supercook: Joint Food Attributes and Multimodal Content Modeling for Recipe Retrieval and Exploration | Type | Journal Article | ||
Year | 2017 | Publication | IEEE Transactions on Multimedia | Abbreviated Journal | TMM |
Volume | 19 | Issue | 5 | Pages | 1100 - 1113 |
Keywords | |||||
Abstract | This paper considers the problem of recipe-oriented image-ingredient correlation learning with multi-attributes for recipe retrieval and exploration. Existing methods mainly focus on food visual information for recognition while we model visual information, textual content (e.g., ingredients), and attributes (e.g., cuisine and course) together to solve extended recipe-oriented problems, such as multimodal cuisine classification and attribute-enhanced food image retrieval. As a solution, we propose a multimodal multitask deep belief network (M3TDBN) to learn joint image-ingredient representation regularized by different attributes. By grouping ingredients into visible ingredients (which are visible in the food image, e.g., “chicken” and “mushroom”) and nonvisible ingredients (e.g., “salt” and “oil”), M3TDBN is capable of learning both midlevel visual representation between images and visible ingredients and nonvisual representation. Furthermore, in order to utilize different attributes to improve the intermodality correlation, M3TDBN incorporates multitask learning to make different attributes collaborate each other. Based on the proposed M3TDBN, we exploit the derived deep features and the discovered correlations for three extended novel applications: 1) multimodal cuisine classification; 2) attribute-augmented cross-modal recipe image retrieval; and 3) ingredient and attribute inference from food images. The proposed approach is evaluated on the constructed Yummly dataset and the evaluation results have validated the effectiveness of the proposed approach. | ||||
Address | |||||
Corporate Author | Thesis | ||||
Publisher | Place of Publication | Editor | |||
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | |||
Notes | LAMP; 600.120 | Approved | no | ||
Call Number | Admin @ si @ MJS2017 | Serial | 2964 | ||
Permanent link to this record | |||||
Author | Wenjuan Gong | ||||
Title | 3D Motion Data aided Human Action Recognition and Pose Estimation | Type | Book Whole | ||
Year | 2013 | Publication | PhD Thesis, Universitat Autonoma de Barcelona-CVC | Abbreviated Journal | |
Volume | Issue | Pages | |||
Keywords | |||||
Abstract | In this work, we explore human action recognition and pose estimation prob-
lems. Different from traditional works of learning from 2D images or video sequences and their annotated output, we seek to solve the problems with ad- ditional 3D motion capture information, which helps to fill the gap between 2D image features and human interpretations. We first compare two different schools of approaches commonly used for 3D pose estimation from 2D pose configuration: modeling and learning methods. By looking into experiments results and considering our problems, we fixed a learning method as the following approaches to do pose estimation. We then establish a framework by adding a module of detecting 2D pose configuration from images with varied background, which widely extend the application of the approach. We also seek to directly estimate 3D poses from image features, instead of estimating 2D poses as a intermediate module. We explore a robust input feature, which combined with the proposed distance measure, provides a solution for noisy or corrupted inputs. We further utilize the above method to estimate weak poses,which is a concise representation of the original poses by using dimension deduction technologies, from image features. Weak pose space is where we calculate vocabulary and label action types using a bog of words pipeline. Temporal information of an action is taken into consideration by considering several consecutive frames as a single unit for computing vocabulary and histogram assignments. |
||||
Address | Barcelona | ||||
Corporate Author | Thesis | Ph.D. thesis | |||
Publisher | Ediciones Graficas Rey | Place of Publication | Editor | Jordi Gonzalez;Xavier Roca | |
Language | Summary Language | Original Title | |||
Series Editor | Series Title | Abbreviated Series Title | |||
Series Volume | Series Issue | Edition | |||
ISSN | ISBN | Medium | |||
Area | Expedition | Conference | |||
Notes | ISE | Approved | no | ||
Call Number | Admin @ si @ Gon2013 | Serial | 2279 | ||
Permanent link to this record |