toggle visibility Search & Display Options

Select All    Deselect All
 |   | 
Details
   print
  Records Links
Author Fahad Shahbaz Khan; Muhammad Anwer Rao; Joost Van de Weijer; Michael Felsberg; J.Laaksonen edit  doi
openurl 
  Title Compact color texture description for texture classification Type Journal Article
  Year 2015 Publication Pattern Recognition Letters Abbreviated Journal PRL  
  Volume 51 Issue Pages 16-22  
  Keywords  
  Abstract (up) Describing textures is a challenging problem in computer vision and pattern recognition. The classification problem involves assigning a category label to the texture class it belongs to. Several factors such as variations in scale, illumination and viewpoint make the problem of texture description extremely challenging. A variety of histogram based texture representations exists in literature.
However, combining multiple texture descriptors and assessing their complementarity is still an open research problem. In this paper, we first show that combining multiple local texture descriptors significantly improves the recognition performance compared to using a single best method alone. This
gain in performance is achieved at the cost of high-dimensional final image representation. To counter this problem, we propose to use an information-theoretic compression technique to obtain a compact texture description without any significant loss in accuracy. In addition, we perform a comprehensive
evaluation of pure color descriptors, popular in object recognition, for the problem of texture classification. Experiments are performed on four challenging texture datasets namely, KTH-TIPS-2a, KTH-TIPS-2b, FMD and Texture-10. The experiments clearly demonstrate that our proposed compact multi-texture approach outperforms the single best texture method alone. In all cases, discriminative color names outperforms other color features for texture classification. Finally, we show that combining discriminative color names with compact texture representation outperforms state-of-the-art methods by 7:8%, 4:3% and 5:0% on KTH-TIPS-2a, KTH-TIPS-2b and Texture-10 datasets respectively.
 
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes LAMP; 600.068; 600.079;ADAS Approved no  
  Call Number Admin @ si @ KRW2015a Serial 2587  
Permanent link to this record
 

 
Author Muhammad Anwer Rao; Fahad Shahbaz Khan; Joost Van de Weijer; Matthieu Molinier; Jorma Laaksonen edit   pdf
url  openurl
  Title Binary patterns encoded convolutional neural networks for texture recognition and remote sensing scene classification Type Journal Article
  Year 2018 Publication ISPRS Journal of Photogrammetry and Remote Sensing Abbreviated Journal ISPRS J  
  Volume 138 Issue Pages 74-85  
  Keywords Remote sensing; Deep learning; Scene classification; Local Binary Patterns; Texture analysis  
  Abstract (up) Designing discriminative powerful texture features robust to realistic imaging conditions is a challenging computer vision problem with many applications, including material recognition and analysis of satellite or aerial imagery. In the past, most texture description approaches were based on dense orderless statistical distribution of local features. However, most recent approaches to texture recognition and remote sensing scene classification are based on Convolutional Neural Networks (CNNs). The de facto practice when learning these CNN models is to use RGB patches as input with training performed on large amounts of labeled data (ImageNet). In this paper, we show that Local Binary Patterns (LBP) encoded CNN models, codenamed TEX-Nets, trained using mapped coded images with explicit LBP based texture information provide complementary information to the standard RGB deep models. Additionally, two deep architectures, namely early and late fusion, are investigated to combine the texture and color information. To the best of our knowledge, we are the first to investigate Binary Patterns encoded CNNs and different deep network fusion architectures for texture recognition and remote sensing scene classification. We perform comprehensive experiments on four texture recognition datasets and four remote sensing scene classification benchmarks: UC-Merced with 21 scene categories, WHU-RS19 with 19 scene classes, RSSCN7 with 7 categories and the recently introduced large scale aerial image dataset (AID) with 30 aerial scene types. We demonstrate that TEX-Nets provide complementary information to standard RGB deep model of the same network architecture. Our late fusion TEX-Net architecture always improves the overall performance compared to the standard RGB network on both recognition problems. Furthermore, our final combination leads to consistent improvement over the state-of-the-art for remote sensing scene  
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes LAMP; 600.109; 600.106; 600.120 Approved no  
  Call Number Admin @ si @ RKW2018 Serial 3158  
Permanent link to this record
 

 
Author Alejandro Gonzalez Alzate; Zhijie Fang; Yainuvis Socarras; Joan Serrat; David Vazquez; Jiaolong Xu; Antonio Lopez edit   pdf
doi  openurl
  Title Pedestrian Detection at Day/Night Time with Visible and FIR Cameras: A Comparison Type Journal Article
  Year 2016 Publication Sensors Abbreviated Journal SENS  
  Volume 16 Issue 6 Pages 820  
  Keywords Pedestrian Detection; FIR  
  Abstract (up) Despite all the significant advances in pedestrian detection brought by computer vision for driving assistance, it is still a challenging problem. One reason is the extremely varying lighting conditions under which such a detector should operate, namely day and night time. Recent research has shown that the combination of visible and non-visible imaging modalities may increase detection accuracy, where the infrared spectrum plays a critical role. The goal of this paper is to assess the accuracy gain of different pedestrian models (holistic, part-based, patch-based) when training with images in the far infrared spectrum. Specifically, we want to compare detection accuracy on test images recorded at day and nighttime if trained (and tested) using (a) plain color images, (b) just infrared images and (c) both of them. In order to obtain results for the last item we propose an early fusion approach to combine features from both modalities. We base the evaluation on a new dataset we have built for this purpose as well as on the publicly available KAIST multispectral dataset.  
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN 1424-8220 ISBN Medium  
  Area Expedition Conference  
  Notes ADAS; 600.085; 600.076; 600.082; 601.281 Approved no  
  Call Number ADAS @ adas @ GFS2016 Serial 2754  
Permanent link to this record
 

 
Author Anjan Dutta; Pau Riba; Josep Llados; Alicia Fornes edit   pdf
url  openurl
  Title Hierarchical Stochastic Graphlet Embedding for Graph-based Pattern Recognition Type Journal Article
  Year 2020 Publication Neural Computing and Applications Abbreviated Journal NEUCOMA  
  Volume 32 Issue Pages 11579–11596  
  Keywords  
  Abstract (up) Despite being very successful within the pattern recognition and machine learning community, graph-based methods are often unusable because of the lack of mathematical operations defined in graph domain. Graph embedding, which maps graphs to a vectorial space, has been proposed as a way to tackle these difficulties enabling the use of standard machine learning techniques. However, it is well known that graph embedding functions usually suffer from the loss of structural information. In this paper, we consider the hierarchical structure of a graph as a way to mitigate this loss of information. The hierarchical structure is constructed by topologically clustering the graph nodes and considering each cluster as a node in the upper hierarchical level. Once this hierarchical structure is constructed, we consider several configurations to define the mapping into a vector space given a classical graph embedding, in particular, we propose to make use of the stochastic graphlet embedding (SGE). Broadly speaking, SGE produces a distribution of uniformly sampled low-to-high-order graphlets as a way to embed graphs into the vector space. In what follows, the coarse-to-fine structure of a graph hierarchy and the statistics fetched by the SGE complements each other and includes important structural information with varied contexts. Altogether, these two techniques substantially cope with the usual information loss involved in graph embedding techniques, obtaining a more robust graph representation. This fact has been corroborated through a detailed experimental evaluation on various benchmark graph datasets, where we outperform the state-of-the-art methods.  
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes DAG; 600.140; 600.121; 600.141 Approved no  
  Call Number Admin @ si @ DRL2020 Serial 3348  
Permanent link to this record
 

 
Author Wenjuan Gong; Zhang Yue; Wei Wang; Cheng Peng; Jordi Gonzalez edit  doi
openurl 
  Title Meta-MMFNet: Meta-Learning Based Multi-Model Fusion Network for Micro-Expression Recognition Type Journal Article
  Year 2022 Publication ACM Transactions on Multimedia Computing, Communications, and Applications Abbreviated Journal ACMTMC  
  Volume Issue Pages  
  Keywords Feature Fusion; Model Fusion; Meta-Learning; Micro-Expression Recognition  
  Abstract (up) Despite its wide applications in criminal investigations and clinical communications with patients suffering from autism, automatic micro-expression recognition remains a challenging problem because of the lack of training data and imbalanced classes problems. In this study, we proposed a meta-learning based multi-model fusion network (Meta-MMFNet) to solve the existing problems. The proposed method is based on the metric-based meta-learning pipeline, which is specifically designed for few-shot learning and is suitable for model-level fusion. The frame difference and optical flow features were fused, deep features were extracted from the fused feature, and finally in the meta-learning-based framework, weighted sum model fusion method was applied for micro-expression classification. Meta-MMFNet achieved better results than state-of-the-art methods on four datasets. The code is available at https://github.com/wenjgong/meta-fusion-based-method.  
  Address May 2022  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes ISE; 600.157 Approved no  
  Call Number Admin @ si @ GYW2022 Serial 3692  
Permanent link to this record
 

 
Author Wenjuan Gong; Yue Zhang; Wei Wang; Peng Cheng; Jordi Gonzalez edit  url
openurl 
  Title Meta-MMFNet: Meta-learning-based Multi-model Fusion Network for Micro-expression Recognition Type Journal Article
  Year 2023 Publication ACM Transactions on Multimedia Computing, Communications, and Applications Abbreviated Journal TMCCA  
  Volume 20 Issue 2 Pages 1–20  
  Keywords  
  Abstract (up) Despite its wide applications in criminal investigations and clinical communications with patients suffering from autism, automatic micro-expression recognition remains a challenging problem because of the lack of training data and imbalanced classes problems. In this study, we proposed a meta-learning-based multi-model fusion network (Meta-MMFNet) to solve the existing problems. The proposed method is based on the metric-based meta-learning pipeline, which is specifically designed for few-shot learning and is suitable for model-level fusion. The frame difference and optical flow features were fused, deep features were extracted from the fused feature, and finally in the meta-learning-based framework, weighted sum model fusion method was applied for micro-expression classification. Meta-MMFNet achieved better results than state-of-the-art methods on four datasets. The code is available at https://github.com/wenjgong/meta-fusion-based-method.  
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes ISE Approved no  
  Call Number Admin @ si @ GZW2023 Serial 3862  
Permanent link to this record
 

 
Author Meysam Madadi; Sergio Escalera; Xavier Baro; Jordi Gonzalez edit   pdf
doi  openurl
  Title End-to-end Global to Local CNN Learning for Hand Pose Recovery in Depth data Type Journal Article
  Year 2022 Publication IET Computer Vision Abbreviated Journal IETCV  
  Volume 16 Issue 1 Pages 50-66  
  Keywords Computer vision; data acquisition; human computer interaction; learning (artificial intelligence); pose estimation  
  Abstract (up) Despite recent advances in 3D pose estimation of human hands, especially thanks to the advent of CNNs and depth cameras, this task is still far from being solved. This is mainly due to the highly non-linear dynamics of fingers, which make hand model training a challenging task. In this paper, we exploit a novel hierarchical tree-like structured CNN, in which branches are trained to become specialized in predefined subsets of hand joints, called local poses. We further fuse local pose features, extracted from hierarchical CNN branches, to learn higher order dependencies among joints in the final pose by end-to-end training. Lastly, the loss function used is also defined to incorporate appearance and physical constraints about doable hand motion and deformation. Finally, we introduce a non-rigid data augmentation approach to increase the amount of training depth data. Experimental results suggest that feeding a tree-shaped CNN, specialized in local poses, into a fusion network for modeling joints correlations and dependencies, helps to increase the precision of final estimations, outperforming state-of-the-art results on NYU and SyntheticHand datasets.  
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes HUPBA; ISE; 600.098; 600.119 Approved no  
  Call Number Admin @ si @ MEB2022 Serial 3652  
Permanent link to this record
 

 
Author Alejandro Gonzalez Alzate; David Vazquez; Antonio Lopez; Jaume Amores edit   pdf
doi  openurl
  Title On-Board Object Detection: Multicue, Multimodal, and Multiview Random Forest of Local Experts Type Journal Article
  Year 2017 Publication IEEE Transactions on cybernetics Abbreviated Journal Cyber  
  Volume 47 Issue 11 Pages 3980 - 3990  
  Keywords Multicue; multimodal; multiview; object detection  
  Abstract (up) Despite recent significant advances, object detection continues to be an extremely challenging problem in real scenarios. In order to develop a detector that successfully operates under these conditions, it becomes critical to leverage upon multiple cues, multiple imaging modalities, and a strong multiview (MV) classifier that accounts for different object views and poses. In this paper, we provide an extensive evaluation that gives insight into how each of these aspects (multicue, multimodality, and strong MV classifier) affect accuracy both individually and when integrated together. In the multimodality component, we explore the fusion of RGB and depth maps obtained by high-definition light detection and ranging, a type of modality that is starting to receive increasing attention. As our analysis reveals, although all the aforementioned aspects significantly help in improving the accuracy, the fusion of visible spectrum and depth information allows to boost the accuracy by a much larger margin. The resulting detector not only ranks among the top best performers in the challenging KITTI benchmark, but it is built upon very simple blocks that are easy to implement and computationally efficient.  
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN 2168-2267 ISBN Medium  
  Area Expedition Conference  
  Notes ADAS; 600.085; 600.082; 600.076; 600.118 Approved no  
  Call Number Admin @ si @ Serial 2810  
Permanent link to this record
 

 
Author Debora Gil; Oriol Rodriguez-Leor; Petia Radeva; J. Mauri edit   pdf
doi  openurl
  Title Myocardial Perfusion Characterization From Contrast Angiography Spectral Distribution Type Journal Article
  Year 2008 Publication IEEE Transactions on Medical Imaging Abbreviated Journal  
  Volume 27 Issue 5 Pages 641-649  
  Keywords Contrast angiography; myocardial perfusion; spectral analysis.  
  Abstract (up) Despite recovering a normal coronary flow after acute myocardial infarction, percutaneous coronary intervention does not guarantee a proper perfusion (irrigation) of the infarcted area. This damage in microcirculation integrity may detrimentally affect the patient survival. Visual assessment of the myocardium opacification in contrast angiography serves to define a subjective score of the microcirculation integrity myocardial blush analysis (MBA). Although MBA correlates with patient prognosis its visual assessment is a very difficult task that requires of a highly expertise training in order to achieve a good intraobserver and interobserver agreement. In this paper, we provide objective descriptors of the myocardium staining pattern by analyzing the spectrum of the image local statistics. The descriptors proposed discriminate among the different phenomena observed in the angiographic sequence and allow defining an objective score of the myocardial perfusion.  
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes IAM;MILAB Approved no  
  Call Number IAM @ iam @ GRR2008 Serial 1541  
Permanent link to this record
 

 
Author Onur Ferhat; Fernando Vilariño edit   pdf
doi  openurl
  Title Low Cost Eye Tracking: The Current Panorama Type Journal Article
  Year 2016 Publication Computational Intelligence and Neuroscience Abbreviated Journal CIN  
  Volume Issue Pages Article ID 8680541  
  Keywords  
  Abstract (up) Despite the availability of accurate, commercial gaze tracker devices working with infrared (IR) technology, visible light gaze tracking constitutes an interesting alternative by allowing scalability and removing hardware requirements. Over the last years, this field has seen examples of research showing performance comparable to the IR alternatives. In this work, we survey the previous work on remote, visible light gaze trackers and analyze the explored techniques from various perspectives such as calibration strategies, head pose invariance, and gaze estimation techniques. We also provide information on related aspects of research such as public datasets to test against, open source projects to build upon, and gaze tracking services to directly use in applications. With all this information, we aim to provide the contemporary and future researchers with a map detailing previously explored ideas and the required tools.  
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes MV; 605.103; 600.047; 600.097;SIAI Approved no  
  Call Number Admin @ si @ FeV2016 Serial 2744  
Permanent link to this record
 

 
Author Jiaolong Xu; David Vazquez; Antonio Lopez; Javier Marin; Daniel Ponsa edit   pdf
doi  isbn
openurl 
  Title Learning a Part-based Pedestrian Detector in Virtual World Type Journal Article
  Year 2014 Publication IEEE Transactions on Intelligent Transportation Systems Abbreviated Journal TITS  
  Volume 15 Issue 5 Pages 2121-2131  
  Keywords Domain Adaptation; Pedestrian Detection; Virtual Worlds  
  Abstract (up) Detecting pedestrians with on-board vision systems is of paramount interest for assisting drivers to prevent vehicle-to-pedestrian accidents. The core of a pedestrian detector is its classification module, which aims at deciding if a given image window contains a pedestrian. Given the difficulty of this task, many classifiers have been proposed during the last fifteen years. Among them, the so-called (deformable) part-based classifiers including multi-view modeling are usually top ranked in accuracy. Training such classifiers is not trivial since a proper aspect clustering and spatial part alignment of the pedestrian training samples are crucial for obtaining an accurate classifier. In this paper, first we perform automatic aspect clustering and part alignment by using virtual-world pedestrians, i.e., human annotations are not required. Second, we use a mixture-of-parts approach that allows part sharing among different aspects. Third, these proposals are integrated in a learning framework which also allows to incorporate real-world training data to perform domain adaptation between virtual- and real-world cameras. Overall, the obtained results on four popular on-board datasets show that our proposal clearly outperforms the state-of-the-art deformable part-based detector known as latent SVM.  
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN 1931-0587 ISBN 978-1-4673-2754-1 Medium  
  Area Expedition Conference  
  Notes ADAS; 600.076 Approved no  
  Call Number ADAS @ adas @ XVL2014 Serial 2433  
Permanent link to this record
 

 
Author Jose Manuel Alvarez; Antonio Lopez; Theo Gevers; Felipe Lumbreras edit   pdf
doi  openurl
  Title Combining Priors, Appearance and Context for Road Detection Type Journal Article
  Year 2014 Publication IEEE Transactions on Intelligent Transportation Systems Abbreviated Journal TITS  
  Volume 15 Issue 3 Pages 1168-1178  
  Keywords Illuminant invariance; lane markings; road detection; road prior; road scene understanding; vanishing point; 3-D scene layout  
  Abstract (up) Detecting the free road surface ahead of a moving vehicle is an important research topic in different areas of computer vision, such as autonomous driving or car collision warning.
Current vision-based road detection methods are usually based solely on low-level features. Furthermore, they generally assume structured roads, road homogeneity, and uniform lighting conditions, constraining their applicability in real-world scenarios. In this paper, road priors and contextual information are introduced for road detection. First, we propose an algorithm to estimate road priors online using geographical information, providing relevant initial information about the road location. Then, contextual cues, including horizon lines, vanishing points, lane markings, 3-D scene layout, and road geometry, are used in addition to low-level cues derived from the appearance of roads. Finally, a generative model is used to combine these cues and priors, leading to a road detection method that is, to a large degree, robust to varying imaging conditions, road types, and scenarios.
 
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN 1524-9050 ISBN Medium  
  Area Expedition Conference  
  Notes ADAS; 600.076;ISE Approved no  
  Call Number Admin @ si @ ALG2014 Serial 2501  
Permanent link to this record
 

 
Author Antonio Lopez; Joan Serrat; Cristina Cañero; Felipe Lumbreras; T. Graf edit   pdf
doi  openurl
  Title Robust lane markings detection and road geometry computation Type Journal Article
  Year 2010 Publication International Journal of Automotive Technology Abbreviated Journal IJAT  
  Volume 11 Issue 3 Pages 395–407  
  Keywords lane markings  
  Abstract (up) Detection of lane markings based on a camera sensor can be a low-cost solution to lane departure and curve-over-speed warnings. A number of methods and implementations have been reported in the literature. However, reliable detection is still an issue because of cast shadows, worn and occluded markings, variable ambient lighting conditions, for example. We focus on increasing detection reliability in two ways. First, we employed an image feature other than the commonly used edges: ridges, which we claim addresses this problem better. Second, we adapted RANSAC, a generic robust estimation method, to fit a parametric model of a pair of lane lines to the image features, based on both ridgeness and ridge orientation. In addition, the model was fitted for the left and right lane lines simultaneously to enforce a consistent result. Four measures of interest for driver assistance applications were directly computed from the fitted parametric model at each frame: lane width, lane curvature, and vehicle yaw angle and lateral offset with regard the lane medial axis. We qualitatively assessed our method in video sequences captured on several road types and under very different lighting conditions. We also quantitatively assessed it on synthetic but realistic video sequences for which road geometry and vehicle trajectory ground truth are known.  
  Address  
  Corporate Author Thesis  
  Publisher The Korean Society of Automotive Engineers Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN 1229-9138 ISBN Medium  
  Area Expedition Conference  
  Notes ADAS Approved no  
  Call Number ADAS @ adas @ LSC2010 Serial 1300  
Permanent link to this record
 

 
Author Sounak Dey; Palaiahnakote Shivakumara; K.S. Raghunanda; Umapada Pal; Tong Lu; G. Hemantha Kumar; Chee Seng Chan edit  url
openurl 
  Title Script independent approach for multi-oriented text detection in scene image Type Journal Article
  Year 2017 Publication Neurocomputing Abbreviated Journal NEUCOM  
  Volume 242 Issue Pages 96-112  
  Keywords  
  Abstract (up) Developing a text detection method which is invariant to scripts in natural scene images is a challeng- ing task due to different geometrical structures of various scripts. Besides, multi-oriented of text lines in natural scene images make the problem more challenging. This paper proposes to explore ring radius transform (RRT) for text detection in multi-oriented and multi-script environments. The method finds component regions based on convex hull to generate radius matrices using RRT. It is a fact that RRT pro- vides low radius values for the pixels that are near to edges, constant radius values for the pixels that represent stroke width, and high radius values that represent holes created in background and convex hull because of the regular structures of text components. We apply k -means clustering on the radius matrices to group such spatially coherent regions into individual clusters. Then the proposed method studies the radius values of such cluster components that are close to the centroid and far from the cen- troid to detect text components. Furthermore, we have developed a Bangla dataset (named as ISI-UM dataset) and propose a semi-automatic system for generating its ground truth for text detection of arbi- trary orientations, which can be used by the researchers for text detection and recognition in the future. The ground truth will be released to public. Experimental results on our ISI-UM data and other standard datasets, namely, ICDAR 2013 scene, SVT and MSRA data, show that the proposed method outperforms the existing methods in terms of multi-lingual and multi-oriented text detection ability.  
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes DAG; 600.121 Approved no  
  Call Number Admin @ si @ DSR2017 Serial 3260  
Permanent link to this record
 

 
Author Estefania Talavera; Carolin Wuerich; Nicolai Petkov; Petia Radeva edit  url
doi  openurl
  Title Topic modelling for routine discovery from egocentric photo-streams Type Journal Article
  Year 2020 Publication Pattern Recognition Abbreviated Journal PR  
  Volume 104 Issue Pages 107330  
  Keywords Routine; Egocentric vision; Lifestyle; Behaviour analysis; Topic modelling  
  Abstract (up) Developing tools to understand and visualize lifestyle is of high interest when addressing the improvement of habits and well-being of people. Routine, defined as the usual things that a person does daily, helps describe the individuals’ lifestyle. With this paper, we are the first ones to address the development of novel tools for automatic discovery of routine days of an individual from his/her egocentric images. In the proposed model, sequences of images are firstly characterized by semantic labels detected by pre-trained CNNs. Then, these features are organized in temporal-semantic documents to later be embedded into a topic models space. Finally, Dynamic-Time-Warping and Spectral-Clustering methods are used for final day routine/non-routine discrimination. Moreover, we introduce a new EgoRoutine-dataset, a collection of 104 egocentric days with more than 100.000 images recorded by 7 users. Results show that routine can be discovered and behavioural patterns can be observed.  
  Address  
  Corporate Author Thesis  
  Publisher Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes MILAB; no proj Approved no  
  Call Number Admin @ si @ TWP2020 Serial 3435  
Permanent link to this record
Select All    Deselect All
 |   | 
Details
   print

Save Citations:
Export Records: