|   | 
Details
   web
Records
Author (down) Tadashi Araki; Nobutaka Ikeda; Nilanjan Dey; Sayan Chakraborty; Luca Saba; Dinesh Kumar; Elisa Cuadrado Godia; Xiaoyi Jiang; Ajay Gupta; Petia Radeva; John R. Laird; Andrew Nicolaides; Jasjit S. Suri
Title A comparative approach of four different image registration techniques for quantitative assessment of coronary artery calcium lesions using intravascular ultrasound Type Journal Article
Year 2015 Publication Computer Methods and Programs in Biomedicine Abbreviated Journal CMPB
Volume 118 Issue 2 Pages 158-172
Keywords
Abstract
Address
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference
Notes MILAB Approved no
Call Number Admin @ si @ AID2015 Serial 2640
Permanent link to this record
 

 
Author (down) T.Chauhan; E.Perales; Kaida Xiao; E.Hird ; Dimosthenis Karatzas; Sophie Wuerger
Title The achromatic locus: Effect of navigation direction in color space Type Journal Article
Year 2014 Publication Journal of Vision Abbreviated Journal VSS
Volume 14 (1) Issue 25 Pages 1-11
Keywords achromatic; unique hues; color constancy; luminance; color space
Abstract 5Y Impact Factor: 2.99 / 1st (Ophthalmology)
An achromatic stimulus is defined as a patch of light that is devoid of any hue. This is usually achieved by asking observers to adjust the stimulus such that it looks neither red nor green and at the same time neither yellow nor blue. Despite the theoretical and practical importance of the achromatic locus, little is known about the variability in these settings. The main purpose of the current study was to evaluate whether achromatic settings were dependent on the task of the observers, namely the navigation direction in color space. Observers could either adjust the test patch along the two chromatic axes in the CIE u*v* diagram or, alternatively, navigate along the unique-hue lines. Our main result is that the navigation method affects the reliability of these achromatic settings. Observers are able to make more reliable achromatic settings when adjusting the test patch along the directions defined by the four unique hues as opposed to navigating along the main axes in the commonly used CIE u*v* chromaticity plane. This result holds across different ambient viewing conditions (Dark, Daylight, Cool White Fluorescent) and different test luminance levels (5, 20, and 50 cd/m2). The reduced variability in the achromatic settings is consistent with the idea that internal color representations are more aligned with the unique-hue lines than the u* and v* axes.
Address
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference
Notes DAG; 600.077 Approved no
Call Number Admin @ si @ CPX2014 Serial 2418
Permanent link to this record
 

 
Author (down) T. Widemann; Xavier Otazu
Title Titanias radius and an upper limit on its atmosphere from the September 8, 2001 stellar occultation Type Journal Article
Year 2009 Publication International Journal of Solar System Studies Abbreviated Journal
Volume 199 Issue 2 Pages 458–476
Keywords Occultations; Uranus, satellites; Satellites, shapes; Satellites, dynamics; Ices; Satellites, atmospheres
Abstract On September 8, 2001 around 2 h UT, the largest uranian moon, Titania, occulted Hipparcos star 106829 (alias SAO 164538, a V=7.2, K0 III star). This was the first-ever observed occultation by this satellite, a rare event as Titania subtends only 0.11 arcsec on the sky. The star's unusual brightness allowed many observers, both amateurs or professionals, to monitor this unique event, providing fifty-seven occultations chords over three continents, all reported here. Selecting the best 27 occultation chords, and assuming a circular limb, we derive Titania's radius: View the MathML source (1-σ error bar). This implies a density of View the MathML source using the value View the MathML source derived by Taylor [Taylor, D.B., 1998. Astron. Astrophys. 330, 362–374]. We do not detect any significant difference between equatorial and polar radii, in the limit View the MathML source, in agreement with Voyager limb image retrieval during the 1986 flyby. Titania's offset with respect to the DE405 + URA027 (based on GUST86 theory) ephemeris is derived: ΔαTcos(δT)=−108±13 mas and ΔδT=−62±7 mas (ICRF J2000.0 system). Most of this offset is attributable to a Uranus' barycentric offset with respect to DE405, that we estimate to be: View the MathML source and ΔδU=−85±25 mas at the moment of occultation. This offset is confirmed by another Titania stellar occultation observed on August 1st, 2003, which provides an offset of ΔαTcos(δT)=−127±20 mas and ΔδT=−97±13 mas for the satellite. The combined ingress and egress data do not show any significant hint for atmospheric refraction, allowing us to set surface pressure limits at the level of 10–20 nbar. More specifically, we find an upper limit of 13 nbar (1-σ level) at 70 K and 17 nbar at 80 K, for a putative isothermal CO2 atmosphere. We also provide an upper limit of 8 nbar for a possible CH4 atmosphere, and 22 nbar for pure N2, again at the 1-σ level. We finally constrain the stellar size using the time-resolved star disappearance and reappearance at ingress and egress. We find an angular diameter of 0.54±0.03 mas (corresponding to View the MathML source projected at Titania). With a distance of 170±25 parsecs, this corresponds to a radius of 9.8±0.2 solar radii for HIP 106829, typical of a K0 III giant.
Address
Corporate Author Thesis
Publisher ELSEVIER Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN 0019-1035 ISBN Medium
Area Expedition Conference
Notes CIC Approved no
Call Number CAT @ cat @ Wid2009 Serial 1052
Permanent link to this record
 

 
Author (down) T. Mouats; N. Aouf; Angel Sappa; Cristhian A. Aguilera-Carrasco; Ricardo Toledo
Title Multi-Spectral Stereo Odometry Type Journal Article
Year 2015 Publication IEEE Transactions on Intelligent Transportation Systems Abbreviated Journal TITS
Volume 16 Issue 3 Pages 1210-1224
Keywords Egomotion estimation; feature matching; multispectral odometry (MO); optical flow; stereo odometry; thermal imagery
Abstract In this paper, we investigate the problem of visual odometry for ground vehicles based on the simultaneous utilization of multispectral cameras. It encompasses a stereo rig composed of an optical (visible) and thermal sensors. The novelty resides in the localization of the cameras as a stereo setup rather
than two monocular cameras of different spectrums. To the best of our knowledge, this is the first time such task is attempted. Log-Gabor wavelets at different orientations and scales are used to extract interest points from both images. These are then described using a combination of frequency and spatial information within the local neighborhood. Matches between the pairs of multimodal images are computed using the cosine similarity function based
on the descriptors. Pyramidal Lucas–Kanade tracker is also introduced to tackle temporal feature matching within challenging sequences of the data sets. The vehicle egomotion is computed from the triangulated 3-D points corresponding to the matched features. A windowed version of bundle adjustment incorporating
Gauss–Newton optimization is utilized for motion estimation. An outlier removal scheme is also included within the framework to deal with outliers. Multispectral data sets were generated and used as test bed. They correspond to real outdoor scenarios captured using our multimodal setup. Finally, detailed results validating the proposed strategy are illustrated.
Address
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN 1524-9050 ISBN Medium
Area Expedition Conference
Notes ADAS; 600.055; 600.076 Approved no
Call Number Admin @ si @ MAS2015a Serial 2533
Permanent link to this record
 

 
Author (down) Swathikiran Sudhakaran; Sergio Escalera;Oswald Lanz
Title Learning to Recognize Actions on Objects in Egocentric Video with Attention Dictionaries Type Journal Article
Year 2021 Publication IEEE Transactions on Pattern Analysis and Machine Intelligence Abbreviated Journal TPAMI
Volume Issue Pages
Keywords
Abstract We present EgoACO, a deep neural architecture for video action recognition that learns to pool action-context-object descriptors from frame level features by leveraging the verb-noun structure of action labels in egocentric video datasets. The core component of EgoACO is class activation pooling (CAP), a differentiable pooling operation that combines ideas from bilinear pooling for fine-grained recognition and from feature learning for discriminative localization. CAP uses self-attention with a dictionary of learnable weights to pool from the most relevant feature regions. Through CAP, EgoACO learns to decode object and scene context descriptors from video frame features. For temporal modeling in EgoACO, we design a recurrent version of class activation pooling termed Long Short-Term Attention (LSTA). LSTA extends convolutional gated LSTM with built-in spatial attention and a re-designed output gate. Action, object and context descriptors are fused by a multi-head prediction that accounts for the inter-dependencies between noun-verb-action structured labels in egocentric video datasets. EgoACO features built-in visual explanations, helping learning and interpretation. Results on the two largest egocentric action recognition datasets currently available, EPIC-KITCHENS and EGTEA, show that by explicitly decoding action-context-object descriptors, EgoACO achieves state-of-the-art recognition performance.
Address
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference
Notes HUPBA; no proj Approved no
Call Number Admin @ si @ SEL2021 Serial 3656
Permanent link to this record
 

 
Author (down) Swathikiran Sudhakaran; Sergio Escalera; Oswald Lanz
Title Gate-Shift-Fuse for Video Action Recognition Type Journal Article
Year 2023 Publication IEEE Transactions on Pattern Analysis and Machine Intelligence Abbreviated Journal TPAMI
Volume 45 Issue 9 Pages 10913-10928
Keywords Action Recognition; Video Classification; Spatial Gating; Channel Fusion
Abstract Convolutional Neural Networks are the de facto models for image recognition. However 3D CNNs, the straight forward extension of 2D CNNs for video recognition, have not achieved the same success on standard action recognition benchmarks. One of the main reasons for this reduced performance of 3D CNNs is the increased computational complexity requiring large scale annotated datasets to train them in scale. 3D kernel factorization approaches have been proposed to reduce the complexity of 3D CNNs. Existing kernel factorization approaches follow hand-designed and hard-wired techniques. In this paper we propose Gate-Shift-Fuse (GSF), a novel spatio-temporal feature extraction module which controls interactions in spatio-temporal decomposition and learns to adaptively route features through time and combine them in a data dependent manner. GSF leverages grouped spatial gating to decompose input tensor and channel weighting to fuse the decomposed tensors. GSF can be inserted into existing 2D CNNs to convert them into an efficient and high performing spatio-temporal feature extractor, with negligible parameter and compute overhead. We perform an extensive analysis of GSF using two popular 2D CNN families and achieve state-of-the-art or competitive performance on five standard action recognition benchmarks.
Address 1 Sept. 2023
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference
Notes HUPBA; no menciona Approved no
Call Number Admin @ si @ SEL2023 Serial 3814
Permanent link to this record
 

 
Author (down) Svebor Karaman; Giuseppe Lisanti; Andrew Bagdanov; Alberto del Bimbo
Title Leveraging local neighborhood topology for large scale person re-identification Type Journal Article
Year 2014 Publication Pattern Recognition Abbreviated Journal PR
Volume 47 Issue 12 Pages 3767–3778
Keywords Re-identification; Conditional random field; Semi-supervised; ETHZ; CAVIAR; 3DPeS; CMV100
Abstract In this paper we describe a semi-supervised approach to person re-identification that combines discriminative models of person identity with a Conditional Random Field (CRF) to exploit the local manifold approximation induced by the nearest neighbor graph in feature space. The linear discriminative models learned on few gallery images provides coarse separation of probe images into identities, while a graph topology defined by distances between all person images in feature space leverages local support for label propagation in the CRF. We evaluate our approach using multiple scenarios on several publicly available datasets, where the number of identities varies from 28 to 191 and the number of images ranges between 1003 and 36 171. We demonstrate that the discriminative model and the CRF are complementary and that the combination of both leads to significant improvement over state-of-the-art approaches. We further demonstrate how the performance of our approach improves with increasing test data and also with increasing amounts of additional unlabeled data.
Address
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference
Notes LAMP; 601.240; 600.079 Approved no
Call Number Admin @ si @ KLB2014a Serial 2522
Permanent link to this record
 

 
Author (down) Svebor Karaman; Andrew Bagdanov; Lea Landucci; Gianpaolo D'Amico; Andrea Ferracani; Daniele Pezzatini; Alberto del Bimbo
Title Personalized multimedia content delivery on an interactive table by passive observation of museum visitors Type Journal Article
Year 2016 Publication Multimedia Tools and Applications Abbreviated Journal MTAP
Volume 75 Issue 7 Pages 3787-3811
Keywords Computer vision; Video surveillance; Cultural heritage; Multimedia museum; Personalization; Natural interaction; Passive profiling
Abstract The amount of multimedia data collected in museum databases is growing fast, while the capacity of museums to display information to visitors is acutely limited by physical space. Museums must seek the perfect balance of information given on individual pieces in order to provide sufficient information to aid visitor understanding while maintaining sparse usage of the walls and guaranteeing high appreciation of the exhibit. Moreover, museums often target the interests of average visitors instead of the entire spectrum of different interests each individual visitor might have. Finally, visiting a museum should not be an experience contained in the physical space of the museum but a door opened onto a broader context of related artworks, authors, artistic trends, etc. In this paper we describe the MNEMOSYNE system that attempts to address these issues through a new multimedia museum experience. Based on passive observation, the system builds a profile of the artworks of interest for each visitor. These profiles of interest are then used to drive an interactive table that personalizes multimedia content delivery. The natural user interface on the interactive table uses the visitor’s profile, an ontology of museum content and a recommendation system to personalize exploration of multimedia content. At the end of their visit, the visitor can take home a personalized summary of their visit on a custom mobile application. In this article we describe in detail each component of our approach as well as the first field trials of our prototype system built and deployed at our permanent exhibition space at LeMurate (http://www.lemurate.comune.fi.it/lemurate/) in Florence together with the first results of the evaluation process during the official installation in the National Museum of Bargello (http://www.uffizi.firenze.it/musei/?m=bargello).
Address
Corporate Author Thesis
Publisher Springer US Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN 1380-7501 ISBN Medium
Area Expedition Conference
Notes LAMP; 601.240; 600.079 Approved no
Call Number Admin @ si @ KBL2016 Serial 2520
Permanent link to this record
 

 
Author (down) Susana Alvarez; Maria Vanrell
Title Texton theory revisited: a bag-of-words approach to combine textons Type Journal Article
Year 2012 Publication Pattern Recognition Abbreviated Journal PR
Volume 45 Issue 12 Pages 4312-4325
Keywords
Abstract The aim of this paper is to revisit an old theory of texture perception and
update its computational implementation by extending it to colour. With this in mind we try to capture the optimality of perceptual systems. This is achieved in the proposed approach by sharing well-known early stages of the visual processes and extracting low-dimensional features that perfectly encode adequate properties for a large variety of textures without needing further learning stages. We propose several descriptors in a bag-of-words framework that are derived from different quantisation models on to the feature spaces. Our perceptual features are directly given by the shape and colour attributes of image blobs, which are the textons. In this way we avoid learning visual words and directly build the vocabularies on these lowdimensionaltexton spaces. Main differences between proposed descriptors rely on how co-occurrence of blob attributes is represented in the vocabularies. Our approach overcomes current state-of-art in colour texture description which is proved in several experiments on large texture datasets.
Address
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN 0031-3203 ISBN Medium
Area Expedition Conference
Notes CIC Approved no
Call Number Admin @ si @ AlV2012a Serial 2130
Permanent link to this record
 

 
Author (down) Susana Alvarez; Anna Salvatella; Maria Vanrell; Xavier Otazu
Title Low-dimensional and Comprehensive Color Texture Description Type Journal Article
Year 2012 Publication Computer Vision and Image Understanding Abbreviated Journal CVIU
Volume 116 Issue I Pages 54-67
Keywords
Abstract Image retrieval can be dealt by combining standard descriptors, such as those of MPEG-7, which are defined independently for each visual cue (e.g. SCD or CLD for Color, HTD for texture or EHD for edges).
A common problem is to combine similarities coming from descriptors representing different concepts in different spaces. In this paper we propose a color texture description that bypasses this problem from its inherent definition. It is based on a low dimensional space with 6 perceptual axes. Texture is described in a 3D space derived from a direct implementation of the original Julesz’s Texton theory and color is described in a 3D perceptual space. This early fusion through the blob concept in these two bounded spaces avoids the problem and allows us to derive a sparse color-texture descriptor that achieves similar performance compared to MPEG-7 in image retrieval. Moreover, our descriptor presents comprehensive qualities since it can also be applied either in segmentation or browsing: (a) a dense image representation is defined from the descriptor showing a reasonable performance in locating texture patterns included in complex images; and (b) a vocabulary of basic terms is derived to build an intermediate level descriptor in natural language improving browsing by bridging semantic gap
Address
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN 1077-3142 ISBN Medium
Area Expedition Conference
Notes CAT;CIC Approved no
Call Number Admin @ si @ ASV2012 Serial 1827
Permanent link to this record
 

 
Author (down) Sumit K. Banchhor; Tadashi Araki; Narendra D. Londhe; Nobutaka Ikeda; Petia Radeva; Ayman El-Baz; Luca Saba; Andrew Nicolaides; Shoaib Shafique; John R. Laird; Jasjit S. Suri
Title Five multiresolution-based calcium volume measurement techniques from coronary IVUS videos: A comparative approach Type Journal Article
Year 2016 Publication Computer Methods and Programs in Biomedicine Abbreviated Journal CMPB
Volume 134 Issue Pages 237-258
Keywords
Abstract BACKGROUND AND OBJECTIVE:
Fast intravascular ultrasound (IVUS) video processing is required for calcium volume computation during the planning phase of percutaneous coronary interventional (PCI) procedures. Nonlinear multiresolution techniques are generally applied to improve the processing time by down-sampling the video frames.
METHODS:
This paper presents four different segmentation methods for calcium volume measurement, namely Threshold-based, Fuzzy c-Means (FCM), K-means, and Hidden Markov Random Field (HMRF) embedded with five different kinds of multiresolution techniques (bilinear, bicubic, wavelet, Lanczos, and Gaussian pyramid). This leads to 20 different kinds of combinations. IVUS image data sets consisting of 38,760 IVUS frames taken from 19 patients were collected using 40 MHz IVUS catheter (Atlantis® SR Pro, Boston Scientific®, pullback speed of 0.5 mm/sec.). The performance of these 20 systems is compared with and without multiresolution using the following metrics: (a) computational time; (b) calcium volume; (c) image quality degradation ratio; and (d) quality assessment ratio.
RESULTS:
Among the four segmentation methods embedded with five kinds of multiresolution techniques, FCM segmentation combined with wavelet-based multiresolution gave the best performance. FCM and wavelet experienced the highest percentage mean improvement in computational time of 77.15% and 74.07%, respectively. Wavelet interpolation experiences the highest mean precision-of-merit (PoM) of 94.06 ± 3.64% and 81.34 ± 16.29% as compared to other multiresolution techniques for volume level and frame level respectively. Wavelet multiresolution technique also experiences the highest Jaccard Index and Dice Similarity of 0.7 and 0.8, respectively. Multiresolution is a nonlinear operation which introduces bias and thus degrades the image. The proposed system also provides a bias correction approach to enrich the system, giving a better mean calcium volume similarity for all the multiresolution-based segmentation methods. After including the bias correction, bicubic interpolation gives the largest increase in mean calcium volume similarity of 4.13% compared to the rest of the multiresolution techniques. The system is automated and can be adapted in clinical settings.
CONCLUSIONS:
We demonstrated the time improvement in calcium volume computation without compromising the quality of IVUS image. Among the 20 different combinations of multiresolution with calcium volume segmentation methods, the FCM embedded with wavelet-based multiresolution gave the best performance.
Address
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference
Notes MILAB; Approved no
Call Number Admin @ si @ BAL2016 Serial 2830
Permanent link to this record
 

 
Author (down) Sumit K. Banchhor; Narendra D. Londhe; Tadashi Araki; Luca Saba; Petia Radeva; Narendra N. Khanna; Jasjit S. Suri
Title Calcium detection, its quantification, and grayscale morphology-based risk stratification using machine learning in multimodality big data coronary and carotid scans: A review. Type Journal Article
Year 2018 Publication Computers in Biology and Medicine Abbreviated Journal CBM
Volume 101 Issue Pages 184-198
Keywords Heart disease; Stroke; Atherosclerosis; Intravascular; Coronary; Carotid; Calcium; Morphology; Risk stratification
Abstract Purpose of review

Atherosclerosis is the leading cause of cardiovascular disease (CVD) and stroke. Typically, atherosclerotic calcium is found during the mature stage of the atherosclerosis disease. It is therefore often a challenge to identify and quantify the calcium. This is due to the presence of multiple components of plaque buildup in the arterial walls. The American College of Cardiology/American Heart Association guidelines point to the importance of calcium in the coronary and carotid arteries and further recommend its quantification for the prevention of heart disease. It is therefore essential to stratify the CVD risk of the patient into low- and high-risk bins.
Recent finding

Calcium formation in the artery walls is multifocal in nature with sizes at the micrometer level. Thus, its detection requires high-resolution imaging. Clinical experience has shown that even though optical coherence tomography offers better resolution, intravascular ultrasound still remains an important imaging modality for coronary wall imaging. For a computer-based analysis system to be complete, it must be scientifically and clinically validated. This study presents a state-of-the-art review (condensation of 152 publications after examining 200 articles) covering the methods for calcium detection and its quantification for coronary and carotid arteries, the pros and cons of these methods, and the risk stratification strategies. The review also presents different kinds of statistical models and gold standard solutions for the evaluation of software systems useful for calcium detection and quantification. Finally, the review concludes with a possible vision for designing the next-generation system for better clinical outcomes.
Address
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference
Notes MILAB; no proj Approved no
Call Number Admin @ si @ BLA2018 Serial 3188
Permanent link to this record
 

 
Author (down) Sudeep Katakol; Basem Elbarashy; Luis Herranz; Joost Van de Weijer; Antonio Lopez
Title Distributed Learning and Inference with Compressed Images Type Journal Article
Year 2021 Publication IEEE Transactions on Image Processing Abbreviated Journal TIP
Volume 30 Issue Pages 3069 - 3083
Keywords
Abstract Modern computer vision requires processing large amounts of data, both while training the model and/or during inference, once the model is deployed. Scenarios where images are captured and processed in physically separated locations are increasingly common (e.g. autonomous vehicles, cloud computing). In addition, many devices suffer from limited resources to store or transmit data (e.g. storage space, channel capacity). In these scenarios, lossy image compression plays a crucial role to effectively increase the number of images collected under such constraints. However, lossy compression entails some undesired degradation of the data that may harm the performance of the downstream analysis task at hand, since important semantic information may be lost in the process. Moreover, we may only have compressed images at training time but are able to use original images at inference time, or vice versa, and in such a case, the downstream model suffers from covariate shift. In this paper, we analyze this phenomenon, with a special focus on vision-based perception for autonomous driving as a paradigmatic scenario. We see that loss of semantic information and covariate shift do indeed exist, resulting in a drop in performance that depends on the compression rate. In order to address the problem, we propose dataset restoration, based on image restoration with generative adversarial networks (GANs). Our method is agnostic to both the particular image compression method and the downstream task; and has the advantage of not adding additional cost to the deployed models, which is particularly important in resource-limited devices. The presented experiments focus on semantic segmentation as a challenging use case, cover a broad range of compression rates and diverse datasets, and show how our method is able to significantly alleviate the negative effects of compression on the downstream visual task.
Address
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference
Notes LAMP; ADAS; 600.120; 600.118 Approved no
Call Number Admin @ si @ KEH2021 Serial 3543
Permanent link to this record
 

 
Author (down) Stefan Lonn; Petia Radeva; Mariella Dimiccoli
Title Smartphone picture organization: A hierarchical approach Type Journal Article
Year 2019 Publication Computer Vision and Image Understanding Abbreviated Journal CVIU
Volume 187 Issue Pages 102789
Keywords
Abstract We live in a society where the large majority of the population has a camera-equipped smartphone. In addition, hard drives and cloud storage are getting cheaper and cheaper, leading to a tremendous growth in stored personal photos. Unlike photo collections captured by a digital camera, which typically are pre-processed by the user who organizes them into event-related folders, smartphone pictures are automatically stored in the cloud. As a consequence, photo collections captured by a smartphone are highly unstructured and because smartphones are ubiquitous, they present a larger variability compared to pictures captured by a digital camera. To solve the need of organizing large smartphone photo collections automatically, we propose here a new methodology for hierarchical photo organization into topics and topic-related categories. Our approach successfully estimates latent topics in the pictures by applying probabilistic Latent Semantic Analysis, and automatically assigns a name to each topic by relying on a lexical database. Topic-related categories are then estimated by using a set of topic-specific Convolutional Neuronal Networks. To validate our approach, we ensemble and make public a large dataset of more than 8,000 smartphone pictures from 40 persons. Experimental results demonstrate major user satisfaction with respect to state of the art solutions in terms of organization.
Address
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference
Notes MILAB; no proj Approved no
Call Number Admin @ si @ LRD2019 Serial 3297
Permanent link to this record
 

 
Author (down) Sounak Dey; Palaiahnakote Shivakumara; K.S. Raghunanda; Umapada Pal; Tong Lu; G. Hemantha Kumar; Chee Seng Chan
Title Script independent approach for multi-oriented text detection in scene image Type Journal Article
Year 2017 Publication Neurocomputing Abbreviated Journal NEUCOM
Volume 242 Issue Pages 96-112
Keywords
Abstract Developing a text detection method which is invariant to scripts in natural scene images is a challeng- ing task due to different geometrical structures of various scripts. Besides, multi-oriented of text lines in natural scene images make the problem more challenging. This paper proposes to explore ring radius transform (RRT) for text detection in multi-oriented and multi-script environments. The method finds component regions based on convex hull to generate radius matrices using RRT. It is a fact that RRT pro- vides low radius values for the pixels that are near to edges, constant radius values for the pixels that represent stroke width, and high radius values that represent holes created in background and convex hull because of the regular structures of text components. We apply k -means clustering on the radius matrices to group such spatially coherent regions into individual clusters. Then the proposed method studies the radius values of such cluster components that are close to the centroid and far from the cen- troid to detect text components. Furthermore, we have developed a Bangla dataset (named as ISI-UM dataset) and propose a semi-automatic system for generating its ground truth for text detection of arbi- trary orientations, which can be used by the researchers for text detection and recognition in the future. The ground truth will be released to public. Experimental results on our ISI-UM data and other standard datasets, namely, ICDAR 2013 scene, SVT and MSRA data, show that the proposed method outperforms the existing methods in terms of multi-lingual and multi-oriented text detection ability.
Address
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference
Notes DAG; 600.121 Approved no
Call Number Admin @ si @ DSR2017 Serial 3260
Permanent link to this record