|
Records |
Links |
|
Author |
Joakim Bruslund Haurum; Meysam Madadi; Sergio Escalera; Thomas B. Moeslund |
|
|
Title |
Multi-scale hybrid vision transformer and Sinkhorn tokenizer for sewer defect classification |
Type |
Journal Article |
|
Year |
2022 |
Publication |
Automation in Construction |
Abbreviated Journal |
AC |
|
|
Volume |
144 |
Issue |
|
Pages |
104614 |
|
|
Keywords |
Sewer Defect Classification; Vision Transformers; Sinkhorn-Knopp; Convolutional Neural Networks; Closed-Circuit Television; Sewer Inspection |
|
|
Abstract |
A crucial part of image classification consists of capturing non-local spatial semantics of image content. This paper describes the multi-scale hybrid vision transformer (MSHViT), an extension of the classical convolutional neural network (CNN) backbone, for multi-label sewer defect classification. To better model spatial semantics in the images, features are aggregated at different scales non-locally through the use of a lightweight vision transformer, and a smaller set of tokens was produced through a novel Sinkhorn clustering-based tokenizer using distinct cluster centers. The proposed MSHViT and Sinkhorn tokenizer were evaluated on the Sewer-ML multi-label sewer defect classification dataset, showing consistent performance improvements of up to 2.53 percentage points. |
|
|
Address |
Dec 2022 |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
|
|
|
Notes |
HuPBA |
Approved |
no |
|
|
Call Number |
Admin @ si @ BME2022c |
Serial |
3780 |
|
Permanent link to this record |
|
|
|
|
Author |
Manisha Das; Deep Gupta; Petia Radeva; Ashwini M. Bakde |
|
|
Title |
Multi-scale decomposition-based CT-MR neurological image fusion using optimized bio-inspired spiking neural model with meta-heuristic optimization |
Type |
Journal Article |
|
Year |
2021 |
Publication |
International Journal of Imaging Systems and Technology |
Abbreviated Journal |
IMA |
|
|
Volume |
31 |
Issue |
4 |
Pages |
2170-2188 |
|
|
Keywords |
|
|
|
Abstract |
Multi-modal medical image fusion plays an important role in clinical diagnosis and works as an assistance model for clinicians. In this paper, a computed tomography-magnetic resonance (CT-MR) image fusion model is proposed using an optimized bio-inspired spiking feedforward neural network in different decomposition domains. First, source images are decomposed into base (low-frequency) and detail (high-frequency) layer components. Low-frequency subbands are fused using texture energy measures to capture the local energy, contrast, and small edges in the fused image. High-frequency coefficients are fused using firing maps obtained by pixel-activated neural model with the optimized parameters using three different optimization techniques such as differential evolution, cuckoo search, and gray wolf optimization, individually. In the optimization model, a fitness function is computed based on the edge index of resultant fused images, which helps to extract and preserve sharp edges available in the source CT and MR images. To validate the fusion performance, a detailed comparative analysis is presented among the proposed and state-of-the-art methods in terms of quantitative and qualitative measures along with computational complexity. Experimental results show that the proposed method produces a significantly better visual quality of fused images meanwhile outperforms the existing methods. |
|
|
Address |
|
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
|
|
|
Notes |
MILAB; no menciona |
Approved |
no |
|
|
Call Number |
Admin @ si @ DGR2021a |
Serial |
3630 |
|
Permanent link to this record |
|
|
|
|
Author |
Ferran Poveda; Debora Gil;Enric Marti |
|
|
Title |
Multi-resolution DT-MRI cardiac tractography |
Type |
Conference Article |
|
Year |
2012 |
Publication |
Statistical Atlases And Computational Models Of The Heart: Imaging and Modelling Challenges |
Abbreviated Journal |
|
|
|
Volume |
7746 |
Issue |
|
Pages |
270-277 |
|
|
Keywords |
|
|
|
Abstract |
Even using objective measures from DT-MRI no consensus about myocardial architecture has been achieved so far. Streamlining provides good reconstructions at low level of detail, but falls short to give global abstract interpretations. In this paper, we present a multi-resolution methodology that is able to produce simplified representations of cardiac architecture. Our approach produces a reduced set of tracts that are representative of the main geometric features of myocardial anatomical structure. Experiments show that fiber geometry is preserved along reductions, which validates the simplified model for interpretation of cardiac architecture. |
|
|
Address |
Nice, France |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
Springer Berlin Heidelberg |
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
LNCS |
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
0302-9743 |
ISBN |
978-3-642-36960-5 |
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
STACOM |
|
|
Notes |
IAM |
Approved |
no |
|
|
Call Number |
IAM @ iam @ PGM2012 |
Serial |
1986 |
|
Permanent link to this record |
|
|
|
|
Author |
Cristina Palmero; Oleg V Komogortsev; Sergio Escalera; Sachin S Talathi |
|
|
Title |
Multi-Rate Sensor Fusion for Unconstrained Near-Eye Gaze Estimation |
Type |
Conference Article |
|
Year |
2023 |
Publication |
Proceedings of the 2023 Symposium on Eye Tracking Research and Applications |
Abbreviated Journal |
|
|
|
Volume |
|
Issue |
|
Pages |
1-8 |
|
|
Keywords |
|
|
|
Abstract |
The power requirements of video-oculography systems can be prohibitive for high-speed operation on portable devices. Recently, low-power alternatives such as photosensors have been evaluated, providing gaze estimates at high frequency with a trade-off in accuracy and robustness. Potentially, an approach combining slow/high-fidelity and fast/low-fidelity sensors should be able to exploit their complementarity to track fast eye motion accurately and robustly. To foster research on this topic, we introduce OpenSFEDS, a near-eye gaze estimation dataset containing approximately 2M synthetic camera-photosensor image pairs sampled at 500 Hz under varied appearance and camera position. We also formulate the task of sensor fusion for gaze estimation, proposing a deep learning framework consisting in appearance-based encoding and temporal eye-state dynamics. We evaluate several single- and multi-rate fusion baselines on OpenSFEDS, achieving 8.7% error decrease when tracking fast eye movements with a multi-rate approach vs. a gaze forecasting approach operating with a low-speed sensor alone. |
|
|
Address |
Tubingen; Germany; May 2023 |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
ETRA |
|
|
Notes |
HUPBA |
Approved |
no |
|
|
Call Number |
Admin @ si @ PKE2023 |
Serial |
3923 |
|
Permanent link to this record |
|
|
|
|
Author |
Meysam Madadi; Sergio Escalera; Jordi Gonzalez; Xavier Roca; Felipe Lumbreras |
|
|
Title |
Multi-part body segmentation based on depth maps for soft biometry analysis |
Type |
Journal Article |
|
Year |
2015 |
Publication |
Pattern Recognition Letters |
Abbreviated Journal |
PRL |
|
|
Volume |
56 |
Issue |
|
Pages |
14-21 |
|
|
Keywords |
3D shape context; 3D point cloud alignment; Depth maps; Human body segmentation; Soft biometry analysis |
|
|
Abstract |
This paper presents a novel method extracting biometric measures using depth sensors. Given a multi-part labeled training data, a new subject is aligned to the best model of the dataset, and soft biometrics such as lengths or circumference sizes of limbs and body are computed. The process is performed by training relevant pose clusters, defining a representative model, and fitting a 3D shape context descriptor within an iterative matching procedure. We show robust measures by applying orthogonal plates to body hull. We test our approach in a novel full-body RGB-Depth data set, showing accurate estimation of soft biometrics and better segmentation accuracy in comparison with random forest approach without requiring large training data. |
|
|
Address |
|
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
|
|
|
Notes |
HuPBA; ISE; ADAS; 600.076;600.049; 600.063; 600.054; 302.018;MILAB |
Approved |
no |
|
|
Call Number |
Admin @ si @ MEG2015 |
Serial |
2588 |
|
Permanent link to this record |
|
|
|
|
Author |
Partha Pratim Roy; Umapada Pal; Josep Llados; Mathieu Nicolas Delalandre |
|
|
Title |
Multi-oriented touching text character segmentation in graphical documents using dynamic programming |
Type |
Journal Article |
|
Year |
2012 |
Publication |
Pattern Recognition |
Abbreviated Journal |
PR |
|
|
Volume |
45 |
Issue |
5 |
Pages |
1972-1983 |
|
|
Keywords |
|
|
|
Abstract |
2,292 JCR
The touching character segmentation problem becomes complex when touching strings are multi-oriented. Moreover in graphical documents sometimes characters in a single-touching string have different orientations. Segmentation of such complex touching is more challenging. In this paper, we present a scheme towards the segmentation of English multi-oriented touching strings into individual characters. When two or more characters touch, they generate a big cavity region in the background portion. Based on the convex hull information, at first, we use this background information to find some initial points for segmentation of a touching string into possible primitives (a primitive consists of a single character or part of a character). Next, the primitives are merged to get optimum segmentation. A dynamic programming algorithm is applied for this purpose using the total likelihood of characters as the objective function. A SVM classifier is used to find the likelihood of a character. To consider multi-oriented touching strings the features used in the SVM are invariant to character orientation. Experiments were performed in different databases of real and synthetic touching characters and the results show that the method is efficient in segmenting touching characters of arbitrary orientations and sizes. |
|
|
Address |
|
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
0031-3203 |
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
|
|
|
Notes |
DAG |
Approved |
no |
|
|
Call Number |
Admin @ si @ RPL2012a |
Serial |
2133 |
|
Permanent link to this record |
|
|
|
|
Author |
Palaiahnakote Shivakumara; Anjan Dutta; Chew Lim Tan; Umapada Pal |
|
|
Title |
Multi-oriented scene text detection in video based on wavelet and angle projection boundary growing |
Type |
Journal Article |
|
Year |
2014 |
Publication |
Multimedia Tools and Applications |
Abbreviated Journal |
MTAP |
|
|
Volume |
72 |
Issue |
1 |
Pages |
515-539 |
|
|
Keywords |
|
|
|
Abstract |
In this paper, we address two complex issues: 1) Text frame classification and 2) Multi-oriented text detection in video text frame. We first divide a video frame into 16 blocks and propose a combination of wavelet and median-moments with k-means clustering at the block level to identify probable text blocks. For each probable text block, the method applies the same combination of feature with k-means clustering over a sliding window running through the blocks to identify potential text candidates. We introduce a new idea of symmetry on text candidates in each block based on the observation that pixel distribution in text exhibits a symmetric pattern. The method integrates all blocks containing text candidates in the frame and then all text candidates are mapped on to a Sobel edge map of the original frame to obtain text representatives. To tackle the multi-orientation problem, we present a new method called Angle Projection Boundary Growing (APBG) which is an iterative algorithm and works based on a nearest neighbor concept. APBG is then applied on the text representatives to fix the bounding box for multi-oriented text lines in the video frame. Directional information is used to eliminate false positives. Experimental results on a variety of datasets such as non-horizontal, horizontal, publicly available data (Hua’s data) and ICDAR-03 competition data (camera images) show that the proposed method outperforms existing methods proposed for video and the state of the art methods for scene text as well. |
|
|
Address |
|
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
Springer US |
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
1380-7501 |
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
|
|
|
Notes |
DAG; 600.077 |
Approved |
no |
|
|
Call Number |
Admin @ si @ SDT2014 |
Serial |
2357 |
|
Permanent link to this record |
|
|
|
|
Author |
Partha Pratim Roy; Umapada Pal; Josep Llados |
|
|
Title |
Multi-oriented English Text Line Extraction using Background and Foreground Information |
Type |
Conference Article |
|
Year |
2008 |
Publication |
Proceedings of the 8th IAPR International Workshop on Document Analysis Systems, |
Abbreviated Journal |
|
|
|
Volume |
|
Issue |
|
Pages |
315–322 |
|
|
Keywords |
|
|
|
Abstract |
|
|
|
Address |
Nara (Japo) |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
DAS |
|
|
Notes |
DAG |
Approved |
no |
|
|
Call Number |
DAG @ dag @ RPL2008b |
Serial |
1047 |
|
Permanent link to this record |
|
|
|
|
Author |
Partha Pratim Roy; Josep Llados |
|
|
Title |
Multi-Oriented Character Recognition from Graphical Documents |
Type |
Conference Article |
|
Year |
2008 |
Publication |
2nd International Conference on Cognition and Recognition |
Abbreviated Journal |
|
|
|
Volume |
|
Issue |
|
Pages |
30–35 |
|
|
Keywords |
|
|
|
Abstract |
|
|
|
Address |
Mandya (India) |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
ICCR |
|
|
Notes |
DAG |
Approved |
no |
|
|
Call Number |
DAG @ dag @ RLP2008 |
Serial |
965 |
|
Permanent link to this record |
|
|
|
|
Author |
Umapada Pal; Partha Pratim Roy; N. Tripathya; Josep Llados |
|
|
Title |
Multi-oriented Bangla and Devnagari text recognition |
Type |
Journal Article |
|
Year |
2010 |
Publication |
Pattern Recognition |
Abbreviated Journal |
PR |
|
|
Volume |
43 |
Issue |
12 |
Pages |
4124–4136 |
|
|
Keywords |
|
|
|
Abstract |
There are printed complex documents where text lines of a single page may have different orientations or the text lines may be curved in shape. As a result, it is difficult to detect the skew of such documents and hence character segmentation and recognition of such documents are a complex task. In this paper, using background and foreground information we propose a novel scheme towards the recognition of Indian complex documents of Bangla and Devnagari script. In Bangla and Devnagari documents usually characters in a word touch and they form cavity regions. To take care of these cavity regions, background information of such documents is used. Convex hull and water reservoir principle have been applied for this purpose. Here, at first, the characters are segmented from the documents using the background information of the text. Next, individual characters are recognized using rotation invariant features obtained from the foreground part of the characters.
For character segmentation, at first, writing mode of a touching component (word) is detected using water reservoir principle based features. Next, depending on writing mode and the reservoir base-region of the touching component, a set of candidate envelope points is then selected from the contour points of the component. Based on these candidate points, the touching component is finally segmented into individual characters. For recognition of multi-sized/multi-oriented characters the features are computed from different angular information obtained from the external and internal contour pixels of the characters. These angular information are computed in such a way that they do not depend on the size and rotation of the characters. Circular and convex hull rings have been used to divide a character into smaller zones to get zone-wise features for higher recognition results. We combine circular and convex hull features to improve the results and these features are fed to support vector machines (SVM) for recognition. From our experiment we obtained recognition results of 99.18% (98.86%) accuracy when tested on 7515 (7874) Devnagari (Bangla) characters. |
|
|
Address |
|
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
Elsevier |
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
|
|
|
Notes |
DAG |
Approved |
no |
|
|
Call Number |
DAG @ dag @ PRT2010 |
Serial |
1337 |
|
Permanent link to this record |
|
|
|
|
Author |
Partha Pratim Roy; Umapada Pal; Josep Llados; Mathieu Nicolas Delalandre |
|
|
Title |
Multi-Oriented and Multi-Sized Touching Character Segmentation using Dynamic Programming |
Type |
Conference Article |
|
Year |
2009 |
Publication |
10th International Conference on Document Analysis and Recognition |
Abbreviated Journal |
|
|
|
Volume |
|
Issue |
|
Pages |
11–15 |
|
|
Keywords |
|
|
|
Abstract |
In this paper, we present a scheme towards the segmentation of English multi-oriented touching strings into individual characters. When two or more characters touch, they generate a big cavity region at the background portion. Using Convex Hull information, we use these background information to find some initial points to segment a touching string into possible primitive segments (a primitive segment consists of a single character or a part of a character). Next these primitive segments are merged to get optimum segmentation and dynamic programming is applied using total likelihood of characters as the objective function. SVM classifier is used to find the likelihood of a character. To consider multi-oriented touching strings the features used in the SVM are invariant to character orientation. Circular ring and convex hull ring based approach has been used along with angular information of the contour pixels of the character to make the feature rotation invariant. From the experiment, we obtained encouraging results. |
|
|
Address |
Barcelona, Spain |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
1520-5363 |
ISBN |
978-1-4244-4500-4 |
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
ICDAR |
|
|
Notes |
DAG |
Approved |
no |
|
|
Call Number |
DAG @ dag @ RPL2009a |
Serial |
1240 |
|
Permanent link to this record |
|
|
|
|
Author |
Partha Pratim Roy |
|
|
Title |
Multi-Oriented and Multi-Scaled Text Character Analysis and Recognition in Graphical Documents and their Applications to Document Image Retrieval |
Type |
Book Whole |
|
Year |
2010 |
Publication |
PhD Thesis, Universitat Autonoma de Barcelona-CVC |
Abbreviated Journal |
|
|
|
Volume |
|
Issue |
|
Pages |
|
|
|
Keywords |
|
|
|
Abstract |
With the advent research of Document Image Analysis and Recognition (DIAR), an
important line of research is explored on indexing and retrieval of graphics rich documents. It aims at finding relevant documents relying on segmentation and recognition
of text and graphics components underlying in non-standard layout where commercial
OCRs can not be applied due to complexity. This thesis is focused towards text information extraction approaches in graphical documents and retrieval of such documents
using text information.
Automatic text recognition in graphical documents (map, engineering drawing,
etc.) involves many challenges because text characters are usually printed in multioriented and multi-scale way along with different graphical objects. Text characters
are used to annotate the graphical curve lines and hence, many times they follow
curvi-linear paths too. For OCR of such documents, individual text lines and their
corresponding words/characters need to be extracted.
For recognition of multi-font, multi-scale and multi-oriented characters, we have
proposed a feature descriptor for character shape using angular information from contour pixels to take care of the invariance nature. To improve the efficiency of OCR, an
approach towards the segmentation of multi-oriented touching strings into individual
characters is also discussed. Convex hull based background information is used to
segment a touching string into possible primitive segments and later these primitive
segments are merged to get optimum segmentation using dynamic programming. To
overcome the touching/overlapping problem of text with graphical lines, a character
spotting approach using SIFT and skeleton information is included. Afterwards, we
propose a novel method to extract individual curvi-linear text lines using the foreground and background information of the characters of the text and a water reservoir
concept is used to utilize the background information.
We have also formulated the methodologies for graphical document retrieval applications using query words and seals. The retrieval approaches are performed using
recognition results of individual components in the document. Given a query text,
the system extracts positional knowledge from the query word and uses the same to
generate hypothetical locations in the document. Indexing of documents is also performed based on automatic detection of seals from documents containing cluttered
background. A seal is characterized by scale and rotation invariant spatial feature
descriptors computed from labelled text characters and a concept based on the Generalized Hough Transform is used to locate the seal in documents. |
|
|
Address |
|
|
|
Corporate Author |
|
Thesis |
Ph.D. thesis |
|
|
Publisher |
Ediciones Graficas Rey |
Place of Publication |
|
Editor |
Josep Llados;Umapada Pal |
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
978-84-937261-7-1 |
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
|
|
|
Notes |
|
Approved |
no |
|
|
Call Number |
Admin @ si @ Roy2010 |
Serial |
1455 |
|
Permanent link to this record |
|
|
|
|
Author |
Bogdan Raducanu; Alireza Bosaghzadeh; Fadi Dornaika |
|
|
Title |
Multi-observation Face Recognition in Videos based on Label Propagation |
Type |
Conference Article |
|
Year |
2015 |
Publication |
6th Workshop on Analysis and Modeling of Faces and Gestures AMFG2015 |
Abbreviated Journal |
|
|
|
Volume |
|
Issue |
|
Pages |
10-17 |
|
|
Keywords |
|
|
|
Abstract |
In order to deal with the huge amount of content generated by social media, especially for indexing and retrieval purposes, the focus shifted from single object recognition to multi-observation object recognition. Of particular interest is the problem of face recognition (used as primary cue for persons’ identity assessment), since it is highly required by popular social media search engines like Facebook and Youtube. Recently, several approaches for graph-based label propagation were proposed. However, the associated graphs were constructed in an ad-hoc manner (e.g., using the KNN graph) that cannot cope properly with the rapid and frequent changes in data appearance, a phenomenon intrinsically related with video sequences. In this paper, we
propose a novel approach for efficient and adaptive graph construction, based on a two-phase scheme: (i) the first phase is used to adaptively find the neighbors of a sample and also to find the adequate weights for the minimization function of the second phase; (ii) in the second phase, the
selected neighbors along with their corresponding weights are used to locally and collaboratively estimate the sparse affinity matrix weights. Experimental results performed on Honda Video Database (HVDB) and a subset of video
sequences extracted from the popular TV-series ’Friends’ show a distinct advantage of the proposed method over the existing standard graph construction methods. |
|
|
Address |
Boston; USA; June 2015 |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
CVPRW |
|
|
Notes |
OR; 600.068; 600.072;MV |
Approved |
no |
|
|
Call Number |
Admin @ si @ RBD2015 |
Serial |
2627 |
|
Permanent link to this record |
|
|
|
|
Author |
Albert Clapes; Miguel Reyes; Sergio Escalera |
|
|
Title |
Multi-modal User Identification and Object Recognition Surveillance System |
Type |
Journal Article |
|
Year |
2013 |
Publication |
Pattern Recognition Letters |
Abbreviated Journal |
PRL |
|
|
Volume |
34 |
Issue |
7 |
Pages |
799-808 |
|
|
Keywords |
Multi-modal RGB-Depth data analysis; User identification; Object recognition; Intelligent surveillance; Visual features; Statistical learning |
|
|
Abstract |
We propose an automatic surveillance system for user identification and object recognition based on multi-modal RGB-Depth data analysis. We model a RGBD environment learning a pixel-based background Gaussian distribution. Then, user and object candidate regions are detected and recognized using robust statistical approaches. The system robustly recognizes users and updates the system in an online way, identifying and detecting new actors in the scene. Moreover, segmented objects are described, matched, recognized, and updated online using view-point 3D descriptions, being robust to partial occlusions and local 3D viewpoint rotations. Finally, the system saves the historic of user–object assignments, being specially useful for surveillance scenarios. The system has been evaluated on a novel data set containing different indoor/outdoor scenarios, objects, and users, showing accurate recognition and better performance than standard state-of-the-art approaches. |
|
|
Address |
|
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
Elsevier |
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
|
|
|
Notes |
HUPBA; 600.046; 605.203;MILAB |
Approved |
no |
|
|
Call Number |
Admin @ si @ CRE2013 |
Serial |
2248 |
|
Permanent link to this record |
|
|
|
|
Author |
Victor Ponce; Sergio Escalera; Xavier Baro |
|
|
Title |
Multi-modal Social Signal Analysis for Predicting Agreement in Conversation Settings |
Type |
Conference Article |
|
Year |
2013 |
Publication |
15th ACM International Conference on Multimodal Interaction |
Abbreviated Journal |
|
|
|
Volume |
|
Issue |
|
Pages |
495-502 |
|
|
Keywords |
|
|
|
Abstract |
In this paper we present a non-invasive ambient intelligence framework for the analysis of non-verbal communication applied to conversational settings. In particular, we apply feature extraction techniques to multi-modal audio-RGB-depth data. We compute a set of behavioral indicators that define communicative cues coming from the fields of psychology and observational methodology. We test our methodology over data captured in victim-offender mediation scenarios. Using different state-of-the-art classification approaches, our system achieve upon 75% of recognition predicting agreement among the parts involved in the conversations, using as ground truth the experts opinions. |
|
|
Address |
Sidney; Australia; December 2013 |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
978-1-4503-2129-7 |
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
ICMI |
|
|
Notes |
HuPBA;MV |
Approved |
no |
|
|
Call Number |
Admin @ si @ PEB2013 |
Serial |
2488 |
|
Permanent link to this record |