|
Records |
Links |
|
Author |
Adela Barbulescu; Wenjuan Gong; Jordi Gonzalez; Thomas B. Moeslund; Xavier Roca |
![download PDF file pdf](img/file_PDF.gif)
![find book details (via ISBN) isbn](img/isbn.gif)
|
|
Title |
3D Human Pose Estimation Using 2D Body Part Detectors |
Type |
Conference Article |
|
Year |
2012 |
Publication |
21st International Conference on Pattern Recognition |
Abbreviated Journal |
|
|
|
Volume |
|
Issue |
|
Pages |
2484 - 2487 |
|
|
Keywords |
|
|
|
Abstract |
Automatic 3D reconstruction of human poses from monocular images is a challenging and popular topic in the computer vision community, which provides a wide range of applications in multiple areas. Solutions for 3D pose estimation involve various learning approaches, such as support vector machines and Gaussian processes, but many encounter difficulties in cluttered scenarios and require additional input data, such as silhouettes, or controlled camera settings. We present a framework that is capable of estimating the 3D pose of a person from single images or monocular image sequences without requiring background information and which is robust to camera variations. The framework models the non-linearity present in human pose estimation as it benefits from flexible learning approaches, including a highly customizable 2D detector. Results on the HumanEva benchmark show how they perform and influence the quality of the 3D pose estimates. |
|
|
Address |
Tsubuka, Japan |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
1051-4651 |
ISBN |
978-1-4673-2216-4 |
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
ICPR |
|
|
Notes ![sorted by Notes field, descending order (down)](img/sort_desc.gif) |
ISE |
Approved |
no |
|
|
Call Number |
Admin @ si @ BGG2012 |
Serial |
2172 |
|
Permanent link to this record |
|
|
|
|
Author |
Mikhail Mozerov |
![goto web page url](img/www.gif)
![find record details (via OpenURL) openurl](img/xref.gif)
|
|
Title |
Constrained Optical Flow Estimation as a Matching Problem |
Type |
Journal Article |
|
Year |
2013 |
Publication |
IEEE Transactions on Image Processing |
Abbreviated Journal |
TIP |
|
|
Volume |
22 |
Issue |
5 |
Pages |
2044-2055 |
|
|
Keywords |
|
|
|
Abstract |
In general, discretization in the motion vector domain yields an intractable number of labels. In this paper we propose an approach that can reduce general optical flow to the constrained matching problem by pre-estimating a 2D disparity labeling map of the desired discrete motion vector function. One of the goals of the proposed paper is estimating coarse distribution of motion vectors and then utilizing this distribution as global constraints for discrete optical flow estimation. This pre-estimation is done with a simple frame-to-frame correlation technique also known as the digital symmetric-phase-only-filter (SPOF). We discover a strong correlation between the output of the SPOF and the motion vector distribution of the related optical flow. The two step matching paradigm for optical flow estimation is applied: pixel accuracy (integer flow), and subpixel accuracy estimation. The matching problem is solved by global optimization. Experiments on the Middlebury optical flow datasets confirm our intuitive assumptions about strong correlation between motion vector distribution of optical flow and maximal peaks of SPOF outputs. The overall performance of the proposed method is promising and achieves state-of-the-art results on the Middlebury benchmark. |
|
|
Address |
|
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
1057-7149 |
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
|
|
|
Notes ![sorted by Notes field, descending order (down)](img/sort_desc.gif) |
ISE |
Approved |
no |
|
|
Call Number |
Admin @ si @ Moz2013 |
Serial |
2191 |
|
Permanent link to this record |
|
|
|
|
Author |
Ariel Amato |
![find record details (via OpenURL) openurl](img/xref.gif)
|
|
Title |
Environment-Independent Moving Cast Shadow Suppression in Video Surveillance |
Type |
Book Whole |
|
Year |
2012 |
Publication |
PhD Thesis, Universitat Autonoma de Barcelona-CVC |
Abbreviated Journal |
|
|
|
Volume |
|
Issue |
|
Pages |
|
|
|
Keywords |
|
|
|
Abstract |
This thesis is devoted to moving shadows detection and suppression. Shadows could be defined as the parts of the scene that are not directly illuminated by a light source due to obstructing object or objects. Often, moving shadows in images sequences are undesirable since they could cause degradation of the expected results during processing of images for object detection, segmentation, scene surveillance or similar purposes. In this thesis first moving shadow detection methods are exhaustively overviewed. Beside the mentioned methods from literature and to compensate their limitations a new moving shadow detection method is proposed. It requires no prior knowledge about the scene, nor is it restricted to assumptions about specific scene structures. Furthermore, the technique can detect both achromatic and chromatic shadows even in the presence of camouflage that occurs when foreground regions are very similar in color to shadowed regions. The method exploits local color constancy properties due to reflectance suppression over shadowed regions. To detect shadowed regions in a scene the values of the background image are divided by values of the current frame in the RGB color space. In the thesis how this luminance ratio can be used to identify segments with low gradient constancy is shown, which in turn distinguish shadows from foreground. Experimental results on a collection of publicly available datasets illustrate the superior performance of the proposed method compared with the most sophisticated state-of-the-art shadow detection algorithms. These results show that the proposed approach is robust and accurate over a broad range of shadow types and challenging video conditions. |
|
|
Address |
|
|
|
Corporate Author |
|
Thesis |
Ph.D. thesis |
|
|
Publisher |
Ediciones Graficas Rey |
Place of Publication |
|
Editor |
Mikhail Mozerov;Jordi Gonzalez |
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
|
|
|
Notes ![sorted by Notes field, descending order (down)](img/sort_desc.gif) |
ISE |
Approved |
no |
|
|
Call Number |
Admin @ si @ Ama2012 |
Serial |
2201 |
|
Permanent link to this record |
|
|
|
|
Author |
Noha Elfiky |
![find record details (via OpenURL) openurl](img/xref.gif)
|
|
Title |
Compact, Adaptive and Discriminative Spatial Pyramids for Improved Object and Scene Classification |
Type |
Book Whole |
|
Year |
2012 |
Publication |
PhD Thesis, Universitat Autonoma de Barcelona-CVC |
Abbreviated Journal |
|
|
|
Volume |
|
Issue |
|
Pages |
|
|
|
Keywords |
|
|
|
Abstract |
The release of challenging datasets with a vast number of images, requires the development of efficient image representations and algorithms which are able to manipulate these large-scale datasets efficiently. Nowadays the Bag-of-Words (BoW) is the most successful approach in the context of object and scene classification tasks. However, its main drawback is the absence of the important spatial information. Spatial pyramids (SP) have been successfully applied to incorporate spatial information into BoW-based image representation. Observing the remarkable performance of spatial pyramids, their growing number of applications to a broad range of vision problems, and finally its geometry inclusion, a question can be asked what are the limits of spatial pyramids. Within the SP framework, the optimal way for obtaining an image spatial representation, which is able to cope with it’s most foremost shortcomings, concretely, it’s high dimensionality and the rigidity of the resulting image representation, still remains an active research domain. In summary, the main concern of this thesis is to search for the limits of spatial pyramids and try to figure out solutions for them. |
|
|
Address |
|
|
|
Corporate Author |
|
Thesis |
Ph.D. thesis |
|
|
Publisher |
Ediciones Graficas Rey |
Place of Publication |
|
Editor |
Jordi Gonzalez;Xavier Roca |
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
|
|
|
Notes ![sorted by Notes field, descending order (down)](img/sort_desc.gif) |
ISE |
Approved |
no |
|
|
Call Number |
Admin @ si @ Elf2012 |
Serial |
2202 |
|
Permanent link to this record |
|
|
|
|
Author |
Marco Pedersoli |
![find record details (via OpenURL) openurl](img/xref.gif)
|
|
Title |
Hierarchical Multiresolution Models for fast Object Detection |
Type |
Book Whole |
|
Year |
2012 |
Publication |
PhD Thesis, Universitat Autonoma de Barcelona-CVC |
Abbreviated Journal |
|
|
|
Volume |
|
Issue |
|
Pages |
|
|
|
Keywords |
|
|
|
Abstract |
The ability to automatically detect and recognize objects in unconstrained images is becoming more and more critical: from security systems and autonomous robots, to smart phones and augmented reality, intelligent devices need to understand the meaning of images as a composition of semantic objects. This Thesis tackles the problem of fast object detection based on template models. Detection consists of searching for an object in an image by evaluating the similarity between a template model and an image region at each possible location and scale. In this work, we show that using a template model representation based on a multiple resolution hierarchy is an optimal choice that can lead to excellent detection accuracy and fast computation. We implement two different approaches that make use of a hierarchy of multiresolution models: a multiresolution cascade and a coarse-to-fine search. Also, we extend the coarse-to-fine search by introducing a deformable part-based model that achieves state-of-the-art results together with a very reduced computational cost. Finally, we specialize our approach to the challenging task of pedestrian detection from moving vehicles and show that the overall quality of the system outperforms previous works in terms of speed and accuracy. |
|
|
Address |
|
|
|
Corporate Author |
|
Thesis |
Ph.D. thesis |
|
|
Publisher |
Ediciones Graficas Rey |
Place of Publication |
|
Editor |
Jordi Gonzalez;Xavier Roca |
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
|
|
|
Notes ![sorted by Notes field, descending order (down)](img/sort_desc.gif) |
ISE |
Approved |
no |
|
|
Call Number |
Admin @ si @ Ped2012 |
Serial |
2203 |
|
Permanent link to this record |
|
|
|
|
Author |
Bhaskar Chakraborty |
![find record details (via OpenURL) openurl](img/xref.gif)
|
|
Title |
Model free approach to human action recognition |
Type |
Book Whole |
|
Year |
2012 |
Publication |
PhD Thesis, Universitat Autonoma de Barcelona-CVC |
Abbreviated Journal |
|
|
|
Volume |
|
Issue |
|
Pages |
|
|
|
Keywords |
|
|
|
Abstract |
Automatic understanding of human activity and action is very important and challenging research area of Computer Vision with wide applications in video surveillance, motion analysis, virtual reality interfaces, video indexing, content based video retrieval, HCI and health care. This thesis presents a series of techniques to solve the problem of human action recognition in video. First approach towards this goal is based on a probabilistic optimization model of body parts using Hidden Markov Model. This strong model based approach is able to distinguish between similar actions by only considering the body parts having major contributions to the actions. In next approach, we apply a weak model based human detector and actions are represented by Bag-of-key poses model to capture the human pose changes during the actions. To tackle the problem of human action recognition in complex scenes, a selective spatio-temporal interest point (STIP) detector is proposed by using a mechanism similar to that of the non-classical receptive field inhibition that is exhibited by most oriented selective neuron in the primary visual cortex. An extension of the selective STIP detector is applied to multi-view action recognition system by introducing a novel 4D STIPs (3D space + time). Finally, we use our STIP detector on large scale continuous visual event recognition problem and propose a novel generalized max-margin Hough transformation framework for activity detection |
|
|
Address |
|
|
|
Corporate Author |
|
Thesis |
Ph.D. thesis |
|
|
Publisher |
Ediciones Graficas Rey |
Place of Publication |
|
Editor |
Jordi Gonzalez;Xavier Roca |
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
|
|
|
Notes ![sorted by Notes field, descending order (down)](img/sort_desc.gif) |
ISE |
Approved |
no |
|
|
Call Number |
Admin @ si @ Cha2012 |
Serial |
2207 |
|
Permanent link to this record |
|
|
|
|
Author |
Josep M. Gonfaus |
![find record details (via OpenURL) openurl](img/xref.gif)
|
|
Title |
Towards Deep Image Understanding: From pixels to semantics |
Type |
Book Whole |
|
Year |
2012 |
Publication |
PhD Thesis, Universitat Autonoma de Barcelona-CVC |
Abbreviated Journal |
|
|
|
Volume |
|
Issue |
|
Pages |
|
|
|
Keywords |
|
|
|
Abstract |
Understanding the content of the images is one of the greatest challenges of computer vision. Recognition of objects appearing in images, identifying and interpreting their actions are the main purposes of Image Understanding. This thesis seeks to identify what is present in a picture by categorizing and locating all the objects in the scene.
Images are composed by pixels, and one possibility consists of assigning to each pixel an object category, which is commonly known as semantic segmentation. By incorporating information as a contextual cue, we are able to resolve the ambiguity within categories at the pixel-level. We propose three levels of scale in order to resolve such ambiguity.
Another possibility to represent the objects is the object detection task. In this case, the aim is to recognize and localize the whole object by accurately placing a bounding box around it. We present two new approaches. The first one is focused on improving the object representation of deformable part models with the concept of factorized appearances. The second approach addresses the issue of reducing the computational cost for multi-class recognition. The results given have been validated on several commonly used datasets, reaching international recognition and state-of-the-art within the field |
|
|
Address |
|
|
|
Corporate Author |
|
Thesis |
Ph.D. thesis |
|
|
Publisher |
Ediciones Graficas Rey |
Place of Publication |
|
Editor |
Jordi Gonzalez;Theo Gevers |
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
|
|
|
Notes ![sorted by Notes field, descending order (down)](img/sort_desc.gif) |
ISE |
Approved |
no |
|
|
Call Number |
Admin @ si @ Gon2012 |
Serial |
2208 |
|
Permanent link to this record |
|
|
|
|
Author |
Wenjuan Gong |
![find record details (via OpenURL) openurl](img/xref.gif)
|
|
Title |
Action priors for human pose tracking by particle filter |
Type |
Report |
|
Year |
2009 |
Publication |
CVC Technical Report |
Abbreviated Journal |
|
|
|
Volume |
|
Issue |
|
Pages |
|
|
|
Keywords |
|
|
|
Abstract |
|
|
|
Address |
|
|
|
Corporate Author |
Computer Vision Center |
Thesis |
Master's thesis |
|
|
Publisher |
|
Place of Publication |
Bellaterra, Barcelona |
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
|
|
|
Notes ![sorted by Notes field, descending order (down)](img/sort_desc.gif) |
ISE |
Approved |
no |
|
|
Call Number |
Admin @ si @ Gon2009 |
Serial |
2401 |
|
Permanent link to this record |
|
|
|
|
Author |
Bhaskar Chakraborty; Jordi Gonzalez; Xavier Roca |
![download PDF file pdf](img/file_PDF.gif)
![goto web page (via DOI) doi](img/doi.gif)
|
|
Title |
Large scale continuous visual event recognition using max-margin Hough transformation framework |
Type |
Journal Article |
|
Year |
2013 |
Publication |
Computer Vision and Image Understanding |
Abbreviated Journal |
CVIU |
|
|
Volume |
117 |
Issue |
10 |
Pages |
1356–1368 |
|
|
Keywords |
|
|
|
Abstract |
In this paper we propose a novel method for continuous visual event recognition (CVER) on a large scale video dataset using max-margin Hough transformation framework. Due to high scalability, diverse real environmental state and wide scene variability direct application of action recognition/detection methods such as spatio-temporal interest point (STIP)-local feature based technique, on the whole dataset is practically infeasible. To address this problem, we apply a motion region extraction technique which is based on motion segmentation and region clustering to identify possible candidate “event of interest” as a preprocessing step. On these candidate regions a STIP detector is applied and local motion features are computed. For activity representation we use generalized Hough transform framework where each feature point casts a weighted vote for possible activity class centre. A max-margin frame work is applied to learn the feature codebook weight. For activity detection, peaks in the Hough voting space are taken into account and initial event hypothesis is generated using the spatio-temporal information of the participating STIPs. For event recognition a verification Support Vector Machine is used. An extensive evaluation on benchmark large scale video surveillance dataset (VIRAT) and as well on a small scale benchmark dataset (MSR) shows that the proposed method is applicable on a wide range of continuous visual event recognition applications having extremely challenging conditions. |
|
|
Address |
|
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
1077-3142 |
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
|
|
|
Notes ![sorted by Notes field, descending order (down)](img/sort_desc.gif) |
ISE |
Approved |
no |
|
|
Call Number |
Admin @ si @ CGR2013 |
Serial |
2413 |
|
Permanent link to this record |
|
|
|
|
Author |
Ariel Amato |
![find record details (via OpenURL) openurl](img/xref.gif)
|
|
Title |
Moving cast shadow detection |
Type |
Journal Article |
|
Year |
2014 |
Publication |
Electronic letters on computer vision and image analysis |
Abbreviated Journal |
ELCVIA |
|
|
Volume |
13 |
Issue |
2 |
Pages |
70-71 |
|
|
Keywords |
|
|
|
Abstract |
Motion perception is an amazing innate ability of the creatures on the planet. This adroitness entails a functional advantage that enables species to compete better in the wild. The motion perception ability is usually employed at different levels, allowing from the simplest interaction with the ’physis’ up to the most transcendental survival tasks. Among the five classical perception system , vision is the most widely used in the motion perception field. Millions years of evolution have led to a highly specialized visual system in humans, which is characterized by a tremendous accuracy as well as an extraordinary robustness. Although humans and an immense diversity of species can distinguish moving object with a seeming simplicity, it has proven to be a difficult and non trivial problem from a computational perspective. In the field of Computer Vision, the detection of moving objects is a challenging and fundamental research area. This can be referred to as the ’origin’ of vast and numerous vision-based research sub-areas. Nevertheless, from the bottom to the top of this hierarchical analysis, the foundations still relies on when and where motion has occurred in an image. Pixels corresponding to moving objects in image sequences can be identified by measuring changes in their values. However, a pixel’s value (representing a combination of color and brightness) could also vary due to other factors such as: variation in scene illumination, camera noise and nonlinear sensor responses among others. The challenge lies in detecting if the changes in pixels’ value are caused by a genuine object movement or not. An additional challenging aspect in motion detection is represented by moving cast shadows. The paradox arises because a moving object and its cast shadow share similar motion patterns. However, a moving cast shadow is not a moving object. In fact, a shadow represents a photometric illumination effect caused by the relative position of the object with respect to the light sources. Shadow detection methods are mainly divided in two domains depending on the application field. One normally consists of static images where shadows are casted by static objects, whereas the second one is referred to image sequences where shadows are casted by moving objects. For the first case, shadows can provide additional geometric and semantic cues about shape and position of its casting object as well as the localization of the light source. Although the previous information can be extracted from static images as well as video sequences, the main focus in the second area is usually change detection, scene matching or surveillance. In this context, a shadow can severely affect with the analysis and interpretation of the scene. The work done in the thesis is focused on the second case, thus it addresses the problem of detection and removal of moving cast shadows in video sequences in order to enhance the detection of moving object. |
|
|
Address |
|
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
|
|
|
Notes ![sorted by Notes field, descending order (down)](img/sort_desc.gif) |
ISE |
Approved |
no |
|
|
Call Number |
Admin @ si @ Ama2014 |
Serial |
2870 |
|
Permanent link to this record |
|
|
|
|
Author |
Parichehr Behjati Ardakani; Pau Rodriguez; Carles Fernandez; Armin Mehri; Xavier Roca; Seiichi Ozawa; Jordi Gonzalez |
![goto web page (via DOI) doi](img/doi.gif)
|
|
Title |
Frequency-based Enhancement Network for Efficient Super-Resolution |
Type |
Journal Article |
|
Year |
2022 |
Publication |
IEEE Access |
Abbreviated Journal |
ACCESS |
|
|
Volume |
10 |
Issue |
|
Pages |
57383-57397 |
|
|
Keywords |
Deep learning; Frequency-based methods; Lightweight architectures; Single image super-resolution |
|
|
Abstract |
Recently, deep convolutional neural networks (CNNs) have provided outstanding performance in single image super-resolution (SISR). Despite their remarkable performance, the lack of high-frequency information in the recovered images remains a core problem. Moreover, as the networks increase in depth and width, deep CNN-based SR methods are faced with the challenge of computational complexity in practice. A promising and under-explored solution is to adapt the amount of compute based on the different frequency bands of the input. To this end, we present a novel Frequency-based Enhancement Block (FEB) which explicitly enhances the information of high frequencies while forwarding low-frequencies to the output. In particular, this block efficiently decomposes features into low- and high-frequency and assigns more computation to high-frequency ones. Thus, it can help the network generate more discriminative representations by explicitly recovering finer details. Our FEB design is simple and generic and can be used as a direct replacement of commonly used SR blocks with no need to change network architectures. We experimentally show that when replacing SR blocks with FEB we consistently improve the reconstruction error, while reducing the number of parameters in the model. Moreover, we propose a lightweight SR model — Frequency-based Enhancement Network (FENet) — based on FEB that matches the performance of larger models. Extensive experiments demonstrate that our proposal performs favorably against the state-of-the-art SR algorithms in terms of visual quality, memory footprint, and inference time. The code is available at https://github.com/pbehjatii/FENet |
|
|
Address |
18 May 2022 |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
IEEE |
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
|
|
|
Notes ![sorted by Notes field, descending order (down)](img/sort_desc.gif) |
ISE |
Approved |
no |
|
|
Call Number |
Admin @ si @ BRF2022a |
Serial |
3747 |
|
Permanent link to this record |
|
|
|
|
Author |
Ana Garcia Rodriguez; Yael Tudela; Henry Cordova; S. Carballal; I. Ordas; L. Moreira; E. Vaquero; O. Ortiz; L. Rivero; F. Javier Sanchez; Miriam Cuatrecasas; Maria Pellise; Jorge Bernal; Gloria Fernandez Esparrach |
![goto web page (via DOI) doi](img/doi.gif)
|
|
Title |
First in Vivo Computer-Aided Diagnosis of Colorectal Polyps using White Light Endoscopy |
Type |
Journal Article |
|
Year |
2022 |
Publication |
Endoscopy |
Abbreviated Journal |
END |
|
|
Volume |
54 |
Issue |
|
Pages |
|
|
|
Keywords |
|
|
|
Abstract |
|
|
|
Address |
2022/04/14 |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
Georg Thieme Verlag KG |
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
|
|
|
Notes ![sorted by Notes field, descending order (down)](img/sort_desc.gif) |
ISE |
Approved |
no |
|
|
Call Number |
Admin @ si @ GTC2022a |
Serial |
3746 |
|
Permanent link to this record |
|
|
|
|
Author |
Diego Velazquez; Josep M. Gonfaus; Pau Rodriguez; Xavier Roca; Seiichi Ozawa; Jordi Gonzalez |
![goto web page url](img/www.gif)
![find record details (via OpenURL) openurl](img/xref.gif)
|
|
Title |
Logo Detection With No Priors |
Type |
Journal Article |
|
Year |
2021 |
Publication |
IEEE Access |
Abbreviated Journal |
ACCESS |
|
|
Volume |
9 |
Issue |
|
Pages |
106998-107011 |
|
|
Keywords |
|
|
|
Abstract |
In recent years, top referred methods on object detection like R-CNN have implemented this task as a combination of proposal region generation and supervised classification on the proposed bounding boxes. Although this pipeline has achieved state-of-the-art results in multiple datasets, it has inherent limitations that make object detection a very complex and inefficient task in computational terms. Instead of considering this standard strategy, in this paper we enhance Detection Transformers (DETR) which tackles object detection as a set-prediction problem directly in an end-to-end fully differentiable pipeline without requiring priors. In particular, we incorporate Feature Pyramids (FP) to the DETR architecture and demonstrate the effectiveness of the resulting DETR-FP approach on improving logo detection results thanks to the improved detection of small logos. So, without requiring any domain specific prior to be fed to the model, DETR-FP obtains competitive results on the OpenLogo and MS-COCO datasets offering a relative improvement of up to 30%, when compared to a Faster R-CNN baseline which strongly depends on hand-designed priors. |
|
|
Address |
|
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
|
|
|
Notes ![sorted by Notes field, descending order (down)](img/sort_desc.gif) |
ISE |
Approved |
no |
|
|
Call Number |
Admin @ si @ VGR2021 |
Serial |
3664 |
|
Permanent link to this record |
|
|
|
|
Author |
Diana Ramirez Cifuentes; Ana Freire; Ricardo Baeza Yates; Nadia Sanz Lamora; Aida Alvarez; Alexandre Gonzalez; Meritxell Lozano; Roger Llobet; Diego Velazquez; Josep M. Gonfaus; Jordi Gonzalez |
![goto web page url](img/www.gif)
![find record details (via OpenURL) openurl](img/xref.gif)
|
|
Title |
Characterization of Anorexia Nervosa on Social Media: Textual, Visual, Relational, Behavioral, and Demographical Analysis |
Type |
Journal Article |
|
Year |
2021 |
Publication |
Journal of Medical Internet Research |
Abbreviated Journal |
JMIR |
|
|
Volume |
23 |
Issue |
7 |
Pages |
e25925 |
|
|
Keywords |
|
|
|
Abstract |
Background: Eating disorders are psychological conditions characterized by unhealthy eating habits. Anorexia nervosa (AN) is defined as the belief of being overweight despite being dangerously underweight. The psychological signs involve emotional and behavioral issues. There is evidence that signs and symptoms can manifest on social media, wherein both harmful and beneficial content is shared daily. |
|
|
Address |
|
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
|
|
|
Notes ![sorted by Notes field, descending order (down)](img/sort_desc.gif) |
ISE |
Approved |
no |
|
|
Call Number |
Admin @ si @ RFB2021 |
Serial |
3665 |
|
Permanent link to this record |
|
|
|
|
Author |
F.Negin; Pau Rodriguez; M.Koperski; A.Kerboua; Jordi Gonzalez; J.Bourgeois; E.Chapoulie; P.Robert; F.Bremond |
![goto web page url](img/www.gif)
|
|
Title |
PRAXIS: Towards automatic cognitive assessment using gesture recognition |
Type |
Journal Article |
|
Year |
2018 |
Publication |
Expert Systems with Applications |
Abbreviated Journal |
ESWA |
|
|
Volume |
106 |
Issue |
|
Pages |
21-35 |
|
|
Keywords |
|
|
|
Abstract |
Praxis test is a gesture-based diagnostic test which has been accepted as diagnostically indicative of cortical pathologies such as Alzheimer’s disease. Despite being simple, this test is oftentimes skipped by the clinicians. In this paper, we propose a novel framework to investigate the potential of static and dynamic upper-body gestures based on the Praxis test and their potential in a medical framework to automatize the test procedures for computer-assisted cognitive assessment of older adults.
In order to carry out gesture recognition as well as correctness assessment of the performances we have recollected a novel challenging RGB-D gesture video dataset recorded by Kinect v2, which contains 29 specific gestures suggested by clinicians and recorded from both experts and patients performing the gesture set. Moreover, we propose a framework to learn the dynamics of upper-body gestures, considering the videos as sequences of short-term clips of gestures. Our approach first uses body part detection to extract image patches surrounding the hands and then, by means of a fine-tuned convolutional neural network (CNN) model, it learns deep hand features which are then linked to a long short-term memory to capture the temporal dependencies between video frames.
We report the results of four developed methods using different modalities. The experiments show effectiveness of our deep learning based approach in gesture recognition and performance assessment tasks. Satisfaction of clinicians from the assessment reports indicates the impact of framework corresponding to the diagnosis. |
|
|
Address |
|
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
|
|
|
Notes ![sorted by Notes field, descending order (down)](img/sort_desc.gif) |
ISE |
Approved |
no |
|
|
Call Number |
Admin @ si @ NRK2018 |
Serial |
3669 |
|
Permanent link to this record |