|   | 
Details
   web
Records
Author (down) Chengyi Zou; Shuai Wan; Tiannan Ji; Marc Gorriz Blanch; Marta Mrak; Luis Herranz
Title Chroma Intra Prediction with Lightweight Attention-Based Neural Networks Type Journal Article
Year 2023 Publication IEEE Transactions on Circuits and Systems for Video Technology Abbreviated Journal TCSVT
Volume 34 Issue 1 Pages 549 - 560
Keywords
Abstract Neural networks can be successfully used for cross-component prediction in video coding. In particular, attention-based architectures are suitable for chroma intra prediction using luma information because of their capability to model relations between difierent channels. However, the complexity of such methods is still very high and should be further reduced, especially for decoding. In this paper, a cost-effective attention-based neural network is designed for chroma intra prediction. Moreover, with the goal of further improving coding performance, a novel approach is introduced to utilize more boundary information effectively. In addition to improving prediction, a simplification methodology is also proposed to reduce inference complexity by simplifying convolutions. The proposed schemes are integrated into H.266/Versatile Video Coding (VVC) pipeline, and only one additional binary block-level syntax flag is introduced to indicate whether a given block makes use of the proposed method. Experimental results demonstrate that the proposed scheme achieves up to −0.46%/−2.29%/−2.17% BD-rate reduction on Y/Cb/Cr components, respectively, compared with H.266/VVC anchor. Reductions in the encoding and decoding complexity of up to 22% and 61%, respectively, are achieved by the proposed scheme with respect to the previous attention-based chroma intra prediction method while maintaining coding performance.
Address
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference
Notes MACO; LAMP Approved no
Call Number Admin @ si @ ZWJ2023 Serial 3875
Permanent link to this record
 

 
Author (down) Chengyi Zou; Shuai Wan; Marta Mrak; Marc Gorriz Blanch; Luis Herranz; Tiannan Ji
Title Towards Lightweight Neural Network-based Chroma Intra Prediction for Video Coding Type Conference Article
Year 2022 Publication 29th IEEE International Conference on Image Processing Abbreviated Journal
Volume Issue Pages
Keywords Video coding; Quantization (signal); Computational modeling; Neural networks; Predictive models; Video compression; Syntactics
Abstract In video compression the luma channel can be useful for predicting chroma channels (Cb, Cr), as has been demonstrated with the Cross-Component Linear Model (CCLM) used in Versatile Video Coding (VVC) standard. More recently, it has been shown that neural networks can even better capture the relationship among different channels. In this paper, a new attention-based neural network is proposed for cross-component intra prediction. With the goal to simplify neural network design, the new framework consists of four branches: boundary branch and luma branch for extracting features from reference samples, attention branch for fusing the first two branches, and prediction branch for computing the predicted chroma samples. The proposed scheme is integrated into VVC test model together with one additional binary block-level syntax flag which indicates whether a given block makes use of the proposed method. Experimental results demonstrate 0.31%/2.36%/2.00% BD-rate reductions on Y/Cb/Cr components, respectively, on top of the VVC Test Model (VTM) 7.0 which uses CCLM.
Address Bordeaux; France; October 2022
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference ICIP
Notes MACO Approved no
Call Number Admin @ si @ ZWM2022 Serial 3790
Permanent link to this record
 

 
Author (down) Chen Zhang; Maria del Mar Vila Muñoz; Petia Radeva; Roberto Elosua; Maria Grau; Angels Betriu; Elvira Fernandez-Giraldez; Laura Igual
Title Carotid Artery Segmentation in Ultrasound Images Type Conference Article
Year 2015 Publication Computing and Visualization for Intravascular Imaging and Computer Assisted Stenting (CVII-STENT2015), Joint MICCAI Workshops Abbreviated Journal
Volume Issue Pages
Keywords
Abstract
Address Munich; Germany; October 2015
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference CVII-STENT
Notes MILAB Approved no
Call Number Admin @ si @ ZVR2015 Serial 2675
Permanent link to this record
 

 
Author (down) Chee-Kheng Chng; Yuliang Liu; Yipeng Sun; Chun Chet Ng; Canjie Luo; Zihan Ni; ChuanMing Fang; Shuaitao Zhang; Junyu Han; Errui Ding; Jingtuo Liu; Dimosthenis Karatzas; Chee Seng Chan; Lianwen Jin
Title ICDAR2019 Robust Reading Challenge on Arbitrary-Shaped Text – RRC-ArT Type Conference Article
Year 2019 Publication 15th International Conference on Document Analysis and Recognition Abbreviated Journal
Volume Issue Pages 1571-1576
Keywords
Abstract This paper reports the ICDAR2019 Robust Reading Challenge on Arbitrary-Shaped Text – RRC-ArT that consists of three major challenges: i) scene text detection, ii) scene text recognition, and iii) scene text spotting. A total of 78 submissions from 46 unique teams/individuals were received for this competition. The top performing score of each challenge is as follows: i) T1 – 82.65%, ii) T2.1 – 74.3%, iii) T2.2 – 85.32%, iv) T3.1 – 53.86%, and v) T3.2 – 54.91%. Apart from the results, this paper also details the ArT dataset, tasks description, evaluation metrics and participants' methods. The dataset, the evaluation kit as well as the results are publicly available at the challenge website.
Address Sydney; Australia; September 2019
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference ICDAR
Notes DAG; 600.121; 600.129 Approved no
Call Number Admin @ si @ CLS2019 Serial 3340
Permanent link to this record
 

 
Author (down) Cesar Isaza; Joaquin Salas; Bogdan Raducanu
Title Toward the Detection of Urban Infrastructures Edge Shadows Type Conference Article
Year 2010 Publication 12th International Conference on Advanced Concepts for Intelligent Vision Systems Abbreviated Journal
Volume 6474 Issue I Pages 30–37
Keywords
Abstract In this paper, we propose a novel technique to detect the shadows cast by urban infrastructure, such as buildings, billboards, and traffic signs, using a sequence of images taken from a fixed camera. In our approach, we compute two different background models in parallel: one for the edges and one for the reflected light intensity. An algorithm is proposed to train the system to distinguish between moving edges in general and edges that belong to static objects, creating an edge background model. Then, during operation, a background intensity model allow us to separate between moving and static objects. Those edges included in the moving objects and those that belong to the edge background model are subtracted from the current image edges. The remaining edges are the ones cast by urban infrastructure. Our method is tested on a typical crossroad scene and the results show that the approach is sound and promising.
Address Sydney, Australia
Corporate Author Thesis
Publisher Springer Berlin Heidelberg Place of Publication Editor eds. Blanc–Talon et al
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title LNCS
Series Volume Series Issue Edition
ISSN 0302-9743 ISBN 978-3-642-17687-6 Medium
Area Expedition Conference ACIVS
Notes OR;MV Approved no
Call Number BCNPCL @ bcnpcl @ ISR2010 Serial 1458
Permanent link to this record
 

 
Author (down) Cesar Isaza; Joaquin Salas; Bogdan Raducanu
Title Synthetic ground truth dataset to detect shadow cast by static objects in outdoor Type Conference Article
Year 2012 Publication 1st International Workshop on Visual Interfaces for Ground Truth Collection in Computer Vision Applications Abbreviated Journal
Volume Issue Pages art. 11
Keywords
Abstract In this paper, we propose a precise synthetic ground truth dataset to study the problem of detection of the shadows cast by static objects in outdoor environments during extended periods of time (days). For our dataset, we have created a virtual scenario using a rendering software. To increase the realism of the simulated environment, we have defined the scenario in a precise geographical location. In our dataset the sun is by far the main illumination source. The sun position during the simulation time takes into consideration factors related to the geographical location, such as the latitude, longitude, elevation above sea level, and precise image capturing day and time. In our simulation the camera remains fixed. The dataset consists of seven days of simulation, from 10:00am to 5:00pm. Images are captured every 10 seconds. The shadows' ground truth is automatically computed by the rendering software.
Address Capri, Italy
Corporate Author Thesis
Publisher ACM Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN 978-1-4503-1405-3 Medium
Area Expedition Conference VIGTA
Notes OR;MV Approved no
Call Number Admin @ si @ ISR2012a Serial 2037
Permanent link to this record
 

 
Author (down) Cesar Isaza; Joaquin Salas; Bogdan Raducanu
Title Rendering ground truth data sets to detect shadows cast by static objects in outdoors Type Journal Article
Year 2014 Publication Multimedia Tools and Applications Abbreviated Journal MTAP
Volume 70 Issue 1 Pages 557-571
Keywords Synthetic ground truth data set; Sun position; Shadow detection; Static objects shadow detection
Abstract In our work, we are particularly interested in studying the shadows cast by static objects in outdoor environments, during daytime. To assess the accuracy of a shadow detection algorithm, we need ground truth information. The collection of such information is a very tedious task because it is a process that requires manual annotation. To overcome this severe limitation, we propose in this paper a methodology to automatically render ground truth using a virtual environment. To increase the degree of realism and usefulness of the simulated environment, we incorporate in the scenario the precise longitude, latitude and elevation of the actual location of the object, as well as the sun’s position for a given time and day. To evaluate our method, we consider a qualitative and a quantitative comparison. In the quantitative one, we analyze the shadow cast by a real object in a particular geographical location and its corresponding rendered model. To evaluate qualitatively the methodology, we use some ground truth images obtained both manually and automatically.
Address
Corporate Author Thesis
Publisher Springer US Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN 1380-7501 ISBN Medium
Area Expedition Conference
Notes LAMP; Approved no
Call Number Admin @ si @ ISR2014 Serial 2229
Permanent link to this record
 

 
Author (down) Cesar Isaza; Joaquin Salas; Bogdan Raducanu
Title Evaluation of Intrinsic Image Algorithms to Detect the Shadows Cast by Static Objects Outdoors Type Journal Article
Year 2012 Publication Sensors Abbreviated Journal SENS
Volume 12 Issue 10 Pages 13333-13348
Keywords
Abstract In some automatic scene analysis applications, the presence of shadows becomes a nuisance that is necessary to deal with. As a consequence, a preliminary stage in many computer vision algorithms is to attenuate their effect. In this paper, we focus our attention on the detection of shadows cast by static objects outdoors, as the scene is viewed for extended periods of time (days, weeks) from a fixed camera and considering daylight intervals where the main source of light is the sun. In this context, we report two contributions. First, we introduce the use of synthetic images for which ground truth can be generated automatically, avoiding the tedious effort of manual annotation. Secondly, we report a novel application of the intrinsic image concept to the automatic detection of shadows cast by static objects in outdoors. We make both a quantitative and a qualitative evaluation of several algorithms based on this image representation. For the quantitative evaluation, we used the synthetic data set, while for the qualitative evaluation we used both data sets. Our experimental results show that the evaluated methods can partially solve the problem of shadow detection.
Address
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference
Notes OR;MV Approved no
Call Number Admin @ si @ ISR2012b Serial 2173
Permanent link to this record
 

 
Author (down) Cesar de Souza; Adrien Gaidon; Yohann Cabon; Naila Murray; Antonio Lopez
Title Generating Human Action Videos by Coupling 3D Game Engines and Probabilistic Graphical Models Type Journal Article
Year 2020 Publication International Journal of Computer Vision Abbreviated Journal IJCV
Volume 128 Issue Pages 1505–1536
Keywords Procedural generation; Human action recognition; Synthetic data; Physics
Abstract Deep video action recognition models have been highly successful in recent years but require large quantities of manually-annotated data, which are expensive and laborious to obtain. In this work, we investigate the generation of synthetic training data for video action recognition, as synthetic data have been successfully used to supervise models for a variety of other computer vision tasks. We propose an interpretable parametric generative model of human action videos that relies on procedural generation, physics models and other components of modern game engines. With this model we generate a diverse, realistic, and physically plausible dataset of human action videos, called PHAV for “Procedural Human Action Videos”. PHAV contains a total of 39,982 videos, with more than 1000 examples for each of 35 action categories. Our video generation approach is not limited to existing motion capture sequences: 14 of these 35 categories are procedurally-defined synthetic actions. In addition, each video is represented with 6 different data modalities, including RGB, optical flow and pixel-level semantic labels. These modalities are generated almost simultaneously using the Multiple Render Targets feature of modern GPUs. In order to leverage PHAV, we introduce a deep multi-task (i.e. that considers action classes from multiple datasets) representation learning architecture that is able to simultaneously learn from synthetic and real video datasets, even when their action categories differ. Our experiments on the UCF-101 and HMDB-51 benchmarks suggest that combining our large set of synthetic videos with small real-world datasets can boost recognition performance. Our approach also significantly outperforms video representations produced by fine-tuning state-of-the-art unsupervised generative models of videos.
Address
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference
Notes ADAS; 600.124; 600.118 Approved no
Call Number Admin @ si @ SGC2019 Serial 3303
Permanent link to this record
 

 
Author (down) Cesar de Souza; Adrien Gaidon; Yohann Cabon; Antonio Lopez
Title Procedural Generation of Videos to Train Deep Action Recognition Networks Type Conference Article
Year 2017 Publication 30th IEEE Conference on Computer Vision and Pattern Recognition Abbreviated Journal
Volume Issue Pages 2594-2604
Keywords
Abstract Deep learning for human action recognition in videos is making significant progress, but is slowed down by its dependency on expensive manual labeling of large video collections. In this work, we investigate the generation of synthetic training data for action recognition, as it has recently shown promising results for a variety of other computer vision tasks. We propose an interpretable parametric generative model of human action videos that relies on procedural generation and other computer graphics techniques of modern game engines. We generate a diverse, realistic, and physically plausible dataset of human action videos, called PHAV for ”Procedural Human Action Videos”. It contains a total of 39, 982 videos, with more than 1, 000 examples for each action of 35 categories. Our approach is not limited to existing motion capture sequences, and we procedurally define 14 synthetic actions. We introduce a deep multi-task representation learning architecture to mix synthetic and real videos, even if the action categories differ. Our experiments on the UCF101 and HMDB51 benchmarks suggest that combining our large set of synthetic videos with small real-world datasets can boost recognition performance, significantly
outperforming fine-tuning state-of-the-art unsupervised generative models of videos.
Address Honolulu; Hawaii; July 2017
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference CVPR
Notes ADAS; 600.076; 600.085; 600.118 Approved no
Call Number Admin @ si @ SGC2017 Serial 3051
Permanent link to this record
 

 
Author (down) Cesar de Souza; Adrien Gaidon; Eleonora Vig; Antonio Lopez
Title Sympathy for the Details: Dense Trajectories and Hybrid Classification Architectures for Action Recognition Type Conference Article
Year 2016 Publication 14th European Conference on Computer Vision Abbreviated Journal
Volume Issue Pages 697-716
Keywords
Abstract Action recognition in videos is a challenging task due to the complexity of the spatio-temporal patterns to model and the difficulty to acquire and learn on large quantities of video data. Deep learning, although a breakthrough for image classification and showing promise for videos, has still not clearly superseded action recognition methods using hand-crafted features, even when training on massive datasets. In this paper, we introduce hybrid video classification architectures based on carefully designed unsupervised representations of hand-crafted spatio-temporal features classified by supervised deep networks. As we show in our experiments on five popular benchmarks for action recognition, our hybrid model combines the best of both worlds: it is data efficient (trained on 150 to 10000 short clips) and yet improves significantly on the state of the art, including recent deep models trained on millions of manually labelled images and videos.
Address Amsterdam; The Netherlands; October 2016
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title LNCS
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference ECCV
Notes ADAS; 600.076; 600.085 Approved no
Call Number Admin @ si @ SGV2016 Serial 2824
Permanent link to this record
 

 
Author (down) Cesar de Souza; Adrien Gaidon; Eleonora Vig; Antonio Lopez
Title System and method for video classification using a hybrid unsupervised and supervised multi-layer architecture Type Patent
Year 2018 Publication US9946933B2 Abbreviated Journal
Volume Issue Pages
Keywords US9946933B2
Abstract A computer-implemented video classification method and system are disclosed. The method includes receiving an input video including a sequence of frames. At least one transformation of the input video is generated, each transformation including a sequence of frames. For the input video and each transformation, local descriptors are extracted from the respective sequence of frames. The local descriptors of the input video and each transformation are aggregated to form an aggregated feature vector with a first set of processing layers learned using unsupervised learning. An output classification value is generated for the input video, based on the aggregated feature vector with a second set of processing layers learned using supervised learning.
Address
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference
Notes ADAS; 600.118 Approved no
Call Number Admin @ si @ SGV2018 Serial 3255
Permanent link to this record
 

 
Author (down) Cesar de Souza
Title Action Recognition in Videos: Data-efficient approaches for supervised learning of human action classification models for video Type Book Whole
Year 2018 Publication PhD Thesis, Universitat Autonoma de Barcelona-CVC Abbreviated Journal
Volume Issue Pages
Keywords
Abstract In this dissertation, we explore different ways to perform human action recognition in video clips. We focus on data efficiency, proposing new approaches that alleviate the need for laborious and time-consuming manual data annotation. In the first part of this dissertation, we start by analyzing previous state-of-the-art models, comparing their differences and similarities in order to pinpoint where their real strengths come from. Leveraging this information, we then proceed to boost the classification accuracy of shallow models to levels that rival deep neural networks. We introduce hybrid video classification architectures based on carefully designed unsupervised representations of handcrafted spatiotemporal features classified by supervised deep networks. We show in our experiments that our hybrid model combine the best of both worlds: it is data efficient (trained on 150 to 10,000 short clips) and yet improved significantly on the state of the art, including deep models trained on millions of manually labeled images and videos. In the second part of this research, we investigate the generation of synthetic training data for action recognition, as it has recently shown promising results for a variety of other computer vision tasks. We propose an interpretable parametric generative model of human action videos that relies on procedural generation and other computer graphics techniques of modern game engines. We generate a diverse, realistic, and physically plausible dataset of human action videos, called PHAV for “Procedural Human Action Videos”. It contains a total of 39,982 videos, with more than 1,000 examples for each action of 35 categories. Our approach is not limited to existing motion capture sequences, and we procedurally define 14 synthetic actions. We then introduce deep multi-task representation learning architectures to mix synthetic and real videos, even if the action categories differ. Our experiments on the UCF-101 and HMDB-51 benchmarks suggest that combining our large set of synthetic videos with small real-world datasets can boost recognition performance, outperforming fine-tuning state-of-the-art unsupervised generative models of videos.
Address April 2018
Corporate Author Thesis Ph.D. thesis
Publisher Ediciones Graficas Rey Place of Publication Editor Antonio Lopez;Naila Murray
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference
Notes ADAS; 600.118 Approved no
Call Number Admin @ si @ Sou2018 Serial 3127
Permanent link to this record
 

 
Author (down) Carolina Malagelada; Michal Drozdzal; Santiago Segui; Sara Mendez; Jordi Vitria; Petia Radeva; Javier Santos; Anna Accarino; Juan R. Malagelada; Fernando Azpiroz
Title Classification of functional bowel disorders by objective physiological criteria based on endoluminal image analysis Type Journal Article
Year 2015 Publication American Journal of Physiology-Gastrointestinal and Liver Physiology Abbreviated Journal AJPGI
Volume 309 Issue 6 Pages G413--G419
Keywords capsule endoscopy; computer vision analysis; functional bowel disorders; intestinal motility; machine learning
Abstract We have previously developed an original method to evaluate small bowel motor function based on computer vision analysis of endoluminal images obtained by capsule endoscopy. Our aim was to demonstrate intestinal motor abnormalities in patients with functional bowel disorders by endoluminal vision analysis. Patients with functional bowel disorders (n = 205) and healthy subjects (n = 136) ingested the endoscopic capsule (Pillcam-SB2, Given-Imaging) after overnight fast and 45 min after gastric exit of the capsule a liquid meal (300 ml, 1 kcal/ml) was administered. Endoluminal image analysis was performed by computer vision and machine learning techniques to define the normal range and to identify clusters of abnormal function. After training the algorithm, we used 196 patients and 48 healthy subjects, completely naive, as test set. In the test set, 51 patients (26%) were detected outside the normal range (P < 0.001 vs. 3 healthy subjects) and clustered into hypo- and hyperdynamic subgroups compared with healthy subjects. Patients with hypodynamic behavior (n = 38) exhibited less luminal closure sequences (41 ± 2% of the recording time vs. 61 ± 2%; P < 0.001) and more static sequences (38 ± 3 vs. 20 ± 2%; P < 0.001); in contrast, patients with hyperdynamic behavior (n = 13) had an increased proportion of luminal closure sequences (73 ± 4 vs. 61 ± 2%; P = 0.029) and more high-motion sequences (3 ± 1 vs. 0.5 ± 0.1%; P < 0.001). Applying an original methodology, we have developed a novel classification of functional gut disorders based on objective, physiological criteria of small bowel function.
Address
Corporate Author Thesis
Publisher American Physiological Society Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference
Notes MILAB; OR;MV Approved no
Call Number Admin @ si @ MDS2015 Serial 2666
Permanent link to this record
 

 
Author (down) Carolina Malagelada; Fosca De Iorio; Fernando Azpiroz; Anna Accarino; Santiago Segui; Petia Radeva; Juan R. Malagelada
Title New Insight Into Intestinal Motor Function via Noninvasive Endoluminal Image Analysis Type Journal
Year 2008 Publication Gastroenterology Abbreviated Journal
Volume 135 Issue 4 Pages 1155–1162
Keywords
Abstract
Address
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference
Notes MILAB Approved no
Call Number BCNPCL @ bcnpcl @ MDA2008 Serial 1040
Permanent link to this record