toggle visibility Search & Display Options

Select All    Deselect All
 |   | 
Details
   print
  Records Links
Author Cesar de Souza edit  openurl
  Title Action Recognition in Videos: Data-efficient approaches for supervised learning of human action classification models for video Type Book Whole
  Year 2018 Publication PhD Thesis, Universitat Autonoma de Barcelona-CVC Abbreviated Journal  
  Volume Issue Pages  
  Keywords  
  Abstract In this dissertation, we explore different ways to perform human action recognition in video clips. We focus on data efficiency, proposing new approaches that alleviate the need for laborious and time-consuming manual data annotation. In the first part of this dissertation, we start by analyzing previous state-of-the-art models, comparing their differences and similarities in order to pinpoint where their real strengths come from. Leveraging this information, we then proceed to boost the classification accuracy of shallow models to levels that rival deep neural networks. We introduce hybrid video classification architectures based on carefully designed unsupervised representations of handcrafted spatiotemporal features classified by supervised deep networks. We show in our experiments that our hybrid model combine the best of both worlds: it is data efficient (trained on 150 to 10,000 short clips) and yet improved significantly on the state of the art, including deep models trained on millions of manually labeled images and videos. In the second part of this research, we investigate the generation of synthetic training data for action recognition, as it has recently shown promising results for a variety of other computer vision tasks. We propose an interpretable parametric generative model of human action videos that relies on procedural generation and other computer graphics techniques of modern game engines. We generate a diverse, realistic, and physically plausible dataset of human action videos, called PHAV for “Procedural Human Action Videos”. It contains a total of 39,982 videos, with more than 1,000 examples for each action of 35 categories. Our approach is not limited to existing motion capture sequences, and we procedurally define 14 synthetic actions. We then introduce deep multi-task representation learning architectures to mix synthetic and real videos, even if the action categories differ. Our experiments on the UCF-101 and HMDB-51 benchmarks suggest that combining our large set of synthetic videos with small real-world datasets can boost recognition performance, outperforming fine-tuning state-of-the-art unsupervised generative models of videos.  
  Address April 2018  
  Corporate Author Thesis Ph.D. thesis  
  Publisher (up) Ediciones Graficas Rey Place of Publication Editor Antonio Lopez;Naila Murray  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN Medium  
  Area Expedition Conference  
  Notes ADAS; 600.118 Approved no  
  Call Number Admin @ si @ Sou2018 Serial 3127  
Permanent link to this record
 

 
Author Suman Ghosh edit  isbn
openurl 
  Title Word Spotting and Recognition in Images from Heterogeneous Sources A Type Book Whole
  Year 2018 Publication PhD Thesis, Universitat Autonoma de Barcelona-CVC Abbreviated Journal  
  Volume Issue Pages  
  Keywords  
  Abstract Text is the most common way of information sharing from ages. With recent development of personal images databases and handwritten historic manuscripts the demand for algorithms to make these databases accessible for browsing and indexing are in rise. Enabling search or understanding large collection of manuscripts or image databases needs fast and robust methods. Researchers have found different ways to represent cropped words for understanding and matching, which works well when words are already segmented. However there is no trivial way to extend these for non-segmented documents. In this thesis we explore different methods for text retrieval and recognition from unsegmented document and scene images. Two different ways of representation exist in literature, one uses a fixed length representation learned from cropped words and another a sequence of features of variable length. Throughout this thesis, we have studied both these representation for their suitability in segmentation free understanding of text. In the first part we are focused on segmentation free word spotting using a fixed length representation. We extended the use of the successful PHOC (Pyramidal Histogram of Character) representation to segmentation free retrieval. In the second part of the thesis, we explore sequence based features and finally, we propose a unified solution where the same framework can generate both kind of representations.  
  Address November 2018  
  Corporate Author Thesis Ph.D. thesis  
  Publisher (up) Ediciones Graficas Rey Place of Publication Editor Ernest Valveny  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN 978-84-948531-0-4 Medium  
  Area Expedition Conference  
  Notes DAG; 600.121 Approved no  
  Call Number Admin @ si @ Gho2018 Serial 3217  
Permanent link to this record
 

 
Author Sergio Escalera; Markus Weimer; Mikhail Burtsev; Valentin Malykh; Varvara Logacheva; Ryan Lowe; Iulian Vlad Serban; Yoshua Bengio; Alexander Rudnicky; Alan W. Black; Shrimai Prabhumoye; Łukasz Kidzinski; Mohanty Sharada; Carmichael Ong; Jennifer Hicks; Sergey Levine; Marcel Salathe; Scott Delp; Iker Huerga; Alexander Grigorenko; Leifur Thorbergsson; Anasuya Das; Kyla Nemitz; Jenna Sandker; Stephen King; Alexander S. Ecker; Leon A. Gatys; Matthias Bethge; Jordan Boyd Graber; Shi Feng; Pedro Rodriguez; Mohit Iyyer; He He; Hal Daume III; Sean McGregor; Amir Banifatemi; Alexey Kurakin; Ian Goodfellow; Samy Bengio edit  url
isbn  openurl
  Title Introduction to NIPS 2017 Competition Track Type Book Chapter
  Year 2018 Publication The NIPS ’17 Competition: Building Intelligent Systems Abbreviated Journal  
  Volume Issue Pages 1-23  
  Keywords  
  Abstract Competitions have become a popular tool in the data science community to solve hard problems, assess the state of the art and spur new research directions. Companies like Kaggle and open source platforms like Codalab connect people with data and a data science problem to those with the skills and means to solve it. Hence, the question arises: What, if anything, could NIPS add to this rich ecosystem?

In 2017, we embarked to find out. We attracted 23 potential competitions, of which we selected five to be NIPS 2017 competitions. Our final selection features competitions advancing the state of the art in other sciences such as “Classifying Clinically Actionable Genetic Mutations” and “Learning to Run”. Others, like “The Conversational Intelligence Challenge” and “Adversarial Attacks and Defences” generated new data sets that we expect to impact the progress in their respective communities for years to come. And “Human-Computer Question Answering Competition” showed us just how far we as a field have come in ability and efficiency since the break-through performance of Watson in Jeopardy. Two additional competitions, DeepArt and AI XPRIZE Milestions, were also associated to the NIPS 2017 competition track, whose results are also presented within this chapter.
 
  Address  
  Corporate Author Thesis  
  Publisher (up) Springer Place of Publication Editor Sergio Escalera; Markus Weimer  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title  
  Series Volume Series Issue Edition  
  ISSN ISBN 978-3-319-94042-7 Medium  
  Area Expedition Conference  
  Notes HUPBA; no proj Approved no  
  Call Number Admin @ si @ EWB2018 Serial 3200  
Permanent link to this record
 

 
Author Arnau Baro; Pau Riba; Jorge Calvo-Zaragoza; Alicia Fornes edit   pdf
doi  isbn
openurl 
  Title Optical Music Recognition by Long Short-Term Memory Networks Type Book Chapter
  Year 2018 Publication Graphics Recognition. Current Trends and Evolutions Abbreviated Journal  
  Volume 11009 Issue Pages 81-95  
  Keywords Optical Music Recognition; Recurrent Neural Network; Long ShortTerm Memory  
  Abstract Optical Music Recognition refers to the task of transcribing the image of a music score into a machine-readable format. Many music scores are written in a single staff, and therefore, they could be treated as a sequence. Therefore, this work explores the use of Long Short-Term Memory (LSTM) Recurrent Neural Networks for reading the music score sequentially, where the LSTM helps in keeping the context. For training, we have used a synthetic dataset of more than 40000 images, labeled at primitive level. The experimental results are promising, showing the benefits of our approach.  
  Address  
  Corporate Author Thesis  
  Publisher (up) Springer Place of Publication Editor A. Fornes, B. Lamiroy  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title LNCS  
  Series Volume Series Issue Edition  
  ISSN ISBN 978-3-030-02283-9 Medium  
  Area Expedition Conference GREC  
  Notes DAG; 600.097; 601.302; 601.330; 600.121 Approved no  
  Call Number Admin @ si @ BRC2018 Serial 3227  
Permanent link to this record
 

 
Author Alicia Fornes; Bart Lamiroy edit  url
isbn  openurl
  Title Graphics Recognition, Current Trends and Evolutions Type Book Whole
  Year 2018 Publication Graphics Recognition, Current Trends and Evolutions Abbreviated Journal  
  Volume 11009 Issue Pages  
  Keywords  
  Abstract This book constitutes the thoroughly refereed post-conference proceedings of the 12th International Workshop on Graphics Recognition, GREC 2017, held in Kyoto, Japan, in November 2017.
The 10 revised full papers presented were carefully reviewed and selected from 14 initial submissions. They contain both classical and emerging topics of graphics rcognition, namely analysis and detection of diagrams, search and classification, optical music recognition, interpretation of engineering drawings and maps.
 
  Address  
  Corporate Author Thesis  
  Publisher (up) Springer International Publishing Place of Publication Editor  
  Language Summary Language Original Title  
  Series Editor Series Title Abbreviated Series Title LNCS  
  Series Volume Series Issue Edition  
  ISSN ISBN 978-3-030-02283-9 Medium  
  Area Expedition Conference  
  Notes DAG; 600.121 Approved no  
  Call Number Admin @ si @ FoL2018 Serial 3171  
Permanent link to this record
Select All    Deselect All
 |   | 
Details
   print

Save Citations:
Export Records: