|
Records |
Links |
|
Author |
Sergio Escalera; Vassilis Athitsos; Isabelle Guyon |
![find record details (via OpenURL) openurl](img/xref.gif)
|
|
Title |
Challenges in Multi-modal Gesture Recognition |
Type |
Book Chapter |
|
Year |
2017 |
Publication |
|
Abbreviated Journal |
|
|
|
Volume |
|
Issue |
|
Pages |
1-60 |
|
|
Keywords |
Gesture recognition; Time series analysis; Multimodal data analysis; Computer vision; Pattern recognition; Wearable sensors; Infrared cameras; Kinect TMTM |
|
|
Abstract |
This paper surveys the state of the art on multimodal gesture recognition and introduces the JMLR special topic on gesture recognition 2011–2015. We began right at the start of the Kinect TMTM revolution when inexpensive infrared cameras providing image depth recordings became available. We published papers using this technology and other more conventional methods, including regular video cameras, to record data, thus providing a good overview of uses of machine learning and computer vision using multimodal data in this area of application. Notably, we organized a series of challenges and made available several datasets we recorded for that purpose, including tens of thousands of videos, which are available to conduct further research. We also overview recent state of the art works on gesture recognition based on a proposed taxonomy for gesture recognition, discussing challenges and future lines of research. |
|
|
Address |
|
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
|
|
|
Notes |
HuPBA; no proj |
Approved |
no |
|
|
Call Number ![sorted by Call Number field, ascending order (up)](img/sort_asc.gif) |
Admin @ si @ EAG2017 |
Serial |
3008 |
|
Permanent link to this record |
|
|
|
|
Author |
J. Elder; Fadi Dornaika; Y. Hou; R. Goldstein |
![find record details (via OpenURL) openurl](img/xref.gif)
|
|
Title |
Attentive wide-field sensing for visual telepresence and surveillance |
Type |
Book Chapter |
|
Year |
2005 |
Publication |
L. Itti, G. Rees and J. Tsotsos (editors), Neurobiology of Attention, Academic Press / Elsevier |
Abbreviated Journal |
|
|
|
Volume |
|
Issue |
|
Pages |
|
|
|
Keywords |
|
|
|
Abstract |
|
|
|
Address |
|
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
|
|
|
Notes |
|
Approved |
no |
|
|
Call Number ![sorted by Call Number field, ascending order (up)](img/sort_asc.gif) |
Admin @ si @ EDH2005 |
Serial |
604 |
|
Permanent link to this record |
|
|
|
|
Author |
Sergio Escalera; Marti Soler; Stephane Ayache; Umut Guçlu; Jun Wan; Meysam Madadi; Xavier Baro; Hugo Jair Escalante; Isabelle Guyon |
![goto web page url](img/www.gif)
|
|
Title |
ChaLearn Looking at People: Inpainting and Denoising Challenges |
Type |
Book Chapter |
|
Year |
2019 |
Publication |
The Springer Series on Challenges in Machine Learning |
Abbreviated Journal |
|
|
|
Volume |
|
Issue |
|
Pages |
23-44 |
|
|
Keywords |
|
|
|
Abstract |
Dealing with incomplete information is a well studied problem in the context of machine learning and computational intelligence. However, in the context of computer vision, the problem has only been studied in specific scenarios (e.g., certain types of occlusions in specific types of images), although it is common to have incomplete information in visual data. This chapter describes the design of an academic competition focusing on inpainting of images and video sequences that was part of the competition program of WCCI2018 and had a satellite event collocated with ECCV2018. The ChaLearn Looking at People Inpainting Challenge aimed at advancing the state of the art on visual inpainting by promoting the development of methods for recovering missing and occluded information from images and video. Three tracks were proposed in which visual inpainting might be helpful but still challenging: human body pose estimation, text overlays removal and fingerprint denoising. This chapter describes the design of the challenge, which includes the release of three novel datasets, and the description of evaluation metrics, baselines and evaluation protocol. The results of the challenge are analyzed and discussed in detail and conclusions derived from this event are outlined. |
|
|
Address |
|
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
|
|
|
Notes |
HuPBA; no proj |
Approved |
no |
|
|
Call Number ![sorted by Call Number field, ascending order (up)](img/sort_asc.gif) |
Admin @ si @ ESA2019 |
Serial |
3327 |
|
Permanent link to this record |
|
|
|
|
Author |
Sergio Escalera; David M.J. Tax; Oriol Pujol; Petia Radeva; Robert P.W. Duin |
![goto web page (via DOI) doi](img/doi.gif)
![find record details (via OpenURL) openurl](img/xref.gif)
|
|
Title |
Multi-Class Classification in Image Analysis Via Error-Correcting Output Codes |
Type |
Book Chapter |
|
Year |
2011 |
Publication |
Innovations in Intelligent Image Analysis |
Abbreviated Journal |
|
|
|
Volume |
339 |
Issue |
|
Pages |
7-29 |
|
|
Keywords |
|
|
|
Abstract |
A common way to model multi-class classification problems is by means of Error-Correcting Output Codes (ECOC). Given a multi-class problem, the ECOC technique designs a codeword for each class, where each position of the code identifies the membership of the class for a given binary problem.A classification decision is obtained by assigning the label of the class with the closest code. In this paper, we overview the state-of-the-art on ECOC designs and test them in real applications. Results on different multi-class data sets show the benefits of using the ensemble of classifiers when categorizing objects in images. |
|
|
Address |
|
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
Springer Berlin Heidelberg |
Place of Publication |
Berlin |
Editor |
H. Kawasnicka; L.Jain |
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
1860-949X |
ISBN |
978-3-642-17933-4 |
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
|
|
|
Notes |
MILAB;HuPBA |
Approved |
no |
|
|
Call Number ![sorted by Call Number field, ascending order (up)](img/sort_asc.gif) |
Admin @ si @ ETP2011 |
Serial |
1746 |
|
Permanent link to this record |
|
|
|
|
Author |
Sergio Escalera; Markus Weimer; Mikhail Burtsev; Valentin Malykh; Varvara Logacheva; Ryan Lowe; Iulian Vlad Serban; Yoshua Bengio; Alexander Rudnicky; Alan W. Black; Shrimai Prabhumoye; Łukasz Kidzinski; Mohanty Sharada; Carmichael Ong; Jennifer Hicks; Sergey Levine; Marcel Salathe; Scott Delp; Iker Huerga; Alexander Grigorenko; Leifur Thorbergsson; Anasuya Das; Kyla Nemitz; Jenna Sandker; Stephen King; Alexander S. Ecker; Leon A. Gatys; Matthias Bethge; Jordan Boyd Graber; Shi Feng; Pedro Rodriguez; Mohit Iyyer; He He; Hal Daume III; Sean McGregor; Amir Banifatemi; Alexey Kurakin; Ian Goodfellow; Samy Bengio |
![goto web page url](img/www.gif)
![find record details (via OpenURL) openurl](img/xref.gif)
|
|
Title |
Introduction to NIPS 2017 Competition Track |
Type |
Book Chapter |
|
Year |
2018 |
Publication |
The NIPS ’17 Competition: Building Intelligent Systems |
Abbreviated Journal |
|
|
|
Volume |
|
Issue |
|
Pages |
1-23 |
|
|
Keywords |
|
|
|
Abstract |
Competitions have become a popular tool in the data science community to solve hard problems, assess the state of the art and spur new research directions. Companies like Kaggle and open source platforms like Codalab connect people with data and a data science problem to those with the skills and means to solve it. Hence, the question arises: What, if anything, could NIPS add to this rich ecosystem?
In 2017, we embarked to find out. We attracted 23 potential competitions, of which we selected five to be NIPS 2017 competitions. Our final selection features competitions advancing the state of the art in other sciences such as “Classifying Clinically Actionable Genetic Mutations” and “Learning to Run”. Others, like “The Conversational Intelligence Challenge” and “Adversarial Attacks and Defences” generated new data sets that we expect to impact the progress in their respective communities for years to come. And “Human-Computer Question Answering Competition” showed us just how far we as a field have come in ability and efficiency since the break-through performance of Watson in Jeopardy. Two additional competitions, DeepArt and AI XPRIZE Milestions, were also associated to the NIPS 2017 competition track, whose results are also presented within this chapter. |
|
|
Address |
|
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
Springer |
Place of Publication |
|
Editor |
Sergio Escalera; Markus Weimer |
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
978-3-319-94042-7 |
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
|
|
|
Notes |
HUPBA; no proj |
Approved |
no |
|
|
Call Number ![sorted by Call Number field, ascending order (up)](img/sort_asc.gif) |
Admin @ si @ EWB2018 |
Serial |
3200 |
|
Permanent link to this record |
|
|
|
|
Author |
Miquel Ferrer; I. Bardaji; Ernest Valveny; Dimosthenis Karatzas; Horst Bunke |
![goto web page (via DOI) doi](img/doi.gif)
![find record details (via OpenURL) openurl](img/xref.gif)
|
|
Title |
Median Graph Computation by Means of Graph Embedding into Vector Spaces |
Type |
Book Chapter |
|
Year |
2013 |
Publication |
Graph Embedding for Pattern Analysis |
Abbreviated Journal |
|
|
|
Volume |
|
Issue |
|
Pages |
45-72 |
|
|
Keywords |
|
|
|
Abstract |
In pattern recognition [8, 14], a key issue to be addressed when designing a system is how to represent input patterns. Feature vectors is a common option. That is, a set of numerical features describing relevant properties of the pattern are computed and arranged in a vector form. The main advantages of this kind of representation are computational simplicity and a well sound mathematical foundation. Thus, a large number of operations are available to work with vectors and a large repository of algorithms for pattern analysis and classification exist. However, the simple structure of feature vectors might not be the best option for complex patterns where nonnumerical features or relations between different parts of the pattern become relevant. |
|
|
Address |
|
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
Springer New York |
Place of Publication |
|
Editor |
Yun Fu; Yungian Ma |
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
978-1-4614-4456-5 |
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
|
|
|
Notes |
DAG |
Approved |
no |
|
|
Call Number ![sorted by Call Number field, ascending order (up)](img/sort_asc.gif) |
Admin @ si @ FBV2013 |
Serial |
2421 |
|
Permanent link to this record |
|
|
|
|
Author |
Carles Fernandez; Jordi Gonzalez; Joao Manuel R. S. Taveres; Xavier Roca |
![download PDF file pdf](img/file_PDF.gif)
![find book details (via ISBN) isbn](img/isbn.gif)
|
|
Title |
Towards Ontological Cognitive System |
Type |
Book Chapter |
|
Year |
2013 |
Publication |
Topics in Medical Image Processing and Computational Vision |
Abbreviated Journal |
|
|
|
Volume |
8 |
Issue |
|
Pages |
87-99 |
|
|
Keywords |
|
|
|
Abstract |
The increasing ubiquitousness of digital information in our daily lives has positioned video as a favored information vehicle, and given rise to an astonishing generation of social media and surveillance footage. This raises a series of technological demands for automatic video understanding and management, which together with the compromising attentional limitations of human operators, have motivated the research community to guide its steps towards a better attainment of such capabilities. As a result, current trends on cognitive vision promise to recognize complex events and self-adapt to different environments, while managing and integrating several types of knowledge. Future directions suggest to reinforce the multi-modal fusion of information sources and the communication with end-users. |
|
|
Address |
|
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
Springer Netherlands |
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
2212-9391 |
ISBN |
978-94-007-0725-2 |
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
|
|
|
Notes |
ISE; 605.203; 302.018; 600.049 |
Approved |
no |
|
|
Call Number ![sorted by Call Number field, ascending order (up)](img/sort_asc.gif) |
Admin @ si @ FGT2013 |
Serial |
2287 |
|
Permanent link to this record |
|
|
|
|
Author |
Alicia Fornes; V.C.Kieu; M. Visani; N.Journet; Anjan Dutta |
![goto web page (via DOI) doi](img/doi.gif)
![find record details (via OpenURL) openurl](img/xref.gif)
|
|
Title |
The ICDAR/GREC 2013 Music Scores Competition: Staff Removal |
Type |
Book Chapter |
|
Year |
2014 |
Publication |
Graphics Recognition. Current Trends and Challenges |
Abbreviated Journal |
|
|
|
Volume |
8746 |
Issue |
|
Pages |
207-220 |
|
|
Keywords |
Competition; Graphics recognition; Music scores; Writer identification; Staff removal |
|
|
Abstract |
The first competition on music scores that was organized at ICDAR and GREC in 2011 awoke the interest of researchers, who participated in both staff removal and writer identification tasks. In this second edition, we focus on the staff removal task and simulate a real case scenario concerning old and degraded music scores. For this purpose, we have generated a new set of semi-synthetic images using two degradation models that we previously introduced: local noise and 3D distortions. In this extended paper we provide an extended description of the dataset, degradation models, evaluation metrics, the participant’s methods and the obtained results that could not be presented at ICDAR and GREC proceedings due to page limitations. |
|
|
Address |
|
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
Springer Berlin Heidelberg |
Place of Publication |
|
Editor |
B.Lamiroy; J.-M. Ogier |
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
LNCS |
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
0302-9743 |
ISBN |
978-3-662-44853-3 |
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
|
|
|
Notes |
DAG; 600.077; 600.061 |
Approved |
no |
|
|
Call Number ![sorted by Call Number field, ascending order (up)](img/sort_asc.gif) |
Admin @ si @ FKV2014 |
Serial |
2581 |
|
Permanent link to this record |
|
|
|
|
Author |
Alicia Fornes; Josep Llados; Joana Maria Pujadas-Mora |
![goto web page url](img/www.gif)
![find record details (via OpenURL) openurl](img/xref.gif)
|
|
Title |
Browsing of the Social Network of the Past: Information Extraction from Population Manuscript Images |
Type |
Book Chapter |
|
Year |
2020 |
Publication |
Handwritten Historical Document Analysis, Recognition, and Retrieval – State of the Art and Future Trends |
Abbreviated Journal |
|
|
|
Volume |
|
Issue |
|
Pages |
|
|
|
Keywords |
|
|
|
Abstract |
|
|
|
Address |
|
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
World Scientific |
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
978-981-120-323-7 |
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
|
|
|
Notes |
DAG; 600.140; 600.121 |
Approved |
no |
|
|
Call Number ![sorted by Call Number field, ascending order (up)](img/sort_asc.gif) |
Admin @ si @ FLP2020 |
Serial |
3350 |
|
Permanent link to this record |
|
|
|
|
Author |
Alicia Fornes; Josep Llados; Gemma Sanchez; Horst Bunke |
![goto web page (via DOI) doi](img/doi.gif)
|
|
Title |
Writer Identification in Old Handwritten Music Scores |
Type |
Book Chapter |
|
Year |
2012 |
Publication |
Pattern Recognition and Signal Processing in Archaeometry: Mathematical and Computational Solutions for Archaeology |
Abbreviated Journal |
|
|
|
Volume |
|
Issue |
|
Pages |
27-63 |
|
|
Keywords |
|
|
|
Abstract |
The aim of writer identification is determining the writer of a piece of handwriting from a set of writers. In this paper we present a system for writer identification in old handwritten music scores. Even though an important amount of compositions contains handwritten text in the music scores, the aim of our work is to use only music notation to determine the author. The steps of the system proposed are the following. First of all, the music sheet is preprocessed and normalized for obtaining a single binarized music line, without the staff lines. Afterwards, 100 features are extracted for every music line, which are subsequently used in a k-NN classifier that compares every feature vector with prototypes stored in a database. By applying feature selection and extraction methods on the original feature set, the performance is increased. The proposed method has been tested on a database of old music scores from the 17th to 19th centuries, achieving a recognition rate of about 95%. |
|
|
Address |
|
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
IGI-Global |
Place of Publication |
|
Editor |
Copnstantin Papaodysseus |
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
|
|
|
Notes |
DAG |
Approved |
no |
|
|
Call Number ![sorted by Call Number field, ascending order (up)](img/sort_asc.gif) |
Admin @ si @ FLS2012 |
Serial |
1828 |
|
Permanent link to this record |
|
|
|
|
Author |
Alicia Fornes; Gemma Sanchez |
![goto web page (via DOI) doi](img/doi.gif)
![find record details (via OpenURL) openurl](img/xref.gif)
|
|
Title |
Analysis and Recognition of Music Scores |
Type |
Book Chapter |
|
Year |
2014 |
Publication |
Handbook of Document Image Processing and Recognition |
Abbreviated Journal |
|
|
|
Volume |
E |
Issue |
|
Pages |
749-774 |
|
|
Keywords |
|
|
|
Abstract |
The analysis and recognition of music scores has attracted the interest of researchers for decades. Optical Music Recognition (OMR) is a classical research field of Document Image Analysis and Recognition (DIAR), whose aim is to extract information from music scores. Music scores contain both graphical and textual information, and for this reason, techniques are closely related to graphics recognition and text recognition. Since music scores use a particular diagrammatic notation that follow the rules of music theory, many approaches make use of context information to guide the recognition and solve ambiguities. This chapter overviews the main Optical Music Recognition (OMR) approaches. Firstly, the different methods are grouped according to the OMR stages, namely, staff removal, music symbol recognition, and syntactical analysis. Secondly, specific approaches for old and handwritten music scores are reviewed. Finally, online approaches and commercial systems are also commented. |
|
|
Address |
|
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
Springer London |
Place of Publication |
|
Editor |
D. Doermann; K. Tombre |
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
978-0-85729-860-7 |
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
|
|
|
Notes |
DAG; ADAS; 600.076; 600.077 |
Approved |
no |
|
|
Call Number ![sorted by Call Number field, ascending order (up)](img/sort_asc.gif) |
Admin @ si @ FoS2014 |
Serial |
2484 |
|
Permanent link to this record |
|
|
|
|
Author |
Miquel Ferrer; F. Serratosa; A. Sanfeliu |
![find record details (via OpenURL) openurl](img/xref.gif)
|
|
Title |
Synthesis of median spectral graph |
Type |
Book Chapter |
|
Year |
2005 |
Publication |
Pattern Recognition and Image Analysis (IbPRIA´05), LNCS, 3523: 139 146 |
Abbreviated Journal |
|
|
|
Volume |
|
Issue |
|
Pages |
|
|
|
Keywords |
|
|
|
Abstract |
|
|
|
Address |
Estoril (Portugal) |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
|
|
|
Notes |
|
Approved |
no |
|
|
Call Number ![sorted by Call Number field, ascending order (up)](img/sort_asc.gif) |
Admin @ si @ FSS2005 |
Serial |
656 |
|
Permanent link to this record |
|
|
|
|
Author |
Lluis Gomez; Dena Bazazian; Dimosthenis Karatzas |
![find record details (via OpenURL) openurl](img/xref.gif)
|
|
Title |
Historical review of scene text detection research |
Type |
Book Chapter |
|
Year |
2020 |
Publication |
Visual Text Interpretation – Algorithms and Applications in Scene Understanding and Document Analysis |
Abbreviated Journal |
|
|
|
Volume |
|
Issue |
|
Pages |
|
|
|
Keywords |
|
|
|
Abstract |
|
|
|
Address |
|
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
Springer |
Place of Publication |
|
Editor |
K. Alahari; C.V. Jawahar |
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
Series on Advances in Computer Vision and Pattern Recognition |
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
|
|
|
Notes |
DAG; 600.121 |
Approved |
no |
|
|
Call Number ![sorted by Call Number field, ascending order (up)](img/sort_asc.gif) |
Admin @ si @ GBK2020 |
Serial |
3495 |
|
Permanent link to this record |
|
|
|
|
Author |
Abel Gonzalez-Garcia; Robert Benavente; Olivier Penacchio; Javier Vazquez; Maria Vanrell; C. Alejandro Parraga |
![download PDF file pdf](img/file_PDF.gif)
![find book details (via ISBN) isbn](img/isbn.gif)
|
|
Title |
Coloresia: An Interactive Colour Perception Device for the Visually Impaired |
Type |
Book Chapter |
|
Year |
2013 |
Publication |
Multimodal Interaction in Image and Video Applications |
Abbreviated Journal |
|
|
|
Volume |
48 |
Issue |
|
Pages |
47-66 |
|
|
Keywords |
|
|
|
Abstract |
A significative percentage of the human population suffer from impairments in their capacity to distinguish or even see colours. For them, everyday tasks like navigating through a train or metro network map becomes demanding. We present a novel technique for extracting colour information from everyday natural stimuli and presenting it to visually impaired users as pleasant, non-invasive sound. This technique was implemented inside a Personal Digital Assistant (PDA) portable device. In this implementation, colour information is extracted from the input image and categorised according to how human observers segment the colour space. This information is subsequently converted into sound and sent to the user via speakers or headphones. In the original implementation, it is possible for the user to send its feedback to reconfigure the system, however several features such as these were not implemented because the current technology is limited.We are confident that the full implementation will be possible in the near future as PDA technology improves. |
|
|
Address |
|
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
Springer Berlin Heidelberg |
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
1868-4394 |
ISBN |
978-3-642-35931-6 |
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
|
|
|
Notes |
CIC; 600.052; 605.203 |
Approved |
no |
|
|
Call Number ![sorted by Call Number field, ascending order (up)](img/sort_asc.gif) |
Admin @ si @ GBP2013 |
Serial |
2266 |
|
Permanent link to this record |
|
|
|
|
Author |
Raul Gomez; Lluis Gomez; Jaume Gibert; Dimosthenis Karatzas |
![download PDF file pdf](img/file_PDF.gif)
![find record details (via OpenURL) openurl](img/xref.gif)
|
|
Title |
Self-Supervised Learning from Web Data for Multimodal Retrieval |
Type |
Book Chapter |
|
Year |
2019 |
Publication |
Multi-Modal Scene Understanding Book |
Abbreviated Journal |
|
|
|
Volume |
|
Issue |
|
Pages |
279-306 |
|
|
Keywords |
self-supervised learning; webly supervised learning; text embeddings; multimodal retrieval; multimodal embedding |
|
|
Abstract |
Self-Supervised learning from multimodal image and text data allows deep neural networks to learn powerful features with no need of human annotated data. Web and Social Media platforms provide a virtually unlimited amount of this multimodal data. In this work we propose to exploit this free available data to learn a multimodal image and text embedding, aiming to leverage the semantic knowledge learnt in the text domain and transfer it to a visual model for semantic image retrieval. We demonstrate that the proposed pipeline can learn from images with associated text without supervision and analyze the semantic structure of the learnt joint image and text embeddingspace. Weperformathoroughanalysisandperformancecomparisonoffivedifferentstateof the art text embeddings in three different benchmarks. We show that the embeddings learnt with Web and Social Media data have competitive performances over supervised methods in the text basedimageretrievaltask,andweclearlyoutperformstateoftheartintheMIRFlickrdatasetwhen training in the target data. Further, we demonstrate how semantic multimodal image retrieval can be performed using the learnt embeddings, going beyond classical instance-level retrieval problems. Finally, we present a new dataset, InstaCities1M, composed by Instagram images and their associated texts that can be used for fair comparison of image-text embeddings. |
|
|
Address |
|
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
|
|
|
Notes |
DAG; 600.129; 601.338; 601.310 |
Approved |
no |
|
|
Call Number ![sorted by Call Number field, ascending order (up)](img/sort_asc.gif) |
Admin @ si @ GGG2019 |
Serial |
3266 |
|
Permanent link to this record |