|
Antonio Lopez, David Vazquez, & Gabriel Villalonga. (2018). Data for Training Models, Domain Adaptation. In Intelligent Vehicles. Enabling Technologies and Future Developments (395–436).
Abstract: Simulation can enable several developments in the field of intelligent vehicles. This chapter is divided into three main subsections. The first one deals with driving simulators. The continuous improvement of hardware performance is a well-known fact that is allowing the development of more complex driving simulators. The immersion in the simulation scene is increased by high fidelity feedback to the driver. In the second subsection, traffic simulation is explained as well as how it can be used for intelligent transport systems. Finally, it is rather clear that sensor-based perception and action must be based on data-driven algorithms. Simulation could provide data to train and test algorithms that are afterwards implemented in vehicles. These tools are explained in the third subsection.
Keywords: Driving simulator; hardware; software; interface; traffic simulation; macroscopic simulation; microscopic simulation; virtual data; training data
|
|
|
Sergio Escalera, Markus Weimer, Mikhail Burtsev, Valentin Malykh, Varvara Logacheva, Ryan Lowe, et al. (2018). Introduction to NIPS 2017 Competition Track. In Sergio Escalera, & Markus Weimer (Eds.), The NIPS ’17 Competition: Building Intelligent Systems (pp. 1–23). Springer.
Abstract: Competitions have become a popular tool in the data science community to solve hard problems, assess the state of the art and spur new research directions. Companies like Kaggle and open source platforms like Codalab connect people with data and a data science problem to those with the skills and means to solve it. Hence, the question arises: What, if anything, could NIPS add to this rich ecosystem?
In 2017, we embarked to find out. We attracted 23 potential competitions, of which we selected five to be NIPS 2017 competitions. Our final selection features competitions advancing the state of the art in other sciences such as “Classifying Clinically Actionable Genetic Mutations” and “Learning to Run”. Others, like “The Conversational Intelligence Challenge” and “Adversarial Attacks and Defences” generated new data sets that we expect to impact the progress in their respective communities for years to come. And “Human-Computer Question Answering Competition” showed us just how far we as a field have come in ability and efficiency since the break-through performance of Watson in Jeopardy. Two additional competitions, DeepArt and AI XPRIZE Milestions, were also associated to the NIPS 2017 competition track, whose results are also presented within this chapter.
|
|
|
Rain Eric Haamer, Eka Rusadze, Iiris Lusi, Tauseef Ahmed, Sergio Escalera, & Gholamreza Anbarjafari. (2018). Review on Emotion Recognition Databases. In Human-Robot Interaction: Theory and Application.
Abstract: Over the past few decades human-computer interaction has become more important in our daily lives and research has developed in many directions: memory research, depression detection, and behavioural deficiency detection, lie detection, (hidden) emotion recognition etc. Because of that, the number of generic emotion and face databases or those tailored to specific needs have grown immensely large. Thus, a comprehensive yet compact guide is needed to help researchers find the most suitable database and understand what types of databases already exist. In this paper, different elicitation methods are discussed and the databases are primarily organized into neat and informative tables based on the format.
Keywords: emotion; computer vision; databases
|
|
|
Antonio Lopez. (2018). Pedestrian Detection Systems. In Wiley Encyclopedia of Electrical and Electronics Engineering.
Abstract: Pedestrian detection is a highly relevant topic for both advanced driver assistance systems (ADAS) and autonomous driving. In this entry, we review the ideas behind pedestrian detection systems from the point of view of perception based on computer vision and machine learning.
|
|
|
Raul Gomez, Lluis Gomez, Jaume Gibert, & Dimosthenis Karatzas. (2019). Self-Supervised Learning from Web Data for Multimodal Retrieval. In Multi-Modal Scene Understanding Book (pp. 279–306).
Abstract: Self-Supervised learning from multimodal image and text data allows deep neural networks to learn powerful features with no need of human annotated data. Web and Social Media platforms provide a virtually unlimited amount of this multimodal data. In this work we propose to exploit this free available data to learn a multimodal image and text embedding, aiming to leverage the semantic knowledge learnt in the text domain and transfer it to a visual model for semantic image retrieval. We demonstrate that the proposed pipeline can learn from images with associated text without supervision and analyze the semantic structure of the learnt joint image and text embeddingspace. Weperformathoroughanalysisandperformancecomparisonoffivedifferentstateof the art text embeddings in three different benchmarks. We show that the embeddings learnt with Web and Social Media data have competitive performances over supervised methods in the text basedimageretrievaltask,andweclearlyoutperformstateoftheartintheMIRFlickrdatasetwhen training in the target data. Further, we demonstrate how semantic multimodal image retrieval can be performed using the learnt embeddings, going beyond classical instance-level retrieval problems. Finally, we present a new dataset, InstaCities1M, composed by Instagram images and their associated texts that can be used for fair comparison of image-text embeddings.
Keywords: self-supervised learning; webly supervised learning; text embeddings; multimodal retrieval; multimodal embedding
|
|
|
Sergio Escalera, Marti Soler, Stephane Ayache, Umut Guçlu, Jun Wan, Meysam Madadi, et al. (2019). ChaLearn Looking at People: Inpainting and Denoising Challenges. In The Springer Series on Challenges in Machine Learning (pp. 23–44).
Abstract: Dealing with incomplete information is a well studied problem in the context of machine learning and computational intelligence. However, in the context of computer vision, the problem has only been studied in specific scenarios (e.g., certain types of occlusions in specific types of images), although it is common to have incomplete information in visual data. This chapter describes the design of an academic competition focusing on inpainting of images and video sequences that was part of the competition program of WCCI2018 and had a satellite event collocated with ECCV2018. The ChaLearn Looking at People Inpainting Challenge aimed at advancing the state of the art on visual inpainting by promoting the development of methods for recovering missing and occluded information from images and video. Three tracks were proposed in which visual inpainting might be helpful but still challenging: human body pose estimation, text overlays removal and fingerprint denoising. This chapter describes the design of the challenge, which includes the release of three novel datasets, and the description of evaluation metrics, baselines and evaluation protocol. The results of the challenge are analyzed and discussed in detail and conclusions derived from this event are outlined.
|
|
|
Alicia Fornes, Josep Llados, & Joana Maria Pujadas-Mora. (2020). Browsing of the Social Network of the Past: Information Extraction from Population Manuscript Images. In Handwritten Historical Document Analysis, Recognition, and Retrieval – State of the Art and Future Trends. World Scientific.
|
|
|
Joana Maria Pujadas-Mora, Alicia Fornes, Josep Llados, Gabriel Brea-Martinez, & Miquel Valls-Figols. (2019). The Baix Llobregat (BALL) Demographic Database, between Historical Demography and Computer Vision (nineteenth–twentieth centuries. In Nominative Data in Demographic Research in the East and the West: monograph (pp. 29–61).
Abstract: The Baix Llobregat (BALL) Demographic Database is an ongoing database project containing individual census data from the Catalan region of Baix Llobregat (Spain) during the nineteenth and twentieth centuries. The BALL Database is built within the project ‘NETWORKS: Technology and citizen innovation for building historical social networks to understand the demographic past’ directed by Alícia Fornés from the Center for Computer Vision and Joana Maria Pujadas-Mora from the Center for Demographic Studies, both at the Universitat Autònoma de Barcelona, funded by the Recercaixa program (2017–2019).
Its webpage is http://dag.cvc.uab.es/xarxes/.The aim of the project is to develop technologies facilitating massive digitalization of demographic sources, and more specifically the padrones (local censuses), in order to reconstruct historical ‘social’ networks employing computer vision technology. Such virtual networks can be created thanks to the linkage of nominative records compiled in the local censuses across time and space. Thus, digitized versions of individual and family lifespans are established, and individuals and families can be located spatially.
|
|
|
Estefania Talavera, Alexandre Cola, Nicolai Petkov, & Petia Radeva. (2019). Towards Egocentric Person Re-identification and Social Pattern Analysis. In Frontiers in Artificial Intelligence and Applications (Vol. 310, pp. 203–211).
Abstract: CoRR abs/1905.04073
Wearable cameras capture a first-person view of the daily activities of the camera wearer, offering a visual diary of the user behaviour. Detection of the appearance of people the camera user interacts with for social interactions analysis is of high interest. Generally speaking, social events, lifestyle and health are highly correlated, but there is a lack of tools to monitor and analyse them. We consider that egocentric vision provides a tool to obtain information and understand users social interactions. We propose a model that enables us to evaluate and visualize social traits obtained by analysing social interactions appearance within egocentric photostreams. Given sets of egocentric images, we detect the appearance of faces within the days of the camera wearer, and rely on clustering algorithms to group their feature descriptors in order to re-identify persons. Recurrence of detected faces within photostreams allows us to shape an idea of the social pattern of behaviour of the user. We validated our model over several weeks recorded by different camera wearers. Our findings indicate that social profiles are potentially useful for social behaviour interpretation.
|
|
|
Lluis Gomez, Anguelos Nicolaou, Marçal Rusiñol, & Dimosthenis Karatzas. (2020). 12 years of ICDAR Robust Reading Competitions: The evolution of reading systems for unconstrained text understanding. In K. Alahari, & C.V. Jawahar (Eds.), Visual Text Interpretation – Algorithms and Applications in Scene Understanding and Document Analysis. Series on Advances in Computer Vision and Pattern Recognition. Springer.
|
|
|
Lluis Gomez, Dena Bazazian, & Dimosthenis Karatzas. (2020). Historical review of scene text detection research. In K. Alahari, & C.V. Jawahar (Eds.), Visual Text Interpretation – Algorithms and Applications in Scene Understanding and Document Analysis. Series on Advances in Computer Vision and Pattern Recognition. Springer.
|
|
|
Jon Almazan, Lluis Gomez, Suman Ghosh, Ernest Valveny, & Dimosthenis Karatzas. (2020). WATTS: A common representation of word images and strings using embedded attributes for text recognition and retrieval. In K. A. Analysis”, & C.V. Jawahar (Eds.), Visual Text Interpretation – Algorithms and Applications in Scene Understanding and Document Analysis. Series on Advances in Computer Vision and Pattern Recognition. Springer.
|
|
|
Patricia Suarez, Angel Sappa, & Boris X. Vintimilla. (2021). Deep learning-based vegetation index estimation. In A.Solanki, A.Nayyar, & M.Naved (Eds.), Generative Adversarial Networks for Image-to-Image Translation (pp. 205–234). Elsevier.
|
|
|
Debora Gil, Oriol Ramos Terrades, & Raquel Perez. (2021). Topological Radiomics (TOPiomics): Early Detection of Genetic Abnormalities in Cancer Treatment Evolution. In Extended Abstracts GEOMVAP 2019, Trends in Mathematics 15 (Vol. 15, 89–93). Springer Nature.
Abstract: Abnormalities in radiomic measures correlate to genomic alterations prone to alter the outcome of personalized anti-cancer treatments. TOPiomics is a new method for the early detection of variations in tumor imaging phenotype from a topological structure in multi-view radiomic spaces.
|
|
|
Jun Wan, Guodong Guo, Sergio Escalera, Hugo Jair Escalante, & Stan Z Li. (2023). Best Solutions Proposed in the Context of the Face Anti-spoofing Challenge Series. In Advances in Face Presentation Attack Detection (37–78).
Abstract: The PAD competitions we organized attracted more than 835 teams from home and abroad, most of them from the industry, which shows that the topic of face anti-spoofing is closely related to daily life, and there is an urgent need for advanced algorithms to solve its application needs. Specifically, the Chalearn LAP multi-modal face anti-spoofing attack detection challenge attracted more than 300 teams for the development phase with a total of 13 teams qualifying for the final round; the Chalearn Face Anti-spoofing Attack Detection Challenge attracted 340 teams in the development stage, and finally, 11 and 8 teams have submitted their codes in the single-modal and multi-modal face anti-spoofing recognition challenges, respectively; the 3D High-Fidelity Mask Face Presentation Attack Detection Challenge attracted 195 teams for the development phase with a total of 18 teams qualifying for the final round. All the results were verified and re-run by the organizing team, and the results were used for the final ranking. In this chapter, we briefly the methods developed by the teams participating in each competition, and introduce the algorithm details of the top-three ranked teams in detail.
|
|