|
Jose Luis Gomez, Manuel Silva, Antonio Seoane, Agnes Borras, Mario Noriega, German Ros, et al. (2023). All for One, and One for All: UrbanSyn Dataset, the third Musketeer of Synthetic Driving Scenes.
Abstract: We introduce UrbanSyn, a photorealistic dataset acquired through semi-procedurally generated synthetic urban driving scenarios. Developed using high-quality geometry and materials, UrbanSyn provides pixel-level ground truth, including depth, semantic segmentation, and instance segmentation with object bounding boxes and occlusion degree. It complements GTAV and Synscapes datasets to form what we coin as the 'Three Musketeers'. We demonstrate the value of the Three Musketeers in unsupervised domain adaptation for image semantic segmentation. Results on real-world datasets, Cityscapes, Mapillary Vistas, and BDD100K, establish new benchmarks, largely attributed to UrbanSyn. We make UrbanSyn openly and freely accessible (this http URL).
|
|
|
Maedeh Aghaei, Mariella Dimiccoli, & Petia Radeva. (2017). All the people around me: face clustering in egocentric photo streams. In 24th International Conference on Image Processing.
Abstract: arxiv1703.01790
Given an unconstrained stream of images captured by a wearable photo-camera (2fpm), we propose an unsupervised bottom-up approach for automatic clustering appearing faces into the individual identities present in these data. The problem is challenging since images are acquired under real world conditions; hence the visible appearance of the people in the images undergoes intensive variations. Our proposed pipeline consists of first arranging the photo-stream into events, later, localizing the appearance of multiple people in them, and
finally, grouping various appearances of the same person across different events. Experimental results performed on a dataset acquired by wearing a photo-camera during one month, demonstrate the effectiveness of the proposed approach for the considered purpose.
Keywords: face discovery; face clustering; deepmatching; bag-of-tracklets; egocentric photo-streams
|
|
|
Razieh Rastgoo, Kourosh Kiani, Sergio Escalera, Vassilis Athitsos, & Mohammad Sabokrou. (2022). All You Need In Sign Language Production.
Abstract: Sign Language is the dominant form of communication language used in the deaf and hearing-impaired community. To make an easy and mutual communication between the hearing-impaired and the hearing communities, building a robust system capable of translating the spoken language into sign language and vice versa is fundamental.
To this end, sign language recognition and production are two necessary parts for making such a two-way system. Signlanguage recognition and production need to cope with some critical challenges. In this survey, we review recent advances in
Sign Language Production (SLP) and related areas using deep learning. To have more realistic perspectives to sign language, we present an introduction to the Deaf culture, Deaf centers, psychological perspective of sign language, the main differences between spoken language and sign language. Furthermore, we present the fundamental components of a bi-directional sign language translation system, discussing the main challenges in this area. Also, the backbone architectures and methods in SLP are briefly introduced and the proposed taxonomy on SLP is presented. Finally, a general framework for SLP and performance evaluation, and also a discussion on the recent developments, advantages, and limitations in SLP, commenting on possible lines for future research are presented.
Keywords: Sign Language Production; Sign Language Recog- nition; Sign Language Translation; Deep Learning; Survey; Deaf
|
|
|
Christophe Rigaud, Dimosthenis Karatzas, Joost Van de Weijer, Jean-Christophe Burie, & Jean-Marc Ogier. (2013). An active contour model for speech balloon detection in comics. In 12th International Conference on Document Analysis and Recognition (pp. 1240–1244).
Abstract: Comic books constitute an important cultural heritage asset in many countries. Digitization combined with subsequent comic book understanding would enable a variety of new applications, including content-based retrieval and content retargeting. Document understanding in this domain is challenging as comics are semi-structured documents, combining semantically important graphical and textual parts. Few studies have been done in this direction. In this work we detail a novel approach for closed and non-closed speech balloon localization in scanned comic book pages, an essential step towards a fully automatic comic book understanding. The approach is compared with existing methods for closed balloon localization found in the literature and results are presented.
|
|
|
Carme Julia, Angel Sappa, Felipe Lumbreras, Joan Serrat, & Antonio Lopez. (2008). An Adapted Alternation Approach for Recommender Systems. In IEEE International Conference on e–Business Engineering, (128–135).
Abstract: This paper presents an adaptation of the alternation technique to tackle the prediction task in recommender systems. These systems are widely considered in electronic commerce to help customers to find products they will probably like or dislike. As the SVD-based approaches, the proposed adapted alternation technique uses all the information stored in the system to find the predictions. The main advantage of this technique with respect to the SVD-based ones is that it can deal with missing data. Furthermore, it has a smaller computational cost. Experimental results with public data sets are provided in order to show the viability of the proposed adapted alternation approach.
|
|
|
Joan Mas, Gemma Sanchez, & Josep Llados. (2005). An Adjacency Grammar to Recognize Symbols and Gestures in a Digital Pen Framework. In Pattern Recognition and Image Analysis (IbPRIA 2005), LNCS 3523: 115–122.
|
|
|
Gemma Sanchez, Josep Llados, & K. Tombre. (2001). An Algorithm to Recognize Graphical Textured Symbols using String Representations..
|
|
|
X. Orriols, Lluis Barcelo, & X. Binefa. (2001). An Appearance-Based Method for Parametric Video Registration..
|
|
|
Michal Drozdzal, Santiago Segui, Petia Radeva, Carolina Malagelada, Fernando Azpiroz, & Jordi Vitria. (2013). An Application for Efficient Error-Free Labeling of Medical Images. In Multimodal Interaction in Image and Video Applications (Vol. 48, pp. 1–16). Springer Berlin Heidelberg.
Abstract: In this chapter we describe an application for efficient error-free labeling of medical images. In this scenario, the compilation of a complete training set for building a realistic model of a given class of samples is not an easy task, making the process tedious and time consuming. For this reason, there is a need for interactive labeling applications that minimize the effort of the user while providing error-free labeling. We propose a new algorithm that is based on data similarity in feature space. This method actively explores data in order to find the best label-aligned clustering and exploits it to reduce the labeler effort, that is measured by the number of “clicks. Moreover, error-free labeling is guaranteed by the fact that all data and their labels proposals are visually revised by en expert.
|
|
|
A. Sanfeliu, & Juan J. Villanueva. (2005). An approach of visual motion analysis. PRL - Pattern Recognition Letters, 26(3), 355–368.
|
|
|
Partha Pratim Roy. (2007). An Approach to Text/Graphics Separation in Color Maps.
|
|
|
Miquel Ferrer, Ernest Valveny, F. Serratosa, K. Riesen, & Horst Bunke. (2008). An Approximate Algorith for Median Graph Computation using Graph Embedding. In 19th International Conference on Pattern Recognition..
|
|
|
Yuhua Luo, Francisco Jose Perales, & Juan J. Villanueva. (1992). An automatic Rotoscopy System for Human Motion Based on a Biomedical Graphical Model. Computer & Graphics, 16(4), 355–362.
|
|
|
Francisco Jose Perales, Juan J. Villanueva, & Yuhua Luo. (1991). An automatic two-camera human motion perception system based on biomechanical model matching. In IEEE International Conference on Systems, Man and Cybernetics (Vol. 2, pp. 856–858).
|
|
|
Angel Sappa, & Fadi Dornaika. (2006). An Edge-Based Approach to Motion Detection. In 6th International Conference on Computational Science (ICCS´06), LNCS 3991:563–570.
|
|