X. Orriols, Lluis Barcelo, & X. Binefa. (2001). An Appearance-Based Method for Parametric Video Registration..
|
Gemma Sanchez, Josep Llados, & K. Tombre. (2001). An Algorithm to Recognize Graphical Textured Symbols using String Representations..
|
Joan Mas, Gemma Sanchez, & Josep Llados. (2005). An Adjacency Grammar to Recognize Symbols and Gestures in a Digital Pen Framework. In Pattern Recognition and Image Analysis (IbPRIA 2005), LNCS 3523: 115–122.
|
Carme Julia, Angel Sappa, Felipe Lumbreras, Joan Serrat, & Antonio Lopez. (2008). An Adapted Alternation Approach for Recommender Systems. In IEEE International Conference on e–Business Engineering, (128–135).
Abstract: This paper presents an adaptation of the alternation technique to tackle the prediction task in recommender systems. These systems are widely considered in electronic commerce to help customers to find products they will probably like or dislike. As the SVD-based approaches, the proposed adapted alternation technique uses all the information stored in the system to find the predictions. The main advantage of this technique with respect to the SVD-based ones is that it can deal with missing data. Furthermore, it has a smaller computational cost. Experimental results with public data sets are provided in order to show the viability of the proposed adapted alternation approach.
|
Christophe Rigaud, Dimosthenis Karatzas, Joost Van de Weijer, Jean-Christophe Burie, & Jean-Marc Ogier. (2013). An active contour model for speech balloon detection in comics. In 12th International Conference on Document Analysis and Recognition (pp. 1240–1244).
Abstract: Comic books constitute an important cultural heritage asset in many countries. Digitization combined with subsequent comic book understanding would enable a variety of new applications, including content-based retrieval and content retargeting. Document understanding in this domain is challenging as comics are semi-structured documents, combining semantically important graphical and textual parts. Few studies have been done in this direction. In this work we detail a novel approach for closed and non-closed speech balloon localization in scanned comic book pages, an essential step towards a fully automatic comic book understanding. The approach is compared with existing methods for closed balloon localization found in the literature and results are presented.
|
Razieh Rastgoo, Kourosh Kiani, Sergio Escalera, Vassilis Athitsos, & Mohammad Sabokrou. (2022). All You Need In Sign Language Production.
Abstract: Sign Language is the dominant form of communication language used in the deaf and hearing-impaired community. To make an easy and mutual communication between the hearing-impaired and the hearing communities, building a robust system capable of translating the spoken language into sign language and vice versa is fundamental.
To this end, sign language recognition and production are two necessary parts for making such a two-way system. Signlanguage recognition and production need to cope with some critical challenges. In this survey, we review recent advances in
Sign Language Production (SLP) and related areas using deep learning. To have more realistic perspectives to sign language, we present an introduction to the Deaf culture, Deaf centers, psychological perspective of sign language, the main differences between spoken language and sign language. Furthermore, we present the fundamental components of a bi-directional sign language translation system, discussing the main challenges in this area. Also, the backbone architectures and methods in SLP are briefly introduced and the proposed taxonomy on SLP is presented. Finally, a general framework for SLP and performance evaluation, and also a discussion on the recent developments, advantages, and limitations in SLP, commenting on possible lines for future research are presented.
Keywords: Sign Language Production; Sign Language Recog- nition; Sign Language Translation; Deep Learning; Survey; Deaf
|
Maedeh Aghaei, Mariella Dimiccoli, & Petia Radeva. (2017). All the people around me: face clustering in egocentric photo streams. In 24th International Conference on Image Processing.
Abstract: arxiv1703.01790
Given an unconstrained stream of images captured by a wearable photo-camera (2fpm), we propose an unsupervised bottom-up approach for automatic clustering appearing faces into the individual identities present in these data. The problem is challenging since images are acquired under real world conditions; hence the visible appearance of the people in the images undergoes intensive variations. Our proposed pipeline consists of first arranging the photo-stream into events, later, localizing the appearance of multiple people in them, and
finally, grouping various appearances of the same person across different events. Experimental results performed on a dataset acquired by wearing a photo-camera during one month, demonstrate the effectiveness of the proposed approach for the considered purpose.
Keywords: face discovery; face clustering; deepmatching; bag-of-tracklets; egocentric photo-streams
|
Jose Luis Gomez, Manuel Silva, Antonio Seoane, Agnes Borras, Mario Noriega, German Ros, et al. (2023). All for One, and One for All: UrbanSyn Dataset, the third Musketeer of Synthetic Driving Scenes.
Abstract: We introduce UrbanSyn, a photorealistic dataset acquired through semi-procedurally generated synthetic urban driving scenarios. Developed using high-quality geometry and materials, UrbanSyn provides pixel-level ground truth, including depth, semantic segmentation, and instance segmentation with object bounding boxes and occlusion degree. It complements GTAV and Synscapes datasets to form what we coin as the 'Three Musketeers'. We demonstrate the value of the Three Musketeers in unsupervised domain adaptation for image semantic segmentation. Results on real-world datasets, Cityscapes, Mapillary Vistas, and BDD100K, establish new benchmarks, largely attributed to UrbanSyn. We make UrbanSyn openly and freely accessible (this http URL).
|
Ferran Diego. (2007). Alignment of Videos Recorded from Moving Vehicles.
|
Joan Serrat, Ferran Diego, Jose Manuel Alvarez, & Felipe Lumbreras. (2007). Alignment of Videos Recorded from Moving Vehicles. In in 14th International Conference on Image Analysis and Processing, (512–517).
|
Sounak Dey, Anjan Dutta, Suman Ghosh, Ernest Valveny, & Josep Llados. (2018). Aligning Salient Objects to Queries: A Multi-modal and Multi-object Image Retrieval Framework. In 14th Asian Conference on Computer Vision.
Abstract: In this paper we propose an approach for multi-modal image retrieval in multi-labelled images. A multi-modal deep network architecture is formulated to jointly model sketches and text as input query modalities into a common embedding space, which is then further aligned with the image feature space. Our architecture also relies on a salient object detection through a supervised LSTM-based visual attention model learned from convolutional features. Both the alignment between the queries and the image and the supervision of the attention on the images are obtained by generalizing the Hungarian Algorithm using different loss functions. This permits encoding the object-based features and its alignment with the query irrespective of the availability of the co-occurrence of different objects in the training set. We validate the performance of our approach on standard single/multi-object datasets, showing state-of-the art performance in every dataset.
|
Michal Drozdzal, Laura Igual, Petia Radeva, Jordi Vitria, Carolina Malagelada, & Fernando Azpiroz. (2010). Aligning Endoluminal Scene Sequences in Wireless Capsule Endoscopy. In IEEE Computer Society Workshop on Mathematical Methods in Biomedical Image Analysis (117–124).
Abstract: Intestinal motility analysis is an important examination in detection of various intestinal malfunctions. One of the big challenges of automatic motility analysis is how to compare sequence of images and extract dynamic paterns taking into account the high deformability of the intestine wall as well as the capsule motion. From clinical point of view the ability to align endoluminal scene sequences will help to find regions of similar intestinal activity and in this way will provide a valuable information on intestinal motility problems. This work, for first time, addresses the problem of aligning endoluminal sequences taking into account motion and structure of the intestine. To describe motility in the sequence, we propose different descriptors based on the Sift Flow algorithm, namely: (1) Histograms of Sift Flow Directions to describe the flow course, (2) Sift Descriptors to represent image intestine structure and (3) Sift Flow Magnitude to quantify intestine deformation. We show that the merge of all three descriptors provides robust information on sequence description in terms of motility. Moreover, we develop a novel methodology to rank the intestinal sequences based on the expert feedback about relevance of the results. The experimental results show that the selected descriptors are useful in the alignment and similarity description and the proposed method allows the analysis of the WCE.
|
Xavier Otazu, & J. Nuñez. (2001). Algoritmo de Clasificacion no Supervisada Basado en Wavelets..
|
V. Kober, Mikhail Mozerov, Josue Albarez, & I.A. Ovseyevich. (2007). Algorithms for Impulse Noise Renoval from Corrupted Color Images.
|
Margarita Torre, & Petia Radeva. (2000). Agricultural-Field Extraction on Aerial Images by Region Competition Algorithm. In 15 th International Conference on Pattern Recognition (Vol. 1, pp. 313–316).
|