|
Maria Elena Meza-de-Luna, Juan Ramon Terven Salinas, Bogdan Raducanu, & Joaquin Salas. (2019). A Social-Aware Assistant to support individuals with visual impairments during social interaction: A systematic requirements analysis. IJHC - International Journal of Human-Computer Studies, 122, 50–60.
Abstract: Visual impairment affects the normal course of activities in everyday life including mobility, education, employment, and social interaction. Most of the existing technical solutions devoted to empowering the visually impaired people are in the areas of navigation (obstacle avoidance), access to printed information and object recognition. Less effort has been dedicated so far in developing solutions to support social interactions. In this paper, we introduce a Social-Aware Assistant (SAA) that provides visually impaired people with cues to enhance their face-to-face conversations. The system consists of a perceptive component (represented by smartglasses with an embedded video camera) and a feedback component (represented by a haptic belt). When the vision system detects a head nodding, the belt vibrates, thus suggesting the user to replicate (mirror) the gesture. In our experiments, sighted persons interacted with blind people wearing the SAA. We instructed the former to mirror the noddings according to the vibratory signal, while the latter interacted naturally. After the face-to-face conversation, the participants had an interview to express their experience regarding the use of this new technological assistant. With the data collected during the experiment, we have assessed quantitatively and qualitatively the device usefulness and user satisfaction.
|
|
|
Petia Radeva, Joan Serrat, & Enric Marti. (1995). A snake for model-based segmentation. In Proc. Conf. Fifth Int Computer Vision (pp. 816–821).
Abstract: Despite the promising results of numerous applications, the hitherto proposed snake techniques share some common problems: snake attraction by spurious edge points, snake degeneration (shrinking and attening), convergence and stability of the deformation process, snake initialization and local determination of the parameters of elasticity. We argue here that these problems can be solved only when all the snake aspects are considered. The snakes proposed here implement a new potential eld and external force in order to provide a deformation convergence, attraction by both near and far edges as well as snake behaviour selective according to the edge orientation. Furthermore, we conclude that in the case of model-based seg mentation, the internal force should include structural information about the expected snake shape. Experiments using this kind of snakes for segmenting bones in complex hand radiographs show a signicant improvement.
Keywords: snakes; elastic matching; model-based segmenta tion
|
|
|
Suman Ghosh, & Ernest Valveny. (2015). A Sliding Window Framework for Word Spotting Based on Word Attributes. In Pattern Recognition and Image Analysis, Proceedings of 7th Iberian Conference , ibPRIA 2015 (Vol. 9117, pp. 652–661). LNCS. Springer International Publishing.
Abstract: In this paper we propose a segmentation-free approach to word spotting. Word images are first encoded into feature vectors using Fisher Vector. Then, these feature vectors are used together with pyramidal histogram of characters labels (PHOC) to learn SVM-based attribute models. Documents are represented by these PHOC based word attributes. To efficiently compute the word attributes over a sliding window, we propose to use an integral image representation of the document using a simplified version of the attribute model. Finally we re-rank the top word candidates using the more discriminative full version of the word attributes. We show state-of-the-art results for segmentation-free query-by-example word spotting in single-writer and multi-writer standard datasets.
Keywords: Word spotting; Sliding window; Word attributes
|
|
|
Daniel Rato, Miguel Oliveira, Vitor Santos, Manuel Gomes, & Angel Sappa. (2022). A sensor-to-pattern calibration framework for multi-modal industrial collaborative cells. JMANUFSYST - Journal of Manufacturing Systems, 64, 497–507.
Abstract: Collaborative robotic industrial cells are workspaces where robots collaborate with human operators. In this context, safety is paramount, and for that a complete perception of the space where the collaborative robot is inserted is necessary. To ensure this, collaborative cells are equipped with a large set of sensors of multiple modalities, covering the entire work volume. However, the fusion of information from all these sensors requires an accurate extrinsic calibration. The calibration of such complex systems is challenging, due to the number of sensors and modalities, and also due to the small overlapping fields of view between the sensors, which are positioned to capture different viewpoints of the cell. This paper proposes a sensor to pattern methodology that can calibrate a complex system such as a collaborative cell in a single optimization procedure. Our methodology can tackle RGB and Depth cameras, as well as LiDARs. Results show that our methodology is able to accurately calibrate a collaborative cell containing three RGB cameras, a depth camera and three 3D LiDARs.
Keywords: Calibration; Collaborative cell; Multi-modal; Multi-sensor
|
|
|
Oriol Pujol. (2004). A semi-Supervised Statistical Framework and Generative Snakes for IVUS Analysis (Petia Radeva, Ed.). Ph.D. thesis, , .
|
|
|
Santiago Segui, Laura Igual, Petia Radeva, Carolina Malagelada, Fernando Azpiroz, & Jordi Vitria. (2007). A Semi-Supervised Learning Method for Motility Disease Diagnostic. In Progress in Pattern Recognition, Image Analysis and Applications, 12th Iberoamerican Congress on Pattern (CIARP 2007), LCNS 4756:773–782, ISBN 978–3–540–76724–4.
|
|
|
J. Chazalon, Marçal Rusiñol, Jean-Marc Ogier, & Josep Llados. (2015). A Semi-Automatic Groundtruthing Tool for Mobile-Captured Document Segmentation. In 13th International Conference on Document Analysis and Recognition ICDAR2015 (pp. 621–625).
Abstract: This paper presents a novel way to generate groundtruth data for the evaluation of mobile document capture systems, focusing on the first stage of the image processing pipeline involved: document object detection and segmentation in lowquality preview frames. We introduce and describe a simple, robust and fast technique based on color markers which enables a semi-automated annotation of page corners. We also detail a technique for marker removal. Methods and tools presented in the paper were successfully used to annotate, in few hours, 24889
frames in 150 video files for the smartDOC competition at ICDAR 2015
|
|
|
Albert Suso, Pau Riba, Oriol Ramos Terrades, & Josep Llados. (2021). A Self-supervised Inverse Graphics Approach for Sketch Parametrization. In 16th International Conference on Document Analysis and Recognition (Vol. 12916, pp. 28–42). LNCS.
Abstract: The study of neural generative models of handwritten text and human sketches is a hot topic in the computer vision field. The landmark SketchRNN provided a breakthrough by sequentially generating sketches as a sequence of waypoints, and more recent articles have managed to generate fully vector sketches by coding the strokes as Bézier curves. However, the previous attempts with this approach need them all a ground truth consisting in the sequence of points that make up each stroke, which seriously limits the datasets the model is able to train in. In this work, we present a self-supervised end-to-end inverse graphics approach that learns to embed each image to its best fit of Bézier curves. The self-supervised nature of the training process allows us to train the model in a wider range of datasets, but also to perform better after-training predictions by applying an overfitting process on the input binary image. We report qualitative an quantitative evaluations on the MNIST and the Quick, Draw! datasets.
|
|
|
Bhaskar Chakraborty, Michael Holte, Thomas B. Moeslund, Jordi Gonzalez, & Xavier Roca. (2011). A Selective Spatio-Temporal Interest Point Detector for Human Action Recognition in Complex Scenes. In 13th IEEE International Conference on Computer Vision (pp. 1776–1783).
Abstract: Recent progress in the field of human action recognition points towards the use of Spatio-Temporal Interest Points (STIPs) for local descriptor-based recognition strategies. In this paper we present a new approach for STIP detection by applying surround suppression combined with local and temporal constraints. Our method is significantly different from existing STIP detectors and improves the performance by detecting more repeatable, stable and distinctive STIPs for human actors, while suppressing unwanted background STIPs. For action representation we use a bag-of-visual words (BoV) model of local N-jet features to build a vocabulary of visual-words. To this end, we introduce a novel vocabulary building strategy by combining spatial pyramid and vocabulary compression techniques, resulting in improved performance and efficiency. Action class specific Support Vector Machine (SVM) classifiers are trained for categorization of human actions. A comprehensive set of experiments on existing benchmark datasets, and more challenging datasets of complex scenes, validate our approach and show state-of-the-art performance.
|
|
|
Anders Hast, & Alicia Fornes. (2016). A Segmentation-free Handwritten Word Spotting Approach by Relaxed Feature Matching. In 12th IAPR Workshop on Document Analysis Systems (pp. 150–155).
Abstract: The automatic recognition of historical handwritten documents is still considered challenging task. For this reason, word spotting emerges as a good alternative for making the information contained in these documents available to the user. Word spotting is defined as the task of retrieving all instances of the query word in a document collection, becoming a useful tool for information retrieval. In this paper we propose a segmentation-free word spotting approach able to deal with large document collections. Our method is inspired on feature matching algorithms that have been applied to image matching and retrieval. Since handwritten words have different shape, there is no exact transformation to be obtained. However, the sufficient degree of relaxation is achieved by using a Fourier based descriptor and an alternative approach to RANSAC called PUMA. The proposed approach is evaluated on historical marriage records, achieving promising results.
|
|
|
Ekaterina Zaytseva, & Jordi Vitria. (2012). A search based approach to non maximum suppression in face detection. In 19th IEEE International Conference on Image Processing.
Abstract: Poster
paper TA.P5.12
Face detectors typically produce a large number of false positives and this leads to the need to have a further non maximum suppression stage to eliminate multiple and spurious responses. This stage is based on considering spatial heuristics: true positive responses are selected by implicitly considering several restrictions on the spatial distribution of detector responses in natural images. In this paper we analyze the limitations of this approach and propose an efficient search method to overcome them. Results show how the application of this new non-maximum suppression approach to a simple face detector boosts its performance to state of the art results.
|
|
|
Petia Radeva. (1993). A Rule-Based Approach to Hand X-Ray image Segmentation..
|
|
|
Albert Gordo, & Ernest Valveny. (2009). A rotation invariant page layout descriptor for document classification and retrieval. In 10th International Conference on Document Analysis and Recognition (481–485).
Abstract: Document classification usually requires of structural features such as the physical layout to obtain good accuracy rates on complex documents. This paper introduces a descriptor of the layout and a distance measure based on the cyclic dynamic time warping which can be computed in O(n2). This descriptor is translation invariant and can be easily modified to be scale and rotation invariant. Experiments with this descriptor and its rotation invariant modification are performed on the Girona archives database and compared against another common layout distance, the minimum weight edge cover. The experiments show that these methods outperform the MWEC both in accuracy and speed, particularly on rotated documents.
|
|
|
Bogdan Raducanu, & Jordi Vitria. (2006). A Robust Particle Filter-Based Face Tracker Using Combination of Color and Geometric Information. In International Conference on Image Analysis and Recognition (ICIAR´06), LNCS 4141 (A. Campilho et al., eds.), 1: 922–933.
|
|
|
Bogdan Raducanu, & Jordi Vitria. (2005). A Robust Particle Filter-based Face Tracker Using a Combination of Color and Geometric Information.
|
|