|
Niki Aifanti, Angel Sappa, N. Grammalidis, & Sotiris Malassiotis. (2009). Advances in Tracking and Recognition of Human Motion. In Encyclopedia of Information Science and Technology (Vol. I, 65–71).
|
|
|
Debora Gil, Aura Hernandez-Sabate, Antoni Carol, Oriol Rodriguez, & Petia Radeva. (2005). A Deterministic-Statistic Adventitia Detection in IVUS Images. In 3rd International workshop on International Workshop on Functional Imaging and Modeling of the Heart (pp. 65–74).
Abstract: Plaque analysis in IVUS planes needs accurate intima and adventitia models. Large variety in adventitia descriptors difficulties its detection and motivates using a classification strategy for selecting points on the structure. Whatever the set of descriptors used, the selection stage suffers from fake responses due to noise and uncompleted true curves. In order to smooth background noise while strengthening responses, we apply a restricted anisotropic filter that homogenizes grey levels along the image significant structures. Candidate points are extracted by means of a simple semi supervised adaptive classification of the filtered image response to edge and calcium detectors. The final model is obtained by interpolating the former line segments with an anisotropic contour closing technique based on functional extension principles.
Keywords: Electron microscopy; Unbending; 2D crystal; Interpolation; Approximation
|
|
|
David Rotger, Misael Rosales, Jaume Garcia, Oriol Pujol, Josefina Mauri, & Petia Radeva. (2003). Active Vessel: A New Multimedia Workstation for Intravascular Ultrasound and Angiography Fusion. Computers in Cardiology, 30, 65–68.
Abstract: AcriveVessel is a new multimedia workstation which enables the visualization, acquisition and handling of both image modalities, on- and ofline. It enables DICOM v3.0 decompression and browsing, video acquisition,repmduction and storage for IntraVascular UltraSound (IVUS) and angiograms with their corresponding ECG,automatic catheter segmentation in angiography images (using fast marching algorithm). BSpline models definition for vessel layers on IVUS images sequence and an extensively validated tool to fuse information. This approach defines the correspondence of every IVUS image with its correspondent point in the angiogram and viceversa. The 3 0 reconstruction of the NUS catheterhessel enables real distance measurements as well as threedimensional visualization showing vessel tortuosity in the space.
|
|
|
Gemma Sanchez, Ernest Valveny, Josep Llados, Enric Marti, Oriol Ramos Terrades, N.Lozano, et al. (2003). A system for virtual prototyping of architectural projects. In Proceedings of Fifth IAPR International Workshop on Pattern Recognition (pp. 65–74).
|
|
|
Gerard Canal, Sergio Escalera, & Cecilio Angulo. (2016). A Real-time Human-Robot Interaction system based on gestures for assistive scenarios. CVIU - Computer Vision and Image Understanding, 149, 65–77.
Abstract: Natural and intuitive human interaction with robotic systems is a key point to develop robots assisting people in an easy and effective way. In this paper, a Human Robot Interaction (HRI) system able to recognize gestures usually employed in human non-verbal communication is introduced, and an in-depth study of its usability is performed. The system deals with dynamic gestures such as waving or nodding which are recognized using a Dynamic Time Warping approach based on gesture specific features computed from depth maps. A static gesture consisting in pointing at an object is also recognized. The pointed location is then estimated in order to detect candidate objects the user may refer to. When the pointed object is unclear for the robot, a disambiguation procedure by means of either a verbal or gestural dialogue is performed. This skill would lead to the robot picking an object in behalf of the user, which could present difficulties to do it by itself. The overall system — which is composed by a NAO and Wifibot robots, a KinectTM v2 sensor and two laptops — is firstly evaluated in a structured lab setup. Then, a broad set of user tests has been completed, which allows to assess correct performance in terms of recognition rates, easiness of use and response times.
Keywords: Gesture recognition; Human Robot Interaction; Dynamic Time Warping; Pointing location estimation
|
|
|
Eduardo Aguilar, & Petia Radeva. (2019). Food Recognition by Integrating Local and Flat Classifiers. In 9th Iberian Conference on Pattern Recognition and Image Analysis (Vol. 11867, pp. 65–74). LNCS.
Abstract: The recognition of food image is an interesting research topic, in which its applicability in the creation of nutritional diaries stands out with the aim of improving the quality of life of people with a chronic disease (e.g. diabetes, heart disease) or prone to acquire it (e.g. people with overweight or obese). For a food recognition system to be useful in real applications, it is necessary to recognize a huge number of different foods. We argue that for very large scale classification, a traditional flat classifier is not enough to acquire an acceptable result. To address this, we propose a method that performs prediction with local classifiers, based on a class hierarchy, or with flat classifier. We decide which approach to use, depending on the analysis of both the Epistemic Uncertainty obtained for the image in the children classifiers and the prediction of the parent classifier. When our criterion is met, the final prediction is obtained with the respective local classifier; otherwise, with the flat classifier. From the results, we can see that the proposed method improves the classification performance compared to the use of a single flat classifier.
|
|
|
Josep Brugues Pujolras, Lluis Gomez, & Dimosthenis Karatzas. (2022). A Multilingual Approach to Scene Text Visual Question Answering. In Document Analysis Systems.15th IAPR International Workshop, (DAS2022) (pp. 65–79).
Abstract: Scene Text Visual Question Answering (ST-VQA) has recently emerged as a hot research topic in Computer Vision. Current ST-VQA models have a big potential for many types of applications but lack the ability to perform well on more than one language at a time due to the lack of multilingual data, as well as the use of monolingual word embeddings for training. In this work, we explore the possibility to obtain bilingual and multilingual VQA models. In that regard, we use an already established VQA model that uses monolingual word embeddings as part of its pipeline and substitute them by FastText and BPEmb multilingual word embeddings that have been aligned to English. Our experiments demonstrate that it is possible to obtain bilingual and multilingual VQA models with a minimal loss in performance in languages not used during training, as well as a multilingual model trained in multiple languages that match the performance of the respective monolingual baselines.
Keywords: Scene text; Visual question answering; Multilingual word embeddings; Vision and language; Deep learning
|
|
|
Sangheeta Roy, Palaiahnakote Shivakumara, Namita Jain, Vijeta Khare, Anjan Dutta, Umapada Pal, et al. (2018). Rough-Fuzzy based Scene Categorization for Text Detection and Recognition in Video. PR - Pattern Recognition, 80, 64–82.
Abstract: Scene image or video understanding is a challenging task especially when number of video types increases drastically with high variations in background and foreground. This paper proposes a new method for categorizing scene videos into different classes, namely, Animation, Outlet, Sports, e-Learning, Medical, Weather, Defense, Economics, Animal Planet and Technology, for the performance improvement of text detection and recognition, which is an effective approach for scene image or video understanding. For this purpose, at first, we present a new combination of rough and fuzzy concept to study irregular shapes of edge components in input scene videos, which helps to classify edge components into several groups. Next, the proposed method explores gradient direction information of each pixel in each edge component group to extract stroke based features by dividing each group into several intra and inter planes. We further extract correlation and covariance features to encode semantic features located inside planes or between planes. Features of intra and inter planes of groups are then concatenated to get a feature matrix. Finally, the feature matrix is verified with temporal frames and fed to a neural network for categorization. Experimental results show that the proposed method outperforms the existing state-of-the-art methods, at the same time, the performances of text detection and recognition methods are also improved significantly due to categorization.
Keywords: Rough set; Fuzzy set; Video categorization; Scene image classification; Video text detection; Video text recognition
|
|
|
Vacit Oguz Yazici, Joost Van de Weijer, & Arnau Ramisa. (2018). Color Naming for Multi-Color Fashion Items. In 6th World Conference on Information Systems and Technologies (Vol. 747, pp. 64–73).
Abstract: There exists a significant amount of research on color naming of single colored objects. However in reality many fashion objects consist of multiple colors. Currently, searching in fashion datasets for multi-colored objects can be a laborious task. Therefore, in this paper we focus on color naming for images with multi-color fashion items. We collect a dataset, which consists of images which may have from one up to four colors. We annotate the images with the 11 basic colors of the English language. We experiment with several designs for deep neural networks with different losses. We show that explicitly estimating the number of colors in the fashion item leads to improved results.
Keywords: Deep learning; Color; Multi-label
|
|
|
Parichehr Behjati Ardakani, Diego Velazquez, Josep M. Gonfaus, Pau Rodriguez, Xavier Roca, & Jordi Gonzalez. (2019). Catastrophic interference in Disguised Face Recognition. In 9th Iberian Conference on Pattern Recognition and Image Analysis (Vol. 11868, pp. 64–75). LNCS.
Abstract: It is commonly known the natural tendency of artificial neural networks to completely and abruptly forget previously known information when learning new information. We explore this behaviour in the context of Face Verification on the recently proposed Disguised Faces in the Wild dataset (DFW). We empirically evaluate several commonly used DCNN architectures on Face Recognition and distill some insights about the effect of sequential learning on distinct identities from different datasets, showing that the catastrophic forgetness phenomenon is present even in feature embeddings fine-tuned on different tasks from the original domain.
Keywords: Neural network forgetness; Face recognition; Disguised Faces
|
|
|
Jose Antonio Rodriguez, Gemma Sanchez, & Josep Llados. (2007). Categorization of Digital Ink Elements using Spectral Features. In Seventh IAPR International Workshop on Graphics Recognition (63–64).
|
|
|
Agata Lapedriza, Jaume Garcia, Ernest Valveny, Robert Benavente, Miquel Ferrer, & Gemma Sanchez. (2008). Una experiencia de aprenentatge basada en projectes en el ambit de la informatica.
|
|
|
Marçal Rusiñol, David Aldavert, Ricardo Toledo, & Josep Llados. (2011). Browsing Heterogeneous Document Collections by a Segmentation-Free Word Spotting Method. In 11th International Conference on Document Analysis and Recognition (pp. 63–67).
Abstract: In this paper, we present a segmentation-free word spotting method that is able to deal with heterogeneous document image collections. We propose a patch-based framework where patches are represented by a bag-of-visual-words model powered by SIFT descriptors. A later refinement of the feature vectors is performed by applying the latent semantic indexing technique. The proposed method performs well on both handwritten and typewritten historical document images. We have also tested our method on documents written in non-Latin scripts.
|
|
|
Olivier Penacchio, Laura Dempere-Marco, & Xavier Otazu. (2012). A Neurodynamical Model Of Brightness Induction In V1 Following Static And Dynamic Contextual Influences. In 8th Federation of European Neurosciences (Vol. 6, pp. 63–64).
Abstract: Brightness induction is the modulation of the perceived intensity of an area by the luminance of surrounding areas. Although striate cortex is traditionally regarded as an area mostly responsive to ensory (i.e. retinal) information,
neurophysiological evidence suggests that perceived brightness information mightbe explicitly represented in V1.
Such evidence has been observed both in anesthetised cats where neuronal response modulations have been found to follow luminance changes outside the receptive felds and in human fMRI measurements. In this work, possible neural mechanisms that ofer a plausible explanation for such phenomenon are investigated. To this end, we consider the model proposed by Z.Li (Li, Network:Comput. Neural Syst., 10 (1999)) which is based on neurophysiological evidence and focuses on the part of V1 responsible for contextual infuences, i.e. layer 2-3 pyramidal cells, interneurons, and horizontal intracortical connections. This model has reproduced other phenomena such as contour detection and preattentive segmentation, which share with brightness induction the relevant efect of contextual infuences. We have extended the original model such that the input to the network is obtained from a complete multiscale and multiorientation wavelet decomposition, thereby allowing the recovery of an image refecting the perceived intensity. The proposed model successfully accounts for well known psychophysical efects for static contexts (among them: the White's and modifed White's efects, the Todorovic, Chevreul, achromatic ring patterns, and grating induction efects) and also for brigthness induction in dynamic contexts defned by modulating the luminance of surrounding areas (e.g. the brightness of a static central area is perceived to vary in antiphase to the sinusoidal luminance changes of its surroundings). This work thus suggests that intra-cortical interactions in V1 could partially explain perceptual brightness induction efects and reveals how a common general architecture may account for several different fundamental processes emerging early in the visual processing pathway.
|
|
|
Debora Gil, Rosa Maria Ortiz, Carles Sanchez, & Antoni Rosell. (2018). Objective endoscopic measurements of central airway stenosis. A pilot study. RES - Respiration, 95, 63–69.
Abstract: Endoscopic estimation of the degree of stenosis in central airway obstruction is subjective and highly variable. Objective: To determine the benefits of using SENSA (System for Endoscopic Stenosis Assessment), an image-based computational software, for obtaining objective stenosis index (SI) measurements among a group of expert bronchoscopists and general pulmonologists. Methods: A total of 7 expert bronchoscopists and 7 general pulmonologists were enrolled to validate SENSA usage. The SI obtained by the physicians and by SENSA were compared with a reference SI to set their precision in SI computation. We used SENSA to efficiently obtain this reference SI in 11 selected cases of benign stenosis. A Web platform with three user-friendly microtasks was designed to gather the data. The users had to visually estimate the SI from videos with and without contours of the normal and the obstructed area provided by SENSA. The users were able to modify the SENSA contours to define the reference SI using morphometric bronchoscopy. Results: Visual SI estimation accuracy was associated with neither bronchoscopic experience (p = 0.71) nor the contours of the normal and the obstructed area provided by the system (p = 0.13). The precision of the SI by SENSA was 97.7% (95% CI: 92.4-103.7), which is significantly better than the precision of the SI by visual estimation (p < 0.001), with an improvement by at least 15%. Conclusion: SENSA provides objective SI measurements with a precision of up to 99.5%, which can be calculated from any bronchoscope using an affordable scalable interface. Providing normal and obstructed contours on bronchoscopic videos does not improve physicians' visual estimation of the SI.
Keywords: Bronchoscopy; Tracheal stenosis; Airway stenosis; Computer-assisted analysis
|
|