Ivan Huerta, Michael Holte, Thomas B. Moeslund, & Jordi Gonzalez. (2015). Chromatic shadow detection and tracking for moving foreground segmentation. IMAVIS - Image and Vision Computing, 41, 42–53.
Abstract: Advanced segmentation techniques in the surveillance domain deal with shadows to avoid distortions when detecting moving objects. Most approaches for shadow detection are still typically restricted to penumbra shadows and cannot cope well with umbra shadows. Consequently, umbra shadow regions are usually detected as part of moving objects, thus aecting the performance of the nal detection. In this paper we address the detection of both penumbra and umbra shadow regions. First, a novel bottom-up approach is presented based on gradient and colour models, which successfully discriminates between chromatic moving cast shadow regions and those regions detected as moving objects. In essence, those regions corresponding to potential shadows are detected based on edge partitioning and colour statistics. Subsequently (i) temporal similarities between textures and (ii) spatial similarities between chrominance angle and brightness distortions are analysed for each potential shadow region for detecting the umbra shadow regions. Our second contribution renes even further the segmentation results: a tracking-based top-down approach increases the performance of our bottom-up chromatic shadow detection algorithm by properly correcting non-detected shadows.
To do so, a combination of motion lters in a data association framework exploits the temporal consistency between objects and shadows to increase
the shadow detection rate. Experimental results exceed current state-of-the-
art in shadow accuracy for multiple well-known surveillance image databases which contain dierent shadowed materials and illumination conditions.
Keywords: Detecting moving objects; Chromatic shadow detection; Temporal local gradient; Spatial and Temporal brightness and angle distortions; Shadow tracking
|
Pau Torras, Arnau Baro, Alicia Fornes, & Lei Kang. (2022). Improving Handwritten Music Recognition through Language Model Integration. In 4th International Workshop on Reading Music Systems (WoRMS2022) (pp. 42–46).
Abstract: Handwritten Music Recognition, especially in the historical domain, is an inherently challenging endeavour; paper degradation artefacts and the ambiguous nature of handwriting make recognising such scores an error-prone process, even for the current state-of-the-art Sequence to Sequence models. In this work we propose a way of reducing the production of statistically implausible output sequences by fusing a Language Model into a recognition Sequence to Sequence model. The idea is leveraging visually-conditioned and context-conditioned output distributions in order to automatically find and correct any mistakes that would otherwise break context significantly. We have found this approach to improve recognition results to 25.15 SER (%) from a previous best of 31.79 SER (%) in the literature.
Keywords: optical music recognition; historical sources; diversity; music theory; digital humanities
|
Marçal Rusiñol, & Josep Llados. (2007). A Region-Based Hashing Approach for Symbol Spotting in Thechnical Documents. In J.M. Ogier W. L. J. Llados (Ed.), Seventh IAPR International Workshop on Graphics Recognition (41–42).
|
Sergio Escalera, R. M. Martinez, Jordi Vitria, Petia Radeva, & Maria Teresa Anguera. (2010). Deteccion automatica de la dominancia en conversaciones diadicas. EP - Escritos de Psicologia, 3(2), 41–45.
Abstract: Dominance is referred to the level of influence that a person has in a conversation. Dominance is an important research area in social psychology, but the problem of its automatic estimation is a very recent topic in the contexts of social and wearable computing. In this paper, we focus on the dominance detection of visual cues. We estimate the correlation among observers by categorizing the dominant people in a set of face-to-face conversations. Different dominance indicators from gestural communication are defined, manually annotated, and compared to the observers' opinion. Moreover, these indicators are automatically extracted from video sequences and learnt by using binary classifiers. Results from the three analyses showed a high correlation and allows the categorization of dominant people in public discussion video sequences.
Keywords: Dominance detection; Non-verbal communication; Visual features
|
Oriol Rodriguez-Leon.A.Carol, H.Tizon, Eduard Fernandez-Nofrerias, Josefina Mauri, Vicente del Valle, Debora Gil, et al. (2005). Model estadístic-determinístic per la segmentació de l adventicia en imatges d ecografía intracoronaria. Rev Societat Catalana Cardiologia, 5, 41.
|
Jorge Bernal, Joan M. Nuñez, F. Javier Sanchez, & Fernando Vilariño. (2014). Polyp Segmentation Method in Colonoscopy Videos by means of MSA-DOVA Energy Maps Calculation. In 3rd MICCAI Workshop on Clinical Image-based Procedures: Translational Research in Medical Imaging (Vol. 8680, pp. 41–49).
Abstract: In this paper we present a novel polyp region segmentation method for colonoscopy videos. Our method uses valley information associated to polyp boundaries in order to provide an initial segmentation. This first segmentation is refined to eliminate boundary discontinuities caused by image artifacts or other elements of the scene. Experimental results over a publicly annotated database show that our method outperforms both general and specific segmentation methods by providing more accurate regions rich in polyp content. We also prove how image preprocessing is needed to improve final polyp region segmentation.
Keywords: Image segmentation; Polyps; Colonoscopy; Valley information; Energy maps
|
Joan Serrat, Felipe Lumbreras, Francisco Blanco, Manuel Valiente, & Montserrat Lopez-Mesas. (2017). myStone: A system for automatic kidney stone classification. ESA - Expert Systems with Applications, 89, 41–51.
Abstract: Kidney stone formation is a common disease and the incidence rate is constantly increasing worldwide. It has been shown that the classification of kidney stones can lead to an important reduction of the recurrence rate. The classification of kidney stones by human experts on the basis of certain visual color and texture features is one of the most employed techniques. However, the knowledge of how to analyze kidney stones is not widespread, and the experts learn only after being trained on a large number of samples of the different classes. In this paper we describe a new device specifically designed for capturing images of expelled kidney stones, and a method to learn and apply the experts knowledge with regard to their classification. We show that with off the shelf components, a carefully selected set of features and a state of the art classifier it is possible to automate this difficult task to a good degree. We report results on a collection of 454 kidney stones, achieving an overall accuracy of 63% for a set of eight classes covering almost all of the kidney stones taxonomy. Moreover, for more than 80% of samples the real class is the first or the second most probable class according to the system, being then the patient recommendations for the two top classes similar. This is the first attempt towards the automatic visual classification of kidney stones, and based on the current results we foresee better accuracies with the increase of the dataset size.
Keywords: Kidney stone; Optical device; Computer vision; Image classification
|
Angel Sappa, David Geronimo, Fadi Dornaika, & Antonio Lopez. (2007). Stereo Vision Camera Pose Estimation for On-Board Applications. In Scene Reconstruction, Pose Estimation and Traking (pp. 39–50). Rustam Stolking.
|
Aura Hernandez-Sabate, Petia Radeva, Antonio Tovar, & Debora Gil. (2006). Vessel structures alignment by spectral analysis of ivus sequences. In Proc. of CVII, MICCAI Workshop (pp. 39–36). 1st International Wokshop on Computer Vision for Intravascular and Intracardiac Imaging (CVII’06). Copenhaguen (Denmark),.
Abstract: Three-dimensional intravascular ultrasound (IVUS) allows to visualize and obtain volumetric measurements of coronary lesions through an exploration of the cross sections and longitudinal views of arteries. However, the visualization and subsequent morpho-geometric measurements in IVUS longitudinal cuts are subject to distortion caused by periodic image/vessel motion around the IVUS catheter. Usually, to overcome the image motion artifact ECG-gating and image-gated approaches are proposed, leading to slowing the pullback acquisition or disregarding part of IVUS data. In this paper, we argue that the image motion is due to 3-D vessel geometry as well as cardiac dynamics, and propose a dynamic model based on the tracking of an elliptical vessel approximation to recover the rigid transformation and align IVUS images without loosing any IVUS data. We report an extensive validation with synthetic simulated data and in vivo IVUS sequences of 30 patients achieving an average reduction of the image artifact of 97% in synthetic data and 79% in real-data. Our study shows that IVUS alignment improves longitudinal analysis of the IVUS data and is a necessary step towards accurate reconstruction and volumetric measurements of 3-D IVUS.
|
Firat Ismailoglu, Ida G. Sprinkhuizen-Kuyper, Evgueni Smirnov, Sergio Escalera, & Ralf Peeters. (2015). Fractional Programming Weighted Decoding for Error-Correcting Output Codes. In Multiple Classifier Systems, Proceedings of 12th International Workshop , MCS 2015 (pp. 38–50). Springer International Publishing.
Abstract: In order to increase the classification performance obtained using Error-Correcting Output Codes designs (ECOC), introducing weights in the decoding phase of the ECOC has attracted a lot of interest. In this work, we present a method for ECOC designs that focuses on increasing hypothesis margin on the data samples given a base classifier. While achieving this, we implicitly reward the base classifiers with high performance, whereas punish those with low performance. The resulting objective function is of the fractional programming type and we deal with this problem through the Dinkelbach’s Algorithm. The conducted tests over well known UCI datasets show that the presented method is superior to the unweighted decoding and that it outperforms the results of the state-of-the-art weighted decoding methods in most of the performed experiments.
|
Salim Jouili, Salvatore Tabbone, & Ernest Valveny. (2010). Comparing Graph Similarity Measures for Graphical Recognition. In Graphics Recognition. Achievements, Challenges, and Evolution. 8th International Workshop, GREC 2009. Selected Papers (Vol. 6020, pp. 37–48). LNCS. Springer Berlin Heidelberg.
Abstract: In this paper we evaluate four graph distance measures. The analysis is performed for document retrieval tasks. For this aim, different kind of documents are used including line drawings (symbols), ancient documents (ornamental letters), shapes and trademark-logos. The experimental results show that the performance of each graph distance measure depends on the kind of data and the graph representation technique.
|
Jaume Garcia, Petia Radeva, & Francesc Carreras. (2004). Combining Spectral and Active Shape methods to Track Tagged MRI. In Recent Advances in Artificial Intelligence Research and Development (pp. 37–44). IOS Press.
Abstract: Tagged magnetic resonance is a very usefull and unique tool that provides a complete local and global knowledge of the left ventricle (LV) motion. In this article we introduce a method capable of tracking and segmenting the LV. Spectral methods are applied in order to obtain the so called HARP images which encode information about movement and are the base for LV point-tracking. For segmentation we use Active Shapes (ASM) that model LV shape variation in order to overcome possible local misplacements of the boundary. We finally show experiments on both synthetic and real data which appear to be very promising.
Keywords: MR; tagged MR; ASM; LV segmentation; motion estimation.
|
Ole Larsen, Petia Radeva, & Enric Marti. (1995). Bounds on the optimal elasticity parameters for a snake. Image Analysis and Processing, , 37–42.
Abstract: This paper develops a formalism by which an estimate for the upper and lower bounds for the elasticity parameters for a snake can be obtained. Objects different in size and shape give rise to different bounds. The bounds can be obtained based on an analysis of the shape of the object of interest. Experiments on synthetic images show a good correlation between the estimated behaviour of the snake and the one actually observed. Experiments on real X-ray images show that the parameters for optimal segmentation lie within the estimated bounds.
|
Xim Cerda-Company, Olivier Penacchio, & Xavier Otazu. (2021). Chromatic Induction in Migraine. VISION, 37.
Abstract: The human visual system is not a colorimeter. The perceived colour of a region does not only depend on its colour spectrum, but also on the colour spectra and geometric arrangement of neighbouring regions, a phenomenon called chromatic induction. Chromatic induction is thought to be driven by lateral interactions: the activity of a central neuron is modified by stimuli outside its classical receptive field through excitatory–inhibitory mechanisms. As there is growing evidence of an excitation/inhibition imbalance in migraine, we compared chromatic induction in migraine and control groups. As hypothesised, we found a difference in the strength of induction between the two groups, with stronger induction effects in migraine. On the other hand, given the increased prevalence of visual phenomena in migraine with aura, we also hypothesised that the difference between migraine and control would be more important in migraine with aura than in migraine without aura. Our experiments did not support this hypothesis. Taken together, our results suggest a link between excitation/inhibition imbalance and increased induction effects.
Keywords: migraine; vision; colour; colour perception; chromatic induction; psychophysics
|
Jun Wan, Guodong Guo, Sergio Escalera, Hugo Jair Escalante, & Stan Z Li. (2023). Best Solutions Proposed in the Context of the Face Anti-spoofing Challenge Series. In Advances in Face Presentation Attack Detection (37–78).
Abstract: The PAD competitions we organized attracted more than 835 teams from home and abroad, most of them from the industry, which shows that the topic of face anti-spoofing is closely related to daily life, and there is an urgent need for advanced algorithms to solve its application needs. Specifically, the Chalearn LAP multi-modal face anti-spoofing attack detection challenge attracted more than 300 teams for the development phase with a total of 13 teams qualifying for the final round; the Chalearn Face Anti-spoofing Attack Detection Challenge attracted 340 teams in the development stage, and finally, 11 and 8 teams have submitted their codes in the single-modal and multi-modal face anti-spoofing recognition challenges, respectively; the 3D High-Fidelity Mask Face Presentation Attack Detection Challenge attracted 195 teams for the development phase with a total of 18 teams qualifying for the final round. All the results were verified and re-run by the organizing team, and the results were used for the final ranking. In this chapter, we briefly the methods developed by the teams participating in each competition, and introduce the algorithm details of the top-three ranked teams in detail.
|