|
Anthony Cioppa, Silvio Giancola, Vladimir Somers, Floriane Magera, Xin Zhou, Hassan Mkhallati, et al. (2023). SoccerNet 2023 Challenges Results.
Abstract: The SoccerNet 2023 challenges were the third annual video understanding challenges organized by the SoccerNet team. For this third edition, the challenges were composed of seven vision-based tasks split into three main themes. The first theme, broadcast video understanding, is composed of three high-level tasks related to describing events occurring in the video broadcasts: (1) action spotting, focusing on retrieving all timestamps related to global actions in soccer, (2) ball action spotting, focusing on retrieving all timestamps related to the soccer ball change of state, and (3) dense video captioning, focusing on describing the broadcast with natural language and anchored timestamps. The second theme, field understanding, relates to the single task of (4) camera calibration, focusing on retrieving the intrinsic and extrinsic camera parameters from images. The third and last theme, player understanding, is composed of three low-level tasks related to extracting information about the players: (5) re-identification, focusing on retrieving the same players across multiple views, (6) multiple object tracking, focusing on tracking players and the ball through unedited video streams, and (7) jersey number recognition, focusing on recognizing the jersey number of players from tracklets. Compared to the previous editions of the SoccerNet challenges, tasks (2-3-7) are novel, including new annotations and data, task (4) was enhanced with more data and annotations, and task (6) now focuses on end-to-end approaches. More information on the tasks, challenges, and leaderboards are available on this https URL. Baselines and development kits can be found on this https URL.
|
|
|
Anton Cervantes. (2005). Biometric Newborn Identification.
|
|
|
Anton Cervantes, Gemma Sanchez, Josep Llados, Agnes Borras, & A. Rodriguez. (2005). Biometric Recognition Based on Line Shape Descriptors. In Sixth IAPR International Workshop on Graphics Recognition (GREC 2005) (335–344).
|
|
|
Anton Cervantes, Gemma Sanchez, Josep Llados, Agnes Borras, & Ana Rodriguez. (2006). Biometric Recognition Based on Line Shape Descriptors. In Lecture Notes in Computer Science (Vol. 3926, 346–357,). Springer Link.
Abstract: Abstract. In this paper we propose biometric descriptors inspired by shape signatures traditionally used in graphics recognition approaches. In particular several methods based on line shape descriptors used to iden- tify newborns from the biometric information of the ears are developed. The process steps are the following: image acquisition, ear segmentation, ear normalization, feature extraction and identification. Several shape signatures are defined from contour images. These are formulated in terms of zoning and contour crossings descriptors. Experimental results are presented to demonstrate the effectiveness of the used techniques.
|
|
|
Antoni Gurgui, Debora Gil, & Enric Marti. (2015). Laplacian Unitary Domain for Texture Morphing. In Proceedings of the 10th International Conference on Computer Vision Theory and Applications VISIGRAPP2015 (Vol. 1, pp. 693–699). SciTePress.
Abstract: Deformation of expressive textures is the gateway to realistic computer synthesis of expressions. By their good mathematical properties and flexible formulation on irregular meshes, most texture mappings rely on solutions to the Laplacian in the cartesian space. In the context of facial expression morphing, this approximation can be seen from the opposite point of view by neglecting the metric. In this paper, we use the properties of the Laplacian in manifolds to present a novel approach to warping expressive facial images in order to generate a morphing between them.
Keywords: Facial; metamorphosis;LaplacianMorphing
|
|
|
Antoni Gurgui, Debora Gil, Enric Marti, & Vicente Grau. (2016). Left-Ventricle Basal Region Constrained Parametric Mapping to Unitary Domain. In 7th International Workshop on Statistical Atlases & Computational Modelling of the Heart (Vol. 10124, pp. 163–171). LNCS.
Abstract: Due to its complex geometry, the basal ring is often omitted when putting different heart geometries into correspondence. In this paper, we present the first results on a new mapping of the left ventricle basal rings onto a normalized coordinate system using a fold-over free approach to the solution to the Laplacian. To guarantee correspondences between different basal rings, we imposed some internal constrained positions at anatomical landmarks in the normalized coordinate system. To prevent internal fold-overs, constraints are handled by cutting the volume into regions defined by anatomical features and mapping each piece of the volume separately. Initial results presented in this paper indicate that our method is able to handle internal constrains without introducing fold-overs and thus guarantees one-to-one mappings between different basal ring geometries.
Keywords: Laplacian; Constrained maps; Parameterization; Basal ring
|
|
|
Antoni Rosell, Sonia Baeza, S. Garcia-Reina, JL. Mate, Ignasi Guasch, I. Nogueira, et al. (2022). EP01.05-001 Radiomics to Increase the Effectiveness of Lung Cancer Screening Programs. Radiolung Preliminary Results. JTO - Journal of Thoracic Oncology, 17(9), S182.
|
|
|
Antoni Rosell, Sonia Baeza, S. Garcia-Reina, JL. Mate, Ignasi Guasch, I. Nogueira, et al. (2022). Radiomics to increase the effectiveness of lung cancer screening programs. Radiolung preliminary results. ERJ - European Respiratory Journal, 60(66).
|
|
|
Antonio Carta, Andrea Cossu, Vincenzo Lomonaco, Davide Bacciu, & Joost Van de Weijer. (2023). Projected Latent Distillation for Data-Agnostic Consolidation in Distributed Continual Learning.
Abstract: Distributed learning on the edge often comprises self-centered devices (SCD) which learn local tasks independently and are unwilling to contribute to the performance of other SDCs. How do we achieve forward transfer at zero cost for the single SCDs? We formalize this problem as a Distributed Continual Learning scenario, where SCD adapt to local tasks and a CL model consolidates the knowledge from the resulting stream of models without looking at the SCD's private data. Unfortunately, current CL methods are not directly applicable to this scenario. We propose Data-Agnostic Consolidation (DAC), a novel double knowledge distillation method that consolidates the stream of SC models without using the original data. DAC performs distillation in the latent space via a novel Projected Latent Distillation loss. Experimental results show that DAC enables forward transfer between SCDs and reaches state-of-the-art accuracy on Split CIFAR100, CORe50 and Split TinyImageNet, both in reharsal-free and distributed CL scenarios. Somewhat surprisingly, even a single out-of-distribution image is sufficient as the only source of data during consolidation.
|
|
|
Antonio Clavelli. (2014). A computational model of eye guidance, searching for text in real scene images (Dimosthenis Karatzas, Giuseppe Boccignone, & Josep Llados, Eds.). Ph.D. thesis, Ediciones Graficas Rey, .
Abstract: Searching for text objects in real scene images is an open problem and a very active computer vision research area. A large number of methods have been proposed tackling the text search as extension of the ones from the document analysis field or inspired by general purpose object detection methods. However the general problem of object search in real scene images remains an extremely challenging problem due to the huge variability in object appearance. This thesis builds on top of the most recent findings in the visual attention literature presenting a novel computational model of eye guidance aiming to better describe text object search in real scene images.
First are presented the relevant state-of-the-art results from the visual attention literature regarding eye movements and visual search. Relevant models of attention are discussed and integrated with recent observations on the role of top-down constraints and the emerging need for a layered model of attention in which saliency is not the only factor guiding attention. Visual attention is then explained by the interaction of several modulating factors, such as objects, value, plans and saliency. Then we introduce our probabilistic formulation of attention deployment in real scene. The model is based on the rationale that oculomotor control depends on two interacting but distinct processes: an attentional process that assigns value to the sources of information and motor process that flexibly links information with action.
In such framework, the choice of where to look next is task-dependent and oriented to classes of objects embedded within pictures of complex scenes. The dependence on task is taken into account by exploiting the value and the reward of gazing at certain image patches or proto-objects that provide a sparse representation of the scene objects.
In the experimental section the model is tested in laboratory condition, comparing model simulations with data from eye tracking experiments. The comparison is qualitative in terms of observable scan paths and quantitative in terms of statistical similarity of gaze shift amplitude. Experiments are performed using eye tracking data from both a publicly available dataset of face and text and from newly performed eye-tracking experiments on a dataset of street view pictures containing text. The last part of this thesis is dedicated to study the extent to which the proposed model can account for human eye movements in a low constrained setting. We used a mobile eye tracking device and an ad-hoc developed methodology to compare model simulated eye data with the human eye data from mobile eye tracking recordings. Such setting allow to test the model in an incomplete visual information condition, reproducing a close to real-life search task.
|
|
|
Antonio Clavelli, & Dimosthenis Karatzas. (2009). Text Segmentation in Colour Posters from the Spanish Civil War Era. In 10th International Conference on Document Analysis and Recognition (pp. 181–185).
Abstract: The extraction of textual content from colour documents of a graphical nature is a complicated task. The text can be rendered in any colour, size and orientation while the existence of complex background graphics with repetitive patterns can make its localization and segmentation extremely difficult.
Here, we propose a new method for extracting textual content from such colour images that makes no assumption as to the size of the characters, their orientation or colour, while it is tolerant to characters that do not follow a straight baseline. We evaluate this method on a collection of documents with historical
connotations: the Posters from the Spanish Civil War.
|
|
|
Antonio Clavelli, Dimosthenis Karatzas, & Josep Llados. (2010). A framework for the assessment of text extraction algorithms on complex colour images. In 9th IAPR International Workshop on Document Analysis Systems (19–26).
Abstract: The availability of open, ground-truthed datasets and clear performance metrics is a crucial factor in the development of an application domain. The domain of colour text image analysis (real scenes, Web and spam images, scanned colour documents) has traditionally suffered from a lack of a comprehensive performance evaluation framework. Such a framework is extremely difficult to specify, and corresponding pixel-level accurate information tedious to define. In this paper we discuss the challenges and technical issues associated with developing such a framework. Then, we describe a complete framework for the evaluation of text extraction methods at multiple levels, provide a detailed ground-truth specification and present a case study on how this framework can be used in a real-life situation.
|
|
|
Antonio Clavelli, Dimosthenis Karatzas, Josep Llados, Mario Ferraro, & Giuseppe Boccignone. (2013). Towards Modelling an Attention-Based Text Localization Process. In 6th Iberian Conference on Pattern Recognition and Image Analysis (Vol. 7887, pp. 296–303). LNCS. Springer Berlin Heidelberg.
Abstract: This note introduces a visual attention model of text localization in real-world scenes. The core of the model built upon the proto-object concept is discussed. It is shown how such dynamic mid-level representation of the scene can be derived in the framework of an action-perception loop engaging salience, text information value computation, and eye guidance mechanisms.
Preliminary results that compare model generated scanpaths with those eye-tracked from human subjects are presented.
Keywords: text localization; visual attention; eye guidance
|
|
|
Antonio Clavelli, Dimosthenis Karatzas, Josep Llados, Mario Ferraro, & Giuseppe Boccignone. (2014). Modelling task-dependent eye guidance to objects in pictures. CoCom - Cognitive Computation, 6(3), 558–584.
Abstract: 5Y Impact Factor: 1.14 / 3rd (Computer Science, Artificial Intelligence)
We introduce a model of attentional eye guidance based on the rationale that the deployment of gaze is to be considered in the context of a general action-perception loop relying on two strictly intertwined processes: sensory processing, depending on current gaze position, identifies sources of information that are most valuable under the given task; motor processing links such information with the oculomotor act by sampling the next gaze position and thus performing the gaze shift. In such a framework, the choice of where to look next is task-dependent and oriented to classes of objects embedded within pictures of complex scenes. The dependence on task is taken into account by exploiting the value and the payoff of gazing at certain image patches or proto-objects that provide a sparse representation of the scene objects. The different levels of the action-perception loop are represented in probabilistic form and eventually give rise to a stochastic process that generates the gaze sequence. This way the model also accounts for statistical properties of gaze shifts such as individual scan path variability. Results of the simulations are compared either with experimental data derived from publicly available datasets and from our own experiments.
Keywords: Visual attention; Gaze guidance; Value; Payoff; Stochastic fixation prediction
|
|
|
Antonio Esteban Lansaque. (2014). 3D reconstruction and recognition using structured ligth (Vol. 179). Master's thesis, , .
Abstract: This work covers the problem of 3D reconstruction, recognition and 6DOF pose estimation. The goal of this project is to reconstruct a 3D scene and to align an object model of the industrial pieces onto the reconstructed scene. The reconstruction algorithm is based on stereo techniques and the recognition algorithm is based on SHOT descriptors computed on a set of uniform keypoints. Correspondences are used to estimate a first 6DOF transformation that maps the model onto the scene and then ICP algorithm is used to refine the transformation. In order to check the effectiveness of the proposed algorithm, several experiments were performed. These experiments were conducted on a lab environment in order to get results under the same conditions in all of them. Although obtained results are not real time results, the proposed algorithm ends up with high rates of object recognition.
|
|