Adrien Pavao, Isabelle Guyon, Anne-Catherine Letournel, Dinh-Tuan Tran, Xavier Baro, Hugo Jair Escalante, et al. (2023). CodaLab Competitions: An Open Source Platform to Organize Scientific Challenges. JMLR - Journal of Machine Learning Research, .
Abstract: CodaLab Competitions is an open source web platform designed to help data scientists and research teams to crowd-source the resolution of machine learning problems through the organization of competitions, also called challenges or contests. CodaLab Competitions provides useful features such as multiple phases, results and code submissions, multi-score leaderboards, and jobs running
inside Docker containers. The platform is very flexible and can handle large scale experiments, by allowing organizers to upload large datasets and provide their own CPU or GPU compute workers.
|
Ruben Ballester, Carles Casacuberta, & Sergio Escalera. (2023). Decorrelating neurons using persistence.
Abstract: We propose a novel way to improve the generalisation capacity of deep learning models by reducing high correlations between neurons. For this, we present two regularisation terms computed from the weights of a minimum spanning tree of the clique whose vertices are the neurons of a given network (or a sample of those), where weights on edges are correlation dissimilarities. We provide an extensive set of experiments to validate the effectiveness of our terms, showing that they outperform popular ones. Also, we demonstrate that naive minimisation of all correlations between neurons obtains lower accuracies than our regularisation terms, suggesting that redundancies play a significant role in artificial neural networks, as evidenced by some studies in neuroscience for real networks. We include a proof of differentiability of our regularisers, thus developing the first effective topological persistence-based regularisation terms that consider the whole set of neurons and that can be applied to a feedforward architecture in any deep learning task such as classification, data generation, or regression.
|
Anders Skaarup Johansen, Kamal Nasrollahi, Sergio Escalera, & Thomas B. Moeslund. (2023). Who Cares about the Weather? Inferring Weather Conditions for Weather-Aware Object Detection in Thermal Images. AS - Applied Sciences, 13(18).
Abstract: Deployments of real-world object detection systems often experience a degradation in performance over time due to concept drift. Systems that leverage thermal cameras are especially susceptible because the respective thermal signatures of objects and their surroundings are highly sensitive to environmental changes. In this study, two types of weather-aware latent conditioning methods are investigated. The proposed method aims to guide two object detectors, (YOLOv5 and Deformable DETR) to become weather-aware. This is achieved by leveraging an auxiliary branch that predicts weather-related information while conditioning intermediate layers of the object detector. While the conditioning methods proposed do not directly improve the accuracy of baseline detectors, it can be observed that conditioned networks manage to extract a weather-related signal from the thermal images, thus resulting in a decreased miss rate at the cost of increased false positives. The extracted signal appears noisy and is thus challenging to regress accurately. This is most likely a result of the qualitative nature of the thermal sensor; thus, further work is needed to identify an ideal method for optimizing the conditioning branch, as well as to further improve the accuracy of the system.
Keywords: thermal; object detection; concept drift; conditioning; weather recognition
|
Anthony Cioppa, Silvio Giancola, Vladimir Somers, Floriane Magera, Xin Zhou, Hassan Mkhallati, et al. (2023). SoccerNet 2023 Challenges Results.
Abstract: The SoccerNet 2023 challenges were the third annual video understanding challenges organized by the SoccerNet team. For this third edition, the challenges were composed of seven vision-based tasks split into three main themes. The first theme, broadcast video understanding, is composed of three high-level tasks related to describing events occurring in the video broadcasts: (1) action spotting, focusing on retrieving all timestamps related to global actions in soccer, (2) ball action spotting, focusing on retrieving all timestamps related to the soccer ball change of state, and (3) dense video captioning, focusing on describing the broadcast with natural language and anchored timestamps. The second theme, field understanding, relates to the single task of (4) camera calibration, focusing on retrieving the intrinsic and extrinsic camera parameters from images. The third and last theme, player understanding, is composed of three low-level tasks related to extracting information about the players: (5) re-identification, focusing on retrieving the same players across multiple views, (6) multiple object tracking, focusing on tracking players and the ball through unedited video streams, and (7) jersey number recognition, focusing on recognizing the jersey number of players from tracklets. Compared to the previous editions of the SoccerNet challenges, tasks (2-3-7) are novel, including new annotations and data, task (4) was enhanced with more data and annotations, and task (6) now focuses on end-to-end approaches. More information on the tasks, challenges, and leaderboards are available on this https URL. Baselines and development kits can be found on this https URL.
|
Razieh Rastgoo, Kourosh Kiani, & Sergio Escalera. (2024). A transformer model for boundary detection in continuous sign language. MTAP - Multimedia Tools and Applications, .
Abstract: Sign Language Recognition (SLR) has garnered significant attention from researchers in recent years, particularly the intricate domain of Continuous Sign Language Recognition (CSLR), which presents heightened complexity compared to Isolated Sign Language Recognition (ISLR). One of the prominent challenges in CSLR pertains to accurately detecting the boundaries of isolated signs within a continuous video stream. Additionally, the reliance on handcrafted features in existing models poses a challenge to achieving optimal accuracy. To surmount these challenges, we propose a novel approach utilizing a Transformer-based model. Unlike traditional models, our approach focuses on enhancing accuracy while eliminating the need for handcrafted features. The Transformer model is employed for both ISLR and CSLR. The training process involves using isolated sign videos, where hand keypoint features extracted from the input video are enriched using the Transformer model. Subsequently, these enriched features are forwarded to the final classification layer. The trained model, coupled with a post-processing method, is then applied to detect isolated sign boundaries within continuous sign videos. The evaluation of our model is conducted on two distinct datasets, including both continuous signs and their corresponding isolated signs, demonstrates promising results.
|
Mustafa Hajij, Mathilde Papillon, Florian Frantzen, Jens Agerberg, Ibrahem AlJabea, Ruben Ballester, et al. (2024). TopoX: A Suite of Python Packages for Machine Learning on Topological Domains.
Abstract: We introduce TopoX, a Python software suite that provides reliable and user-friendly building blocks for computing and machine learning on topological domains that extend graphs: hypergraphs, simplicial, cellular, path and combinatorial complexes. TopoX consists of three packages: TopoNetX facilitates constructing and computing on these domains, including working with nodes, edges and higher-order cells; TopoEmbedX provides methods to embed topological domains into vector spaces, akin to popular graph-based embedding algorithms such as node2vec; TopoModelx is built on top of PyTorch and offers a comprehensive toolbox of higher-order message passing functions for neural networks on topological domains. The extensively documented and unit-tested source code of TopoX is available under MIT license at this https URL.
|
German Barquero, Sergio Escalera, & Cristina Palmero. (2024). Seamless Human Motion Composition with Blended Positional Encodings.
Abstract: Conditional human motion generation is an important topic with many applications in virtual reality, gaming, and robotics. While prior works have focused on generating motion guided by text, music, or scenes, these typically result in isolated motions confined to short durations. Instead, we address the generation of long, continuous sequences guided by a series of varying textual descriptions. In this context, we introduce FlowMDM, the first diffusion-based model that generates seamless Human Motion Compositions (HMC) without any postprocessing or redundant denoising steps. For this, we introduce the Blended Positional Encodings, a technique that leverages both absolute and relative positional encodings in the denoising chain. More specifically, global motion coherence is recovered at the absolute stage, whereas smooth and realistic transitions are built at the relative stage. As a result, we achieve state-of-the-art results in terms of accuracy, realism, and smoothness on the Babel and HumanML3D datasets. FlowMDM excels when trained with only a single description per motion sequence thanks to its Pose-Centric Cross-ATtention, which makes it robust against varying text descriptions at inference time. Finally, to address the limitations of existing HMC metrics, we propose two new metrics: the Peak Jerk and the Area Under the Jerk, to detect abrupt transitions.
|
A. Martinez, & Jordi Vitria. (1997). Using Low-Dimensional Spaces for Face Recognition..
|
J.R. Serra, A. Martinez, Jordi Vitria, & J.B. Subirana. (1997). Iconic Representation to Image Retrieval..
|
Ernest Valveny, Ricardo Toledo, Ramon Baldrich, & Enric Marti. (2002). Combining recognition-based in segmentation-based approaches for graphic symol recognition using deformable template matching. In Proceeding of the Second IASTED International Conference Visualization, Imaging and Image Proceesing VIIP 2002 (502–507).
|
Josep Llados, Enric Marti, & Juan J.Villanueva. (2001). Symbol recognition by error-tolerant subgraph matching between region adjacency graphs. IEEE Transactions on Pattern Analysis and Machine Intelligence, 23(10), 1137–1143.
Abstract: The recognition of symbols in graphic documents is an intensive research activity in the community of pattern recognition and document analysis. A key issue in the interpretation of maps, engineering drawings, diagrams, etc. is the recognition of domain dependent symbols according to a symbol database. In this work we first review the most outstanding symbol recognition methods from two different points of view: application domains and pattern recognition methods. In the second part of the paper, open and unaddressed problems involved in symbol recognition are described, analyzing their current state of art and discussing future research challenges. Thus, issues such as symbol representation, matching, segmentation, learning, scalability of recognition methods and performance evaluation are addressed in this work. Finally, we discuss the perspectives of symbol recognition concerning to new paradigms such as user interfaces in handheld computers or document database and WWW indexing by graphical content.
|
Josep Llados, Horst Bunke, & Enric Marti. (1997). Finding rotational symmetries by cyclic string matching. PRL - Pattern recognition letters, 18(14), 1435–1442.
Abstract: Symmetry is an important shape feature. In this paper, a simple and fast method to detect perfect and distorted rotational symmetries of 2D objects is described. The boundary of a shape is polygonally approximated and represented as a string. Rotational symmetries are found by cyclic string matching between two identical copies of the shape string. The set of minimum cost edit sequences that transform the shape string to a cyclically shifted version of itself define the rotational symmetry and its order. Finally, a modification of the algorithm is proposed to detect reflectional symmetries. Some experimental results are presented to show the reliability of the proposed algorithm
Keywords: Rotational symmetry; Reflectional symmetry; String matching
|
Josep Llados, Horst Bunke, & Enric Marti. (1997). Using Cyclic String Matching to Find Rotational and Reflectional Symmetries in Shapes. In Intelligent Robots: Sensing, Modeling and Planning (pp. 164–179). World Scientific Press.
Abstract: Dagstuhl Workshop
|
Josep Llados, Horst Bunke, & Enric Marti. (1996). Structural Recognition of hand drawn floor plans. In VI National Symposium on Pattern Recognition and Image Analysis. Cordoba.
Abstract: A system to recognize hand drawn architectural drawings in a CAD environment has been deve- loped. In this paper we focus on its high level interpretation module. To interpret a floor plan, the system must identify several building elements, whose description is stored in a library of pat- terns, as well as their spatial relationships. We propose a structural approach based on subgraph isomorphism techniques to obtain a high-level interpretation of the document. The vectorized input document and the patterns to be recognized are represented by attributed graphs. Discrete relaxation techniques (AC4 algorithm) have been applied to develop the matching algorithm. The process has been divided in three steps: node labeling, local consistency and global consistency verification. The hand drawn creation causes disturbed line drawings with several accuracy errors, which must be taken into account. Here we have identified them and the AC4 algorithm has been adapted to manage them.
Keywords: Rotational Symmetry; Reflectional Symmetry; String Matching.
|
Josep Llados, & Enric Marti. (1999). A graph-edit algorithm for hand-drawn graphical document recognition and their automatic introduction into CAD systems. Machine Graphics & Vision, 8, 195–211.
|