|
Records |
Links |
|
Author |
Olivier Lefebvre; Pau Riba; Charles Fournier; Alicia Fornes; Josep Llados; Rejean Plamondon; Jules Gagnon-Marchand |
|
|
Title |
Monitoring neuromotricity on-line: a cloud computing approach |
Type |
Conference Article |
|
Year |
2015 |
Publication |
17th Conference of the International Graphonomics Society IGS2015 |
Abbreviated Journal |
|
|
|
Volume |
|
Issue |
|
Pages |
|
|
|
Keywords |
|
|
|
Abstract |
The goal of our experiment is to develop a useful and accessible tool that can be used to evaluate a patient's health by analyzing handwritten strokes. We use a cloud computing approach to analyze stroke data sampled on a commercial tablet working on the Android platform and a distant server to perform complex calculations using the Delta and Sigma lognormal algorithms. A Google Drive account is used to store the data and to ease the development of the project. The communication between the tablet, the cloud and the server is encrypted to ensure biomedical information confidentiality. Highly parameterized biomedical tests are implemented on the tablet as well as a free drawing test to evaluate the validity of the data acquired by the first test compared to the second one. A blurred shape model descriptor pattern recognition algorithm is used to classify the data obtained by the free drawing test. The functions presented in this paper are still currently under development and other improvements are needed before launching the application in the public domain. |
|
|
Address |
Pointe-à-Pitre; Guadeloupe; June 2015 |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
IGS |
|
|
Notes |
DAG; 600.077 |
Approved |
no |
|
|
Call Number |
Admin @ si @ LRF2015 |
Serial |
2617 |
|
Permanent link to this record |
|
|
|
|
Author |
Sounak Dey; Anjan Dutta; Suman Ghosh; Ernest Valveny; Josep Llados |
|
|
Title |
Aligning Salient Objects to Queries: A Multi-modal and Multi-object Image Retrieval Framework |
Type |
Conference Article |
|
Year |
2018 |
Publication |
14th Asian Conference on Computer Vision |
Abbreviated Journal |
|
|
|
Volume |
|
Issue |
|
Pages |
|
|
|
Keywords |
|
|
|
Abstract |
In this paper we propose an approach for multi-modal image retrieval in multi-labelled images. A multi-modal deep network architecture is formulated to jointly model sketches and text as input query modalities into a common embedding space, which is then further aligned with the image feature space. Our architecture also relies on a salient object detection through a supervised LSTM-based visual attention model learned from convolutional features. Both the alignment between the queries and the image and the supervision of the attention on the images are obtained by generalizing the Hungarian Algorithm using different loss functions. This permits encoding the object-based features and its alignment with the query irrespective of the availability of the co-occurrence of different objects in the training set. We validate the performance of our approach on standard single/multi-object datasets, showing state-of-the art performance in every dataset. |
|
|
Address |
Perth; Australia; December 2018 |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
ACCV |
|
|
Notes |
DAG; 600.097; 600.121; 600.129 |
Approved |
no |
|
|
Call Number |
Admin @ si @ DDG2018a |
Serial |
3151 |
|
Permanent link to this record |
|
|
|
|
Author |
Miquel Ferrer; Ernest Valveny; F. Serratosa |
|
|
Title |
Median Graph Computation by means of a Genetic Approach Based on Minimum Common Supergraph and Maximum Common Subraph |
Type |
Conference Article |
|
Year |
2009 |
Publication |
4th Iberian Conference on Pattern Recognition and Image Analysis |
Abbreviated Journal |
|
|
|
Volume |
5524 |
Issue |
|
Pages |
346–353 |
|
|
Keywords |
|
|
|
Abstract |
Given a set of graphs, the median graph has been theoretically presented as a useful concept to infer a representative of the set. However, the computation of the median graph is a highly complex task and its practical application has been very limited up to now. In this work we present a new genetic algorithm for the median graph computation. A set of experiments on real data, where none of the existing algorithms for the median graph computation could be applied up to now due to their computational complexity, show that we obtain good approximations of the median graph. Finally, we use the median graph in a real nearest neighbour classification showing that it leaves the box of the only-theoretical concepts and demonstrating, from a practical point of view, that can be a useful tool to represent a set of graphs. |
|
|
Address |
Póvoa de Varzim, Portugal |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
Springer Berlin Heidelberg |
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
LNCS |
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
0302-9743 |
ISBN |
978-3-642-02171-8 |
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
IbPRIA |
|
|
Notes |
DAG |
Approved |
no |
|
|
Call Number |
DAG @ dag @ FVS2009c |
Serial |
1174 |
|
Permanent link to this record |
|
|
|
|
Author |
Albert Gordo; Ernest Valveny |
|
|
Title |
The diagonal split: A pre-segmentation step for page layout analysis & classification |
Type |
Conference Article |
|
Year |
2009 |
Publication |
4th Iberian Conference on Pattern Recognition and Image Analysis |
Abbreviated Journal |
|
|
|
Volume |
5524 |
Issue |
|
Pages |
290–297 |
|
|
Keywords |
|
|
|
Abstract |
Document classification is an important task in all the processes related to document storage and retrieval. In the case of complex documents, structural features are needed to achieve a correct classification. Unfortunately, physical layout analysis is error prone. In this paper we present a pre-segmentation step based on a divide & conquer strategy that can be used to improve the page segmentation results, independently of the segmentation algorithm used. This pre-segmentation step is evaluated in classification and retrieval using the selective CRLA algorithm for layout segmentation together with a clustering based on the voronoi area diagram, and tested on two different databases, MARG and Girona Archives. |
|
|
Address |
Póvoa de Varzim, Portugal |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
Springer Berlin Heidelberg |
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
LNCS |
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
0302-9743 |
ISBN |
978-3-642-02171-8 |
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
IbPRIA |
|
|
Notes |
DAG |
Approved |
no |
|
|
Call Number |
DAG @ dag @ Gov2009b |
Serial |
1176 |
|
Permanent link to this record |
|
|
|
|
Author |
Sangeeth Reddy; Minesh Mathew; Lluis Gomez; Marçal Rusiñol; Dimosthenis Karatzas; C.V. Jawahar |
|
|
Title |
RoadText-1K: Text Detection and Recognition Dataset for Driving Videos |
Type |
Conference Article |
|
Year |
2020 |
Publication |
IEEE International Conference on Robotics and Automation |
Abbreviated Journal |
|
|
|
Volume |
|
Issue |
|
Pages |
|
|
|
Keywords |
|
|
|
Abstract |
Perceiving text is crucial to understand semantics of outdoor scenes and hence is a critical requirement to build intelligent systems for driver assistance and self-driving. Most of the existing datasets for text detection and recognition comprise still images and are mostly compiled keeping text in mind. This paper introduces a new ”RoadText-1K” dataset for text in driving videos. The dataset is 20 times larger than the existing largest dataset for text in videos. Our dataset comprises 1000 video clips of driving without any bias towards text and with annotations for text bounding boxes and transcriptions in every frame. State of the art methods for text detection,
recognition and tracking are evaluated on the new dataset and the results signify the challenges in unconstrained driving videos compared to existing datasets. This suggests that RoadText-1K is suited for research and development of reading systems, robust enough to be incorporated into more complex downstream tasks like driver assistance and self-driving. The dataset can be found at http://cvit.iiit.ac.in/research/
projects/cvit-projects/roadtext-1k |
|
|
Address |
Paris; Francia; ??? |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
ICRA |
|
|
Notes |
DAG; 600.121; 600.129 |
Approved |
no |
|
|
Call Number |
Admin @ si @ RMG2020 |
Serial |
3400 |
|
Permanent link to this record |
|
|
|
|
Author |
Arnau Baro; Pau Riba; Alicia Fornes |
|
|
Title |
A Starting Point for Handwritten Music Recognition |
Type |
Conference Article |
|
Year |
2018 |
Publication |
1st International Workshop on Reading Music Systems |
Abbreviated Journal |
|
|
|
Volume |
|
Issue |
|
Pages |
5-6 |
|
|
Keywords |
Optical Music Recognition; Long Short-Term Memory; Convolutional Neural Networks; MUSCIMA++; CVCMUSCIMA |
|
|
Abstract |
In the last years, the interest in Optical Music Recognition (OMR) has reawakened, especially since the appearance of deep learning. However, there are very few works addressing handwritten scores. In this work we describe a full OMR pipeline for handwritten music scores by using Convolutional and Recurrent Neural Networks that could serve as a baseline for the research community. |
|
|
Address |
Paris; France; September 2018 |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
WORMS |
|
|
Notes |
DAG; 600.097; 601.302; 601.330; 600.121 |
Approved |
no |
|
|
Call Number |
Admin @ si @ BRF2018 |
Serial |
3223 |
|
Permanent link to this record |
|
|
|
|
Author |
Soumya Jahagirdar; Minesh Mathew; Dimosthenis Karatzas; CV Jawahar |
|
|
Title |
Understanding Video Scenes Through Text: Insights from Text-Based Video Question Answering |
Type |
Conference Article |
|
Year |
2023 |
Publication |
Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV) Workshops |
Abbreviated Journal |
|
|
|
Volume |
|
Issue |
|
Pages |
|
|
|
Keywords |
|
|
|
Abstract |
Researchers have extensively studied the field of vision and language, discovering that both visual and textual content is crucial for understanding scenes effectively. Particularly, comprehending text in videos holds great significance, requiring both scene text understanding and temporal reasoning. This paper focuses on exploring two recently introduced datasets, NewsVideoQA and M4-ViteVQA, which aim to address video question answering based on textual content. The NewsVideoQA dataset contains question-answer pairs related to the text in news videos, while M4- ViteVQA comprises question-answer pairs from diverse categories like vlogging, traveling, and shopping. We provide an analysis of the formulation of these datasets on various levels, exploring the degree of visual understanding and multi-frame comprehension required for answering the questions. Additionally, the study includes experimentation with BERT-QA, a text-only model, which demonstrates comparable performance to the original methods on both datasets, indicating the shortcomings in the formulation of these datasets. Furthermore, we also look into the domain adaptation aspect by examining the effectiveness of training on M4-ViteVQA and evaluating on NewsVideoQA and vice-versa, thereby shedding light on the challenges and potential benefits of out-of-domain training. |
|
|
Address |
Paris; France; October 2023 |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
ICCVW |
|
|
Notes |
DAG |
Approved |
no |
|
|
Call Number |
Admin @ si @ JMK2023 |
Serial |
3946 |
|
Permanent link to this record |
|
|
|
|
Author |
Jordy Van Landeghem; Ruben Tito; Lukasz Borchmann; Michal Pietruszka; Pawel Joziak; Rafal Powalski; Dawid Jurkiewicz; Mickael Coustaty; Bertrand Anckaert; Ernest Valveny; Matthew Blaschko; Sien Moens; Tomasz Stanislawek |
|
|
Title |
Document Understanding Dataset and Evaluation (DUDE) |
Type |
Conference Article |
|
Year |
2023 |
Publication |
20th IEEE International Conference on Computer Vision |
Abbreviated Journal |
|
|
|
Volume |
|
Issue |
|
Pages |
19528-19540 |
|
|
Keywords |
|
|
|
Abstract |
We call on the Document AI (DocAI) community to re-evaluate current methodologies and embrace the challenge of creating more practically-oriented benchmarks. Document Understanding Dataset and Evaluation (DUDE) seeks to remediate the halted research progress in understanding visually-rich documents (VRDs). We present a new dataset with novelties related to types of questions, answers, and document layouts based on multi-industry, multi-domain, and multi-page VRDs of various origins and dates. Moreover, we are pushing the boundaries of current methods by creating multi-task and multi-domain evaluation setups that more accurately simulate real-world situations where powerful generalization and adaptation under low-resource settings are desired. DUDE aims to set a new standard as a more practical, long-standing benchmark for the community, and we hope that it will lead to future extensions and contributions that address real-world challenges. Finally, our work illustrates the importance of finding more efficient ways to model language, images, and layout in DocAI. |
|
|
Address |
Paris; France; October 2023 |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
ICCV |
|
|
Notes |
DAG |
Approved |
no |
|
|
Call Number |
Admin @ si @ LTB2023 |
Serial |
3948 |
|
Permanent link to this record |
|
|
|
|
Author |
Gemma Sanchez; Josep Llados; Enric Marti |
|
|
Title |
Segmentation and analysis of linial texture in plans |
Type |
Conference Article |
|
Year |
1997 |
Publication |
Intelligence Artificielle et Complexité. |
Abbreviated Journal |
|
|
|
Volume |
|
Issue |
|
Pages |
|
|
|
Keywords |
Structural Texture, Voronoi, Hierarchical Clustering, String Matching. |
|
|
Abstract |
The problem of texture segmentation and interpretation is one of the main concerns in the field of document analysis. Graphical documents often contain areas characterized by a structural texture whose recognition allows both the document understanding, and its storage in a more compact way. In this work, we focus on structural linial textures of regular repetition contained in plan documents. Starting from an atributed graph which represents the vectorized input image, we develop a method to segment textured areas and recognize their placement rules. We wish to emphasize that the searched textures do not follow a predefined pattern. Minimal closed loops of the input graph are computed, and then hierarchically clustered. In this hierarchical clustering, a distance function between two closed loops is defined in terms of their areas difference and boundary resemblance computed by a string matching procedure. Finally it is noted that, when the texture consists of isolated primitive elements, the same method can be used after computing a Voronoi Tesselation of the input graph. |
|
|
Address |
Paris, France |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
Paris |
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
AERFAI |
|
|
Notes |
DAG;IAM; |
Approved |
no |
|
|
Call Number |
IAM @ iam @ SLM1997 |
Serial |
1649 |
|
Permanent link to this record |
|
|
|
|
Author |
Agnes Borras; Francesc Tous; Josep Llados; Maria Vanrell |
|
|
Title |
High-Level Clothes Description Based on Colour-Texture and Structural Features |
Type |
Conference Article |
|
Year |
2003 |
Publication |
1rst. Iberian Conference on Pattern Recognition and Image Analysis IbPRIA 2003 |
Abbreviated Journal |
|
|
|
Volume |
|
Issue |
|
Pages |
|
|
|
Keywords |
|
|
|
Abstract |
|
|
|
Address |
Palma de Mallorca |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
|
|
|
Notes |
DAG;CIC |
Approved |
no |
|
|
Call Number |
CAT @ cat @ BTL2003b |
Serial |
369 |
|
Permanent link to this record |