|
Ramon Baldrich, Ricardo Toledo, Ernest Valveny, & Maria Vanrell. (2002). Perceptual Colour Image Segmentation..
|
|
|
Robert Benavente, Gemma Sanchez, Ramon Baldrich, Maria Vanrell, & Josep Llados. (2000). Normalized colour segmentation for human appearance description..
|
|
|
Francesc Tous, Agnes Borras, Robert Benavente, Ramon Baldrich, Maria Vanrell, & Josep Llados. (2002). Textual Descriptors for browsing people by visual appearence. In 5è. Congrés Català d’Intel·ligència Artificial CCIA.
Abstract: This paper presents a first approach to build colour and structural descriptors for information retrieval on a people database. Queries are formulated in terms of their appearance that allows to seek people wearing specific clothes of a given colour name or texture. Descriptors are automatically computed by following three essential steps. A colour naming labelling from pixel properties. A region seg- mentation step based on colour properties of pixels combined with edge information. And a high level step that models the region arrangements in order to build clothes structure. Results are tested on large set of images from real scenes taken at the entrance desk of a building.
Keywords: Image retrieval, textual descriptors, colour naming, colour normalization, graph matching.
|
|
|
Francesc Tous, Agnes Borras, Robert Benavente, Ramon Baldrich, Maria Vanrell, & Josep Llados. (2002). Textual Descriptions for Browsing People by Visual Apperance. In Lecture Notes in Artificial Intelligence (Vol. 2504, pp. 419–429). Springer Verlag.
Abstract: This paper presents a first approach to build colour and structural descriptors for information retrieval on a people database. Queries are formulated in terms of their appearance that allows to seek people wearing specific clothes of a given colour name or texture. Descriptors are automatically computed by following three essential steps. A colour naming labelling from pixel properties. A region seg- mentation step based on colour properties of pixels combined with edge information. And a high level step that models the region arrangements in order to build clothes structure. Results are tested on large set of images from real scenes taken at the entrance desk of a building
|
|
|
Agnes Borras, Francesc Tous, Josep Llados, & Maria Vanrell. (2003). High-Level Clothes Description Based on Color-Texture and Structural Features. In Lecture Notes in Computer Science (Vol. 2652, 108–116).
Abstract: This work is a part of a surveillance system where content- based image retrieval is done in terms of people appearance. Given an image of a person, our work provides an automatic description of his clothing according to the colour, texture and structural composition of its garments. We present a two-stage process composed by image segmentation and a region-based interpretation. We segment an image by modelling it due to an attributed graph and applying a hybrid method that follows a split-and-merge strategy. We propose the interpretation of five cloth combinations that are modelled in a graph structure in terms of region features. The interpretation is viewed as a graph matching with an associated cost between the segmentation and the cloth models. Fi- nally, we have tested the process with a ground-truth of one hundred images.
|
|
|
Agnes Borras, Francesc Tous, Josep Llados, & Maria Vanrell. (2003). High-Level Clothes Description Based on Colour-Texture and Structural Features. In 1rst. Iberian Conference on Pattern Recognition and Image Analysis IbPRIA 2003.
|
|
|
Partha Pratim Roy, Eduard Vazquez, Josep Llados, Ramon Baldrich, & Umapada Pal. (2008). A System to Segment Text and Symbols from Color Maps. In Graphics Recognition. Recent Advances and New Opportunities (Vol. 5046, pp. 245–256). LNCS.
|
|
|
Ernest Valveny, & Antonio Lopez. (2003). Numeral Recognition for Quality Control of Surgical Sachets.
|
|
|
Marçal Rusiñol, David Aldavert, Ricardo Toledo, & Josep Llados. (2011). Browsing Heterogeneous Document Collections by a Segmentation-Free Word Spotting Method. In 11th International Conference on Document Analysis and Recognition (pp. 63–67).
Abstract: In this paper, we present a segmentation-free word spotting method that is able to deal with heterogeneous document image collections. We propose a patch-based framework where patches are represented by a bag-of-visual-words model powered by SIFT descriptors. A later refinement of the feature vectors is performed by applying the latent semantic indexing technique. The proposed method performs well on both handwritten and typewritten historical document images. We have also tested our method on documents written in non-Latin scripts.
|
|
|
Adria Rico, & Alicia Fornes. (2017). Camera-based Optical Music Recognition using a Convolutional Neural Network. In 12th IAPR International Workshop on Graphics Recognition (pp. 27–28).
Abstract: Optical Music Recognition (OMR) consists in recognizing images of music scores. Contrary to expectation, the current OMR systems usually fail when recognizing images of scores captured by digital cameras and smartphones. In this work, we propose a camera-based OMR system based on Convolutional Neural Networks, showing promising preliminary results
Keywords: optical music recognition; document analysis; convolutional neural network; deep learning
|
|
|
Marçal Rusiñol, David Aldavert, Dimosthenis Karatzas, Ricardo Toledo, & Josep Llados. (2011). Interactive Trademark Image Retrieval by Fusing Semantic and Visual Content. Advances in Information Retrieval. In P. Clough, C. Foley, C. Gurrin, G.J.F. Jones, W. Kraaij, H. Lee, et al. (Eds.), 33rd European Conference on Information Retrieval (Vol. 6611, pp. 314–325). LNCS. Berlin: Springer.
Abstract: In this paper we propose an efficient queried-by-example retrieval system which is able to retrieve trademark images by similarity from patent and trademark offices' digital libraries. Logo images are described by both their semantic content, by means of the Vienna codes, and their visual contents, by using shape and color as visual cues. The trademark descriptors are then indexed by a locality-sensitive hashing data structure aiming to perform approximate k-NN search in high dimensional spaces in sub-linear time. The resulting ranked lists are combined by using the Condorcet method and a relevance feedback step helps to iteratively revise the query and refine the obtained results. The experiments demonstrate the effectiveness and efficiency of this system on a realistic and large dataset.
|
|
|
Ali Furkan Biten, Ruben Tito, Lluis Gomez, Ernest Valveny, & Dimosthenis Karatzas. (2022). OCR-IDL: OCR Annotations for Industry Document Library Dataset. In ECCV Workshop on Text in Everything.
Abstract: Pretraining has proven successful in Document Intelligence tasks where deluge of documents are used to pretrain the models only later to be finetuned on downstream tasks. One of the problems of the pretraining approaches is the inconsistent usage of pretraining data with different OCR engines leading to incomparable results between models. In other words, it is not obvious whether the performance gain is coming from diverse usage of amount of data and distinct OCR engines or from the proposed models. To remedy the problem, we make public the OCR annotations for IDL documents using commercial OCR engine given their superior performance over open source OCR models. The contributed dataset (OCR-IDL) has an estimated monetary value over 20K US$. It is our hope that OCR-IDL can be a starting point for future works on Document Intelligence. All of our data and its collection process with the annotations can be found in this https URL.
|
|
|
Carlos David Martinez Hinarejos, Josep Llados, Alicia Fornes, Francisco Casacuberta, Lluis de Las Heras, Joan Mas, et al. (2016). Context, multimodality, and user collaboration in handwritten text processing: the CoMUN-HaT project. In 3rd IberSPEECH.
Abstract: Processing of handwritten documents is a task that is of wide interest for many
purposes, such as those related to preserve cultural heritage. Handwritten text recognition techniques have been successfully applied during the last decade to obtain transcriptions of handwritten documents, and keyword spotting techniques have been applied for searching specific terms in image collections of handwritten documents. However, results on transcription and indexing are far from perfect. In this framework, the use of new data sources arises as a new paradigm that will allow for a better transcription and indexing of handwritten documents. Three main different data sources could be considered: context of the document (style, writer, historical time, topics,. . . ), multimodal data (representations of the document in a different modality, such as the speech signal of the dictation of the text), and user feedback (corrections, amendments,. . . ). The CoMUN-HaT project aims at the integration of these different data sources into the transcription and indexing task for handwritten documents: the use of context derived from the analysis of the documents, how multimodality can aid the recognition process to obtain more accurate transcriptions (including transcription in a modern version of the language), and integration into a userin-the-loop assisted text transcription framework. This will be reflected in the construction of a transcription and indexing platform that can be used by both professional and nonprofessional users, contributing to crowd-sourcing activities to preserve cultural heritage and to obtain an accessible version of the involved corpus.
|
|
|
Fernando Vilariño, Dimosthenis Karatzas, & Alberto Valcarce. (2018). The Library Living Lab Barcelona: A participative approach to technology as an enabling factor for innovation in cultural spaces. Technology Innovation Management Review.
|
|
|
Fernando Vilariño, Dimosthenis Karatzas, & Alberto Valcarce. (2018). Libraries as New Innovation Hubs: The Library Living Lab. In 30th ISPIM Innovation Conference.
Abstract: Libraries are in deep transformation both in EU and around the world, and they are thriving within a great window of opportunity for innovation. In this paper, we show how the Library Living Lab in Barcelona participated of this changing scenario and contributed to create the Bibliolab program, where more than 200 public libraries give voice to their users in a global user-centric innovation initiative, using technology as enabling factor. The Library Living Lab is a real 4-helix implementation where Universities, Research Centers, Public Administration, Companies and the Neighbors are joint together to explore how technology transforms the cultural experience of people. This case is an example of scalability and provides reference tools for policy making, sustainability, user engage methodologies and governance. We provide specific examples of new prototypes and services that help to understand how to redefine the role of the Library as a real hub for social innovation.
|
|