|   | 
Details
   web
Records
Author David Fernandez; Pau Riba; Alicia Fornes; Josep Llados
Title On the Influence of Key Point Encoding for Handwritten Word Spotting Type Conference Article
Year 2014 Publication 14th International Conference on Frontiers in Handwriting Recognition Abbreviated Journal
Volume Issue Pages 476 - 481
Keywords Local descriptors; Interest points; Handwritten documents; Word spotting; Historical document analysis
Abstract In this paper we evaluate the influence of the selection of key points and the associated features in the performance of word spotting processes. In general, features can be extracted from a number of characteristic points like corners, contours, skeletons, maxima, minima, crossings, etc. A number of descriptors exist in the literature using different interest point detectors. But the intrinsic variability of handwriting vary strongly on the performance if the interest points are not stable enough. In this paper, we analyze the performance of different descriptors for local interest points. As benchmarking dataset we have used the Barcelona Marriage Database that contains handwritten records of marriages over five centuries.
Address Creete Island; Grecia; September 2014
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN 2167-6445 ISBN 978-1-4799-4335-7 Medium
Area Expedition Conference ICFHR
Notes DAG; 600.056; 600.061; 602.006; 600.077 Approved no
Call Number (up) Admin @ si @ FRF2014 Serial 2460
Permanent link to this record
 

 
Author Andreas Fischer; Ching Y. Suen; Volkmar Frinken; Kaspar Riesen; Horst Bunke
Title A Fast Matching Algorithm for Graph-Based Handwriting Recognition Type Conference Article
Year 2013 Publication 9th IAPR – TC15 Workshop on Graph-based Representation in Pattern Recognition Abbreviated Journal
Volume 7877 Issue Pages 194-203
Keywords
Abstract The recognition of unconstrained handwriting images is usually based on vectorial representation and statistical classification. Despite their high representational power, graphs are rarely used in this field due to a lack of efficient graph-based recognition methods. Recently, graph similarity features have been proposed to bridge the gap between structural representation and statistical classification by means of vector space embedding. This approach has shown a high performance in terms of accuracy but had shortcomings in terms of computational speed. The time complexity of the Hungarian algorithm that is used to approximate the edit distance between two handwriting graphs is demanding for a real-world scenario. In this paper, we propose a faster graph matching algorithm which is derived from the Hausdorff distance. On the historical Parzival database it is demonstrated that the proposed method achieves a speedup factor of 12.9 without significant loss in recognition accuracy.
Address Vienna; Austria; May 2013
Corporate Author Thesis
Publisher Springer Berlin Heidelberg Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title LNCS
Series Volume Series Issue Edition
ISSN 0302-9743 ISBN 978-3-642-38220-8 Medium
Area Expedition Conference GBR
Notes DAG; 600.045; 605.203 Approved no
Call Number (up) Admin @ si @ FSF2013 Serial 2294
Permanent link to this record
 

 
Author Adam Fodor; Rachid R. Saboundji; Julio C. S. Jacques Junior; Sergio Escalera; David Gallardo Pujol; Andras Lorincz
Title Multimodal Sentiment and Personality Perception Under Speech: A Comparison of Transformer-based Architectures Type Conference Article
Year 2022 Publication Understanding Social Behavior in Dyadic and Small Group Interactions Abbreviated Journal
Volume 173 Issue Pages 218-241
Keywords
Abstract Human-machine, human-robot interaction, and collaboration appear in diverse fields, from homecare to Cyber-Physical Systems. Technological development is fast, whereas real-time methods for social communication analysis that can measure small changes in sentiment and personality states, including visual, acoustic and language modalities are lagging, particularly when the goal is to build robust, appearance invariant, and fair methods. We study and compare methods capable of fusing modalities while satisfying real-time and invariant appearance conditions. We compare state-of-the-art transformer architectures in sentiment estimation and introduce them in the much less explored field of personality perception. We show that the architectures perform differently on automatic sentiment and personality perception, suggesting that each task may be better captured/modeled by a particular method. Our work calls attention to the attractive properties of the linear versions of the transformer architectures. In particular, we show that the best results are achieved by fusing the different architectures{’} preprocessing methods. However, they pose extreme conditions in computation power and energy consumption for real-time computations for quadratic transformers due to their memory requirements. In turn, linear transformers pave the way for quantifying small changes in sentiment estimation and personality perception for real-time social communications for machines and robots.
Address
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference PMLR
Notes HuPBA; no menciona Approved no
Call Number (up) Admin @ si @ FSJ2022 Serial 3769
Permanent link to this record
 

 
Author Mireia Forns-Nadal; Federico Sem; Anna Mane; Laura Igual; Dani Guinart; Oscar Vilarroya
Title Increased Nucleus Accumbens Volume in First-Episode Psychosis Type Journal Article
Year 2017 Publication Psychiatry Research-Neuroimaging Abbreviated Journal PRN
Volume 263 Issue Pages 57-60
Keywords
Abstract Nucleus accumbens has been reported as a key structure in the neurobiology of schizophrenia. Studies analyzing structural abnormalities have shown conflicting results, possibly related to confounding factors. We investigated the nucleus accumbens volume using manual delimitation in first-episode psychosis (FEP) controlling for age, cannabis use and medication. Thirty-one FEP subjects who were naive or minimally exposed to antipsychotics and a control group were MRI scanned and clinically assessed from baseline to 6 months of follow-up. FEP showed increased relative and total accumbens volumes. Clinical correlations with negative symptoms, duration of untreated psychosis and cannabis use were not significant.
Address
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference
Notes MILAB; no menciona Approved no
Call Number (up) Admin @ si @ FSM2017 Serial 3028
Permanent link to this record
 

 
Author Miquel Ferrer; F. Serratosa; A. Sanfeliu
Title Synthesis of median spectral graph Type Book Chapter
Year 2005 Publication Pattern Recognition and Image Analysis (IbPRIA´05), LNCS, 3523: 139 146 Abbreviated Journal
Volume Issue Pages
Keywords
Abstract
Address Estoril (Portugal)
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference
Notes Approved no
Call Number (up) Admin @ si @ FSS2005 Serial 656
Permanent link to this record
 

 
Author Alex Falcon; Swathikiran Sudhakaran; Giuseppe Serra; Sergio Escalera; Oswald Lanz
Title Relevance-based Margin for Contrastively-trained Video Retrieval Models Type Conference Article
Year 2022 Publication ICMR '22: Proceedings of the 2022 International Conference on Multimedia Retrieval Abbreviated Journal
Volume Issue Pages 146-157
Keywords
Abstract Video retrieval using natural language queries has attracted increasing interest due to its relevance in real-world applications, from intelligent access in private media galleries to web-scale video search. Learning the cross-similarity of video and text in a joint embedding space is the dominant approach. To do so, a contrastive loss is usually employed because it organizes the embedding space by putting similar items close and dissimilar items far. This framework leads to competitive recall rates, as they solely focus on the rank of the groundtruth items. Yet, assessing the quality of the ranking list is of utmost importance when considering intelligent retrieval systems, since multiple items may share similar semantics, hence a high relevance. Moreover, the aforementioned framework uses a fixed margin to separate similar and dissimilar items, treating all non-groundtruth items as equally irrelevant. In this paper we propose to use a variable margin: we argue that varying the margin used during training based on how much relevant an item is to a given query, i.e. a relevance-based margin, easily improves the quality of the ranking lists measured through nDCG and mAP. We demonstrate the advantages of our technique using different models on EPIC-Kitchens-100 and YouCook2. We show that even if we carefully tuned the fixed margin, our technique (which does not have the margin as a hyper-parameter) would still achieve better performance. Finally, extensive ablation studies and qualitative analysis support the robustness of our approach. Code will be released at \urlhttps://github.com/aranciokov/RelevanceMargin-ICMR22.
Address Newwark, NJ, USA, 27 June 2022
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference ICMR
Notes HuPBA; no menciona Approved no
Call Number (up) Admin @ si @ FSS2022 Serial 3808
Permanent link to this record
 

 
Author Arturo Fuentes; F. Javier Sanchez; Thomas Voncina; Jorge Bernal
Title LAMV: Learning to Predict Where Spectators Look in Live Music Performances Type Conference Article
Year 2021 Publication 16th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications Abbreviated Journal
Volume 5 Issue Pages 500-507
Keywords
Abstract The advent of artificial intelligence has supposed an evolution on how different daily work tasks are performed. The analysis of cultural content has seen a huge boost by the development of computer-assisted methods that allows easy and transparent data access. In our case, we deal with the automation of the production of live shows, like music concerts, aiming to develop a system that can indicate the producer which camera to show based on what each of them is showing. In this context, we consider that is essential to understand where spectators look and what they are interested in so the computational method can learn from this information. The work that we present here shows the results of a first preliminary study in which we compare areas of interest defined by human beings and those indicated by an automatic system. Our system is based on the extraction of motion textures from dynamic Spatio-Temporal Volumes (STV) and then analyzing the patterns by means of texture analysis techniques. We validate our approach over several video sequences that have been labeled by 16 different experts. Our method is able to match those relevant areas identified by the experts, achieving recall scores higher than 80% when a distance of 80 pixels between method and ground truth is considered. Current performance shows promise when detecting abnormal peaks and movement trends.
Address Virtual; February 2021
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference VISIGRAPP
Notes MV; ISE; 600.119; Approved no
Call Number (up) Admin @ si @ FSV2021 Serial 3570
Permanent link to this record
 

 
Author Zhijie Fang; David Vazquez; Antonio Lopez
Title On-Board Detection of Pedestrian Intentions Type Journal Article
Year 2017 Publication Sensors Abbreviated Journal SENS
Volume 17 Issue 10 Pages 2193
Keywords pedestrian intention; ADAS; self-driving
Abstract Avoiding vehicle-to-pedestrian crashes is a critical requirement for nowadays advanced driver assistant systems (ADAS) and future self-driving vehicles. Accordingly, detecting pedestrians from raw sensor data has a history of more than 15 years of research, with vision playing a central role.
During the last years, deep learning has boosted the accuracy of image-based pedestrian detectors.
However, detection is just the first step towards answering the core question, namely is the vehicle going to crash with a pedestrian provided preventive actions are not taken? Therefore, knowing as soon as possible if a detected pedestrian has the intention of crossing the road ahead of the vehicle is
essential for performing safe and comfortable maneuvers that prevent a crash. However, compared to pedestrian detection, there is relatively little literature on detecting pedestrian intentions. This paper aims to contribute along this line by presenting a new vision-based approach which analyzes the
pose of a pedestrian along several frames to determine if he or she is going to enter the road or not. We present experiments showing 750 ms of anticipation for pedestrians crossing the road, which at a typical urban driving speed of 50 km/h can provide 15 additional meters (compared to a pure pedestrian detector) for vehicle automatic reactions or to warn the driver. Moreover, in contrast with state-of-the-art methods, our approach is monocular, neither requiring stereo nor optical flow information.
Address
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference
Notes ADAS; 600.085; 600.076; 601.223; 600.116; 600.118 Approved no
Call Number (up) Admin @ si @ FVL2017 Serial 2983
Permanent link to this record
 

 
Author Graham D. Finlayson; Javier Vazquez; Sabine Süsstrunk; Maria Vanrell
Title Spectral sharpening by spherical sampling Type Journal Article
Year 2012 Publication Journal of the Optical Society of America A Abbreviated Journal JOSA A
Volume 29 Issue 7 Pages 1199-1210
Keywords
Abstract There are many works in color that assume illumination change can be modeled by multiplying sensor responses by individual scaling factors. The early research in this area is sometimes grouped under the heading “von Kries adaptation”: the scaling factors are applied to the cone responses. In more recent studies, both in psychophysics and in computational analysis, it has been proposed that scaling factors should be applied to linear combinations of the cones that have narrower support: they should be applied to the so-called “sharp sensors.” In this paper, we generalize the computational approach to spectral sharpening in three important ways. First, we introduce spherical sampling as a tool that allows us to enumerate in a principled way all linear combinations of the cones. This allows us to, second, find the optimal sharp sensors that minimize a variety of error measures including CIE Delta E (previous work on spectral sharpening minimized RMS) and color ratio stability. Lastly, we extend the spherical sampling paradigm to the multispectral case. Here the objective is to model the interaction of light and surface in terms of color signal spectra. Spherical sampling is shown to improve on the state of the art.
Address
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN 1084-7529 ISBN Medium
Area Expedition Conference
Notes CIC Approved no
Call Number (up) Admin @ si @ FVS2012 Serial 2000
Permanent link to this record
 

 
Author Onur Ferhat; Fernando Vilariño; F. Javier Sanchez
Title A cheap portable eye-tracker solution for common setups. Type Journal Article
Year 2014 Publication Journal of Eye Movement Research Abbreviated Journal JEMR
Volume 7 Issue 3 Pages 1-10
Keywords
Abstract We analyze the feasibility of a cheap eye-tracker where the hardware consists of a single webcam and a Raspberry Pi device. Our aim is to discover the limits of such a system and to see whether it provides an acceptable performance. We base our work on the open source Opengazer (Zielinski, 2013) and we propose several improvements to create a robust, real-time system which can work on a computer with 30Hz sampling rate. After assessing the accuracy of our eye-tracker in elaborated experiments involving 12 subjects under 4 different system setups, we install it on a Raspberry Pi to create a portable stand-alone eye-tracker which achieves 1.42° horizontal accuracy with 3Hz refresh rate for a building cost of 70 Euros.
Address
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference
Notes ;SIAI Approved no
Call Number (up) Admin @ si @ FVS2014 Serial 2435
Permanent link to this record
 

 
Author ChuanMing Fang; Kai Wang; Joost Van de Weijer
Title IterInv: Iterative Inversion for Pixel-Level T2I Models Type Conference Article
Year 2023 Publication 37th Annual Conference on Neural Information Processing Systems Abbreviated Journal
Volume Issue Pages
Keywords
Abstract Large-scale text-to-image diffusion models have been a ground-breaking development in generating convincing images following an input text prompt. The goal of image editing research is to give users control over the generated images by modifying the text prompt. Current image editing techniques are relying on DDIM inversion as a common practice based on the Latent Diffusion Models (LDM). However, the large pretrained T2I models working on the latent space as LDM suffer from losing details due to the first compression stage with an autoencoder mechanism. Instead, another mainstream T2I pipeline working on the pixel level, such as Imagen and DeepFloyd-IF, avoids this problem. They are commonly composed of several stages, normally with a text-to-image stage followed by several super-resolution stages. In this case, the DDIM inversion is unable to find the initial noise to generate the original image given that the super-resolution diffusion models are not compatible with the DDIM technique. According to our experimental findings, iteratively concatenating the noisy image as the condition is the root of this problem. Based on this observation, we develop an iterative inversion (IterInv) technique for this stream of T2I models and verify IterInv with the open-source DeepFloyd-IF model. By combining our method IterInv with a popular image editing method, we prove the application prospects of IterInv. The code will be released at \url{this https URL}.
Address New Orleans; USA; December 2023
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference NEURIPS
Notes LAMP Approved no
Call Number (up) Admin @ si @ FWW2023 Serial 3936
Permanent link to this record
 

 
Author Volkmar Frinken; Francisco Zamora; Salvador España; Maria Jose Castro; Andreas Fischer; Horst Bunke
Title Long-Short Term Memory Neural Networks Language Modeling for Handwriting Recognition Type Conference Article
Year 2012 Publication 21st International Conference on Pattern Recognition Abbreviated Journal
Volume Issue Pages 701-704
Keywords
Abstract Unconstrained handwritten text recognition systems maximize the combination of two separate probability scores. The first one is the observation probability that indicates how well the returned word sequence matches the input image. The second score is the probability that reflects how likely a word sequence is according to a language model. Current state-of-the-art recognition systems use statistical language models in form of bigram word probabilities. This paper proposes to model the target language by means of a recurrent neural network with long-short term memory cells. Because the network is recurrent, the considered context is not limited to a fixed size especially as the memory cells are designed to deal with long-term dependencies. In a set of experiments conducted on the IAM off-line database we show the superiority of the proposed language model over statistical n-gram models.
Address Tsukuba Science City, Japan
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN 1051-4651 ISBN 978-1-4673-2216-4 Medium
Area Expedition Conference ICPR
Notes DAG Approved no
Call Number (up) Admin @ si @ FZE2012 Serial 2052
Permanent link to this record
 

 
Author Adrian Galdran; Aitor Alvarez-Gila; Alessandro Bria; Javier Vazquez; Marcelo Bertalmio
Title On the Duality Between Retinex and Image Dehazing Type Conference Article
Year 2018 Publication 31st IEEE Conference on Computer Vision and Pattern Recognition Abbreviated Journal
Volume Issue Pages 8212–8221
Keywords Image color analysis; Task analysis; Atmospheric modeling; Computer vision; Computational modeling; Lighting
Abstract Image dehazing deals with the removal of undesired loss of visibility in outdoor images due to the presence of fog. Retinex is a color vision model mimicking the ability of the Human Visual System to robustly discount varying illuminations when observing a scene under different spectral lighting conditions. Retinex has been widely explored in the computer vision literature for image enhancement and other related tasks. While these two problems are apparently unrelated, the goal of this work is to show that they can be connected by a simple linear relationship. Specifically, most Retinex-based algorithms have the characteristic feature of always increasing image brightness, which turns them into ideal candidates for effective image dehazing by directly applying Retinex to a hazy image whose intensities have been inverted. In this paper, we give theoretical proof that Retinex on inverted intensities is a solution to the image dehazing problem. Comprehensive qualitative and quantitative results indicate that several classical and modern implementations of Retinex can be transformed into competing image dehazing algorithms performing on pair with more complex fog removal methods, and can overcome some of the main challenges associated with this problem.
Address Salt Lake City; USA; June 2018
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference CVPR
Notes LAMP; 600.120 Approved no
Call Number (up) Admin @ si @ GAB2018 Serial 3146
Permanent link to this record
 

 
Author Debora Gil; Ruth Aris; Agnes Borras; Esmitt Ramirez; Rafael Sebastian; Mariano Vazquez
Title Influence of fiber connectivity in simulations of cardiac biomechanics Type Journal Article
Year 2019 Publication International Journal of Computer Assisted Radiology and Surgery Abbreviated Journal IJCAR
Volume 14 Issue 1 Pages 63–72
Keywords Cardiac electromechanical simulations; Diffusion tensor imaging; Fiber connectivity
Abstract PURPOSE:
Personalized computational simulations of the heart could open up new improved approaches to diagnosis and surgery assistance systems. While it is fully recognized that myocardial fiber orientation is central for the construction of realistic computational models of cardiac electromechanics, the role of its overall architecture and connectivity remains unclear. Morphological studies show that the distribution of cardiac muscular fibers at the basal ring connects epicardium and endocardium. However, computational models simplify their distribution and disregard the basal loop. This work explores the influence in computational simulations of fiber distribution at different short-axis cuts.

METHODS:
We have used a highly parallelized computational solver to test different fiber models of ventricular muscular connectivity. We have considered two rule-based mathematical models and an own-designed method preserving basal connectivity as observed in experimental data. Simulated cardiac functional scores (rotation, torsion and longitudinal shortening) were compared to experimental healthy ranges using generalized models (rotation) and Mahalanobis distances (shortening, torsion).

RESULTS:
The probability of rotation was significantly lower for ruled-based models [95% CI (0.13, 0.20)] in comparison with experimental data [95% CI (0.23, 0.31)]. The Mahalanobis distance for experimental data was in the edge of the region enclosing 99% of the healthy population.

CONCLUSIONS:
Cardiac electromechanical simulations of the heart with fibers extracted from experimental data produce functional scores closer to healthy ranges than rule-based models disregarding architecture connectivity.
Address
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference
Notes IAM; 600.096; 601.323; 600.139; 600.145 Approved no
Call Number (up) Admin @ si @ GAB2019a Serial 3133
Permanent link to this record
 

 
Author Bojana Gajic; Ariel Amato; Ramon Baldrich; Carlo Gatta
Title Bag of Negatives for Siamese Architectures Type Conference Article
Year 2019 Publication 30th British Machine Vision Conference Abbreviated Journal
Volume Issue Pages
Keywords
Abstract Training a Siamese architecture for re-identification with a large number of identities is a challenging task due to the difficulty of finding relevant negative samples efficiently. In this work we present Bag of Negatives (BoN), a method for accelerated and improved training of Siamese networks that scales well on datasets with a very large number of identities. BoN is an efficient and loss-independent method, able to select a bag of high quality negatives, based on a novel online hashing strategy.
Address Cardiff; United Kingdom; September 2019
Corporate Author Thesis
Publisher Place of Publication Editor
Language Summary Language Original Title
Series Editor Series Title Abbreviated Series Title
Series Volume Series Issue Edition
ISSN ISBN Medium
Area Expedition Conference BMVC
Notes CIC; 600.140; 600.118 Approved no
Call Number (up) Admin @ si @ GAB2019b Serial 3263
Permanent link to this record