Publicacions CVC -- Query Results

[51–60] << 61 62 63 64 65 66 67 68 69 70 >> [71–80]

Details

Records
Author	David Fernandez; Pau Riba; Alicia Fornes; Josep Llados
Title	On the Influence of Key Point Encoding for Handwritten Word Spotting			Type	Conference Article
Year	2014	Publication	14th International Conference on Frontiers in Handwriting Recognition	Abbreviated Journal
Volume		Issue		Pages	476 - 481
Keywords	Local descriptors; Interest points; Handwritten documents; Word spotting; Historical document analysis
Abstract	In this paper we evaluate the influence of the selection of key points and the associated features in the performance of word spotting processes. In general, features can be extracted from a number of characteristic points like corners, contours, skeletons, maxima, minima, crossings, etc. A number of descriptors exist in the literature using different interest point detectors. But the intrinsic variability of handwriting vary strongly on the performance if the interest points are not stable enough. In this paper, we analyze the performance of different descriptors for local interest points. As benchmarking dataset we have used the Barcelona Marriage Database that contains handwritten records of marriages over five centuries.
Address	Creete Island; Grecia; September 2014
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN	2167-6445	ISBN	978-1-4799-4335-7	Medium
Area		Expedition		Conference	ICFHR
Notes	DAG; 600.056; 600.061; 602.006; 600.077			Approved	no
Call Number	Admin @ si @ FRF2014			Serial	2460
Permanent link to this record



Author	Andreas Fischer; Ching Y. Suen; Volkmar Frinken; Kaspar Riesen; Horst Bunke
Title	A Fast Matching Algorithm for Graph-Based Handwriting Recognition			Type	Conference Article
Year	2013	Publication	9th IAPR – TC15 Workshop on Graph-based Representation in Pattern Recognition	Abbreviated Journal
Volume	7877	Issue		Pages	194-203
Keywords
Abstract	The recognition of unconstrained handwriting images is usually based on vectorial representation and statistical classification. Despite their high representational power, graphs are rarely used in this field due to a lack of efficient graph-based recognition methods. Recently, graph similarity features have been proposed to bridge the gap between structural representation and statistical classification by means of vector space embedding. This approach has shown a high performance in terms of accuracy but had shortcomings in terms of computational speed. The time complexity of the Hungarian algorithm that is used to approximate the edit distance between two handwriting graphs is demanding for a real-world scenario. In this paper, we propose a faster graph matching algorithm which is derived from the Hausdorff distance. On the historical Parzival database it is demonstrated that the proposed method achieves a speedup factor of 12.9 without significant loss in recognition accuracy.
Address	Vienna; Austria; May 2013
Corporate Author				Thesis
Publisher	Springer Berlin Heidelberg	Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title	LNCS
Series Volume		Series Issue		Edition
ISSN	0302-9743	ISBN	978-3-642-38220-8	Medium
Area		Expedition		Conference	GBR
Notes	DAG; 600.045; 605.203			Approved	no
Call Number	Admin @ si @ FSF2013			Serial	2294
Permanent link to this record



Author	Adam Fodor; Rachid R. Saboundji; Julio C. S. Jacques Junior; Sergio Escalera; David Gallardo Pujol; Andras Lorincz
Title	Multimodal Sentiment and Personality Perception Under Speech: A Comparison of Transformer-based Architectures			Type	Conference Article
Year	2022	Publication	Understanding Social Behavior in Dyadic and Small Group Interactions	Abbreviated Journal
Volume	173	Issue		Pages	218-241
Keywords
Abstract	Human-machine, human-robot interaction, and collaboration appear in diverse fields, from homecare to Cyber-Physical Systems. Technological development is fast, whereas real-time methods for social communication analysis that can measure small changes in sentiment and personality states, including visual, acoustic and language modalities are lagging, particularly when the goal is to build robust, appearance invariant, and fair methods. We study and compare methods capable of fusing modalities while satisfying real-time and invariant appearance conditions. We compare state-of-the-art transformer architectures in sentiment estimation and introduce them in the much less explored field of personality perception. We show that the architectures perform differently on automatic sentiment and personality perception, suggesting that each task may be better captured/modeled by a particular method. Our work calls attention to the attractive properties of the linear versions of the transformer architectures. In particular, we show that the best results are achieved by fusing the different architectures{’} preprocessing methods. However, they pose extreme conditions in computation power and energy consumption for real-time computations for quadratic transformers due to their memory requirements. In turn, linear transformers pave the way for quantifying small changes in sentiment estimation and personality perception for real-time social communications for machines and robots.
Address
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference	PMLR
Notes	HuPBA; no menciona			Approved	no
Call Number	Admin @ si @ FSJ2022			Serial	3769
Permanent link to this record



Author	Mireia Forns-Nadal; Federico Sem; Anna Mane; Laura Igual; Dani Guinart; Oscar Vilarroya
Title	Increased Nucleus Accumbens Volume in First-Episode Psychosis			Type	Journal Article
Year	2017	Publication	Psychiatry Research-Neuroimaging	Abbreviated Journal	PRN
Volume	263	Issue		Pages	57-60
Keywords
Abstract	Nucleus accumbens has been reported as a key structure in the neurobiology of schizophrenia. Studies analyzing structural abnormalities have shown conflicting results, possibly related to confounding factors. We investigated the nucleus accumbens volume using manual delimitation in first-episode psychosis (FEP) controlling for age, cannabis use and medication. Thirty-one FEP subjects who were naive or minimally exposed to antipsychotics and a control group were MRI scanned and clinically assessed from baseline to 6 months of follow-up. FEP showed increased relative and total accumbens volumes. Clinical correlations with negative symptoms, duration of untreated psychosis and cannabis use were not significant.
Address
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference
Notes	MILAB; no menciona			Approved	no
Call Number	Admin @ si @ FSM2017			Serial	3028
Permanent link to this record



Author	Miquel Ferrer; F. Serratosa; A. Sanfeliu
Title	Synthesis of median spectral graph			Type	Book Chapter
Year	2005	Publication	Pattern Recognition and Image Analysis (IbPRIA´05), LNCS, 3523: 139 146	Abbreviated Journal
Volume		Issue		Pages
Keywords
Abstract
Address	Estoril (Portugal)
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference
Notes				Approved	no
Call Number	Admin @ si @ FSS2005			Serial	656
Permanent link to this record



Author	Alex Falcon; Swathikiran Sudhakaran; Giuseppe Serra; Sergio Escalera; Oswald Lanz
Title	Relevance-based Margin for Contrastively-trained Video Retrieval Models			Type	Conference Article
Year	2022	Publication	ICMR '22: Proceedings of the 2022 International Conference on Multimedia Retrieval	Abbreviated Journal
Volume		Issue		Pages	146-157
Keywords
Abstract	Video retrieval using natural language queries has attracted increasing interest due to its relevance in real-world applications, from intelligent access in private media galleries to web-scale video search. Learning the cross-similarity of video and text in a joint embedding space is the dominant approach. To do so, a contrastive loss is usually employed because it organizes the embedding space by putting similar items close and dissimilar items far. This framework leads to competitive recall rates, as they solely focus on the rank of the groundtruth items. Yet, assessing the quality of the ranking list is of utmost importance when considering intelligent retrieval systems, since multiple items may share similar semantics, hence a high relevance. Moreover, the aforementioned framework uses a fixed margin to separate similar and dissimilar items, treating all non-groundtruth items as equally irrelevant. In this paper we propose to use a variable margin: we argue that varying the margin used during training based on how much relevant an item is to a given query, i.e. a relevance-based margin, easily improves the quality of the ranking lists measured through nDCG and mAP. We demonstrate the advantages of our technique using different models on EPIC-Kitchens-100 and YouCook2. We show that even if we carefully tuned the fixed margin, our technique (which does not have the margin as a hyper-parameter) would still achieve better performance. Finally, extensive ablation studies and qualitative analysis support the robustness of our approach. Code will be released at \urlhttps://github.com/aranciokov/RelevanceMargin-ICMR22.
Address	Newwark, NJ, USA, 27 June 2022
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference	ICMR
Notes	HuPBA; no menciona			Approved	no
Call Number	Admin @ si @ FSS2022			Serial	3808
Permanent link to this record



Author	Arturo Fuentes; F. Javier Sanchez; Thomas Voncina; Jorge Bernal
Title	LAMV: Learning to Predict Where Spectators Look in Live Music Performances			Type	Conference Article
Year	2021	Publication	16th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications	Abbreviated Journal
Volume	5	Issue		Pages	500-507
Keywords
Abstract	The advent of artificial intelligence has supposed an evolution on how different daily work tasks are performed. The analysis of cultural content has seen a huge boost by the development of computer-assisted methods that allows easy and transparent data access. In our case, we deal with the automation of the production of live shows, like music concerts, aiming to develop a system that can indicate the producer which camera to show based on what each of them is showing. In this context, we consider that is essential to understand where spectators look and what they are interested in so the computational method can learn from this information. The work that we present here shows the results of a first preliminary study in which we compare areas of interest defined by human beings and those indicated by an automatic system. Our system is based on the extraction of motion textures from dynamic Spatio-Temporal Volumes (STV) and then analyzing the patterns by means of texture analysis techniques. We validate our approach over several video sequences that have been labeled by 16 different experts. Our method is able to match those relevant areas identified by the experts, achieving recall scores higher than 80% when a distance of 80 pixels between method and ground truth is considered. Current performance shows promise when detecting abnormal peaks and movement trends.
Address	Virtual; February 2021
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference	VISIGRAPP
Notes	MV; ISE; 600.119;			Approved	no
Call Number	Admin @ si @ FSV2021			Serial	3570
Permanent link to this record



Author	Zhijie Fang; David Vazquez; Antonio Lopez
Title	On-Board Detection of Pedestrian Intentions			Type	Journal Article
Year	2017	Publication	Sensors	Abbreviated Journal	SENS
Volume	17	Issue	10	Pages	2193
Keywords	pedestrian intention; ADAS; self-driving
Abstract	Avoiding vehicle-to-pedestrian crashes is a critical requirement for nowadays advanced driver assistant systems (ADAS) and future self-driving vehicles. Accordingly, detecting pedestrians from raw sensor data has a history of more than 15 years of research, with vision playing a central role. During the last years, deep learning has boosted the accuracy of image-based pedestrian detectors. However, detection is just the first step towards answering the core question, namely is the vehicle going to crash with a pedestrian provided preventive actions are not taken? Therefore, knowing as soon as possible if a detected pedestrian has the intention of crossing the road ahead of the vehicle is essential for performing safe and comfortable maneuvers that prevent a crash. However, compared to pedestrian detection, there is relatively little literature on detecting pedestrian intentions. This paper aims to contribute along this line by presenting a new vision-based approach which analyzes the pose of a pedestrian along several frames to determine if he or she is going to enter the road or not. We present experiments showing 750 ms of anticipation for pedestrians crossing the road, which at a typical urban driving speed of 50 km/h can provide 15 additional meters (compared to a pure pedestrian detector) for vehicle automatic reactions or to warn the driver. Moreover, in contrast with state-of-the-art methods, our approach is monocular, neither requiring stereo nor optical flow information.
Address
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference
Notes	ADAS; 600.085; 600.076; 601.223; 600.116; 600.118			Approved	no
Call Number	Admin @ si @ FVL2017			Serial	2983
Permanent link to this record



Author	Graham D. Finlayson; Javier Vazquez; Sabine Süsstrunk; Maria Vanrell
Title	Spectral sharpening by spherical sampling			Type	Journal Article
Year	2012	Publication	Journal of the Optical Society of America A	Abbreviated Journal	JOSA A
Volume	29	Issue	7	Pages	1199-1210
Keywords
Abstract	There are many works in color that assume illumination change can be modeled by multiplying sensor responses by individual scaling factors. The early research in this area is sometimes grouped under the heading “von Kries adaptation”: the scaling factors are applied to the cone responses. In more recent studies, both in psychophysics and in computational analysis, it has been proposed that scaling factors should be applied to linear combinations of the cones that have narrower support: they should be applied to the so-called “sharp sensors.” In this paper, we generalize the computational approach to spectral sharpening in three important ways. First, we introduce spherical sampling as a tool that allows us to enumerate in a principled way all linear combinations of the cones. This allows us to, second, find the optimal sharp sensors that minimize a variety of error measures including CIE Delta E (previous work on spectral sharpening minimized RMS) and color ratio stability. Lastly, we extend the spherical sampling paradigm to the multispectral case. Here the objective is to model the interaction of light and surface in terms of color signal spectra. Spherical sampling is shown to improve on the state of the art.
Address
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN	1084-7529	ISBN		Medium
Area		Expedition		Conference
Notes	CIC			Approved	no
Call Number	Admin @ si @ FVS2012			Serial	2000
Permanent link to this record



Author	Onur Ferhat; Fernando Vilariño; F. Javier Sanchez
Title	A cheap portable eye-tracker solution for common setups.			Type	Journal Article
Year	2014	Publication	Journal of Eye Movement Research	Abbreviated Journal	JEMR
Volume	7	Issue	3	Pages	1-10
Keywords
Abstract	We analyze the feasibility of a cheap eye-tracker where the hardware consists of a single webcam and a Raspberry Pi device. Our aim is to discover the limits of such a system and to see whether it provides an acceptable performance. We base our work on the open source Opengazer (Zielinski, 2013) and we propose several improvements to create a robust, real-time system which can work on a computer with 30Hz sampling rate. After assessing the accuracy of our eye-tracker in elaborated experiments involving 12 subjects under 4 different system setups, we install it on a Raspberry Pi to create a portable stand-alone eye-tracker which achieves 1.42° horizontal accuracy with 3Hz refresh rate for a building cost of 70 Euros.
Address
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference
Notes	;SIAI			Approved	no
Call Number	Admin @ si @ FVS2014			Serial	2435
Permanent link to this record



Author	ChuanMing Fang; Kai Wang; Joost Van de Weijer
Title	IterInv: Iterative Inversion for Pixel-Level T2I Models			Type	Conference Article
Year	2023	Publication	37th Annual Conference on Neural Information Processing Systems	Abbreviated Journal
Volume		Issue		Pages
Keywords
Abstract	Large-scale text-to-image diffusion models have been a ground-breaking development in generating convincing images following an input text prompt. The goal of image editing research is to give users control over the generated images by modifying the text prompt. Current image editing techniques are relying on DDIM inversion as a common practice based on the Latent Diffusion Models (LDM). However, the large pretrained T2I models working on the latent space as LDM suffer from losing details due to the first compression stage with an autoencoder mechanism. Instead, another mainstream T2I pipeline working on the pixel level, such as Imagen and DeepFloyd-IF, avoids this problem. They are commonly composed of several stages, normally with a text-to-image stage followed by several super-resolution stages. In this case, the DDIM inversion is unable to find the initial noise to generate the original image given that the super-resolution diffusion models are not compatible with the DDIM technique. According to our experimental findings, iteratively concatenating the noisy image as the condition is the root of this problem. Based on this observation, we develop an iterative inversion (IterInv) technique for this stream of T2I models and verify IterInv with the open-source DeepFloyd-IF model. By combining our method IterInv with a popular image editing method, we prove the application prospects of IterInv. The code will be released at \url{this https URL}.
Address	New Orleans; USA; December 2023
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference	NEURIPS
Notes	LAMP			Approved	no
Call Number	Admin @ si @ FWW2023			Serial	3936
Permanent link to this record



Author	Volkmar Frinken; Francisco Zamora; Salvador España; Maria Jose Castro; Andreas Fischer; Horst Bunke
Title	Long-Short Term Memory Neural Networks Language Modeling for Handwriting Recognition			Type	Conference Article
Year	2012	Publication	21st International Conference on Pattern Recognition	Abbreviated Journal
Volume		Issue		Pages	701-704
Keywords
Abstract	Unconstrained handwritten text recognition systems maximize the combination of two separate probability scores. The first one is the observation probability that indicates how well the returned word sequence matches the input image. The second score is the probability that reflects how likely a word sequence is according to a language model. Current state-of-the-art recognition systems use statistical language models in form of bigram word probabilities. This paper proposes to model the target language by means of a recurrent neural network with long-short term memory cells. Because the network is recurrent, the considered context is not limited to a fixed size especially as the memory cells are designed to deal with long-term dependencies. In a set of experiments conducted on the IAM off-line database we show the superiority of the proposed language model over statistical n-gram models.
Address	Tsukuba Science City, Japan
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN	1051-4651	ISBN	978-1-4673-2216-4	Medium
Area		Expedition		Conference	ICPR
Notes	DAG			Approved	no
Call Number	Admin @ si @ FZE2012			Serial	2052
Permanent link to this record



Author	Adrian Galdran; Aitor Alvarez-Gila; Alessandro Bria; Javier Vazquez; Marcelo Bertalmio
Title	On the Duality Between Retinex and Image Dehazing			Type	Conference Article
Year	2018	Publication	31st IEEE Conference on Computer Vision and Pattern Recognition	Abbreviated Journal
Volume		Issue		Pages	8212–8221
Keywords	Image color analysis; Task analysis; Atmospheric modeling; Computer vision; Computational modeling; Lighting
Abstract	Image dehazing deals with the removal of undesired loss of visibility in outdoor images due to the presence of fog. Retinex is a color vision model mimicking the ability of the Human Visual System to robustly discount varying illuminations when observing a scene under different spectral lighting conditions. Retinex has been widely explored in the computer vision literature for image enhancement and other related tasks. While these two problems are apparently unrelated, the goal of this work is to show that they can be connected by a simple linear relationship. Specifically, most Retinex-based algorithms have the characteristic feature of always increasing image brightness, which turns them into ideal candidates for effective image dehazing by directly applying Retinex to a hazy image whose intensities have been inverted. In this paper, we give theoretical proof that Retinex on inverted intensities is a solution to the image dehazing problem. Comprehensive qualitative and quantitative results indicate that several classical and modern implementations of Retinex can be transformed into competing image dehazing algorithms performing on pair with more complex fog removal methods, and can overcome some of the main challenges associated with this problem.
Address	Salt Lake City; USA; June 2018
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference	CVPR
Notes	LAMP; 600.120			Approved	no
Call Number	Admin @ si @ GAB2018			Serial	3146
Permanent link to this record



Author	Debora Gil; Ruth Aris; Agnes Borras; Esmitt Ramirez; Rafael Sebastian; Mariano Vazquez
Title	Influence of fiber connectivity in simulations of cardiac biomechanics			Type	Journal Article
Year	2019	Publication	International Journal of Computer Assisted Radiology and Surgery	Abbreviated Journal	IJCAR
Volume	14	Issue	1	Pages	63–72
Keywords	Cardiac electromechanical simulations; Diffusion tensor imaging; Fiber connectivity
Abstract	PURPOSE: Personalized computational simulations of the heart could open up new improved approaches to diagnosis and surgery assistance systems. While it is fully recognized that myocardial fiber orientation is central for the construction of realistic computational models of cardiac electromechanics, the role of its overall architecture and connectivity remains unclear. Morphological studies show that the distribution of cardiac muscular fibers at the basal ring connects epicardium and endocardium. However, computational models simplify their distribution and disregard the basal loop. This work explores the influence in computational simulations of fiber distribution at different short-axis cuts. METHODS: We have used a highly parallelized computational solver to test different fiber models of ventricular muscular connectivity. We have considered two rule-based mathematical models and an own-designed method preserving basal connectivity as observed in experimental data. Simulated cardiac functional scores (rotation, torsion and longitudinal shortening) were compared to experimental healthy ranges using generalized models (rotation) and Mahalanobis distances (shortening, torsion). RESULTS: The probability of rotation was significantly lower for ruled-based models [95% CI (0.13, 0.20)] in comparison with experimental data [95% CI (0.23, 0.31)]. The Mahalanobis distance for experimental data was in the edge of the region enclosing 99% of the healthy population. CONCLUSIONS: Cardiac electromechanical simulations of the heart with fibers extracted from experimental data produce functional scores closer to healthy ranges than rule-based models disregarding architecture connectivity.
Address
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference
Notes	IAM; 600.096; 601.323; 600.139; 600.145			Approved	no
Call Number	Admin @ si @ GAB2019a			Serial	3133
Permanent link to this record



Author	Bojana Gajic; Ariel Amato; Ramon Baldrich; Carlo Gatta
Title	Bag of Negatives for Siamese Architectures			Type	Conference Article
Year	2019	Publication	30th British Machine Vision Conference	Abbreviated Journal
Volume		Issue		Pages
Keywords
Abstract	Training a Siamese architecture for re-identification with a large number of identities is a challenging task due to the difficulty of finding relevant negative samples efficiently. In this work we present Bag of Negatives (BoN), a method for accelerated and improved training of Siamese networks that scales well on datasets with a very large number of identities. BoN is an efficient and loss-independent method, able to select a bag of high quality negatives, based on a novel online hashing strategy.
Address	Cardiff; United Kingdom; September 2019
Corporate Author				Thesis
Publisher		Place of Publication		Editor
Language		Summary Language		Original Title
Series Editor		Series Title		Abbreviated Series Title
Series Volume		Series Issue		Edition
ISSN		ISBN		Medium
Area		Expedition		Conference	BMVC
Notes	CIC; 600.140; 600.118			Approved	no
Call Number	Admin @ si @ GAB2019b			Serial	3263
Permanent link to this record