|
Records |
Links |
|
Author |
Danna Xue; Fei Yang; Pei Wang; Luis Herranz; Jinqiu Sun; Yu Zhu; Yanning Zhang |
|
|
Title |
SlimSeg: Slimmable Semantic Segmentation with Boundary Supervision |
Type |
Conference Article |
|
Year |
2022 |
Publication |
30th ACM International Conference on Multimedia |
Abbreviated Journal |
|
|
|
Volume |
|
Issue |
|
Pages |
6539-6548 |
|
|
Keywords |
|
|
|
Abstract |
Accurate semantic segmentation models typically require significant computational resources, inhibiting their use in practical applications. Recent works rely on well-crafted lightweight models to achieve fast inference. However, these models cannot flexibly adapt to varying accuracy and efficiency requirements. In this paper, we propose a simple but effective slimmable semantic segmentation (SlimSeg) method, which can be executed at different capacities during inference depending on the desired accuracy-efficiency tradeoff. More specifically, we employ parametrized channel slimming by stepwise downward knowledge distillation during training. Motivated by the observation that the differences between segmentation results of each submodel are mainly near the semantic borders, we introduce an additional boundary guided semantic segmentation loss to further improve the performance of each submodel. We show that our proposed SlimSeg with various mainstream networks can produce flexible models that provide dynamic adjustment of computational cost and better performance than independent models. Extensive experiments on semantic segmentation benchmarks, Cityscapes and CamVid, demonstrate the generalization ability of our framework. |
|
|
Address |
Lisboa, Portugal, October 2022 |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
Association for Computing Machinery |
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
978-1-4503-9203-7 |
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
MM |
|
|
Notes |
MACO; 600.161; 601.400 |
Approved |
no |
|
|
Call Number |
Admin @ si @ XYW2022 |
Serial |
3758 |
|
Permanent link to this record |
|
|
|
|
Author |
Zhaocheng Liu; Luis Herranz; Fei Yang; Saiping Zhang; Shuai Wan; Marta Mrak; Marc Gorriz |
|
|
Title |
Slimmable Video Codec |
Type |
Conference Article |
|
Year |
2022 |
Publication |
CVPR 2022 Workshop and Challenge on Learned Image Compression (CLIC 2022, 5th Edition) |
Abbreviated Journal |
|
|
|
Volume |
|
Issue |
|
Pages |
1742-1746 |
|
|
Keywords |
|
|
|
Abstract |
Neural video compression has emerged as a novel paradigm combining trainable multilayer neural net-works and machine learning, achieving competitive rate-distortion (RD) performances, but still remaining impractical due to heavy neural architectures, with large memory and computational demands. In addition, models are usually optimized for a single RD tradeoff. Recent slimmable image codecs can dynamically adjust their model capacity to gracefully reduce the memory and computation requirements, without harming RD performance. In this paper we propose a slimmable video codec (SlimVC), by integrating a slimmable temporal entropy model in a slimmable autoencoder. Despite a significantly more complex architecture, we show that slimming remains a powerful mechanism to control rate, memory footprint, computational cost and latency, all being important requirements for practical video compression. |
|
|
Address |
Virtual; 19 June 2022 |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
CVPRW |
|
|
Notes |
MACO; 601.379; 601.161 |
Approved |
no |
|
|
Call Number |
Admin @ si @ LHY2022 |
Serial |
3687 |
|
Permanent link to this record |
|
|
|
|
Author |
Fei Yang; Luis Herranz; Yongmei Cheng; Mikhail Mozerov |
|
|
Title |
Slimmable compressive autoencoders for practical neural image compression |
Type |
Conference Article |
|
Year |
2021 |
Publication |
34th IEEE Conference on Computer Vision and Pattern Recognition |
Abbreviated Journal |
|
|
|
Volume |
|
Issue |
|
Pages |
4996-5005 |
|
|
Keywords |
|
|
|
Abstract |
Neural image compression leverages deep neural networks to outperform traditional image codecs in rate-distortion performance. However, the resulting models are also heavy, computationally demanding and generally optimized for a single rate, limiting their practical use. Focusing on practical image compression, we propose slimmable compressive autoencoders (SlimCAEs), where rate (R) and distortion (D) are jointly optimized for different capacities. Once trained, encoders and decoders can be executed at different capacities, leading to different rates and complexities. We show that a successful implementation of SlimCAEs requires suitable capacity-specific RD tradeoffs. Our experiments show that SlimCAEs are highly flexible models that provide excellent rate-distortion performance, variable rate, and dynamic adjustment of memory, computational cost and latency, thus addressing the main requirements of practical image compression. |
|
|
Address |
Virtual; June 2021 |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
CVPR |
|
|
Notes |
LAMP; 600.120 |
Approved |
no |
|
|
Call Number |
Admin @ si @ YHC2021 |
Serial |
3569 |
|
Permanent link to this record |
|
|
|
|
Author |
G.D. Evangelidis; Ferran Diego; Joan Serrat; Antonio Lopez |
|
|
Title |
Slice Matching for Accurate Spatio-Temporal Alignment |
Type |
Conference Article |
|
Year |
2011 |
Publication |
In ICCV Workshop on Visual Surveillance |
Abbreviated Journal |
|
|
|
Volume |
|
Issue |
|
Pages |
|
|
|
Keywords |
video alignment |
|
|
Abstract |
Video synchronization and alignment is a rather recent topic in computer vision. It usually deals with the problem of aligning sequences recorded simultaneously by static, jointly- or independently-moving cameras. In this paper, we investigate the more difficult problem of matching videos captured at different times from independently-moving cameras, whose trajectories are approximately coincident or parallel. To this end, we propose a novel method that pixel-wise aligns videos and allows thus to automatically highlight their differences. This primarily aims at visual surveillance but the method can be adopted as is by other related video applications, like object transfer (augmented reality) or high dynamic range video. We build upon a slice matching scheme to first synchronize the sequences, while we develop a spatio-temporal alignment scheme to spatially register corresponding frames and refine the temporal mapping. We investigate the performance of the proposed method on videos recorded from vehicles driven along different types of roads and compare with related previous works. |
|
|
Address |
|
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
VS |
|
|
Notes |
ADAS |
Approved |
no |
|
|
Call Number |
Admin @ si @ EDS2011; ADAS @ adas @ eds2011a |
Serial |
1861 |
|
Permanent link to this record |
|
|
|
|
Author |
Daniel Hernandez; Lukas Schneider; Antonio Espinosa; David Vazquez; Antonio Lopez; Uwe Franke; Marc Pollefeys; Juan C. Moure |
|
|
Title |
Slanted Stixels: Representing San Francisco's Steepest Streets} |
Type |
Conference Article |
|
Year |
2017 |
Publication |
28th British Machine Vision Conference |
Abbreviated Journal |
|
|
|
Volume |
|
Issue |
|
Pages |
|
|
|
Keywords |
|
|
|
Abstract |
In this work we present a novel compact scene representation based on Stixels that infers geometric and semantic information. Our approach overcomes the previous rather restrictive geometric assumptions for Stixels by introducing a novel depth model to account for non-flat roads and slanted objects. Both semantic and depth cues are used jointly to infer the scene representation in a sound global energy minimization formulation. Furthermore, a novel approximation scheme is introduced that uses an extremely efficient over-segmentation. In doing so, the computational complexity of the Stixel inference algorithm is reduced significantly, achieving real-time computation capabilities with only a slight drop in accuracy. We evaluate the proposed approach in terms of semantic and geometric accuracy as well as run-time on four publicly available benchmark datasets. Our approach maintains accuracy on flat road scene datasets while improving substantially on a novel non-flat road dataset. |
|
|
Address |
London; uk; September 2017 |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
BMVC |
|
|
Notes |
ADAS; 600.118 |
Approved |
no |
|
|
Call Number |
ADAS @ adas @ HSE2017a |
Serial |
2945 |
|
Permanent link to this record |
|
|
|
|
Author |
Daniel Hernandez; Lukas Schneider; P. Cebrian; A. Espinosa; David Vazquez; Antonio Lopez; Uwe Franke; Marc Pollefeys; Juan Carlos Moure |
|
|
Title |
Slanted Stixels: A way to represent steep streets |
Type |
Journal Article |
|
Year |
2019 |
Publication |
International Journal of Computer Vision |
Abbreviated Journal |
IJCV |
|
|
Volume |
127 |
Issue |
|
Pages |
1643–1658 |
|
|
Keywords |
|
|
|
Abstract |
This work presents and evaluates a novel compact scene representation based on Stixels that infers geometric and semantic information. Our approach overcomes the previous rather restrictive geometric assumptions for Stixels by introducing a novel depth model to account for non-flat roads and slanted objects. Both semantic and depth cues are used jointly to infer the scene representation in a sound global energy minimization formulation. Furthermore, a novel approximation scheme is introduced in order to significantly reduce the computational complexity of the Stixel algorithm, and then achieve real-time computation capabilities. The idea is to first perform an over-segmentation of the image, discarding the unlikely Stixel cuts, and apply the algorithm only on the remaining Stixel cuts. This work presents a novel over-segmentation strategy based on a fully convolutional network, which outperforms an approach based on using local extrema of the disparity map. We evaluate the proposed methods in terms of semantic and geometric accuracy as well as run-time on four publicly available benchmark datasets. Our approach maintains accuracy on flat road scene datasets while improving substantially on a novel non-flat road dataset. |
|
|
Address |
|
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
|
|
|
Notes |
ADAS; 600.118; 600.124 |
Approved |
no |
|
|
Call Number |
Admin @ si @ HSC2019 |
Serial |
3304 |
|
Permanent link to this record |
|
|
|
|
Author |
David Guillamet; Jordi Vitria |
|
|
Title |
Skin segmentation using non linear principal component analysis. |
Type |
Miscellaneous |
|
Year |
1999 |
Publication |
|
Abbreviated Journal |
|
|
|
Volume |
|
Issue |
|
Pages |
|
|
|
Keywords |
|
|
|
Abstract |
|
|
|
Address |
Girona. |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
|
|
|
Notes |
OR;MV |
Approved |
no |
|
|
Call Number |
BCNPCL @ bcnpcl @ GuV1999b |
Serial |
48 |
|
Permanent link to this record |
|
|
|
|
Author |
Ekaterina Zaytseva; Santiago Segui; Jordi Vitria |
|
|
Title |
Sketchable Histograms of Oriented Gradients for Object Detection |
Type |
Conference Article |
|
Year |
2012 |
Publication |
17th Iberomerican Conference on Pattern Recognition |
Abbreviated Journal |
|
|
|
Volume |
7441 |
Issue |
|
Pages |
374-381 |
|
|
Keywords |
|
|
|
Abstract |
In this paper we investigate a new representation approach for visual object recognition. The new representation, called sketchable-HoG, extends the classical histogram of oriented gradients (HoG) feature by adding two different aspects: the stability of the majority orientation and the continuity of gradient orientations. In this way, the sketchable-HoG locally characterizes the complexity of an object model and introduces global structure information while still keeping simplicity, compactness and robustness. We evaluated the proposed image descriptor on publicly Catltech 101 dataset. The obtained results outperforms classical HoG descriptor as well as other reported descriptors in the literature. |
|
|
Address |
Buenos Aires, Argentina |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
Springer Berlin Heidelberg |
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
0302-9743 |
ISBN |
978-3-642-33274-6 |
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
CIARP |
|
|
Notes |
OR; MILAB;MV |
Approved |
no |
|
|
Call Number |
Admin @ si @ ZSV2012 |
Serial |
2048 |
|
Permanent link to this record |
|
|
|
|
Author |
Jordi Gonzalez; J. Varona; Xavier Roca; Juan J. Villanueva |
|
|
Title |
Situation Graph Trees for Human Behavior Modeling |
Type |
Miscellaneous |
|
Year |
2004 |
Publication |
7th Catalan Conference for Artificial Intelligence (CCIA’2004) |
Abbreviated Journal |
|
|
|
Volume |
|
Issue |
|
Pages |
|
|
|
Keywords |
|
|
|
Abstract |
|
|
|
Address |
Barcelona (Spain) |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
|
|
|
Notes |
ISE |
Approved |
no |
|
|
Call Number |
ISE @ ise @ GVR2004b |
Serial |
498 |
|
Permanent link to this record |
|
|
|
|
Author |
C. Mariño; V.M. Gulias; M.G. Penas; M. Penedo; Victor Leboran; A. Mosquera; M.J. Carreira; David Lloret |
|
|
Title |
Sistema de Interpretacion Automatica de Secuencias solo Basado en un Servidor vod. |
Type |
Miscellaneous |
|
Year |
2001 |
Publication |
Proceedings of the SIT2001. |
Abbreviated Journal |
|
|
|
Volume |
|
Issue |
|
Pages |
|
|
|
Keywords |
|
|
|
Abstract |
|
|
|
Address |
|
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
|
|
|
Notes |
|
Approved |
no |
|
|
Call Number |
Admin @ si @ MGP2001 |
Serial |
196 |
|
Permanent link to this record |
|
|
|
|
Author |
Raul Chaves |
|
|
Title |
Sistema de identificacion mediante huellas dactilares |
Type |
Report |
|
Year |
2004 |
Publication |
CVC Technical Report #81 |
Abbreviated Journal |
|
|
|
Volume |
|
Issue |
|
Pages |
|
|
|
Keywords |
|
|
|
Abstract |
|
|
|
Address |
CVC (UAB) |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
|
|
|
Notes |
|
Approved |
no |
|
|
Call Number |
Admin @ si @ Cha2004 |
Serial |
507 |
|
Permanent link to this record |
|
|
|
|
Author |
David Geronimo; Antonio Lopez |
|
|
Title |
Sistema de deteccion de peatones |
Type |
Miscellaneous |
|
Year |
2010 |
Publication |
UAB Divulga |
Abbreviated Journal |
|
|
|
Volume |
|
Issue |
|
Pages |
|
|
|
Keywords |
|
|
|
Abstract |
Durante la próxima década, los sistemas de protección de peatones jugarán un papel fundamental en el reto de mejorar la seguridad viaria. El objetivo principal de estos sistemas, detectar peatones en entornos urbanos, implica procesar imágenes de escenas exteriores desde una plataforma móvil para buscar objetos de aspecto variable como son las personas. Dadas estas dificultades, estos sistemas hacen uso de las últimas técnicas de visión por computador. Esta propuesta consiste en un sistema de tres módulos basado tanto en información 2D como en 3D. El primer módulo utiliza información 3D para hacer una estimación de los parámetros de la carretera y seleccionar regiones de interés que serán analizadas después. El segundo módulo utiliza un clasificador de ventanas 2D para etiquetar las mencionadas regiones como peatón o no peatón. El módulo final vuelve a utilizar de nuevo la información 3D para verificar las regiones clasificadas y, con información 2D, refinar los resultados finales. Los resultados experimentales son positivos tanto en rendimiento como en tiempo de cómputo. |
|
|
Address |
Bellaterra (Spain) |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
|
|
|
Notes |
spreading;ADAS |
Approved |
no |
|
|
Call Number |
ADAS @ adas @ GeL2010b |
Serial |
1473 |
|
Permanent link to this record |
|
|
|
|
Author |
Gemma Rotger; Francesc Moreno-Noguer; Felipe Lumbreras; Antonio Agudo |
|
|
Title |
Single view facial hair 3D reconstruction |
Type |
Conference Article |
|
Year |
2019 |
Publication |
9th Iberian Conference on Pattern Recognition and Image Analysis |
Abbreviated Journal |
|
|
|
Volume |
11867 |
Issue |
|
Pages |
423-436 |
|
|
Keywords |
3D Vision; Shape Reconstruction; Facial Hair Modeling |
|
|
Abstract |
n this work, we introduce a novel energy-based framework that addresses the challenging problem of 3D reconstruction of facial hair from a single RGB image. To this end, we identify hair pixels over the image via texture analysis and then determine individual hair fibers that are modeled by means of a parametric hair model based on 3D helixes. We propose to minimize an energy composed of several terms, in order to adapt the hair parameters that better fit the image detections. The final hairs respond to the resulting fibers after a post-processing step where we encourage further realism. The resulting approach generates realistic facial hair fibers from solely an RGB image without assuming any training data nor user interaction. We provide an experimental evaluation on real-world pictures where several facial hair styles and image conditions are observed, showing consistent results and establishing a comparison with respect to competing approaches. |
|
|
Address |
Madrid; July 2019 |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
LNCS |
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
IbPRIA |
|
|
Notes |
ADAS; 600.086; 600.130; 600.122 |
Approved |
no |
|
|
Call Number |
Admin @ si @ |
Serial |
3707 |
|
Permanent link to this record |
|
|
|
|
Author |
Fadi Dornaika; Bogdan Raducanu |
|
|
Title |
Single Snapshot 3D Head Pose Initialization for Tracking in Human Robot Interaction Scenario |
Type |
Conference Article |
|
Year |
2010 |
Publication |
1st International Workshop on Computer Vision for Human-Robot Interaction |
Abbreviated Journal |
|
|
|
Volume |
|
Issue |
|
Pages |
32–39 |
|
|
Keywords |
1st International Workshop on Computer Vision for Human-Robot Interaction, in conjunction with IEEE CVPR 2010 |
|
|
Abstract |
This paper presents an automatic 3D head pose initialization scheme for a real-time face tracker with application to human-robot interaction. It has two main contributions. First, we propose an automatic 3D head pose and person specific face shape estimation, based on a 3D deformable model. The proposed approach serves to initialize our realtime 3D face tracker. What makes this contribution very attractive is that the initialization step can cope with faces
under arbitrary pose, so it is not limited only to near-frontal views. Second, the previous framework is used to develop an application in which the orientation of an AIBO’s camera can be controlled through the imitation of user’s head pose.
In our scenario, this application is used to build panoramic images from overlapping snapshots. Experiments on real videos confirm the robustness and usefulness of the proposed methods. |
|
|
Address |
San Francisco; CA; USA; June 2010 |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
2160-7508 |
ISBN |
978-1-4244-7029-7 |
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
CVPRW |
|
|
Notes |
OR;MV |
Approved |
no |
|
|
Call Number |
BCNPCL @ bcnpcl @ DoR2010a |
Serial |
1309 |
|
Permanent link to this record |
|
|
|
|
Author |
Lluis Gomez; Andres Mafla; Marçal Rusiñol; Dimosthenis Karatzas |
|
|
Title |
Single Shot Scene Text Retrieval |
Type |
Conference Article |
|
Year |
2018 |
Publication |
15th European Conference on Computer Vision |
Abbreviated Journal |
|
|
|
Volume |
11218 |
Issue |
|
Pages |
728-744 |
|
|
Keywords |
Image retrieval; Scene text; Word spotting; Convolutional Neural Networks; Region Proposals Networks; PHOC |
|
|
Abstract |
Textual information found in scene images provides high level semantic information about the image and its context and it can be leveraged for better scene understanding. In this paper we address the problem of scene text retrieval: given a text query, the system must return all images containing the queried text. The novelty of the proposed model consists in the usage of a single shot CNN architecture that predicts at the same time bounding boxes and a compact text representation of the words in them. In this way, the text based image retrieval task can be casted as a simple nearest neighbor search of the query text representation over the outputs of the CNN over the entire image
database. Our experiments demonstrate that the proposed architecture
outperforms previous state-of-the-art while it offers a significant increase
in processing speed. |
|
|
Address |
Munich; September 2018 |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
LNCS |
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
ECCV |
|
|
Notes |
DAG; 600.084; 601.338; 600.121; 600.129 |
Approved |
no |
|
|
Call Number |
Admin @ si @ GMR2018 |
Serial |
3143 |
|
Permanent link to this record |