|
Records |
Links |
|
Author |
Sanket Biswas; Pau Riba; Josep Llados; Umapada Pal |
|
|
Title |
Beyond Document Object Detection: Instance-Level Segmentation of Complex Layouts |
Type |
Journal Article |
|
Year |
2021 |
Publication |
International Journal on Document Analysis and Recognition |
Abbreviated Journal |
IJDAR |
|
|
Volume |
24 |
Issue |
|
Pages |
269–281 |
|
|
Keywords |
|
|
|
Abstract |
Information extraction is a fundamental task of many business intelligence services that entail massive document processing. Understanding a document page structure in terms of its layout provides contextual support which is helpful in the semantic interpretation of the document terms. In this paper, inspired by the progress of deep learning methodologies applied to the task of object recognition, we transfer these models to the specific case of document object detection, reformulating the traditional problem of document layout analysis. Moreover, we importantly contribute to prior arts by defining the task of instance segmentation on the document image domain. An instance segmentation paradigm is especially important in complex layouts whose contents should interact for the proper rendering of the page, i.e., the proper text wrapping around an image. Finally, we provide an extensive evaluation, both qualitative and quantitative, that demonstrates the superior performance of the proposed methodology over the current state of the art. |
|
|
Address |
|
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
|
|
|
Notes |
DAG; 600.121; 600.140; 110.312 |
Approved |
no |
|
|
Call Number |
Admin @ si @ BRL2021b |
Serial |
3574 |
|
Permanent link to this record |
|
|
|
|
Author |
Ferran Poveda; Debora Gil;Enric Marti |
|
|
Title |
Multi-resolution DT-MRI cardiac tractography |
Type |
Conference Article |
|
Year |
2012 |
Publication |
Statistical Atlases And Computational Models Of The Heart: Imaging and Modelling Challenges |
Abbreviated Journal |
|
|
|
Volume |
7746 |
Issue |
|
Pages |
270-277 |
|
|
Keywords |
|
|
|
Abstract |
Even using objective measures from DT-MRI no consensus about myocardial architecture has been achieved so far. Streamlining provides good reconstructions at low level of detail, but falls short to give global abstract interpretations. In this paper, we present a multi-resolution methodology that is able to produce simplified representations of cardiac architecture. Our approach produces a reduced set of tracts that are representative of the main geometric features of myocardial anatomical structure. Experiments show that fiber geometry is preserved along reductions, which validates the simplified model for interpretation of cardiac architecture. |
|
|
Address |
Nice, France |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
Springer Berlin Heidelberg |
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
LNCS |
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
0302-9743 |
ISBN |
978-3-642-36960-5 |
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
STACOM |
|
|
Notes |
IAM |
Approved |
no |
|
|
Call Number |
IAM @ iam @ PGM2012 |
Serial |
1986 |
|
Permanent link to this record |
|
|
|
|
Author |
David Berga; Xavier Otazu |
|
|
Title |
Modeling Bottom-Up and Top-Down Attention with a Neurodynamic Model of V1 |
Type |
Journal Article |
|
Year |
2020 |
Publication |
Neurocomputing |
Abbreviated Journal |
NEUCOM |
|
|
Volume |
417 |
Issue |
|
Pages |
270-289 |
|
|
Keywords |
|
|
|
Abstract |
Previous studies suggested that lateral interactions of V1 cells are responsible, among other visual effects, of bottom-up visual attention (alternatively named visual salience or saliency). Our objective is to mimic these connections with a neurodynamic network of firing-rate neurons in order to predict visual attention. Early visual subcortical processes (i.e. retinal and thalamic) are functionally simulated. An implementation of the cortical magnification function is included to define the retinotopical projections towards V1, processing neuronal activity for each distinct view during scene observation. Novel computational definitions of top-down inhibition (in terms of inhibition of return, oculomotor and selection mechanisms), are also proposed to predict attention in Free-Viewing and Visual Search tasks. Results show that our model outpeforms other biologically inspired models of saliency prediction while predicting visual saccade sequences with the same model. We also show how temporal and spatial characteristics of saccade amplitude and inhibition of return can improve prediction of saccades, as well as how distinct search strategies (in terms of feature-selective or category-specific inhibition) can predict attention at distinct image contexts. |
|
|
Address |
|
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
|
|
|
Notes |
NEUROBIT |
Approved |
no |
|
|
Call Number |
Admin @ si @ BeO2020c |
Serial |
3444 |
|
Permanent link to this record |
|
|
|
|
Author |
Francesc Carreras; Jaume Garcia; Debora Gil; Sandra Pujadas; Chi ho Lion; R.Suarez-Arias; R.Leta; Xavier Alomar; Manuel Ballester; Guillem Pons-Llados |
|
|
Title |
Left ventricular torsion and longitudinal shortening: two fundamental components of myocardial mechanics assessed by tagged cine-MRI in normal subjects |
Type |
Journal Article |
|
Year |
2012 |
Publication |
International Journal of Cardiovascular Imaging |
Abbreviated Journal |
IJCI |
|
|
Volume |
28 |
Issue |
2 |
Pages |
273-284 |
|
|
Keywords |
Magnetic resonance imaging (MRI); Tagging MRI; Cardiac mechanics; Ventricular torsion |
|
|
Abstract |
Cardiac magnetic resonance imaging (Cardiac MRI) has become a gold standard diagnostic technique for the assessment of cardiac mechanics, allowing the non-invasive calculation of left ventric- ular long axis longitudinal shortening (LVLS) and absolute myocardial torsion (AMT) between basal and apical left ventricular slices, a movement directly related to the helicoidal anatomic disposition of the myocardial fibers. The aim of this study is to determine AMT and LVLS behaviour and normal values from a group of healthy subjects. A group of 21 healthy volunteers (15 males) (age: 23–55 y.o., mean:30.7 ± 7.5) were prospectively included in an obser- vational study by Cardiac MRI. Left ventricular rotation (degrees) was calculated by custom-made software (Harmonic Phase Flow) in consecutive LV short axis planes tagged cine-MRI sequences. AMT was determined from the difference between basal and apical planes LV rotations. LVLS (%) was determined from the LV longitudinal and horizontal axis cine-MRI images. All the 21 cases studied were interpretable, although in three cases the value of the LV apical rotation could not be determined. The mean rotation of the basal and apical planes at end-systole were -3.71° ± 0.84° and 6.73° ± 1.69° (n:18) respectively, resulting in a LV mean AMT of 10.48° ± 1.63° (n:18). End-systolic mean LVLS was 19.07 ± 2.71%. Cardiac MRI allows for the calculation of AMT and LVLS, fundamental functional components of the ventricular twist mechanics conditioned, in turn, by the anatomical helical layout of the myocardial fibers. These values provide complementary information about systolic ventricular function in relation to the traditional parameters used in daily practice. |
|
|
Address |
|
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
Springer Netherlands |
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
1569-5794 |
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
|
|
|
Notes |
IAM; |
Approved |
no |
|
|
Call Number |
IAM @ iam @ CGG2012 |
Serial |
1496 |
|
Permanent link to this record |
|
|
|
|
Author |
Antonio Lopez; Joan Serrat; Cristina Cañero; Felipe Lumbreras |
|
|
Title |
Robust Lane Lines Detection and Quantitative Assessment |
Type |
Conference Article |
|
Year |
2007 |
Publication |
3rd Iberian Conference on Pattern Recognition and Image Analysis |
Abbreviated Journal |
|
|
|
Volume |
4477 |
Issue |
|
Pages |
274–281 |
|
|
Keywords |
lane markings |
|
|
Abstract |
|
|
|
Address |
Girona (Spain) |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
J. Marti et al |
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
LNCS |
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
IbPRIA |
|
|
Notes |
ADAS |
Approved |
no |
|
|
Call Number |
ADAS @ adas @ LSC2007 |
Serial |
881 |
|
Permanent link to this record |
|
|
|
|
Author |
Jean-Pascal Jacob; Mariella Dimiccoli; Lionel Moisan |
|
|
Title |
Active skeleton for bacteria modeling |
Type |
Journal Article |
|
Year |
2016 |
Publication |
Computer Methods in Biomechanics and Biomedical Engineering: Imaging and Visualization |
Abbreviated Journal |
CMBBE |
|
|
Volume |
5 |
Issue |
4 |
Pages |
274-286 |
|
|
Keywords |
Bacteria modelling; medial axis; active contours; active skeleton; shape contraints |
|
|
Abstract |
The investigation of spatio-temporal dynamics of bacterial cells and their molecular components requires automated image analysis tools to track cell shape properties and molecular component locations inside the cells. In the study of bacteria aging, the molecular components of interest are protein aggregates accumulated near bacteria boundaries. This particular location makes very ambiguous the correspondence between aggregates and cells, since computing accurately bacteria boundaries in phase-contrast time-lapse imaging is a challenging task. This paper proposes an active skeleton formulation for bacteria modeling which provides several advantages: an easy computation of shape properties (perimeter, length, thickness, orientation), an improved boundary accuracy in noisy images, and a natural bacteria-centered coordinate system that permits the intrinsic location of molecular components inside the cell. Starting from an initial skeleton estimate, the medial axis of the bacterium is obtained by minimizing an energy function which incorporates bacteria shape constraints. Experimental results on biological images and comparative evaluation of the performances validate the proposed approach for modeling cigar-shaped bacteria like Escherichia coli. The Image-J plugin of the proposed method can be found online at this http URL |
|
|
Address |
|
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
|
|
|
Notes |
MILAB |
Approved |
no |
|
|
Call Number |
Admin @ si @ JDM2016 |
Serial |
2711 |
|
Permanent link to this record |
|
|
|
|
Author |
Jean-Pascal Jacob; Mariella Dimiccoli; L. Moisan |
|
|
Title |
Active skeleton for bacteria modelling |
Type |
Journal Article |
|
Year |
2017 |
Publication |
Computer Methods in Biomechanics and Biomedical Engineering: Imaging and Visualization |
Abbreviated Journal |
CMBBE |
|
|
Volume |
5 |
Issue |
4 |
Pages |
274-286 |
|
|
Keywords |
|
|
|
Abstract |
The investigation of spatio-temporal dynamics of bacterial cells and their molecular components requires automated image analysis tools to track cell shape properties and molecular component locations inside the cells. In the study of bacteria aging, the molecular components of interest are protein aggregates accumulated near bacteria boundaries. This particular location makes very ambiguous the correspondence between aggregates and cells, since computing accurately bacteria boundaries in phase-contrast time-lapse imaging is a challenging task. This paper proposes an active skeleton formulation for bacteria modelling which provides several advantages: an easy computation of shape properties (perimeter, length, thickness and orientation), an improved boundary accuracy in noisy images and a natural bacteria-centred coordinate system that permits the intrinsic location of molecular components inside the cell. Starting from an initial skeleton estimate, the medial axis of the bacterium is obtained by minimising an energy function which incorporates bacteria shape constraints. Experimental results on biological images and comparative evaluation of the performances validate the proposed approach for modelling cigar-shaped bacteria like Escherichia coli. The Image-J plugin of the proposed method can be found online at http://fluobactracker.inrialpes.fr. |
|
|
Address |
|
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
Taylor & Francis Group |
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
|
|
|
Notes |
MILAB; |
Approved |
no |
|
|
Call Number |
Admin @ si @JDM2017 |
Serial |
2784 |
|
Permanent link to this record |
|
|
|
|
Author |
Stepan Simsa; Michal Uricar; Milan Sulc; Yash Patel; Ahmed Hamdi; Matej Kocian; Matyas Skalicky; Jiri Matas; Antoine Doucet; Mickael Coustaty; Dimosthenis Karatzas |
|
|
Title |
Overview of DocILE 2023: Document Information Localization and Extraction |
Type |
Conference Article |
|
Year |
2023 |
Publication |
International Conference of the Cross-Language Evaluation Forum for European Languages |
Abbreviated Journal |
|
|
|
Volume |
14163 |
Issue |
|
Pages |
276–293 |
|
|
Keywords |
Information Extraction; Computer Vision; Natural Language Processing; Optical Character Recognition; Document Understanding |
|
|
Abstract |
This paper provides an overview of the DocILE 2023 Competition, its tasks, participant submissions, the competition results and possible future research directions. This first edition of the competition focused on two Information Extraction tasks, Key Information Localization and Extraction (KILE) and Line Item Recognition (LIR). Both of these tasks require detection of pre-defined categories of information in business documents. The second task additionally requires correctly grouping the information into tuples, capturing the structure laid out in the document. The competition used the recently published DocILE dataset and benchmark that stays open to new submissions. The diversity of the participant solutions indicates the potential of the dataset as the submissions included pure Computer Vision, pure Natural Language Processing, as well as multi-modal solutions and utilized all of the parts of the dataset, including the annotated, synthetic and unlabeled subsets. |
|
|
Address |
Thessaloniki; Greece; September 2023 |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
LNCS |
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
CLEF |
|
|
Notes |
DAG |
Approved |
no |
|
|
Call Number |
Admin @ si @ SUS2023a |
Serial |
3924 |
|
Permanent link to this record |
|
|
|
|
Author |
Ernest Valveny; Salvatore Tabbone; Oriol Ramos Terrades |
|
|
Title |
Performance Characterization of Shape Descriptors for Symbol Representation |
Type |
Book Chapter |
|
Year |
2008 |
Publication |
Graphics Recognition: Recent Advances and New Opportunities |
Abbreviated Journal |
|
|
|
Volume |
5046 |
Issue |
|
Pages |
278–287 |
|
|
Keywords |
|
|
|
Abstract |
|
|
|
Address |
|
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
W. Liu, J. Llados, J.M. Ogier |
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
LNCS |
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
|
|
|
Notes |
DAG |
Approved |
no |
|
|
Call Number |
DAG @ dag @ VTR2008 |
Serial |
985 |
|
Permanent link to this record |
|
|
|
|
Author |
Marc Serra; Olivier Penacchio; Robert Benavente; Maria Vanrell |
|
|
Title |
Names and Shades of Color for Intrinsic Image Estimation |
Type |
Conference Article |
|
Year |
2012 |
Publication |
25th IEEE Conference on Computer Vision and Pattern Recognition |
Abbreviated Journal |
|
|
|
Volume |
|
Issue |
|
Pages |
278-285 |
|
|
Keywords |
|
|
|
Abstract |
In the last years, intrinsic image decomposition has gained attention. Most of the state-of-the-art methods are based on the assumption that reflectance changes come along with strong image edges. Recently, user intervention in the recovery problem has proved to be a remarkable source of improvement. In this paper, we propose a novel approach that aims to overcome the shortcomings of pure edge-based methods by introducing strong surface descriptors, such as the color-name descriptor which introduces high-level considerations resembling top-down intervention. We also use a second surface descriptor, termed color-shade, which allows us to include physical considerations derived from the image formation model capturing gradual color surface variations. Both color cues are combined by means of a Markov Random Field. The method is quantitatively tested on the MIT ground truth dataset using different error metrics, achieving state-of-the-art performance. |
|
|
Address |
Providence, Rhode Island |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
IEEE Xplore |
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
1063-6919 |
ISBN |
978-1-4673-1226-4 |
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
CVPR |
|
|
Notes |
CIC |
Approved |
no |
|
|
Call Number |
Admin @ si @ SPB2012 |
Serial |
2026 |
|
Permanent link to this record |
|
|
|
|
Author |
Carles Fernandez; Pau Baiget; Xavier Roca; Jordi Gonzalez |
|
|
Title |
Natural Language Descriptions of Human Behavior from Video Sequences |
Type |
Conference Article |
|
Year |
2007 |
Publication |
Advances in Artificial Intelligence, 30th Annual Conference on Artificial Intelligence |
Abbreviated Journal |
|
|
|
Volume |
4667 |
Issue |
|
Pages |
279–292 |
|
|
Keywords |
|
|
|
Abstract |
|
|
|
Address |
|
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
LNCS |
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
KI |
|
|
Notes |
ISE |
Approved |
no |
|
|
Call Number |
ISE @ ise @ FBR2007b |
Serial |
921 |
|
Permanent link to this record |
|
|
|
|
Author |
Carme Julia; Angel Sappa; Felipe Lumbreras; Joan Serrat; Antonio Lopez |
|
|
Title |
Rank Estimation in 3D Multibody Motion Segmentation |
Type |
Journal Article |
|
Year |
2008 |
Publication |
Electronic Letters |
Abbreviated Journal |
|
|
|
Volume |
44 |
Issue |
4 |
Pages |
279-280 |
|
|
Keywords |
|
|
|
Abstract |
A novel technique for rank estimation in 3D multibody motion segmentation is proposed. It is based on the study of the frequency spectra of moving rigid objects and does not use or assume a prior knowledge of the objects contained in the scene (i.e. number of objects and motion). The significance of rank estimation on multibody motion segmentation results is shown by using two motion segmentation algorithms over both synthetic and real data. |
|
|
Address |
|
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
|
|
|
Notes |
ADAS |
Approved |
no |
|
|
Call Number |
ADAS @ adas @ JSL2008a |
Serial |
939 |
|
Permanent link to this record |
|
|
|
|
Author |
Ivan Huerta; Ariel Amato; Jordi Gonzalez; Juan J. Villanueva |
|
|
Title |
Fusing Edge Cues to Handle Colour Problems in Image Segmentation |
Type |
Book Chapter |
|
Year |
2008 |
Publication |
Articulated Motion and Deformable Objects, 5th International Conference |
Abbreviated Journal |
|
|
|
Volume |
5098 |
Issue |
|
Pages |
279–288 |
|
|
Keywords |
|
|
|
Abstract |
|
|
|
Address |
Port d'Andratx (Mallorca) |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
LNCS |
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
AMDO |
|
|
Notes |
ISE |
Approved |
no |
|
|
Call Number |
ISE @ ise @ HAG2008 |
Serial |
973 |
|
Permanent link to this record |
|
|
|
|
Author |
Raul Gomez; Lluis Gomez; Jaume Gibert; Dimosthenis Karatzas |
|
|
Title |
Self-Supervised Learning from Web Data for Multimodal Retrieval |
Type |
Book Chapter |
|
Year |
2019 |
Publication |
Multi-Modal Scene Understanding Book |
Abbreviated Journal |
|
|
|
Volume |
|
Issue |
|
Pages |
279-306 |
|
|
Keywords |
self-supervised learning; webly supervised learning; text embeddings; multimodal retrieval; multimodal embedding |
|
|
Abstract |
Self-Supervised learning from multimodal image and text data allows deep neural networks to learn powerful features with no need of human annotated data. Web and Social Media platforms provide a virtually unlimited amount of this multimodal data. In this work we propose to exploit this free available data to learn a multimodal image and text embedding, aiming to leverage the semantic knowledge learnt in the text domain and transfer it to a visual model for semantic image retrieval. We demonstrate that the proposed pipeline can learn from images with associated text without supervision and analyze the semantic structure of the learnt joint image and text embeddingspace. Weperformathoroughanalysisandperformancecomparisonoffivedifferentstateof the art text embeddings in three different benchmarks. We show that the embeddings learnt with Web and Social Media data have competitive performances over supervised methods in the text basedimageretrievaltask,andweclearlyoutperformstateoftheartintheMIRFlickrdatasetwhen training in the target data. Further, we demonstrate how semantic multimodal image retrieval can be performed using the learnt embeddings, going beyond classical instance-level retrieval problems. Finally, we present a new dataset, InstaCities1M, composed by Instagram images and their associated texts that can be used for fair comparison of image-text embeddings. |
|
|
Address |
|
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
|
|
|
Notes |
DAG; 600.129; 601.338; 601.310 |
Approved |
no |
|
|
Call Number |
Admin @ si @ GGG2019 |
Serial |
3266 |
|
Permanent link to this record |
|
|
|
|
Author |
Marco Pedersoli; Jordi Gonzalez; Andrew Bagdanov; Juan J. Villanueva |
|
|
Title |
Recursive Coarse-to-Fine Localization for fast Object Recognition |
Type |
Conference Article |
|
Year |
2010 |
Publication |
11th European Conference on Computer Vision |
Abbreviated Journal |
|
|
|
Volume |
6313 |
Issue |
II |
Pages |
280–293 |
|
|
Keywords |
|
|
|
Abstract |
Cascading techniques are commonly used to speed-up the scan of an image for object detection. However, cascades of detectors are slow to train due to the high number of detectors and corresponding thresholds to learn. Furthermore, they do not use any prior knowledge about the scene structure to decide where to focus the search. To handle these problems, we propose a new way to scan an image, where we couple a recursive coarse-to-fine refinement together with spatial constraints of the object location. For doing that we split an image into a set of uniformly distributed neighborhood regions, and for each of these we apply a local greedy search over feature resolutions. The neighborhood is defined as a scanning region that only one object can occupy. Therefore the best hypothesis is obtained as the location with maximum score and no thresholds are needed. We present an implementation of our method using a pyramid of HOG features and we evaluate it on two standard databases, VOC2007 and INRIA dataset. Results show that the Recursive Coarse-to-Fine Localization (RCFL) achieves a 12x speed-up compared to standard sliding windows. Compared with a cascade of multiple resolutions approach our method has slightly better performance in speed and Average-Precision. Furthermore, in contrast to cascading approach, the speed-up is independent of image conditions, the number of detected objects and clutter. |
|
|
Address |
Crete (Greece) |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
Springer Berlin Heidelberg |
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
LNCS |
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
0302-9743 |
ISBN |
978-3-642-15566-6 |
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
ECCV |
|
|
Notes |
ISE |
Approved |
no |
|
|
Call Number |
DAG @ dag @ PGB2010 |
Serial |
1438 |
|
Permanent link to this record |