|
Records |
Links |
|
Author |
Debora Gil; Aura Hernandez-Sabate; Mireia Brunat;Steven Jansen; Jordi Martinez-Vilalta |
|
|
Title |
Structure-preserving smoothing of biomedical images |
Type |
Journal Article |
|
Year |
2011 |
Publication |
Pattern Recognition |
Abbreviated Journal |
PR |
|
|
Volume |
44 |
Issue |
9 |
Pages |
1842-1851 |
|
|
Keywords |
Non-linear smoothing; Differential geometry; Anatomical structures; segmentation; Cardiac magnetic resonance; Computerized tomography |
|
|
Abstract |
Smoothing of biomedical images should preserve gray-level transitions between adjacent tissues, while restoring contours consistent with anatomical structures. Anisotropic diffusion operators are based on image appearance discontinuities (either local or contextual) and might fail at weak inter-tissue transitions. Meanwhile, the output of block-wise and morphological operations is prone to present a block structure due to the shape and size of the considered pixel neighborhood. In this contribution, we use differential geometry concepts to define a diffusion operator that restricts to image consistent level-sets. In this manner, the final state is a non-uniform intensity image presenting homogeneous inter-tissue transitions along anatomical structures, while smoothing intra-structure texture. Experiments on different types of medical images (magnetic resonance, computerized tomography) illustrate its benefit on a further process (such as segmentation) of images. |
|
|
Address |
|
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
0031-3203 |
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
|
|
|
Notes |
IAM; ADAS |
Approved |
no |
|
|
Call Number |
IAM @ iam @ GHB2011 |
Serial |
1526 |
|
Permanent link to this record |
|
|
|
|
Author |
Debora Gil; Aura Hernandez-Sabate; Mireia Burnat; Steven Jansen; Jordi Martinez-Vilalta |
|
|
Title |
Structure-Preserving Smoothing of Biomedical Images |
Type |
Conference Article |
|
Year |
2009 |
Publication |
13th International Conference on Computer Analysis of Images and Patterns |
Abbreviated Journal |
|
|
|
Volume |
5702 |
Issue |
|
Pages |
427-434 |
|
|
Keywords |
non-linear smoothing; differential geometry; anatomical structures segmentation; cardiac magnetic resonance; computerized tomography. |
|
|
Abstract |
Smoothing of biomedical images should preserve gray-level transitions between adjacent tissues, while restoring contours consistent with anatomical structures. Anisotropic diffusion operators are based on image appearance discontinuities (either local or contextual) and might fail at weak inter-tissue transitions. Meanwhile, the output of block-wise and morphological operations is prone to present a block structure due to the shape and size of the considered pixel neighborhood. In this contribution, we use differential geometry concepts to define a diffusion operator that restricts to image consistent level-sets. In this manner, the final state is a non-uniform intensity image presenting homogeneous inter-tissue transitions along anatomical structures, while smoothing intra-structure texture. Experiments on different types of medical images (magnetic resonance, computerized tomography) illustrate its benefit on a further process (such as segmentation) of images. |
|
|
Address |
Münster, Germany |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
Springer Berlin Heidelberg |
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
LNCS |
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
0302-9743 |
ISBN |
978-3-642-03766-5 |
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
CAIP |
|
|
Notes |
IAM |
Approved |
no |
|
|
Call Number |
IAM @ iam @ GHB2009 |
Serial |
1527 |
|
Permanent link to this record |
|
|
|
|
Author |
Francisco Alvaro; Francisco Cruz; Joan Andreu Sanchez; Oriol Ramos Terrades; Jose Miguel Benedi |
|
|
Title |
Structure Detection and Segmentation of Documents Using 2D Stochastic Context-Free Grammars |
Type |
Journal Article |
|
Year |
2015 |
Publication |
Neurocomputing |
Abbreviated Journal |
NEUCOM |
|
|
Volume |
150 |
Issue |
A |
Pages |
147-154 |
|
|
Keywords |
document image analysis; stochastic context-free grammars; text classication features |
|
|
Abstract |
In this paper we dene a bidimensional extension of Stochastic Context-Free Grammars for structure detection and segmentation of images of documents.
Two sets of text classication features are used to perform an initial classication of each zone of the page. Then, the document segmentation is obtained as the most likely hypothesis according to a stochastic grammar. We used a dataset of historical marriage license books to validate this approach. We also tested several inference algorithms for Probabilistic Graphical Models
and the results showed that the proposed grammatical model outperformed
the other methods. Furthermore, grammars also provide the document structure
along with its segmentation. |
|
|
Address |
|
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
|
|
|
Notes |
DAG; 601.158; 600.077; 600.061 |
Approved |
no |
|
|
Call Number |
Admin @ si @ ACS2015 |
Serial |
2531 |
|
Permanent link to this record |
|
|
|
|
Author |
Josep Llados; Horst Bunke; Enric Marti |
|
|
Title |
Structural Recognition of hand drawn floor plans |
Type |
Conference Article |
|
Year |
1996 |
Publication |
VI National Symposium on Pattern Recognition and Image Analysis |
Abbreviated Journal |
|
|
|
Volume |
|
Issue |
|
Pages |
|
|
|
Keywords |
Rotational Symmetry; Reflectional Symmetry; String Matching. |
|
|
Abstract |
A system to recognize hand drawn architectural drawings in a CAD environment has been deve- loped. In this paper we focus on its high level interpretation module. To interpret a floor plan, the system must identify several building elements, whose description is stored in a library of pat- terns, as well as their spatial relationships. We propose a structural approach based on subgraph isomorphism techniques to obtain a high-level interpretation of the document. The vectorized input document and the patterns to be recognized are represented by attributed graphs. Discrete relaxation techniques (AC4 algorithm) have been applied to develop the matching algorithm. The process has been divided in three steps: node labeling, local consistency and global consistency verification. The hand drawn creation causes disturbed line drawings with several accuracy errors, which must be taken into account. Here we have identified them and the AC4 algorithm has been adapted to manage them. |
|
|
Address |
|
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
Cordoba |
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
|
|
|
Notes |
DAG;IAM; |
Approved |
no |
|
|
Call Number |
IAM @ iam @ LIM1995 |
Serial |
1565 |
|
Permanent link to this record |
|
|
|
|
Author |
V. Valev; Petia Radeva |
|
|
Title |
Structural Pattern Recognition by Non-Reducible Descriptors |
Type |
Conference Article |
|
Year |
1994 |
Publication |
Proc. International Workshop on Syntactic and Structural Pattern Recognition. |
Abbreviated Journal |
|
|
|
Volume |
|
Issue |
|
Pages |
|
|
|
Keywords |
|
|
|
Abstract |
|
|
|
Address |
Nahariya, Israel |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
|
|
|
Notes |
MILAB |
Approved |
no |
|
|
Call Number |
BCNPCL @ bcnpcl @ VaR1994 |
Serial |
107 |
|
Permanent link to this record |
|
|
|
|
Author |
Domicele Jonauskaite; Nele Dael; C. Alejandro Parraga; Laetitia Chevre; Alejandro Garcia Sanchez; Christine Mohr |
|
|
Title |
Stripping #The Dress: The importance of contextual information on inter-individual differences in colour perception |
Type |
Journal Article |
|
Year |
2018 |
Publication |
Psychological Research |
Abbreviated Journal |
PSYCHO R |
|
|
Volume |
|
Issue |
|
Pages |
1-15 |
|
|
Keywords |
|
|
|
Abstract |
In 2015, a picture of a Dress (henceforth the Dress) triggered popular and scientific interest; some reported seeing the Dress in white and gold (W&G) and others in blue and black (B&B). We aimed to describe the phenomenon and investigate the role of contextualization. Few days after the Dress had appeared on the Internet, we projected it to 240 students on two large screens in the classroom. Participants reported seeing the Dress in B&B (48%), W&G (38%), or blue and brown (B&Br; 7%). Amongst numerous socio-demographic variables, we only observed that W&G viewers were most likely to have always seen the Dress as W&G. In the laboratory, we tested how much contextual information is necessary for the phenomenon to occur. Fifty-seven participants selected colours most precisely matching predominant colours of parts or the full Dress. We presented, in this order, small squares (a), vertical strips (b), and the full Dress (c). We found that (1) B&B, B&Br, and W&G viewers had selected colours differing in lightness and chroma levels for contextualized images only (b, c conditions) and hue for fully contextualized condition only (c) and (2) B&B viewers selected colours most closely matching displayed colours of the Dress. Thus, the Dress phenomenon emerges due to inter-individual differences in subjectively perceived lightness, chroma, and hue, at least when all aspects of the picture need to be integrated. Our results support the previous conclusions that contextual information is key to colour perception; it should be important to understand how this actually happens. |
|
|
Address |
|
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
|
|
|
Notes |
NEUROBIT; no proj |
Approved |
no |
|
|
Call Number |
Admin @ si @ JDP2018 |
Serial |
3149 |
|
Permanent link to this record |
|
|
|
|
Author |
Mariella Dimiccoli; Benoît Girard; Alain Berthoz; Daniel Bennequin |
|
|
Title |
Striola Magica: a functional explanation of otolith organs |
Type |
Journal Article |
|
Year |
2013 |
Publication |
Journal of Computational Neuroscience |
Abbreviated Journal |
JCN |
|
|
Volume |
35 |
Issue |
2 |
Pages |
125-154 |
|
|
Keywords |
Otolith organs ;Striola; Vestibular pathway |
|
|
Abstract |
Otolith end organs of vertebrates sense linear accelerations of the head and gravitation. The hair cells on their epithelia are responsible for transduction. In mammals, the striola, parallel to the line where hair cells reverse their polarization, is a narrow region centered on a curve with curvature and torsion. It has been shown that the striolar region is functionally different from the rest, being involved in a phasic vestibular pathway. We propose a mathematical and computational model that explains the necessity of this amazing geometry for the striola to be able to carry out its function. Our hypothesis, related to the biophysics of the hair cells and to the physiology of their afferent neurons, is that striolar afferents collect information from several type I hair cells to detect the jerk in a large domain of acceleration directions. This predicts a mean number of two calyces for afferent neurons, as measured in rodents. The domain of acceleration directions sensed by our striolar model is compatible with the experimental results obtained on monkeys considering all afferents. Therefore, the main result of our study is that phasic and tonic vestibular afferents cover the same geometrical fields, but at different dynamical and frequency domains. |
|
|
Address |
|
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
Springer US |
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
1573-6873. 2013 |
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
|
|
|
Notes |
MILAB |
Approved |
no |
|
|
Call Number |
Admin @ si @DBG2013 |
Serial |
2787 |
|
Permanent link to this record |
|
|
|
|
Author |
Juan Andrade; T. Alejandra Vidal; A. Sanfeliu |
|
|
Title |
Stochastic state estimation for simultaneous localization and map building in mobile robotics |
Type |
Book Chapter |
|
Year |
2005 |
Publication |
Vedran Kordic, Aleksandar Lazinica, and Munir Merdan (Eds.), Cutting Edge Robotics, Advanced Robotic Systems Press, 3.3:223–242 |
Abbreviated Journal |
|
|
|
Volume |
|
Issue |
|
Pages |
|
|
|
Keywords |
|
|
|
Abstract |
|
|
|
Address |
|
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
|
|
|
Notes |
|
Approved |
no |
|
|
Call Number |
Admin @ si @ AVS2005a |
Serial |
565 |
|
Permanent link to this record |
|
|
|
|
Author |
Anjan Dutta; Hichem Sahbi |
|
|
Title |
Stochastic Graphlet Embedding |
Type |
Journal Article |
|
Year |
2018 |
Publication |
IEEE Transactions on Neural Networks and Learning Systems |
Abbreviated Journal |
TNNLS |
|
|
Volume |
|
Issue |
|
Pages |
1-14 |
|
|
Keywords |
Stochastic graphlets; Graph embedding; Graph classification; Graph hashing; Betweenness centrality |
|
|
Abstract |
Graph-based methods are known to be successful in many machine learning and pattern classification tasks. These methods consider semi-structured data as graphs where nodes correspond to primitives (parts, interest points, segments,
etc.) and edges characterize the relationships between these primitives. However, these non-vectorial graph data cannot be straightforwardly plugged into off-the-shelf machine learning algorithms without a preliminary step of – explicit/implicit –graph vectorization and embedding. This embedding process
should be resilient to intra-class graph variations while being highly discriminant. In this paper, we propose a novel high-order stochastic graphlet embedding (SGE) that maps graphs into vector spaces. Our main contribution includes a new stochastic search procedure that efficiently parses a given graph and extracts/samples unlimitedly high-order graphlets. We consider
these graphlets, with increasing orders, to model local primitives as well as their increasingly complex interactions. In order to build our graph representation, we measure the distribution of these graphlets into a given graph, using particular hash functions that efficiently assign sampled graphlets into isomorphic sets with a very low probability of collision. When
combined with maximum margin classifiers, these graphlet-based representations have positive impact on the performance of pattern comparison and recognition as corroborated through extensive experiments using standard benchmark databases. |
|
|
Address |
|
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
|
|
|
Notes |
DAG; 602.167; 602.168; 600.097; 600.121 |
Approved |
no |
|
|
Call Number |
Admin @ si @ DuS2018 |
Serial |
3225 |
|
Permanent link to this record |
|
|
|
|
Author |
David Geronimo; Angel Sappa; Antonio Lopez |
|
|
Title |
Stereo-based Candidate Generation for Pedestrian Protection Systems |
Type |
Book Chapter |
|
Year |
2010 |
Publication |
Binocular Vision: Development, Depth Perception and Disorders |
Abbreviated Journal |
|
|
|
Volume |
|
Issue |
9 |
Pages |
189–208 |
|
|
Keywords |
Pedestrian Detection |
|
|
Abstract |
This chapter describes a stereo-based algorithm that provides candidate image windows to a latter 2D classification stage in an on-board pedestrian detection system. The proposed algorithm, which consists of three stages, is based on the use of both stereo imaging and scene prior knowledge (i.e., pedestrians are on the ground) to reduce the candidate searching space. First, a successful road surface fitting algorithm provides estimates on the relative ground-camera pose. This stage directs the search toward the road area thus avoiding irrelevant regions like the sky. Then, three different schemes are used to scan the estimated road surface with pedestrian-sized windows: (a) uniformly distributed through the road surface (3D); (b) uniformly distributed through the image (2D); (c) not uniformly distributed but according to a quadratic function (combined 2D-3D). Finally, the set of candidate windows is reduced by analyzing their 3D content. Experimental results of the proposed algorithm, together with statistics of searching space reduction are provided. |
|
|
Address |
|
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
NOVA Publishers |
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
|
|
|
Notes |
ADAS |
Approved |
no |
|
|
Call Number |
ADAS @ adas @ GSL2010 |
Serial |
1301 |
|
Permanent link to this record |
|
|
|
|
Author |
David Aldavert; Ricardo Toledo |
|
|
Title |
Stereo Vision Local Map Alignment for Robot Environment Mapping |
Type |
Book Chapter |
|
Year |
2008 |
Publication |
Robot Vision Second International Workshop, RobVis |
Abbreviated Journal |
|
|
|
Volume |
4931 |
Issue |
|
Pages |
111–124 |
|
|
Keywords |
|
|
|
Abstract |
|
|
|
Address |
Auckland (New Zealand) |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
LNCS |
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
|
|
|
Notes |
ADAS |
Approved |
no |
|
|
Call Number |
Admin @ si @ AlT2008 |
Serial |
1100 |
|
Permanent link to this record |
|
|
|
|
Author |
Angel Sappa; David Geronimo; Fadi Dornaika; Antonio Lopez |
|
|
Title |
Stereo Vision Camera Pose Estimation for On-Board Applications |
Type |
Book Chapter |
|
Year |
2007 |
Publication |
Scene Reconstruction, Pose Estimation and Traking |
Abbreviated Journal |
|
|
|
Volume |
|
Issue |
|
Pages |
39-50 |
|
|
Keywords |
|
|
|
Abstract |
|
|
|
Address |
|
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
Rustam Stolking |
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
978-3-902613-06-6 |
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
|
|
|
Notes |
ADAS |
Approved |
no |
|
|
Call Number |
ADAS @ adas @ SGD2007 |
Serial |
797 |
|
Permanent link to this record |
|
|
|
|
Author |
C. Sbert; A.F. Sole |
|
|
Title |
Stereo reconstruction of 3D curves. |
Type |
Conference Article |
|
Year |
2000 |
Publication |
15 th International Conference on Pattern Recognition |
Abbreviated Journal |
|
|
|
Volume |
1 |
Issue |
|
Pages |
912–915 |
|
|
Keywords |
|
|
|
Abstract |
|
|
|
Address |
Barcelona. |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
ICPR |
|
|
Notes |
|
Approved |
no |
|
|
Call Number |
Admin @ si @ SbS2000 |
Serial |
219 |
|
Permanent link to this record |
|
|
|
|
Author |
Daniel Hernandez; Alejandro Chacon; Antonio Espinosa; David Vazquez; Juan Carlos Moure; Antonio Lopez |
|
|
Title |
Stereo Matching using SGM on the GPU |
Type |
Report |
|
Year |
2016 |
Publication |
Programming and Tuning Massively Parallel Systems |
Abbreviated Journal |
PUMPS |
|
|
Volume |
|
Issue |
|
Pages |
|
|
|
Keywords |
CUDA; Stereo; Autonomous Vehicle |
|
|
Abstract |
Dense, robust and real-time computation of depth information from stereo-camera systems is a computationally demanding requirement for robotics, advanced driver assistance systems (ADAS) and autonomous vehicles. Semi-Global Matching (SGM) is a widely used algorithm that propagates consistency constraints along several paths across the image. This work presents a real-time system producing reliable disparity estimation results on the new embedded energy efficient GPU devices. Our design runs on a Tegra X1 at 42 frames per second (fps) for an image size of 640x480, 128 disparity levels, and using 4 path directions for the SGM method. |
|
|
Address |
|
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
PUMPS |
|
|
Notes |
ADAS; 600.085; 600.087; 600.076 |
Approved |
no |
|
|
Call Number |
ADAS @ adas @ HCE2016b |
Serial |
2776 |
|
Permanent link to this record |
|
|
|
|
Author |
Sergi Garcia Bordils; Dimosthenis Karatzas; Marçal Rusiñol |
|
|
Title |
STEP – Towards Structured Scene-Text Spotting |
Type |
Conference Article |
|
Year |
2024 |
Publication |
Winter Conference on Applications of Computer Vision |
Abbreviated Journal |
|
|
|
Volume |
|
Issue |
|
Pages |
883-892 |
|
|
Keywords |
|
|
|
Abstract |
We introduce the structured scene-text spotting task, which requires a scene-text OCR system to spot text in the wild according to a query regular expression. Contrary to generic scene text OCR, structured scene-text spotting seeks to dynamically condition both scene text detection and recognition on user-provided regular expressions. To tackle this task, we propose the Structured TExt sPotter (STEP), a model that exploits the provided text structure to guide the OCR process. STEP is able to deal with regular expressions that contain spaces and it is not bound to detection at the word-level granularity. Our approach enables accurate zero-shot structured text spotting in a wide variety of real-world reading scenarios and is solely trained on publicly available data. To demonstrate the effectiveness of our approach, we introduce a new challenging test dataset that contains several types of out-of-vocabulary structured text, reflecting important reading applications of fields such as prices, dates, serial numbers, license plates etc. We demonstrate that STEP can provide specialised OCR performance on demand in all tested scenarios. |
|
|
Address |
Waikoloa; Hawai; USA; January 2024 |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
WACV |
|
|
Notes |
DAG |
Approved |
no |
|
|
Call Number |
Admin @ si @ GKR2024 |
Serial |
3992 |
|
Permanent link to this record |