|
Records |
Links |
|
Author |
Sangeeth Reddy; Minesh Mathew; Lluis Gomez; Marçal Rusiñol; Dimosthenis Karatzas; C.V. Jawahar |
|
|
Title |
RoadText-1K: Text Detection and Recognition Dataset for Driving Videos |
Type |
Conference Article |
|
Year |
2020 |
Publication |
IEEE International Conference on Robotics and Automation |
Abbreviated Journal |
|
|
|
Volume |
|
Issue |
|
Pages |
|
|
|
Keywords |
|
|
|
Abstract |
Perceiving text is crucial to understand semantics of outdoor scenes and hence is a critical requirement to build intelligent systems for driver assistance and self-driving. Most of the existing datasets for text detection and recognition comprise still images and are mostly compiled keeping text in mind. This paper introduces a new ”RoadText-1K” dataset for text in driving videos. The dataset is 20 times larger than the existing largest dataset for text in videos. Our dataset comprises 1000 video clips of driving without any bias towards text and with annotations for text bounding boxes and transcriptions in every frame. State of the art methods for text detection,
recognition and tracking are evaluated on the new dataset and the results signify the challenges in unconstrained driving videos compared to existing datasets. This suggests that RoadText-1K is suited for research and development of reading systems, robust enough to be incorporated into more complex downstream tasks like driver assistance and self-driving. The dataset can be found at http://cvit.iiit.ac.in/research/
projects/cvit-projects/roadtext-1k |
|
|
Address |
Paris; Francia; ??? |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
ICRA |
|
|
Notes |
DAG; 600.121; 600.129 |
Approved |
no |
|
|
Call Number |
Admin @ si @ RMG2020 |
Serial |
3400 |
|
Permanent link to this record |
|
|
|
|
Author |
Alloy Das; Sanket Biswas; Umapada Pal; Josep Llados |
|
|
Title |
Diving into the Depths of Spotting Text in Multi-Domain Noisy Scenes |
Type |
Conference Article |
|
Year |
2024 |
Publication |
IEEE International Conference on Robotics and Automation in PACIFICO |
Abbreviated Journal |
|
|
|
Volume |
|
Issue |
|
Pages |
|
|
|
Keywords |
|
|
|
Abstract |
When used in a real-world noisy environment, the capacity to generalize to multiple domains is essential for any autonomous scene text spotting system. However, existing state-of-the-art methods employ pretraining and fine-tuning strategies on natural scene datasets, which do not exploit the feature interaction across other complex domains. In this work, we explore and investigate the problem of domain-agnostic scene text spotting, i.e., training a model on multi-domain source data such that it can directly generalize to target domains rather than being specialized for a specific domain or scenario. In this regard, we present the community a text spotting validation benchmark called Under-Water Text (UWT) for noisy underwater scenes to establish an important case study. Moreover, we also design an efficient super-resolution based end-to-end transformer baseline called DA-TextSpotter which achieves comparable or superior performance over existing text spotting architectures for both regular and arbitrary-shaped scene text spotting benchmarks in terms of both accuracy and model efficiency. The dataset, code and pre-trained models will be released upon acceptance. |
|
|
Address |
Yokohama; Japan; May 2024 |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
ICRA |
|
|
Notes |
DAG |
Approved |
no |
|
|
Call Number |
Admin @ si @ DBP2024 |
Serial |
3979 |
|
Permanent link to this record |
|
|
|
|
Author |
Fadi Dornaika; Bogdan Raducanu |
|
|
Title |
Detecting and Tracking of 3D Face Pose for Human-Robot Interaction |
Type |
Conference Article |
|
Year |
2008 |
Publication |
IEEE International Conference on Robotics and Automation, |
Abbreviated Journal |
|
|
|
Volume |
|
Issue |
|
Pages |
1716–1721 |
|
|
Keywords |
|
|
|
Abstract |
|
|
|
Address |
Pasadena; CA; USA |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
ICRA |
|
|
Notes |
OR;MV |
Approved |
no |
|
|
Call Number |
BCNPCL @ bcnpcl @ DoR2008a |
Serial |
982 |
|
Permanent link to this record |
|
|
|
|
Author |
Arnau Ramisa; Adriana Tapus; Ramon Lopez de Mantaras; Ricardo Toledo |
|
|
Title |
Mobile Robot Localization using Panoramic Vision and Combination of Feature Region Detectors |
Type |
Conference Article |
|
Year |
2008 |
Publication |
IEEE International Conference on Robotics and Automation, |
Abbreviated Journal |
|
|
|
Volume |
|
Issue |
|
Pages |
538–543 |
|
|
Keywords |
|
|
|
Abstract |
|
|
|
Address |
Pasadena; CA; USA |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
ICRA |
|
|
Notes |
RV;ADAS |
Approved |
no |
|
|
Call Number |
Admin @ si @ RTL2008 |
Serial |
1144 |
|
Permanent link to this record |
|
|
|
|
Author |
T. Alejandra Vidal; Andrew J. Davison; Juan Andrade; David W. Murray |
|
|
Title |
Active Control for Single Camera SLAM |
Type |
Miscellaneous |
|
Year |
2006 |
Publication |
IEEE International Conference on Robotics and Automation, 1930–1936 |
Abbreviated Journal |
|
|
|
Volume |
|
Issue |
|
Pages |
|
|
|
Keywords |
|
|
|
Abstract |
|
|
|
Address |
Orlando (Florida) |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
|
|
|
Notes |
|
Approved |
no |
|
|
Call Number |
DAG @ dag @ VDA2006 |
Serial |
666 |
|
Permanent link to this record |
|
|
|
|
Author |
Zhong Jin; Franck Davoine; Zhen Lou |
|
|
Title |
Facial expression analysis by using KPCA |
Type |
Miscellaneous |
|
Year |
2003 |
Publication |
IEEE International Conference on Robotics, Intelligent Systems and Signal Processing (IEEE RISSP 2003), pp736–741 |
Abbreviated Journal |
|
|
|
Volume |
|
Issue |
|
Pages |
|
|
|
Keywords |
|
|
|
Abstract |
|
|
|
Address |
Changsha, Hunan, China |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
|
|
|
Notes |
|
Approved |
no |
|
|
Call Number |
Admin @ si @ JDL2003 |
Serial |
431 |
|
Permanent link to this record |
|
|
|
|
Author |
Vishwesh Pillai; Pranav Mehar; Manisha Das; Deep Gupta; Petia Radeva |
|
|
Title |
Integrated Hierarchical and Flat Classifiers for Food Image Classification using Epistemic Uncertainty |
Type |
Conference Article |
|
Year |
2022 |
Publication |
IEEE International Conference on Signal Processing and Communications |
Abbreviated Journal |
|
|
|
Volume |
|
Issue |
|
Pages |
|
|
|
Keywords |
|
|
|
Abstract |
The problem of food image recognition is an essential one in today’s context because health conditions such as diabetes, obesity, and heart disease require constant monitoring of a person’s diet. To automate this process, several models are available to recognize food images. Due to a considerable number of unique food dishes and various cuisines, a traditional flat classifier ceases to perform well. To address this issue, prediction schemes consisting of both flat and hierarchical classifiers, with the analysis of epistemic uncertainty are used to switch between the classifiers. However, the accuracy of the predictions made using epistemic uncertainty data remains considerably low. Therefore, this paper presents a prediction scheme using three different threshold criteria that helps to increase the accuracy of epistemic uncertainty predictions. The performance of the proposed method is demonstrated using several experiments performed on the MAFood-121 dataset. The experimental results validate the proposal performance and show that the proposed threshold criteria help to increase the overall accuracy of the predictions by correctly classifying the uncertainty distribution of the samples. |
|
|
Address |
Bangalore; India; July 2022 |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
SPCOM |
|
|
Notes |
MILAB; no menciona |
Approved |
no |
|
|
Call Number |
Admin @ si @ PMD2022 |
Serial |
3796 |
|
Permanent link to this record |
|
|
|
|
Author |
M. Campos-Taberner; Adriana Romero; Carlo Gatta; Gustavo Camps-Valls |
|
|
Title |
Shared feature representations of LiDAR and optical images: Trading sparsity for semantic discrimination |
Type |
Conference Article |
|
Year |
2015 |
Publication |
IEEE International Geoscience and Remote Sensing Symposium IGARSS2015 |
Abbreviated Journal |
|
|
|
Volume |
|
Issue |
|
Pages |
4169 - 4172 |
|
|
Keywords |
|
|
|
Abstract |
This paper studies the level of complementary information conveyed by extremely high resolution LiDAR and optical images. We pursue this goal following an indirect approach via unsupervised spatial-spectral feature extraction. We used a recently presented unsupervised convolutional neural network trained to enforce both population and lifetime spar-sity in the feature representation. We derived independent and joint feature representations, and analyzed the sparsity scores and the discriminative power. Interestingly, the obtained results revealed that the RGB+LiDAR representation is no longer sparse, and the derived basis functions merge color and elevation yielding a set of more expressive colored edge filters. The joint feature representation is also more discriminative when used for clustering and topological data visualization. |
|
|
Address |
Milan; Italy; July 2015 |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
IGARSS |
|
|
Notes |
LAMP; 600.079;MILAB |
Approved |
no |
|
|
Call Number |
Admin @ si @ CRG2015 |
Serial |
2724 |
|
Permanent link to this record |
|
|
|
|
Author |
Sergio Escalera; Jordi Gonzalez; Xavier Baro; Pablo Pardo; Junior Fabian; Marc Oliu; Hugo Jair Escalante; Ivan Huerta; Isabelle Guyon |
|
|
Title |
ChaLearn Looking at People 2015 new competitions: Age Estimation and Cultural Event Recognition |
Type |
Conference Article |
|
Year |
2015 |
Publication |
IEEE International Joint Conference on Neural Networks IJCNN2015 |
Abbreviated Journal |
|
|
|
Volume |
|
Issue |
|
Pages |
1-8 |
|
|
Keywords |
|
|
|
Abstract |
Following previous series on Looking at People (LAP) challenges [1], [2], [3], in 2015 ChaLearn runs two new competitions within the field of Looking at People: age and cultural event recognition in still images. We propose thefirst crowdsourcing application to collect and label data about apparent
age of people instead of the real age. In terms of cultural event recognition, tens of categories have to be recognized. This involves scene understanding and human analysis. This paper summarizes both challenges and data, providing some initial baselines. The results of the first round of the competition were presented at ChaLearn LAP 2015 IJCNN special session on computer vision and robotics http://www.dtic.ua.es/∼jgarcia/IJCNN2015.
Details of the ChaLearn LAP competitions can be found at http://gesture.chalearn.org/. |
|
|
Address |
Killarney; Ireland; July 2015 |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
IJCNN |
|
|
Notes |
HuPBA; ISE; 600.063; 600.078;MV |
Approved |
no |
|
|
Call Number |
Admin @ si @ EGB2015 |
Serial |
2591 |
|
Permanent link to this record |
|
|
|
|
Author |
Hugo Jair Escalante; Jose Martinez; Sergio Escalera; Victor Ponce; Xavier Baro |
|
|
Title |
Improving Bag of Visual Words Representations with Genetic Programming |
Type |
Conference Article |
|
Year |
2015 |
Publication |
IEEE International Joint Conference on Neural Networks IJCNN2015 |
Abbreviated Journal |
|
|
|
Volume |
|
Issue |
|
Pages |
|
|
|
Keywords |
|
|
|
Abstract |
The bag of visual words is a well established representation in diverse computer vision problems. Taking inspiration from the fields of text mining and retrieval, this representation has proved to be very effective in a large number of domains.
In most cases, a standard term-frequency weighting scheme is considered for representing images and videos in computer vision. This is somewhat surprising, as there are many alternative ways of generating bag of words representations within the text processing community. This paper explores the use of alternative weighting schemes for landmark tasks in computer vision: image
categorization and gesture recognition. We study the suitability of using well-known supervised and unsupervised weighting schemes for such tasks. More importantly, we devise a genetic program that learns new ways of representing images and videos under the bag of visual words representation. The proposed method learns to combine term-weighting primitives trying to maximize the classification performance. Experimental results are reported in standard image and video data sets showing the effectiveness of the proposed evolutionary algorithm. |
|
|
Address |
Killarney; Ireland; July 2015 |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
IJCNN |
|
|
Notes |
HuPBA;MV |
Approved |
no |
|
|
Call Number |
Admin @ si @ EME2015 |
Serial |
2603 |
|
Permanent link to this record |
|
|
|
|
Author |
Isabelle Guyon; Kristin Bennett; Gavin Cawley; Hugo Jair Escalante; Sergio Escalera; Tin Kam Ho; Nuria Macia; Bisakha Ray; Alexander Statnikov; Evelyne Viegas |
|
|
Title |
Design of the 2015 ChaLearn AutoML Challenge |
Type |
Conference Article |
|
Year |
2015 |
Publication |
IEEE International Joint Conference on Neural Networks IJCNN2015 |
Abbreviated Journal |
|
|
|
Volume |
|
Issue |
|
Pages |
|
|
|
Keywords |
|
|
|
Abstract |
ChaLearn is organizing for IJCNN 2015 an Automatic Machine Learning challenge (AutoML) to solve classification and regression problems from given feature representations, without any human intervention. This is a challenge with code
submission: the code submitted can be executed automatically on the challenge servers to train and test learning machines on new datasets. However, there is no obligation to submit code. Half of the prizes can be won by just submitting prediction results.
There are six rounds (Prep, Novice, Intermediate, Advanced, Expert, and Master) in which datasets of progressive difficulty are introduced (5 per round). There is no requirement to participate in previous rounds to enter a new round. The rounds alternate AutoML phases in which submitted code is “blind tested” on
datasets the participants have never seen before, and Tweakathon phases giving time (' 1 month) to the participants to improve their methods by tweaking their code on those datasets. This challenge will push the state-of-the-art in fully automatic machine learning on a wide range of problems taken from real world
applications. The platform will remain available beyond the termination of the challenge: http://codalab.org/AutoML |
|
|
Address |
Killarney; Ireland; July 2015 |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
IJCNN |
|
|
Notes |
HuPBA;MILAB |
Approved |
no |
|
|
Call Number |
Admin @ si @ GBC2015a |
Serial |
2604 |
|
Permanent link to this record |
|
|
|
|
Author |
Isabelle Guyon; Kristin Bennett; Gavin Cawley; Hugo Jair Escalante; Sergio Escalera |
|
|
Title |
The AutoML challenge on codalab |
Type |
Conference Article |
|
Year |
2015 |
Publication |
IEEE International Joint Conference on Neural Networks IJCNN2015 |
Abbreviated Journal |
|
|
|
Volume |
|
Issue |
|
Pages |
|
|
|
Keywords |
|
|
|
Abstract |
|
|
|
Address |
Killarney; Ireland; July 2015 |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
IJCNN |
|
|
Notes |
HuPBA;MILAB |
Approved |
no |
|
|
Call Number |
Admin @ si @ GBC2015b |
Serial |
2650 |
|
Permanent link to this record |
|
|
|
|
Author |
Gerard Canal; Cecilio Angulo; Sergio Escalera |
|
|
Title |
Gesture based Human Multi-Robot interaction |
Type |
Conference Article |
|
Year |
2015 |
Publication |
IEEE International Joint Conference on Neural Networks IJCNN2015 |
Abbreviated Journal |
|
|
|
Volume |
|
Issue |
|
Pages |
|
|
|
Keywords |
|
|
|
Abstract |
The emergence of robot applications for nontechnical users implies designing new ways of interaction between robotic platforms and users. The main goal of this work is the development of a gestural interface to interact with robots
in a similar way as humans do, allowing the user to provide information of the task with non-verbal communication. The gesture recognition application has been implemented using the Microsoft’s KinectTM v2 sensor. Hence, a real-time algorithm based on skeletal features is described to deal with both, static
gestures and dynamic ones, being the latter recognized using a weighted Dynamic Time Warping method. The gesture recognition application has been implemented in a multi-robot case.
A NAO humanoid robot is in charge of interacting with the users and respond to the visual signals they produce. Moreover, a wheeled Wifibot robot carries both the sensor and the NAO robot, easing navigation when necessary. A broad set of user tests have been carried out demonstrating that the system is, indeed, a
natural approach to human robot interaction, with a fast response and easy to use, showing high gesture recognition rates. |
|
|
Address |
Killarney; Ireland; July 2015 |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
IJCNN |
|
|
Notes |
HuPBA;MILAB |
Approved |
no |
|
|
Call Number |
CAE2015a |
Serial |
2651 |
|
Permanent link to this record |
|
|
|
|
Author |
Debora Gil; Guillermo Torres; Carles Sanchez |
|
|
Title |
Transforming radiomic features into radiological words |
Type |
Conference Article |
|
Year |
2023 |
Publication |
IEEE International Symposium on Biomedical Imaging |
Abbreviated Journal |
|
|
|
Volume |
|
Issue |
|
Pages |
|
|
|
Keywords |
|
|
|
Abstract |
Pòster |
|
|
Address |
Cartagena de Indias; Colombia; April 2023 |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
ISBI |
|
|
Notes |
IAM |
Approved |
no |
|
|
Call Number |
Admin @ si @ GTS2023 |
Serial |
3952 |
|
Permanent link to this record |
|
|
|
|
Author |
Pau Cano; Debora Gil; Eva Musulen |
|
|
Title |
Towards automatic detection of helicobacter pylori in histological samples of gastric tissue |
Type |
Conference Article |
|
Year |
2023 |
Publication |
IEEE International Symposium on Biomedical Imaging |
Abbreviated Journal |
|
|
|
Volume |
|
Issue |
|
Pages |
|
|
|
Keywords |
|
|
|
Abstract |
|
|
|
Address |
Cartagena de Indias; Colombia; April 2023 |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
ISBI |
|
|
Notes |
IAM |
Approved |
no |
|
|
Call Number |
Admin @ si @ CGM2023 |
Serial |
3953 |
|
Permanent link to this record |