Publicacions CVC -- Query Results

	Publicacions CVC Home \| Show All \| Simple Search \| Advanced Search \| Add Record \| Import	Login Quick Search: Field: contains: ...
	16–26 of 26 records found matching your query (RSS \| history):

Search & Display Options

Select All Deselect All

<< 1 2 >>

List View

Citations

Details

	Records
	Author	Swathikiran Sudhakaran; Sergio Escalera; Oswald Lanz
	Title	Gate-Shift-Fuse for Video Action Recognition			Type	Journal Article
	Year	2023	Publication	IEEE Transactions on Pattern Analysis and Machine Intelligence	Abbreviated Journal	TPAMI
	Volume	45	Issue	9	Pages	10913-10928
	Keywords	Action Recognition; Video Classification; Spatial Gating; Channel Fusion
	Abstract	Convolutional Neural Networks are the de facto models for image recognition. However 3D CNNs, the straight forward extension of 2D CNNs for video recognition, have not achieved the same success on standard action recognition benchmarks. One of the main reasons for this reduced performance of 3D CNNs is the increased computational complexity requiring large scale annotated datasets to train them in scale. 3D kernel factorization approaches have been proposed to reduce the complexity of 3D CNNs. Existing kernel factorization approaches follow hand-designed and hard-wired techniques. In this paper we propose Gate-Shift-Fuse (GSF), a novel spatio-temporal feature extraction module which controls interactions in spatio-temporal decomposition and learns to adaptively route features through time and combine them in a data dependent manner. GSF leverages grouped spatial gating to decompose input tensor and channel weighting to fuse the decomposed tensors. GSF can be inserted into existing 2D CNNs to convert them into an efficient and high performing spatio-temporal feature extractor, with negligible parameter and compute overhead. We perform an extensive analysis of GSF using two popular 2D CNN families and achieve state-of-the-art or competitive performance on five standard action recognition benchmarks.
	Address	1 Sept. 2023
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference
	Notes	HUPBA; no menciona			Approved	no
	Call Number	Admin @ si @ SEL2023			Serial	3814
Permanent link to this record



	Author	Xialei Liu; Joost Van de Weijer; Andrew Bagdanov
	Title	Exploiting Unlabeled Data in CNNs by Self-Supervised Learning to Rank			Type	Journal Article
	Year	2019	Publication	IEEE Transactions on Pattern Analysis and Machine Intelligence	Abbreviated Journal	TPAMI
	Volume	41	Issue	8	Pages	1862-1878
	Keywords	Task analysis;Training;Image quality;Visualization;Uncertainty;Labeling;Neural networks;Learning from rankings;image quality assessment;crowd counting;active learning
	Abstract	For many applications the collection of labeled data is expensive laborious. Exploitation of unlabeled data during training is thus a long pursued objective of machine learning. Self-supervised learning addresses this by positing an auxiliary task (different, but related to the supervised task) for which data is abundantly available. In this paper, we show how ranking can be used as a proxy task for some regression problems. As another contribution, we propose an efficient backpropagation technique for Siamese networks which prevents the redundant computation introduced by the multi-branch network architecture. We apply our framework to two regression problems: Image Quality Assessment (IQA) and Crowd Counting. For both we show how to automatically generate ranked image sets from unlabeled data. Our results show that networks trained to regress to the ground truth targets for labeled data and to simultaneously learn to rank unlabeled data obtain significantly better, state-of-the-art results for both IQA and crowd counting. In addition, we show that measuring network uncertainty on the self-supervised proxy task is a good measure of informativeness of unlabeled data. This can be used to drive an algorithm for active learning and we show that this reduces labeling effort by up to 50 percent.
	Address
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference
	Notes	LAMP; 600.109; 600.106; 600.120			Approved	no
	Call Number	LWB2019			Serial	3267
Permanent link to this record



	Author	Miguel Angel Bautista; Oriol Pujol; Fernando De la Torre; Sergio Escalera
	Title	Error-Correcting Factorization			Type	Journal Article
	Year	2018	Publication	IEEE Transactions on Pattern Analysis and Machine Intelligence	Abbreviated Journal	TPAMI
	Volume	40	Issue		Pages	2388-2401
	Keywords
	Abstract	Error Correcting Output Codes (ECOC) is a successful technique in multi-class classification, which is a core problem in Pattern Recognition and Machine Learning. A major advantage of ECOC over other methods is that the multi- class problem is decoupled into a set of binary problems that are solved independently. However, literature defines a general error-correcting capability for ECOCs without analyzing how it distributes among classes, hindering a deeper analysis of pair-wise error-correction. To address these limitations this paper proposes an Error-Correcting Factorization (ECF) method, our contribution is three fold: (I) We propose a novel representation of the error-correction capability, called the design matrix, that enables us to build an ECOC on the basis of allocating correction to pairs of classes. (II) We derive the optimal code length of an ECOC using rank properties of the design matrix. (III) ECF is formulated as a discrete optimization problem, and a relaxed solution is found using an efficient constrained block coordinate descent approach. (IV) Enabled by the flexibility introduced with the design matrix we propose to allocate the error-correction on classes that are prone to confusion. Experimental results in several databases show that when allocating the error-correction to confusable classes ECF outperforms state-of-the-art approaches.
	Address
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN	0162-8828	ISBN		Medium
	Area		Expedition		Conference
	Notes	HuPBA; no menciona			Approved	no
	Call Number	Admin @ si @ BPT2018			Serial	3015
Permanent link to this record



	Author	Jiaolong Xu; Sebastian Ramos; David Vazquez; Antonio Lopez
	Title	Domain Adaptation of Deformable Part-Based Models			Type	Journal Article
	Year	2014	Publication	IEEE Transactions on Pattern Analysis and Machine Intelligence	Abbreviated Journal	TPAMI
	Volume	36	Issue	12	Pages	2367-2380
	Keywords	Domain Adaptation; Pedestrian Detection
	Abstract	The accuracy of object classifiers can significantly drop when the training data (source domain) and the application scenario (target domain) have inherent differences. Therefore, adapting the classifiers to the scenario in which they must operate is of paramount importance. We present novel domain adaptation (DA) methods for object detection. As proof of concept, we focus on adapting the state-of-the-art deformable part-based model (DPM) for pedestrian detection. We introduce an adaptive structural SVM (A-SSVM) that adapts a pre-learned classifier between different domains. By taking into account the inherent structure in feature space (e.g., the parts in a DPM), we propose a structure-aware A-SSVM (SA-SSVM). Neither A-SSVM nor SA-SSVM needs to revisit the source-domain training data to perform the adaptation. Rather, a low number of target-domain training examples (e.g., pedestrians) are used. To address the scenario where there are no target-domain annotated samples, we propose a self-adaptive DPM based on a self-paced learning (SPL) strategy and a Gaussian Process Regression (GPR). Two types of adaptation tasks are assessed: from both synthetic pedestrians and general persons (PASCAL VOC) to pedestrians imaged from an on-board camera. Results show that our proposals avoid accuracy drops as high as 15 points when comparing adapted and non-adapted detectors.
	Address
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN	0162-8828	ISBN		Medium
	Area		Expedition		Conference
	Notes	ADAS; 600.057; 600.054; 601.217; 600.076			Approved	no
	Call Number	ADAS @ adas @ XRV2014b			Serial	2436
Permanent link to this record



	Author	Mohamed Ali Souibgui; Y.Kessentini
	Title	DE-GAN: A Conditional Generative Adversarial Network for Document Enhancement			Type	Journal Article
	Year	2022	Publication	IEEE Transactions on Pattern Analysis and Machine Intelligence	Abbreviated Journal	TPAMI
	Volume	44	Issue	3	Pages	1180-1191
	Keywords
	Abstract	Documents often exhibit various forms of degradation, which make it hard to be read and substantially deteriorate the performance of an OCR system. In this paper, we propose an effective end-to-end framework named Document Enhancement Generative Adversarial Networks (DE-GAN) that uses the conditional GANs (cGANs) to restore severely degraded document images. To the best of our knowledge, this practice has not been studied within the context of generative adversarial deep networks. We demonstrate that, in different tasks (document clean up, binarization, deblurring and watermark removal), DE-GAN can produce an enhanced version of the degraded document with a high quality. In addition, our approach provides consistent improvements compared to state-of-the-art methods over the widely used DIBCO 2013, DIBCO 2017 and H-DIBCO 2018 datasets, proving its ability to restore a degraded document image to its ideal condition. The obtained results on a wide variety of degradation reveal the flexibility of the proposed model to be exploited in other document enhancement problems.
	Address	1 March 2022
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference
	Notes	DAG; 602.230; 600.121; 600.140			Approved	no
	Call Number	Admin @ si @ SoK2022			Serial	3454
Permanent link to this record



	Author	Lei Kang; Pau Riba; Marcal Rusinol; Alicia Fornes; Mauricio Villegas
	Title	Content and Style Aware Generation of Text-line Images for Handwriting Recognition			Type	Journal Article
	Year	2021	Publication	IEEE Transactions on Pattern Analysis and Machine Intelligence	Abbreviated Journal	TPAMI
	Volume		Issue		Pages
	Keywords
	Abstract	Handwritten Text Recognition has achieved an impressive performance in public benchmarks. However, due to the high inter- and intra-class variability between handwriting styles, such recognizers need to be trained using huge volumes of manually labeled training data. To alleviate this labor-consuming problem, synthetic data produced with TrueType fonts has been often used in the training loop to gain volume and augment the handwriting style variability. However, there is a significant style bias between synthetic and real data which hinders the improvement of recognition performance. To deal with such limitations, we propose a generative method for handwritten text-line images, which is conditioned on both visual appearance and textual content. Our method is able to produce long text-line samples with diverse handwriting styles. Once properly trained, our method can also be adapted to new target data by only accessing unlabeled text-line images to mimic handwritten styles and produce images with any textual content. Extensive experiments have been done on making use of the generated samples to boost Handwritten Text Recognition performance. Both qualitative and quantitative results demonstrate that the proposed approach outperforms the current state of the art.
	Address
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference
	Notes	DAG; 600.140; 600.121			Approved	no
	Call Number	Admin @ si @ KRR2021			Serial	3612
Permanent link to this record



	Author	Arash Akbarinia; C. Alejandro Parraga
	Title	Colour Constancy Beyond the Classical Receptive Field			Type	Journal Article
	Year	2018	Publication	IEEE Transactions on Pattern Analysis and Machine Intelligence	Abbreviated Journal	TPAMI
	Volume	40	Issue	9	Pages	2081 - 2094
	Keywords
	Abstract	The problem of removing illuminant variations to preserve the colours of objects (colour constancy) has already been solved by the human brain using mechanisms that rely largely on centre-surround computations of local contrast. In this paper we adopt some of these biological solutions described by long known physiological findings into a simple, fully automatic, functional model (termed Adaptive Surround Modulation or ASM). In ASM, the size of a visual neuron's receptive field (RF) as well as the relationship with its surround varies according to the local contrast within the stimulus, which in turn determines the nature of the centre-surround normalisation of cortical neurons higher up in the processing chain. We modelled colour constancy by means of two overlapping asymmetric Gaussian kernels whose sizes are adapted based on the contrast of the surround pixels, resembling the change of RF size. We simulated the contrast-dependent surround modulation by weighting the contribution of each Gaussian according to the centre-surround contrast. In the end, we obtained an estimation of the illuminant from the set of the most activated RFs' outputs. Our results on three single-illuminant and one multi-illuminant benchmark datasets show that ASM is highly competitive against the state-of-the-art and it even outperforms learning-based algorithms in one case. Moreover, the robustness of our model is more tangible if we consider that our results were obtained using the same parameters for all datasets, that is, mimicking how the human visual system operates. These results might provide an insight on how dynamical adaptation mechanisms contribute to make object's colours appear constant to us.
	Address
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference
	Notes	NEUROBIT; 600.068; 600.072			Approved	no
	Call Number	Admin @ si @ AkP2018a			Serial	2990
Permanent link to this record



	Author	Arjan Gijsenij; Theo Gevers
	Title	Color Constancy Using Natural Image Statistics and Scene Semantics			Type	Journal Article
	Year	2011	Publication	IEEE Transactions on Pattern Analysis and Machine Intelligence	Abbreviated Journal	TPAMI
	Volume	33	Issue	4	Pages	687-698
	Keywords
	Abstract	Existing color constancy methods are all based on specific assumptions such as the spatial and spectral characteristics of images. As a consequence, no algorithm can be considered as universal. However, with the large variety of available methods, the question is how to select the method that performs best for a specific image. To achieve selection and combining of color constancy algorithms, in this paper natural image statistics are used to identify the most important characteristics of color images. Then, based on these image characteristics, the proper color constancy algorithm (or best combination of algorithms) is selected for a specific image. To capture the image characteristics, the Weibull parameterization (e.g., grain size and contrast) is used. It is shown that the Weibull parameterization is related to the image attributes to which the used color constancy methods are sensitive. An MoG-classifier is used to learn the correlation and weighting between the Weibull-parameters and the image attributes (number of edges, amount of texture, and SNR). The output of the classifier is the selection of the best performing color constancy method for a certain image. Experimental results show a large improvement over state-of-the-art single algorithms. On a data set consisting of more than 11,000 images, an increase in color constancy performance up to 20 percent (median angular error) can be obtained compared to the best-performing single algorithm. Further, it is shown that for certain scene categories, one specific color constancy algorithm can be used instead of the classifier considering several algorithms.
	Address
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN	0162-8828	ISBN		Medium
	Area		Expedition		Conference
	Notes	ISE			Approved	no
	Call Number	Admin @ si @ GiG2011			Serial	1724
Permanent link to this record



	Author	Marc Masana; Xialei Liu; Bartlomiej Twardowski; Mikel Menta; Andrew Bagdanov; Joost Van de Weijer
	Title	Class-incremental learning: survey and performance evaluation			Type	Journal Article
	Year	2022	Publication	IEEE Transactions on Pattern Analysis and Machine Intelligence	Abbreviated Journal	TPAMI
	Volume		Issue		Pages
	Keywords
	Abstract	For future learning systems incremental learning is desirable, because it allows for: efficient resource usage by eliminating the need to retrain from scratch at the arrival of new data; reduced memory usage by preventing or limiting the amount of data required to be stored -- also important when privacy limitations are imposed; and learning that more closely resembles human learning. The main challenge for incremental learning is catastrophic forgetting, which refers to the precipitous drop in performance on previously learned tasks after learning a new one. Incremental learning of deep neural networks has seen explosive growth in recent years. Initial work focused on task incremental learning, where a task-ID is provided at inference time. Recently we have seen a shift towards class-incremental learning where the learner must classify at inference time between all classes seen in previous tasks without recourse to a task-ID. In this paper, we provide a complete survey of existing methods for incremental learning, and in particular we perform an extensive experimental evaluation on twelve class-incremental methods. We consider several new experimental scenarios, including a comparison of class-incremental methods on multiple large-scale datasets, investigation into small and large domain shifts, and comparison on various network architectures.
	Address
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference
	Notes	LAMP; 600.120			Approved	no
	Call Number	Admin @ si @ MLT2022			Serial	3538
Permanent link to this record



	Author	Albert Gordo; Florent Perronnin; Yunchao Gong; Svetlana Lazebnik
	Title	Asymmetric Distances for Binary Embeddings			Type	Journal Article
	Year	2014	Publication	IEEE Transactions on Pattern Analysis and Machine Intelligence	Abbreviated Journal	TPAMI
	Volume	36	Issue	1	Pages	33-47
	Keywords
	Abstract	In large-scale query-by-example retrieval, embedding image signatures in a binary space offers two benefits: data compression and search efficiency. While most embedding algorithms binarize both query and database signatures, it has been noted that this is not strictly a requirement. Indeed, asymmetric schemes which binarize the database signatures but not the query still enjoy the same two benefits but may provide superior accuracy. In this work, we propose two general asymmetric distances which are applicable to a wide variety of embedding techniques including Locality Sensitive Hashing (LSH), Locality Sensitive Binary Codes (LSBC), Spectral Hashing (SH), PCA Embedding (PCAE), PCA Embedding with random rotations (PCAE-RR), and PCA Embedding with iterative quantization (PCAE-ITQ). We experiment on four public benchmarks containing up to 1M images and show that the proposed asymmetric distances consistently lead to large improvements over the symmetric Hamming distance for all binary embedding techniques.
	Address
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN	0162-8828	ISBN		Medium
	Area		Expedition		Conference
	Notes	DAG; 600.045; 605.203; 600.077			Approved	no
	Call Number	Admin @ si @ GPG2014			Serial	2272
Permanent link to this record



	Author	E. Provenzi; Carlo Gatta; M. Fierro; A. Rizzi
	Title	A Spatially Variant White-Patch and Gray-World Method for Color Image Enhancement Driven by Local Constant			Type	Journal
	Year	2008	Publication	IEEE Transactions on Pattern Analysis and Machine Intelligence	Abbreviated Journal	TPAMI
	Volume	30	Issue	10	Pages	1757–1770
	Keywords
	Abstract
	Address
	Corporate Author				Thesis
	Publisher		Place of Publication		Editor
	Language		Summary Language		Original Title
	Series Editor		Series Title		Abbreviated Series Title
	Series Volume		Series Issue		Edition
	ISSN		ISBN		Medium
	Area		Expedition		Conference
	Notes	MILAB			Approved	no
	Call Number	BCNPCL @ bcnpcl @ PGF2008			Serial	1001
Permanent link to this record