|
Records |
Links |
|
Author |
Patricia Marquez; Debora Gil ; Aura Hernandez-Sabate |
|
|
Title |
Error Analysis for Lucas-Kanade Based Schemes |
Type |
Conference Article |
|
Year |
2012 |
Publication |
9th International Conference on Image Analysis and Recognition |
Abbreviated Journal |
|
|
|
Volume |
7324 |
Issue |
I |
Pages |
184-191 |
|
|
Keywords |
Optical flow, Confidence measure, Lucas-Kanade, Cardiac Magnetic Resonance |
|
|
Abstract |
Optical flow is a valuable tool for motion analysis in medical imaging sequences. A reliable application requires determining the accuracy of the computed optical flow. This is a main challenge given the absence of ground truth in medical sequences. This paper presents an error analysis of Lucas-Kanade schemes in terms of intrinsic design errors and numerical stability of the algorithm. Our analysis provides a confidence measure that is naturally correlated to the accuracy of the flow field. Our experiments show the higher predictive value of our confidence measure compared to existing measures. |
|
|
Address |
Aveiro, Portugal |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
Springer-Verlag Berlin Heidelberg |
Place of Publication |
|
Editor |
|
|
|
Language |
english |
Summary Language |
|
Original Title |
|
|
|
Series Editor |
Campilho, Aurélio and Kamel, Mohamed |
Series Title |
Lecture Notes in Computer Science |
Abbreviated Series Title |
LNCS |
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
0302-9743 |
ISBN |
978-3-642-31294-6 |
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
ICIAR |
|
|
Notes |
IAM |
Approved |
no |
|
|
Call Number |
IAM @ iam @ MGH2012a |
Serial |
1899 |
|
Permanent link to this record |
|
|
|
|
Author |
Partha Pratim Roy; Umapada Pal; Josep Llados; Mathieu Nicolas Delalandre |
|
|
Title |
Multi-oriented touching text character segmentation in graphical documents using dynamic programming |
Type |
Journal Article |
|
Year |
2012 |
Publication |
Pattern Recognition |
Abbreviated Journal |
PR |
|
|
Volume |
45 |
Issue |
5 |
Pages |
1972-1983 |
|
|
Keywords |
|
|
|
Abstract |
2,292 JCR
The touching character segmentation problem becomes complex when touching strings are multi-oriented. Moreover in graphical documents sometimes characters in a single-touching string have different orientations. Segmentation of such complex touching is more challenging. In this paper, we present a scheme towards the segmentation of English multi-oriented touching strings into individual characters. When two or more characters touch, they generate a big cavity region in the background portion. Based on the convex hull information, at first, we use this background information to find some initial points for segmentation of a touching string into possible primitives (a primitive consists of a single character or part of a character). Next, the primitives are merged to get optimum segmentation. A dynamic programming algorithm is applied for this purpose using the total likelihood of characters as the objective function. A SVM classifier is used to find the likelihood of a character. To consider multi-oriented touching strings the features used in the SVM are invariant to character orientation. Experiments were performed in different databases of real and synthetic touching characters and the results show that the method is efficient in segmenting touching characters of arbitrary orientations and sizes. |
|
|
Address |
|
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
0031-3203 |
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
|
|
|
Notes |
DAG |
Approved |
no |
|
|
Call Number |
Admin @ si @ RPL2012a |
Serial |
2133 |
|
Permanent link to this record |
|
|
|
|
Author |
Partha Pratim Roy; Umapada Pal; Josep Llados |
|
|
Title |
Text line extraction in graphical documents using background and foreground |
Type |
Journal Article |
|
Year |
2012 |
Publication |
International Journal on Document Analysis and Recognition |
Abbreviated Journal |
IJDAR |
|
|
Volume |
15 |
Issue |
3 |
Pages |
227-241 |
|
|
Keywords |
|
|
|
Abstract |
0,405 JCR
In graphical documents (e.g., maps, engineering drawings), artistic documents etc., the text lines are annotated in multiple orientations or curvilinear way to illustrate different locations or symbols. For the optical character recognition of such documents, individual text lines from the documents need to be extracted. In this paper, we propose a novel method to segment such text lines and the method is based on the foreground and background information of the text components. To effectively utilize the background information, a water reservoir concept is used here. In the proposed scheme, at first, individual components are detected and grouped into character clusters in a hierarchical way using size and positional information. Next, the clusters are extended in two extreme sides to determine potential candidate regions. Finally, with the help of these candidate regions,
individual lines are extracted. The experimental results are presented on different datasets of graphical documents, camera-based warped documents, noisy images containing seals, etc. The results demonstrate that our approach is robust and invariant to size and orientation of the text lines present in
the document. |
|
|
Address |
|
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
1433-2833 |
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
|
|
|
Notes |
DAG |
Approved |
no |
|
|
Call Number |
Admin @ si @ RPL2012b |
Serial |
2134 |
|
Permanent link to this record |
|
|
|
|
Author |
Onur Ferhat |
|
|
Title |
Eye-Tracking with Webcam-Based Setups: Implementation of a Real-Time System and an Analysis of Factors Affecting Performance |
Type |
Report |
|
Year |
2012 |
Publication |
CVC Technical Report |
Abbreviated Journal |
|
|
|
Volume |
172 |
Issue |
|
Pages |
|
|
|
Keywords |
Computer vision, eye-tracking, gaussian process, feature selection, optical flow |
|
|
Abstract |
In the recent years commercial eye-tracking hardware has become more common, with the introduction of new models from several brands that have better performance and easier setup procedures. A cause and at the same time a result of this phenomenon is the popularity of eye-tracking research directed at marketing, accessibility and usability, among others.
One problem with these hardware components is scalability, because both the price and the necessary expertise to operate them makes it practically impossible in the large scale. In this work, we analyze the feasibility of a software eye-tracking system based on a single, ordinary webcam. Our aim is to discover the limits of such a system and to see whether it provides acceptable performances.
The significance of this setup is that it is the most common setup found in consumer environments, off-the-shelf electronic devices such as laptops, mobile phones and tablet computers. As no special equipment such as infrared lights, mirrors or zoom lenses are used; setting up and calibrating the system is easier compared to other approaches using these components.
Our work is based on the open source application Opengazer, which provides a good starting point for our contributions. We propose several improvements in order to push the system's performance further and make it feasible as a robust, real-time device. Then we carry out an elaborate experiment involving 18 human subjects and 4 different system setups. Finally, we give an analysis of the results and discuss the effects of setup changes, subject differences and modifications in the software. |
|
|
Address |
Bellaterra |
|
|
Corporate Author |
Computer Vision Center |
Thesis |
Master's thesis |
|
|
Publisher |
|
Place of Publication |
|
Editor |
Fernando Vilariño |
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
|
|
|
Notes |
MV |
Approved |
no |
|
|
Call Number |
Admin @ si @ Fer2012; IAM @ iam @ Fer2012 |
Serial |
2165 |
|
Permanent link to this record |
|
|
|
|
Author |
Olivier Penacchio; Laura Dempere-Marco; Xavier Otazu |
|
|
Title |
Switching off brightness induction through induction-reversed images |
Type |
Abstract |
|
Year |
2012 |
Publication |
Perception |
Abbreviated Journal |
PER |
|
|
Volume |
41 |
Issue |
|
Pages |
208 |
|
|
Keywords |
|
|
|
Abstract |
Brightness induction is the modulation of the perceived intensity of an
area by the luminance of surrounding areas. Although V1 is traditionally regarded as
an area mostly responsive to retinal information, neurophysiological evidence
suggests that it may explicitly represent brightness information. In this work, we
investigate possible neural mechanisms underlying brightness induction. To this end,
we consider the model by Z Li (1999 Computation and Neural Systems10187-212)
which is constrained by neurophysiological data and focuses on the part of V1
responsible for contextual influences. This model, which has proven to account for
phenomena such as contour detection and preattentive segmentation, shares with
brightness induction the relevant effect of contextual influences. Importantly, the
input to our network model derives from a complete multiscale and multiorientation
wavelet decomposition, which makes it possible to recover an image reflecting the
perceived luminance and successfully accounts for well known psychophysical
effects for both static and dynamic contexts. By further considering inverse problem
techniques we define induction-reversed images: given a target image, we build an
image whose perceived luminance matches the actual luminance of the original
stimulus, thus effectively canceling out brightness induction effects. We suggest that
induction-reversed images may help remove undesired perceptual effects and can
find potential applications in fields such as radiological image interpretation |
|
|
Address |
|
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
|
|
|
Notes |
CIC |
Approved |
no |
|
|
Call Number |
Admin @ si @ PDO2012a |
Serial |
2180 |
|
Permanent link to this record |
|
|
|
|
Author |
Olivier Penacchio; Laura Dempere-Marco; Xavier Otazu |
|
|
Title |
A Neurodynamical Model Of Brightness Induction In V1 Following Static And Dynamic Contextual Influences |
Type |
Abstract |
|
Year |
2012 |
Publication |
8th Federation of European Neurosciences |
Abbreviated Journal |
|
|
|
Volume |
6 |
Issue |
|
Pages |
63-64 |
|
|
Keywords |
|
|
|
Abstract |
Brightness induction is the modulation of the perceived intensity of an area by the luminance of surrounding areas. Although striate cortex is traditionally regarded as an area mostly responsive to ensory (i.e. retinal) information,
neurophysiological evidence suggests that perceived brightness information mightbe explicitly represented in V1.
Such evidence has been observed both in anesthetised cats where neuronal response modulations have been found to follow luminance changes outside the receptive felds and in human fMRI measurements. In this work, possible neural mechanisms that ofer a plausible explanation for such phenomenon are investigated. To this end, we consider the model proposed by Z.Li (Li, Network:Comput. Neural Syst., 10 (1999)) which is based on neurophysiological evidence and focuses on the part of V1 responsible for contextual infuences, i.e. layer 2-3 pyramidal cells, interneurons, and horizontal intracortical connections. This model has reproduced other phenomena such as contour detection and preattentive segmentation, which share with brightness induction the relevant efect of contextual infuences. We have extended the original model such that the input to the network is obtained from a complete multiscale and multiorientation wavelet decomposition, thereby allowing the recovery of an image refecting the perceived intensity. The proposed model successfully accounts for well known psychophysical efects for static contexts (among them: the White's and modifed White's efects, the Todorovic, Chevreul, achromatic ring patterns, and grating induction efects) and also for brigthness induction in dynamic contexts defned by modulating the luminance of surrounding areas (e.g. the brightness of a static central area is perceived to vary in antiphase to the sinusoidal luminance changes of its surroundings). This work thus suggests that intra-cortical interactions in V1 could partially explain perceptual brightness induction efects and reveals how a common general architecture may account for several different fundamental processes emerging early in the visual processing pathway. |
|
|
Address |
|
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
FENS |
|
|
Notes |
CIC |
Approved |
no |
|
|
Call Number |
Admin @ si @ PDO2012b |
Serial |
2181 |
|
Permanent link to this record |
|
|
|
|
Author |
Nuria Cirera |
|
|
Title |
Recognition of Handwritten Historical Documents |
Type |
Report |
|
Year |
2012 |
Publication |
CVC Technical Report |
Abbreviated Journal |
|
|
|
Volume |
174 |
Issue |
|
Pages |
|
|
|
Keywords |
|
|
|
Abstract |
|
|
|
Address |
|
|
|
Corporate Author |
|
Thesis |
Master's thesis |
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
|
|
|
Notes |
DAG |
Approved |
no |
|
|
Call Number |
Admin @ si @ Cir2012 |
Serial |
2416 |
|
Permanent link to this record |
|
|
|
|
Author |
Noha Elfiky; Jordi Gonzalez; Xavier Roca |
|
|
Title |
Compact and Adaptive Spatial Pyramids for Scene Recognition |
Type |
Journal Article |
|
Year |
2012 |
Publication |
Image and Vision Computing |
Abbreviated Journal |
IMAVIS |
|
|
Volume |
30 |
Issue |
8 |
Pages |
492–500 |
|
|
Keywords |
|
|
|
Abstract |
Most successful approaches on scenerecognition tend to efficiently combine global image features with spatial local appearance and shape cues. On the other hand, less attention has been devoted for studying spatial texture features within scenes. Our method is based on the insight that scenes can be seen as a composition of micro-texture patterns. This paper analyzes the role of texture along with its spatial layout for scenerecognition. However, one main drawback of the resulting spatial representation is its huge dimensionality. Hence, we propose a technique that addresses this problem by presenting a compactSpatialPyramid (SP) representation. The basis of our compact representation, namely, CompactAdaptiveSpatialPyramid (CASP) consists of a two-stages compression strategy. This strategy is based on the Agglomerative Information Bottleneck (AIB) theory for (i) compressing the least informative SP features, and, (ii) automatically learning the most appropriate shape for each category. Our method exceeds the state-of-the-art results on several challenging scenerecognition data sets. |
|
|
Address |
|
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
|
|
|
Notes |
ISE |
Approved |
no |
|
|
Call Number |
Admin @ si @ EGR2012 |
Serial |
2004 |
|
Permanent link to this record |
|
|
|
|
Author |
Noha Elfiky; Fahad Shahbaz Khan; Joost Van de Weijer; Jordi Gonzalez |
|
|
Title |
Discriminative Compact Pyramids for Object and Scene Recognition |
Type |
Journal Article |
|
Year |
2012 |
Publication |
Pattern Recognition |
Abbreviated Journal |
PR |
|
|
Volume |
45 |
Issue |
4 |
Pages |
1627-1636 |
|
|
Keywords |
|
|
|
Abstract |
Spatial pyramids have been successfully applied to incorporating spatial information into bag-of-words based image representation. However, a major drawback is that it leads to high dimensional image representations. In this paper, we present a novel framework for obtaining compact pyramid representation. First, we investigate the usage of the divisive information theoretic feature clustering (DITC) algorithm in creating a compact pyramid representation. In many cases this method allows us to reduce the size of a high dimensional pyramid representation up to an order of magnitude with little or no loss in accuracy. Furthermore, comparison to clustering based on agglomerative information bottleneck (AIB) shows that our method obtains superior results at significantly lower computational costs. Moreover, we investigate the optimal combination of multiple features in the context of our compact pyramid representation. Finally, experiments show that the method can obtain state-of-the-art results on several challenging data sets. |
|
|
Address |
|
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
0031-3203 |
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
|
|
|
Notes |
ISE; CAT;CIC |
Approved |
no |
|
|
Call Number |
Admin @ si @ EKW2012 |
Serial |
1807 |
|
Permanent link to this record |
|
|
|
|
Author |
Noha Elfiky |
|
|
Title |
Compact, Adaptive and Discriminative Spatial Pyramids for Improved Object and Scene Classification |
Type |
Book Whole |
|
Year |
2012 |
Publication |
PhD Thesis, Universitat Autonoma de Barcelona-CVC |
Abbreviated Journal |
|
|
|
Volume |
|
Issue |
|
Pages |
|
|
|
Keywords |
|
|
|
Abstract |
The release of challenging datasets with a vast number of images, requires the development of efficient image representations and algorithms which are able to manipulate these large-scale datasets efficiently. Nowadays the Bag-of-Words (BoW) is the most successful approach in the context of object and scene classification tasks. However, its main drawback is the absence of the important spatial information. Spatial pyramids (SP) have been successfully applied to incorporate spatial information into BoW-based image representation. Observing the remarkable performance of spatial pyramids, their growing number of applications to a broad range of vision problems, and finally its geometry inclusion, a question can be asked what are the limits of spatial pyramids. Within the SP framework, the optimal way for obtaining an image spatial representation, which is able to cope with it’s most foremost shortcomings, concretely, it’s high dimensionality and the rigidity of the resulting image representation, still remains an active research domain. In summary, the main concern of this thesis is to search for the limits of spatial pyramids and try to figure out solutions for them. |
|
|
Address |
|
|
|
Corporate Author |
|
Thesis |
Ph.D. thesis |
|
|
Publisher |
Ediciones Graficas Rey |
Place of Publication |
|
Editor |
Jordi Gonzalez;Xavier Roca |
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
|
|
|
Notes |
ISE |
Approved |
no |
|
|
Call Number |
Admin @ si @ Elf2012 |
Serial |
2202 |
|
Permanent link to this record |
|
|
|
|
Author |
Naveen Onkarappa; Sujay M. Veerabhadrappa; Angel Sappa |
|
|
Title |
Optical Flow in Onboard Applications: A Study on the Relationship Between Accuracy and Scene Texture |
Type |
Conference Article |
|
Year |
2012 |
Publication |
4th International Conference on Signal and Image Processing |
Abbreviated Journal |
|
|
|
Volume |
221 |
Issue |
|
Pages |
257-267 |
|
|
Keywords |
|
|
|
Abstract |
Optical flow has got a major role in making advanced driver assistance systems (ADAS) a reality. ADAS applications are expected to perform efficiently in all kinds of environments, those are highly probable, that one can drive the vehicle in different kinds of roads, times and seasons. In this work, we study the relationship of optical flow with different roads, that is by analyzing optical flow accuracy on different road textures. Texture measures such as TeX , TeX and TeX are evaluated for this purpose. Further, the relation of regularization weight to the flow accuracy in the presence of different textures is also analyzed. Additionally, we present a framework to generate synthetic sequences of different textures in ADAS scenarios with ground-truth optical flow. |
|
|
Address |
Coimbatore, India |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
1876-1100 |
ISBN |
978-81-322-0996-6 |
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
ICSIP |
|
|
Notes |
ADAS |
Approved |
no |
|
|
Call Number |
Admin @ si @ OVS2012 |
Serial |
2356 |
|
Permanent link to this record |
|
|
|
|
Author |
Naveen Onkarappa; Angel Sappa |
|
|
Title |
An Empirical Study on Optical Flow Accuracy Depending on Vehicle Speed |
Type |
Conference Article |
|
Year |
2012 |
Publication |
IEEE Intelligent Vehicles Symposium |
Abbreviated Journal |
|
|
|
Volume |
|
Issue |
|
Pages |
1138-1143 |
|
|
Keywords |
|
|
|
Abstract |
Driver assistance and safety systems are getting attention nowadays towards automatic navigation and safety. Optical flow as a motion estimation technique has got major roll in making these systems a reality. Towards this, in the current paper, the suitability of polar representation for optical flow estimation in such systems is demonstrated. Furthermore, the influence of individual regularization terms on the accuracy of optical flow on image sequences of different speeds is empirically evaluated. Also a new synthetic dataset of image sequences with different speeds is generated along with the ground-truth optical flow. |
|
|
Address |
Alcalá de Henares |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
IEEE Xplore |
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
1931-0587 |
ISBN |
978-1-4673-2119-8 |
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
IV |
|
|
Notes |
ADAS |
Approved |
no |
|
|
Call Number |
Admin @ si @ NaS2012 |
Serial |
2020 |
|
Permanent link to this record |
|
|
|
|
Author |
Naila Murray; Sandra Skaff; Luca Marchesotti; Florent Perronnin |
|
|
Title |
Towards automatic and flexible concept transfer |
Type |
Journal Article |
|
Year |
2012 |
Publication |
Computers and Graphics |
Abbreviated Journal |
CG |
|
|
Volume |
36 |
Issue |
6 |
Pages |
622–634 |
|
|
Keywords |
|
|
|
Abstract |
This paper introduces a novel approach to automatic, yet flexible, image concepttransfer; examples of concepts are “romantic”, “earthy”, and “luscious”. The presented method modifies the color content of an input image given only a concept specified by a user in natural language, thereby requiring minimal user input. This method is particularly useful for users who are aware of the message they wish to convey in the transferred image while being unsure of the color combination needed to achieve the corresponding transfer. Our framework is flexible for two reasons. First, the user may select one of two modalities to map input image chromaticities to target concept chromaticities depending on the level of photo-realism required. Second, the user may adjust the intensity level of the concepttransfer to his/her liking with a single parameter. The proposed method uses a convex clustering algorithm, with a novel pruning mechanism, to automatically set the complexity of models of chromatic content. Results show that our approach yields transferred images which effectively represent concepts as confirmed by a user study. |
|
|
Address |
|
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
0097-8493 |
ISBN |
|
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
|
|
|
Notes |
CIC |
Approved |
no |
|
|
Call Number |
Admin @ si @ MSM2012 |
Serial |
2002 |
|
Permanent link to this record |
|
|
|
|
Author |
Naila Murray; Luca Marchesotti; Florent Perronnin |
|
|
Title |
AVA: A Large-Scale Database for Aesthetic Visual Analysis |
Type |
Conference Article |
|
Year |
2012 |
Publication |
25th IEEE Conference on Computer Vision and Pattern Recognition |
Abbreviated Journal |
|
|
|
Volume |
|
Issue |
|
Pages |
2408-2415 |
|
|
Keywords |
|
|
|
Abstract |
With the ever-expanding volume of visual content available, the ability to organize and navigate such content by aesthetic preference is becoming increasingly important. While still in its nascent stage, research into computational models of aesthetic preference already shows great potential. However, to advance research, realistic, diverse and challenging databases are needed. To this end, we introduce a new large-scale database for conducting Aesthetic Visual Analysis: AVA. It contains over 250,000 images along with a rich variety of meta-data including a large number of aesthetic scores for each image, semantic labels for over 60 categories as well as labels related to photographic style. We show the advantages of AVA with respect to existing databases in terms of scale, diversity, and heterogeneity of annotations. We then describe several key insights into aesthetic preference afforded by AVA. Finally, we demonstrate, through three applications, how the large scale of AVA can be leveraged to improve performance on existing preference tasks |
|
|
Address |
Providence, Rhode Islan |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
IEEE Xplore |
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
1063-6919 |
ISBN |
978-1-4673-1226-4 |
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
CVPR |
|
|
Notes |
CIC |
Approved |
no |
|
|
Call Number |
Admin @ si @ MMP2012a |
Serial |
2025 |
|
Permanent link to this record |
|
|
|
|
Author |
Naila Murray; Luca Marchesotti; Florent Perronnin |
|
|
Title |
Learning to Rank Images using Semantic and Aesthetic Labels |
Type |
Conference Article |
|
Year |
2012 |
Publication |
23rd British Machine Vision Conference |
Abbreviated Journal |
|
|
|
Volume |
|
Issue |
|
Pages |
110.1-110.10 |
|
|
Keywords |
|
|
|
Abstract |
Most works on image retrieval from text queries have addressed the problem of retrieving semantically relevant images. However, the ability to assess the aesthetic quality of an image is an increasingly important differentiating factor for search engines. In this work, given a semantic query, we are interested in retrieving images which are semantically relevant and score highly in terms of aesthetics/visual quality. We use large-margin classifiers and rankers to learn statistical models capable of ordering images based on the aesthetic and semantic information. In particular, we compare two families of approaches: while the first one attempts to learn a single ranker which takes into account both semantic and aesthetic information, the second one learns separate semantic and aesthetic models. We carry out a quantitative and qualitative evaluation on a recently-published large-scale dataset and we show that the second family of techniques significantly outperforms the first one. |
|
|
Address |
Guildford, London |
|
|
Corporate Author |
|
Thesis |
|
|
|
Publisher |
|
Place of Publication |
|
Editor |
|
|
|
Language |
|
Summary Language |
|
Original Title |
|
|
|
Series Editor |
|
Series Title |
|
Abbreviated Series Title |
|
|
|
Series Volume |
|
Series Issue |
|
Edition |
|
|
|
ISSN |
|
ISBN |
1-901725-46-4 |
Medium |
|
|
|
Area |
|
Expedition |
|
Conference |
BMVC |
|
|
Notes |
CIC |
Approved |
no |
|
|
Call Number |
Admin @ si @ MMP2012b |
Serial |
2027 |
|
Permanent link to this record |