|
Mikhail Mozerov, Fei Yang, & Joost Van de Weijer. (2019). Sparse Data Interpolation Using the Geodesic Distance Affinity Space. SPL - IEEE Signal Processing Letters, 26(6), 943–947.
Abstract: In this letter, we adapt the geodesic distance-based recursive filter to the sparse data interpolation problem. The proposed technique is general and can be easily applied to any kind of sparse data. We demonstrate its superiority over other interpolation techniques in three experiments for qualitative and quantitative evaluation. In addition, we compare our method with the popular interpolation algorithm presented in the paper on EpicFlow optical flow, which is intuitively motivated by a similar geodesic distance principle. The comparison shows that our algorithm is more accurate and considerably faster than the EpicFlow interpolation technique.
|
|
|
AN Ruchai, VI Kober, KA Dorofeev, VN Karnaukhov, & Mikhail Mozerov. (2021). Classification of breast abnormalities using a deep convolutional neural network and transfer learning. Journal of Communications Technology and Electronics, 66(6), 778–783.
Abstract: A new algorithm for classification of breast pathologies in digital mammography using a convolutional neural network and transfer learning is proposed. The following pretrained neural networks were chosen: MobileNetV2, InceptionResNetV2, Xception, and ResNetV2. All mammographic images were pre-processed to improve classification reliability. Transfer training was carried out using additional data augmentation and fine-tuning. The performance of the proposed algorithm for classification of breast pathologies in terms of accuracy on real data is discussed and compared with that of state-of-the-art algorithms on the available MIAS database.
|
|
|
Lorenzo Seidenari, Giuseppe Serra, Andrew Bagdanov, & Alberto del Bimbo. (2014). Local pyramidal descriptors for image recognition. TPAMI - IEEE Transactions on Pattern Analysis and Machine Intelligence, 36(5), 1033–1040.
Abstract: In this paper we present a novel method to improve the flexibility of descriptor matching for image recognition by using local multiresolution
pyramids in feature space. We propose that image patches be represented at multiple levels of descriptor detail and that these levels be defined in terms of local spatial pooling resolution. Preserving multiple levels of detail in local descriptors is a way of hedging one’s bets on which levels will most relevant for matching during learning and recognition. We introduce the Pyramid SIFT (P-SIFT) descriptor and show that its use in four state-of-the-art image recognition pipelines improves accuracy and yields state-of-the-art results. Our technique is applicable independently of spatial pyramid matching and we show that spatial pyramids can be combined with local pyramids to obtain
further improvement.We achieve state-of-the-art results on Caltech-101
(80.1%) and Caltech-256 (52.6%) when compared to other approaches based on SIFT features over intensity images. Our technique is efficient and is extremely easy to integrate into image recognition pipelines.
Keywords: Object categorization; local features; kernel methods
|
|
|
Weiqing Min, Shuqiang Jiang, Jitao Sang, Huayang Wang, Xinda Liu, & Luis Herranz. (2017). Being a Supercook: Joint Food Attributes and Multimodal Content Modeling for Recipe Retrieval and Exploration. TMM - IEEE Transactions on Multimedia, 19(5), 1100–1113.
Abstract: This paper considers the problem of recipe-oriented image-ingredient correlation learning with multi-attributes for recipe retrieval and exploration. Existing methods mainly focus on food visual information for recognition while we model visual information, textual content (e.g., ingredients), and attributes (e.g., cuisine and course) together to solve extended recipe-oriented problems, such as multimodal cuisine classification and attribute-enhanced food image retrieval. As a solution, we propose a multimodal multitask deep belief network (M3TDBN) to learn joint image-ingredient representation regularized by different attributes. By grouping ingredients into visible ingredients (which are visible in the food image, e.g., “chicken” and “mushroom”) and nonvisible ingredients (e.g., “salt” and “oil”), M3TDBN is capable of learning both midlevel visual representation between images and visible ingredients and nonvisual representation. Furthermore, in order to utilize different attributes to improve the intermodality correlation, M3TDBN incorporates multitask learning to make different attributes collaborate each other. Based on the proposed M3TDBN, we exploit the derived deep features and the discovered correlations for three extended novel applications: 1) multimodal cuisine classification; 2) attribute-augmented cross-modal recipe image retrieval; and 3) ingredient and attribute inference from food images. The proposed approach is evaluated on the constructed Yummly dataset and the evaluation results have validated the effectiveness of the proposed approach.
|
|
|
Simeon Petkov, Xavier Carrillo, Petia Radeva, & Carlo Gatta. (2014). Diaphragm border detection in coronary X-ray angiographies: New method and applications. CMIG - Computerized Medical Imaging and Graphics, 38(4), 296–305.
Abstract: X-ray angiography is widely used in cardiac disease diagnosis during or prior to intravascular interventions. The diaphragm motion and the heart beating induce gray-level changes, which are one of the main obstacles in quantitative analysis of myocardial perfusion. In this paper we focus on detecting the diaphragm border in both single images or whole X-ray angiography sequences. We show that the proposed method outperforms state of the art approaches. We extend a previous publicly available data set, adding new ground truth data. We also compose another set of more challenging images, thus having two separate data sets of increasing difficulty. Finally, we show three applications of our method: (1) a strategy to reduce false positives in vessel enhanced images; (2) a digital diaphragm removal algorithm; (3) an improvement in Myocardial Blush Grade semi-automatic estimation.
|
|