TY - CONF AU - Francesc Net AU - Marc Folia AU - Pep Casals AU - Lluis Gomez A2 - ICDAR PY - 2023// TI - Transductive Learning for Near-Duplicate Image Detection in Scanned Photo Collections T2 - LNCS BT - 17th International Conference on Document Analysis and Recognition SP - 3 EP - 17 VL - 14191 KW - Image deduplication KW - Near-duplicate images detection KW - Transductive Learning KW - Photographic Archives KW - Deep Learning N2 - This paper presents a comparative study of near-duplicate image detection techniques in a real-world use case scenario, where a document management company is commissioned to manually annotate a collection of scanned photographs. Detecting duplicate and near-duplicate photographs can reduce the time spent on manual annotation by archivists. This real use case differs from laboratory settings as the deployment dataset is available in advance, allowing the use of transductive learning. We propose a transductive learning approach that leverages state-of-the-art deep learning architectures such as convolutional neural networks (CNNs) and Vision Transformers (ViTs). Our approach involves pre-training a deep neural network on a large dataset and then fine-tuning the network on the unlabeled target collection with self-supervised learning. The results show that the proposed approach outperforms the baseline methods in the task of near-duplicate image detection in the UKBench and an in-house private dataset. UR - https://link.springer.com/chapter/10.1007/978-3-031-41734-4_1 N1 - DAG ID - Francesc Net2023 ER -