Physics Maths Engineering
Conditional image retrieval (CIR), which involves retrieving images by a query image along with user-specified conditions, is essential in computer vision research for efficient image search and automated image analysis. The existing approaches, such as composed image retrieval (CoIR) methods, have been actively studied. However, these methods face challenges as they require either a triplet dataset or richly annotated image-text pairs, which are expensive to obtain. In this work, we demonstrate that CIR at the image-level concept can be achieved using an inverse mapping approach that explores the model’s inductive knowledge. Our proposed CIR method, called Backward Search, updates the query embedding to conform to the condition. Specifically, the embedding of the query image is updated by predicting the probability of the label and minimizing the difference from the condition label. This enables CIR with image-level concepts while preserving the context of the query. In this paper, we introduce the Backward Search method that enables single and multi-conditional image retrieval. Moreover, we efficiently reduce the computation time by distilling the knowledge. We conduct experiments using the WikiArt, aPY, and CUB benchmark datasets. The proposed method achieves an average mAP@10 of 0.541 on the datasets, demonstrating a marked improvement compared to the CoIR methods in our comparative experiments. Furthermore, by employing knowledge distillation with the Backward Search model as the teacher, the student model achieves a significant reduction in computation time, up to 160 times faster with only a slight decrease in performance. The implementation of our method is available at the following URL: https://github.com/dhlee-work/BackwardSearch.
The Backward Search method enables single and multi-conditional image retrieval by updating the query image's embedding to align with specified conditions, preserving the query's context.
The method reduces computation time by distilling knowledge, enhancing retrieval efficiency.
Experiments were conducted using the WikiArt, aPY, and CUB benchmark datasets.
The method achieved an average mean Average Precision at 10 (mAP@10) of 0.541 across the datasets.
Lee and Kim's study presents the Backward Search method for conditional image retrieval, which updates the query image's embedding to align with specified conditions while maintaining the query's context. This approach enhances computational efficiency through knowledge distillation. Evaluations on the WikiArt, aPY, and CUB datasets demonstrate the method's effectiveness, achieving an average mAP@10 of 0.541. These results indicate a significant improvement over existing Composed Image Retrieval (CoIR) methods.
Show by month | Manuscript | Video Summary |
---|---|---|
2025 April | 4 | 4 |
2025 March | 65 | 65 |
2025 February | 45 | 45 |
2025 January | 50 | 50 |
2024 December | 57 | 57 |
2024 November | 52 | 52 |
2024 October | 22 | 22 |
Total | 295 | 295 |
Show by month | Manuscript | Video Summary |
---|---|---|
2025 April | 4 | 4 |
2025 March | 65 | 65 |
2025 February | 45 | 45 |
2025 January | 50 | 50 |
2024 December | 57 | 57 |
2024 November | 52 | 52 |
2024 October | 22 | 22 |
Total | 295 | 295 |