Published October 2026
| Version v1
Conference paper
Open
Cross-modal Image Recommendation for News Articles by Multimodal Foundation Models-based Retrieval-Reranking
Authors/Creators
Description
Retrieving relevant images for a given news article is challenging and can be considered a special version of the cross-modal retrieval problem. This notebook paper presents our solution for the MediaEval NewsImages 2025 benchmarking task. We propose a retrieval-reranking solution based on multimodal foundation models such as VLMs and multimodal LLMs, and utilizing multiple levels of textual granularity. We report the official results of our submitted runs and additional experiments we conducted internally to evaluate our runs.
Files
mediaeval2025.pdf
Files
(733.5 kB)
| Name | Size | Download all |
|---|---|---|
|
md5:3f2b84f65db95764ebaffd790392d7ba
|
733.5 kB | Preview Download |