KGMEL is a novel framework for multimodal entity linking that enhances the alignment of textual mentions with knowledge base entities by incorporating knowledge graph (KG) triples. It operates in three stages: generating high-quality triples, learning joint representations through contrastive learning, and refining candidate entities using large language models. Experimental results show that KGMEL outperforms existing methods in accuracy and efficiency.
+ entity-linking
knowledge-graphs ✓
multimodal ✓
contrastive-learning ✓
information-retrieval ✓