| Hu, Zhiwei, Gutierrez Basulto, Victor  ORCID: https://orcid.org/0000-0002-6117-5459, Xiang, Zhiliang  ORCID: https://orcid.org/0000-0002-0263-7289, Li, Ru and Pan, Jeff Z.
      2025.
      
      Multi-level mixture of experts for multimodal entity linking.
      Presented at: 31st SIGKDD Conference on Knowledge Discovery and Data Mining,
      Toronto, Canada,
      3-7 August 2025.
      Published in: Antonie, A., Pei, J., Yu, X., Chierichetti, F., Lauw, H. W., Sun, Y. and Parthasarathy, S. eds.
      Proceedings of the 31st ACM SIGKDD Conference on Knowledge Discovery and Data Mining V.2.
      
      
      
       
      
      New York, NY, USA: 
      Association for Computing Machinery,
      pp. 979-990.
      10.1145/3711896.3737060   | 
| Preview | PDF
 - Published Version Available under License Creative Commons Attribution. Download (1MB) | Preview | 
Abstract
Multimodal Entity Linking (MEL) aims to link ambiguous mentions within multimodal contexts to associated entities in a multimodal knowledge base. Existing approaches to MEL introduce multimodal interaction and fusion mechanisms to bridge the modality gap and enable multi-grained semantic matching. However, they do not address two important problems: (i) mention ambiguity, i.e., the lack of semantic content caused by the brevity and omission of key information in the mention's textual context; (ii) dynamic selection of modal content, i.e., to dynamically distinguish the importance of different parts of modal information. To mitigate these issues, we propose a Multi-level Mixture of Experts (MMoE) model for MEL. MMoE has four components: (i) the description-aware mention enhancement module leverages large language models to identify the WikiData descriptions that best match a mention, considering the mention's textual context; (ii) the multimodal feature extraction module adopts multimodal feature encoders to obtain textual and visual embeddings for both mentions and entities; (iii)-(iv) the intra-level mixture of experts and inter-level mixture of experts modules apply a switch mixture of experts mechanism to dynamically and adaptively select features from relevant regions of information. Extensive experiments on WikiMEL, RichpediaMEL and WikiDiverse datasets demonstrate the outstanding performance of MMoE compared to the state-of-the-art. MMoE's code is available at: https://github.com/zhiweihu1103/MEL-MMoE.
| Item Type: | Conference or Workshop Item (Paper) | 
|---|---|
| Date Type: | Published Online | 
| Status: | Published | 
| Schools: | Schools > Computer Science & Informatics | 
| Publisher: | Association for Computing Machinery | 
| ISBN: | 9798400714542 | 
| Related URLs: | |
| Date of First Compliant Deposit: | 3 June 2025 | 
| Date of Acceptance: | 16 May 2025 | 
| Last Modified: | 28 Aug 2025 12:31 | 
| URI: | https://orca.cardiff.ac.uk/id/eprint/178724 | 
Actions (repository staff only)
|  | Edit Item | 

 
							

 Altmetric
 Altmetric Altmetric
 Altmetric