Knowledge graph embedding by fusing multimodal content via cross-modal learning

Shi Liu; Kaiyang Li; Yaoying Wang; Tianyou Zhu; Jiwei Li; Zhenyu Chen; Shi Liu; Kaiyang Li; Yaoying Wang; Tianyou Zhu; Jiwei Li; Zhenyu Chen

doi:10.3934/mbe.2023634

Mathematical Biosciences and Engineering

2023, Volume 20, Issue 8: 14180-14200. doi: 10.3934/mbe.2023634

Previous Article Next Article

Research article

Knowledge graph embedding by fusing multimodal content via cross-modal learning

Big Data Center of State Grid Corporation, Beijing 100052, China

Academic Editor: Yang Kuang

Received: 01 February 2023 Revised: 27 April 2023 Accepted: 21 May 2023 Published: 26 June 2023

Knowledge graph embedding aims to learn representation vectors for the entities and relations. Most of the existing approaches learn the representation from the structural information in the triples, which neglects the content related to the entity and relation. Though there are some approaches proposed to exploit the related multimodal content to improve knowledge graph embedding, such as the text description and images associated with the entities, they are not effective to address the heterogeneity and cross-modal correlation constraint of different types of content and network structure. In this paper, we propose a multi-modal content fusion model (MMCF) for knowledge graph embedding. To effectively fuse the heterogenous data for knowledge graph embedding, such as text description, related images and structural information, a cross-modal correlation learning component is proposed. It first learns the intra-modal and inter-modal correlation to fuse the multimodal content of each entity, and then they are fused with the structure features by a gating network. Meanwhile, to enhance the features of relation, the features of the associated head entity and tail entity are fused to learn relation embedding. To effectively evaluate the proposed model, we compare it with other baselines in three datasets, i.e., FB-IMG, WN18RR and FB15k-237. Experiment result of link prediction demonstrates that our model outperforms the state-of-the-art in most of the metrics significantly, implying the superiority of the proposed method.
- knowledge graph,
- embedding learning,
- graph embedding,
- multimodal learning,
- cross-modal correlation
Citation: Shi Liu, Kaiyang Li, Yaoying Wang, Tianyou Zhu, Jiwei Li, Zhenyu Chen. Knowledge graph embedding by fusing multimodal content via cross-modal learning[J]. Mathematical Biosciences and Engineering, 2023, 20(8): 14180-14200. doi: 10.3934/mbe.2023634

Related Papers:

Abstract

Knowledge graph embedding aims to learn representation vectors for the entities and relations. Most of the existing approaches learn the representation from the structural information in the triples, which neglects the content related to the entity and relation. Though there are some approaches proposed to exploit the related multimodal content to improve knowledge graph embedding, such as the text description and images associated with the entities, they are not effective to address the heterogeneity and cross-modal correlation constraint of different types of content and network structure. In this paper, we propose a multi-modal content fusion model (MMCF) for knowledge graph embedding. To effectively fuse the heterogenous data for knowledge graph embedding, such as text description, related images and structural information, a cross-modal correlation learning component is proposed. It first learns the intra-modal and inter-modal correlation to fuse the multimodal content of each entity, and then they are fused with the structure features by a gating network. Meanwhile, to enhance the features of relation, the features of the associated head entity and tail entity are fused to learn relation embedding. To effectively evaluate the proposed model, we compare it with other baselines in three datasets, i.e., FB-IMG, WN18RR and FB15k-237. Experiment result of link prediction demonstrates that our model outperforms the state-of-the-art in most of the metrics significantly, implying the superiority of the proposed method.

References

[1]	K. Bollacker, C. Evans, P. Paritosh, T. Sturge, J. Taylor, Freebase: a collaboratively created graph database for structuring human knowledge, in 2008 ACM SIGMOD International Conference on Management of Data (SIGKDD), (2008), 1247–1250. https://doi.org/10.1145/1376616.1376746
[2]	F. M. Suchanek, G. Kasneci, G. Weikum, Yago: a core of semantic knowledge, in 2007 16th International Conference on World Wide Web (WWW), (2007), 697–706. https://doi.org/10.1145/1242572.1242667
[3]	J. Lehmann, R. Isele, M. Jakob, A. Jentzsch, D. Kontokostas, P. N. Mendes, et al., Dbpedia–a large-scale, multilingual knowledge base extracted from Wikipedia, Semantic Web, 6 (2015), 167–195. https://doi.org/10.3233/SW-140134 doi: 10.3233/SW-140134
[4]	M. Wang, X. He, Z. Zhang, L. Liu, L. Qing, Y. Liu, Dual-process system based on mixed semantic fusion for Chinese medical knowledge-based question answering, Math. Biosci. Eng., 20 (2023), 4912–4939. https://doi.org/10.3934/mbe.2023228 doi: 10.3934/mbe.2023228
[5]	Z. Zheng, X. Si, F. Li, E. Y. Chang, X. Zhu, Entity disambiguation with freebase, in 2012 IEEE/WIC/ACM International Joint Conferences on Web Intelligence and Intelligent Agent Technology, (2012), 82–89. https://doi.org/10.1109/WI-IAT.2012.26
[6]	S. Moon, P. Shah, A. Kumar, R. Subba, Opendialkg: Explainable conversational reasoning with attention-based walks over knowledge graphs, in 2019 the 57th Annual Meeting of the Association for Computational Linguistics (ACL), (2019), 845–854. https://doi.org/10.18653/v1/P19-1081
[7]	X. Lu, L. Wang, Z. Jiang, S. Liu, J. Lin, MRE: A translational knowledge graph completion model based on multiple relation embedding, Math. Biosci. Eng., 20 (2023), 5881–5900. https://doi.org/10.3934/mbe.2023253 doi: 10.3934/mbe.2023253
[8]	Q. Wang, Z. Mao, B. Wang, L. Guo, Knowledge graph embedding: A survey of approaches and applications, IEEE Trans. Knowl. Data Eng., 29 (2017), 2724–2743. https://doi.org/10.1109/TKDE.2017.2754499 doi: 10.1109/TKDE.2017.2754499
[9]	J. Xu, X. Qiu, K. Chen, X. Huang, Knowledge graph representation with jointly structural and textual encoding, in 2017 the 26th International Joint Conference on Artificial Intelligence (IJCAI), (2017), 1318–1324. https://doi.org/10.48550/arXiv.1611.08661
[10]	I. Balaˇzevi´c, C. Allen, T. Hospedales, Multi-relational poincar'e graph embeddings, Adv. Neural Inf. Proces. Syst., 32 (2019), 1168–1179. https://doi.org/10.48550/arXiv.1905.09791 doi: 10.48550/arXiv.1905.09791
[11]	S. Vashishth, S. Sanyal, V. Nitin, N. Agrawal, P. Talukdar, Interacte: Improving convolution-based knowledge graph embeddings by increasing feature interactions, in 2020 the 34th AAAI Conference on Artificial Intelligence (AAAI), (2020), 3009–3016. https://doi.org/10.1609/aaai.v34i03.5694
[12]	H. Mousselly-Sergieh, T. Botschen, I. Gurevych, S. Roth, A multimodal translation-based approach for knowledge graph representation learning, in 2018 the Seventh Joint Conference on Lexical and Computational Semantics, (2018), 225–234. https://doi.org/10.18653/v1/S18-2027
[13]	N. Veira, B. Keng, K. Padmanabhan, A. G. Veneris, Unsupervised embedding enhancements of knowledge graphs using textual associations, in 2019 the 28th International Joint Conference on Artificial Intelligence (IJCAI), (2019), 5218–5225. https://doi.org/10.24963/ijcai.2019/725
[14]	L. Yao, C. Mao, Y. Luo, Kg-bert: Bert for knowledge graph completion, preprint, arXiv: 1909.03193.
[15]	J. Devlin, M. W. Chang, K. Lee, K. Toutanova, Bert: Pre-training of deep bidirectional transformers for language understanding, in 2019 the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL), (2019), 4171–4186. https://doi.org/10.48550/arXiv.1810.04805
[16]	M. Schlichtkrull, T. N. Kipf, P. Bloem, R. Van Den Berg, I. Titov, M. Welling, Modeling relational data with graph convolutional networks, in 2018 European Semantic Web Conference, (2018), 593–607. https://doi.org/10.48550/arXiv.1703.06103
[17]	S. Vashishth, S. Sanyal, V. Nitin, P. Talukdar, Composition-based multi-relational graph convolutional networks, in 2020 the International Conference on Learning Representations (ICLR), (2020), 121–134. https://doi.org/10.48550/arXiv.1911.03082
[18]	A. Bordes, N. Usunier, A. Garcia-Duran, J. Weston, O. Yakhnenko, Translating embeddings for modeling multi-relational data, Adv. Neural Inf. Process. Syst., 22 (2013), 2787–2795. https://doi.org/10.5555/2999792.2999923 doi: 10.5555/2999792.2999923
[19]	Y. Lin, Z. Liu, M. Sun, Y. Liu, X. Zhu, Learning entity and relation embeddings for knowledge graph completion, in 2015 AAAI Conference on Artificial Intelligence (AAAI), (2015), 2181–2187. https://doi.org/10.1609/aaai.v29i1.9491
[20]	I. Balazevic, C. Allen, T. Hospedales, Tucker: Tensor factorization for knowledge graph completion. In 2019 the Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), (2019), 178–189. https://doi.org/10.18653/v1/D19-1522
[21]	M. Nickel, L. Rosasco, T. Poggio, Holographic embeddings of knowledge graphs, in 2016 the 30th AAAI Conference on Artificial Intelligence (AAAI), (2016), 1955–1961. https://doi.org/10.1609/aaai.v30i1.10314
[22]	W. Zhang, B. Paudel, W. Zhang, A. Bernstein, H. Chen, Interaction embeddings for prediction and explanation in knowledge graphs, in 2019 the 12th ACM International Conference on Web Search and Data Mining (WSDM), (2019), 96–104. https://doi.org/10.1145/3289600.3291014
[23]	Y. LeCun, L. Bottou, Y. Bengio, P. Haffffner, Gradient-based learning applied to document recognition, in Proceedings of the IEEE, (1998), 2278–2324. https://doi.org/10.1109/5.726791
[24]	Z. Wu, S. Pan, F. Chen, G. Long, C. Zhang, S. Y. Philip, A comprehensive survey on graph neural networks, IEEE Trans. Neural Networks Learn. Syst., 6 (2021), 97–109. https://doi.org/10.1109/TNNLS.2020.2978386 doi: 10.1109/TNNLS.2020.2978386
[25]	Z. Xie, G. Zhou, J. Liu, X. Huang, Reinceptione: Relation-aware inception network with joint local-global structural information for knowledge graph embedding, in 2020 the 58th Annual Meeting of the Association for Computational Linguistics (ACL), (2020), 5929–5939. https://doi.org/10.18653/v1/2020.acl-main.526
[26]	D. Q. Nguyen, T. D. Nguyen, D. Q. Nguyen, D. Phung, A novel embedding model for knowledge base completion based on convolutional neural network, in 2018 the 16th Annual Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL-HLT), (2018), 327–333. https://doi.org/10.18653/v1/N18-2053
[27]	I. Balaevicx, C. Allen, T. M. Hospedales, Hypernetwork knowledge graph embeddings, in 2019 the 28th International Conference on Artificial Neural Networks, (2019), 553–565. https://doi.org/10.1007/978-3-030-30493-5_52
[28]	S. Vashishth, S. Sanyal, V. Nitin, P. Talukdar, Composition-based multi-relational graph convolutional networks, in 2020 the International Conference on Learning Representations (ICLR), (2020), 321–334. https://doi.org/10.48550/arXiv.1911.03082
[29]	W. Y. Wang, W. W. Cohen, Learning first-order logic embeddings via matrix factorization, in 2016 the 25th International Joint Conference on Artificial Intelligence (IJCAI), (2016), 2132–2138. https://doi.org/10.5555/3060832.3060919
[30]	B. Jagvaral, W. K. Lee, J. S. Roh, M. S. Kim, Y. T. Park, Path-based reasoning approach for knowledge graph completion using cnn-bilstm with attention mechanism, Expert Syst. Appl., 142 (2020), 112960. https://doi.org/10.1016/j.eswa.2019.112960 doi: 10.1016/j.eswa.2019.112960
[31]	R. Socher, D. Chen, C. D. Manning, A. Ng, Reasoning with neural tensor networks for knowledge base completion, Adv. Neural Inf. Process. Syst., 2013 (2013), 926–934. https://doi.org/10.5555/2999611.2999715 doi: 10.5555/2999611.2999715
[32]	X. Gao, Y. Wang, W. Hou, Z. Liu, X. Ma, Multi-view Clustering for integration of gene expression and methylation data with tensor decomposition and self-representation learning, IEEE/ACM Trans. Comput. Biol. Bioinf., 2022 (2022). https://doi.org/10.1109/TCBB.2022.3229678 doi: 10.1109/TCBB.2022.3229678
[33]	D. Li, S. Zhang, X. Ma, Dynamic module detection in temporal attributed networks of cancers, IEEE/ACM Trans. Comput. Biol. Bioinf., 4 (2022), 2219–2230. https://doi.org/10.1109/TCBB.2021.3069441 doi: 10.1109/TCBB.2021.3069441
[34]	X. Ma, W. Zhao, W. Wu, Layer-specific modules detection in cancer multi-layer networks, IEEE/ACM Trans. Comput. Biol. Bioinf., 2022 (2022). https://doi.org/10.1109/TCBB.2022.3176859 doi: 10.1109/TCBB.2022.3176859
[35]	X. Gao, X. Ma, W. Zhang, J. Huang, H. Li, Y. Li, et al., multi-view clustering with self-representation and structural constraint, IEEE Trans. Big Data, 4 (2022), 882–893. https://doi.org/10.1109/TBDATA.2021.3128906 doi: 10.1109/TBDATA.2021.3128906
[36]	R. Xie, Z. Liu, H. Luan, M. Sun, Image-embodied knowledge representation learning, in 2017 the 26th International Joint Conference on Artificial Intelligence (IJCAI), (2017), 3140–3146. https://doi.org/10.24963/ijcai.2017/438
[37]	P. Pezeshkpour, L. Chen, S. Singh, Embedding multimodal relational data for knowledge base completion, in 2018 the Conference on Empirical Methods in Natural Language Processing (EMNLP), (2018), 3208–3218. https://doi.org/10.18653/v1/D18-1359
[38]	J. Yuan, N. Gao, J. Xiang, Transgate: knowledge graph embedding with shared gate structure, in 2019 the AAAI Conference on Artificial Intelligence (AAAI), (2019), 3100–3107. https://doi.org/10.1609/AAAI.V33I01.33013100
[39]	S. Ren, K. He, R. Girshick, J. Sun, Faster R-CNN: Towards real-time object detection with region proposal networks, IEEE Trans. Pattern Anal. Mach. Intell., 39 (2017), 1137–1149. https://doi.org/10.1109/TPAMI.2016.2577031 doi: 10.1109/TPAMI.2016.2577031
[40]	A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, et al., Attention is all you need, Adv. Neural Inf. Process. Syst., (2017), 5998–6008. https://doi.org/10.48550/arXiv.1706.03762 doi: 10.48550/arXiv.1706.03762
[41]	Y. Kim, Convolutional neural networks for sentence classification, preprint, arXiv: 1408.5882.
[42]	Z. Yu, J. Yu, J. Fan, D. Tao, Multi-modal factorized bilinear pooling with co-attention learning for visual question answering, in 2017 the IEEE International Conference on Computer Vision (ICCV), (2017), 1821–1830. https://doi.org/10.1109/ICCV.2017.202
[43]	T. Dettmers, M. Pasquale, S. Pontus, S. Riedel, Convolutional 2d knowledge graph embeddings, in 2018 the 32th AAAI Conference on Artificial Intelligence (AAAI), (2018), 1811–1818. https://doi.org/10.1609/aaai.v32i1.11573
[44]	K. Toutanova, D. Chen, Observed versus latent features for knowledge base and text inference, in 2015 the 3rd workshop on continuous vector space models and their compositionality, (2015), 57–66. https://doi.org/10.18653/v1/W15-4007
[45]	D. Kingma, J. Ba, Adam: A method for stochastic optimization, Comput. Sci., 34 (2014), 56–67. https://doi.org/10.48550/arXiv.1412.6980 doi: 10.48550/arXiv.1412.6980
[46]	B. Yang, S. W. Yih, X. He, J. Gao, L. Deng, Embedding entities and relations for learning and inference in knowledge bases, in 2015 International Conference on Learning Representations (ICLR), (2015), 345–358. https://doi.org/10.48550/arXiv.1412.6575
[47]	S. Wang, X. Wei, C. N. Santos, Z. Wang, R. Nallapati, A. Arnold, et al., Mixed-curvature multi-relational graph neural network for knowledge graph completion, in 2021 the International World Wide Web Conference (WWW), (2021), 1761–1771. https://doi.org/10.1145/3442381.3450118
[48]	T. Trouillon, J. Welbl, S. Riedel, xE. Gaussier, G. Bouchard, Complex embeddings for simple link prediction, in 2016 the 33rd International Conference on Machine Learning (ICML), (2016), 2071–2080. https://doi.org/10.48550/arXiv.1606.06357

Reader Comments

Your name:*

Email:*
© 2023 the Author(s), licensee AIMS Press. This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0)