Most current deep learning-based news headline generation models only target domain-specific news data. When a new news domain appears, it is usually costly to obtain a large amount of data with reference truth on the new domain for model training, so text generation models trained by traditional supervised approaches often do not generalize well on the new domain—inspired by the idea of transfer learning, this paper designs a cross-domain transfer text generation method based on domain data distribution alignment, intermediate domain redistribution, and zero-shot learning semantic prototype transduction, focusing on the data problem with no reference truth in the target domain. Eventually, the model can be guided by the most relevant source domain data to generate headlines from the target domain news text through the semantic correlation between source and target domain data during the training process of generating headlines for the target domain news, even without any reference truth of the news headlines in the target domain, which improves the usability of the text generation model in real scenarios. The experimental results show that the proposed transfer text generation method has a good domain transfer effect and outperforms other existing transfer text generation methods in various text generation evaluation indexes, proving the proposed method's effectiveness in this paper.
Citation: Ting-Huai Ma, Xin Yu, Huan Rong. A comprehensive transfer news headline generation method based on semantic prototype transduction[J]. Mathematical Biosciences and Engineering, 2023, 20(1): 1195-1228. doi: 10.3934/mbe.2023055
Most current deep learning-based news headline generation models only target domain-specific news data. When a new news domain appears, it is usually costly to obtain a large amount of data with reference truth on the new domain for model training, so text generation models trained by traditional supervised approaches often do not generalize well on the new domain—inspired by the idea of transfer learning, this paper designs a cross-domain transfer text generation method based on domain data distribution alignment, intermediate domain redistribution, and zero-shot learning semantic prototype transduction, focusing on the data problem with no reference truth in the target domain. Eventually, the model can be guided by the most relevant source domain data to generate headlines from the target domain news text through the semantic correlation between source and target domain data during the training process of generating headlines for the target domain news, even without any reference truth of the news headlines in the target domain, which improves the usability of the text generation model in real scenarios. The experimental results show that the proposed transfer text generation method has a good domain transfer effect and outperforms other existing transfer text generation methods in various text generation evaluation indexes, proving the proposed method's effectiveness in this paper.
[1] | X. Ao, X. Wang, L. Luo, PENS: A dataset and generic framework for personalized news headline generation, in Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, 1 (2021), 82–92. https://doi.org/10.18653/v1/2021.acl-long.7 |
[2] | F. Z. Zhuang, P. Luo, Q. He, Z. Z. Shi, Survey on transfer learning, J. Software, 26 (2015), 26–39. https://doi.org/10.13328/j.cnki.jos.004631 doi: 10.13328/j.cnki.jos.004631 |
[3] | H. Choi, J. Kim, S. Joe, Analyzing Zero-shot cross-lingual transfer in supervised NLP tasks, in 2020 25th International Conference on Pattern Recognition (ICPR), (2021), 9608–9613. https://doi.org/10.1109/icpr48806.2021.9412570 |
[4] | W. Wang, V. W. Zheng, H. Yu, A survey of Zero-shot learning: Settings, methods, and applications, ACM Trans. Intell. Syst. Technol., 10 (2019), 1–37. https://doi.org/10.1145/3293318 doi: 10.1145/3293318 |
[5] | N. Y. Wang, Y. X. Ye, L. Liu, L. Z. Feng, T. Bao, T. Peng, Advances in deep learning-based language modeling research, J. Software, 32 (2021), 1082–1115. https://doi.org/10.13328/j.cnki.jos.006169 doi: 10.13328/j.cnki.jos.006169 |
[6] | S. Bae, T. Kim, J. Kim, Summary level training of sentence rewriting for abstractive summarization, in Proceedings of the 2nd Workshop on New Frontiers in Summarization, (2019), 10–20. https://doi.org/10.18653/v1/d19-5402 |
[7] | K. Krishna, B. V. Srinivasan, Generating topic-oriented summaries using neural attention, in Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 1 (2018), 1697–1705. https://doi.org/10.18653/v1/n18-1153 |
[8] | T. Ma, H. Wang, Y. Zhao, Topic-based automatic summarization algorithm for Chinese short text, Math. Biosci. Eng., 17 (2020), 3582–3600. https://doi.org/10.3934/mbe.2020202 doi: 10.3934/mbe.2020202 |
[9] | S. Narayan, J. Maynez, J. Adamek, Stepwise extractive summarization and planning with structured transformers, preprint, arXiv: 1810.04805. |
[10] | A. See, P. J. Liu, C. D. Manning, Get to the point: Summarization with pointer-generator networks, in Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics, 1 (2017), 1073–1083. https://doi.org/10.18653/v1/p17-1099 |
[11] | A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, et al., Attention is all you need, in Advances in Neural Information Processing Systems, (2017), 1–30. |
[12] | P. F. Du, X. Y. Li, Y. L. Gao, Survey on multimodal visual language representation learning, J. Software, 32 (2021), 327–348. https://doi.org/10.13328/j.cnki.jos.006125 doi: 10.13328/j.cnki.jos.006125 |
[13] | S. Golovanov, R. Kurbanov, S. Nikolenko, Large-scale transfer learning for natural language generation, in Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, (2019), 6053–6058. https://doi.org/10.18653/v1/p19-1608 |
[14] | J. J. Huang, P. W. Li, M. Peng, Q. Q. Xie, C. Xu, Research on deep learning-based topic models, Chin. J. Comput., 43 (2020), 827–855. |
[15] | N. Dethlefs, Domain transfer for deep natural language generation from abstract meaning representations, IEEE Comput. Intell. Mag., 12 (2017), 18–28. https://doi.org/10.1109/mci.2017.2708558 doi: 10.1109/mci.2017.2708558 |
[16] | X. Qiu, T. Sun, Y. Xu, Pre-trained models for natural language processing: A survey, Sci. Chin. Technol. Sci., 63 (2020), 1872–1897. https://doi.org/10.1109/iceib53692.2021.9686420 doi: 10.1109/iceib53692.2021.9686420 |
[17] | C. Raffel, N. Shazeer, A. Roberts, Exploring the limits of transfer learning with a unified text-to-text transformer, J. Mach. Learn. Res., 21 (2020), 1–67. |
[18] | M. Lewis, Y. Liu, N. Goyal, BART: Denoising sequence-to-sequence pre-training for natural language generation, translation, and comprehension, in Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, (2020), 7871–7880. https://doi.org/10.18653/v1/2020.acl-main.703 |
[19] | J. Zhang, Y. Zhao, M. Saleh, PEGASUS: Pre-training with extracted gap-sentences for abstractive summarization, in International Conference on Machine Learning, (2020), 11328–11339. |
[20] | Z. C. Zhang, M. Y. Zhang, T. Zhou, Pre-trained language model augmented adversarial training network for Chinese clinical event detection, Math. Biosci. Eng, 17 (2020), 2825–2841. https://doi.org/10.3934/mbe.2020157 doi: 10.3934/mbe.2020157 |
[21] | S. Chen, L. Han, X. Liu, Subspace distribution adaptation frameworks for domain adaptation, IEEE Trans. Neural Networks Learn. Syst., 31 (2020), 5204–5218. https://doi.org/10.1109/tnnls.2020.2964790 doi: 10.1109/tnnls.2020.2964790 |
[22] | H. Li, S. J. Pan, S. Wang, Heterogeneous domain adaptation via nonlinear matrix factorization, IEEE Trans. Neural Networks Learn. Syst., 31 (2020), 984–996. https://doi.org/10.1109/tnnls.2019.2913723 doi: 10.1109/tnnls.2019.2913723 |
[23] | W. Zellinger, B. A. Moser, T. Grubinger, Robust unsupervised domain adaptation for neural networks via moment alignment, Inf. Sci., 483 (2019), 174–191. https://doi.org/10.1016/j.ins.2019.01.025 doi: 10.1016/j.ins.2019.01.025 |
[24] | X. Glorot, A. Bordes, Y. Bengio, Domain adaptation for large-scale sentiment classification: A deep learning approach, in International Conference on Machine Learning, (2011), 513–520. |
[25] | J. Blitzer, M. Dredze, F. Pereira, Biographies, bollywood, boom-boxes, blenders: Domain adaptation for sentiment classification, in Proceedings of the 45th Annual Meeting of the Association of Computational Linguistics, 7 (2007), 440–447. |
[26] | F. Wu, Y. Huang, Sentiment domain adaptation with multiple sources, in Proceedings of the 54th Annual Meeting of the Association of Computational Linguistics, (2016), 301–310, https://doi.org/10.18653/v1/p16-1029 |
[27] | J. Blitzer, R. McDonald, F. Pereira, Domain adaptation with structural correspondence learning, in Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing, (2006), 120–128. https://doi.org/10.3115/1610075.1610094 |
[28] | J. Pan, X. Hu, P. Li, H. Li, W. He, Y. Zhang, Y. Lin, Domain adaptation via multi-layer transfer learning, Neurocomputing, 190 (2016), 10–24. https://doi.org/10.1016/j.neucom.2015.12.097 doi: 10.1016/j.neucom.2015.12.097 |
[29] | P. Wei, R. Sagarna, Y. Ke, Y. S. Ong, C. K. Goh, Source-target similarity modelings for multi-source transfer gaussian process regression, in Proceedings of the 34th International Conference on Machine Learning, (2017), 3722–3731. |
[30] | N. Houlsby, A. Giurgiu, S. Jastrzebski, Parameter-efficient transfer learning for NLP, in PMLR, (2019), 2790–2799. |
[31] | H. Zhang, L. Liu, Y. Long, Deep transductive network for generalized zero shot learning, Pattern Recogn., 105 (2020), 107370. https://doi.org/10.1016/j.patcog.2020.107370 doi: 10.1016/j.patcog.2020.107370 |
[32] | T. Zhao, M. Eskenazi, Zero-shot dialog generation with cross-domain latent actions, in Proceedings of the 19th Annual SIGdial Meeting on Discourse and Dialogue, (2018), 1–10. https://doi.org/10.18653/v1/w18-5001 |
[33] | Z. Liu, J. Shin, Y. Xu, Zero-shot cross-lingual dialogue systems with transferable latent variables, preprint, arXiv: 1911.04081. |
[34] | Ayana, S. Shen, Y. Chen, Zero-shot cross-lingual neural headline generation, IEEE/ACM Trans. Audio Speech Lang. Process., 26 (2018), 2319–2327. https://doi.org/10.1109/taslp.2018.2842432 doi: 10.1109/taslp.2018.2842432 |
[35] | X. Duan, M. Yin, M. Zhang, Zero-shot cross-lingual abstractive sentence summarization through teaching generation and attention, in Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, (2019), 3162–3172. https://doi.org/10.18653/v1/p19-1305 |
[36] | J. Devlin, M. W. Chang, K. Lee, BERT: Pre-training of deep bidirectional transformers for language understanding, preprint, arXiv: 1810.04805. |
[37] | X. T. Song, H. L. Sun, A review of neural network-based automatic source code abstraction techniques, J. Software, 33 (2022), 55–77. https://doi.org/10.13328/j.cnki.jos.006337 doi: 10.13328/j.cnki.jos.006337 |
[38] | P. J. Rousseeuw, Silhouettes: A graphical aid to the interpretation and validation of cluster analysis, J. Comput. Appl. Math., 20 (1987), 53–65. https://doi.org/10.1016/0377-0427(87)90125-7 doi: 10.1016/0377-0427(87)90125-7 |
[39] | Z. Huang, P. Xu, D. Liang, TRANS-BLSTM: Transformer with bidirectional LSTM for language understanding, preprint, arXiv: 2003.07000. |
[40] | Y. Liu, M. Lapata, Text summarization with pretrained encoders, preprint, arXiv: 1908.08345. |
[41] | K. Yaser, R. Naren, K. R. Chandan, Deep transfer reinforcement learning for text summarization, in Proceedings of the 2019 SIAM International Conference on Data Mining, (2019), 675–683. https://doi.org/10.1137/1.9781611975673.76 |
[42] | K. Qian, Z. Yu, Domain adaptive dialog generation via meta learning, in Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, (2019), 2639–2649. https://doi.org/10.18653/v1/p19-1253 |
[43] | Y. S. Chen, H. H. Shuai, Meta-transfer learning for low-resource abstractive summarization, preprint, arXiv: 2102.09397. |