Research article Special Issues

Sentence coherence evaluation based on neural network and textual features for official documents

  • Received: 14 January 2023 Revised: 13 April 2023 Accepted: 17 April 2023 Published: 24 April 2023
  • Sentence coherence is an essential foundation for discourse coherence in natural language processing, as it plays a vital role in enhancing language expression, text readability, and improving the quality of written documents. With the development of e-government, automatic generation of official documents can significantly reduce the writing burden of government agencies. To ensure that the automatically generated official documents are coherent, we propose a sentence coherence evaluation model integrating repetitive words features, which introduces repetitive words features with neural network-based approach for the first time. Experiments were conducted on official documents dataset and THUCNews public dataset, our method has achieved an averaged 3.8% improvement in accuracy indicator compared to past research, reaching a 96.2% accuracy rate. This result is significantly better than the previous best method, proving the superiority of our approach in solving this problem.

    Citation: Yunmei Shi, Yuanhua Li, Ning Li. Sentence coherence evaluation based on neural network and textual features for official documents[J]. Electronic Research Archive, 2023, 31(6): 3609-3624. doi: 10.3934/era.2023183

    Related Papers:

  • Sentence coherence is an essential foundation for discourse coherence in natural language processing, as it plays a vital role in enhancing language expression, text readability, and improving the quality of written documents. With the development of e-government, automatic generation of official documents can significantly reduce the writing burden of government agencies. To ensure that the automatically generated official documents are coherent, we propose a sentence coherence evaluation model integrating repetitive words features, which introduces repetitive words features with neural network-based approach for the first time. Experiments were conducted on official documents dataset and THUCNews public dataset, our method has achieved an averaged 3.8% improvement in accuracy indicator compared to past research, reaching a 96.2% accuracy rate. This result is significantly better than the previous best method, proving the superiority of our approach in solving this problem.



    加载中


    [1] S. Prabhu, K. Akhila, S. Sanriya, A hybrid approach towards automated essay evaluation based on BERT and feature engineering, in 2022 IEEE 7th International Conference for Convergence in Technology (I2CT), IEEE, Vadodara, India, (2022), 1–4. https://doi.org/10.1109/I2CT54291.2022.9824999
    [2] S. Jeon, M. Strube, Centering-based neural coherence modeling with hierarchical discourse segments, in Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), ACL, Online, (2020), 7458–7472. https://doi.org/10.18653/v1/2020.emnlp-main.604
    [3] X. Tan, L. Zhang, D. Xiong, G. Zhou, Hierarchical modeling of global context for document-level neural machine translation, in Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), ACL, Hong Kong, China, (2019), 1576–1585. https://doi.org/10.18653/v1/D19-1168
    [4] Q. Zhou, N. Yang, F. Wei, S. Huang, M. Zhou, T. Zhao, Neural document summarization by jointly learning to score and select sentences, in Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (ACL), ACL, Melbourne, Australia, (2018), 654–663. https://doi.org/10.18653/v1/p18-1061
    [5] Y. Diao, H. Lin, L. Yang, X. Fan, Y. Chu, D. Wu, et al., CRHASum: extractive text summarization with contextualized-representation hierarchical-attention summarization network, Neural Comput. Appl., 32 (2020), 11491–11503. https://doi.org/10.1007/s00521-019-04638-3 doi: 10.1007/s00521-019-04638-3
    [6] P. Yang, L. Li, F. Luo, T. Liu, X. Sun, Enhancing topic-to-essay generation with external commonsense knowledge, in Proceedings of the 57th Conference of the Association for Computational Linguistics (ACL), ACL, Florence, Italy, (2019), 2002–2012. https://doi.org/10.18653/v1/p19-1193
    [7] X. L. Li, J. Thickstun, I. Gulrajani, P. Liang, T. B. Hashimoto, Diffusion-LM improves controllable text generation, arXiv preprint, (2022), arXiv: 2205.14217. https://doi.org/10.48550/arXiv.2205.14217
    [8] D. Parveen, H. M. Ramsl, M. Strube, Topical coherence for graph-based extractive summarization, in Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing (EMNLP), ACL, Lisbon, Portugal, (2015), 1949–1954. https://doi.org/10.18653/v1/d15-1226
    [9] L. Logeswaran, H. Lee, D. R. Radev, Sentence ordering and coherence modeling using recurrent neural networks, in Proceedings of the 8th AAAI Symposium on Educational Advances in Artificial Intelligence, AAAI, Palo Alto, USA, 32 (2018), 5285–5292. https://doi.org/10.1609/aaai.v32i1.11997
    [10] Y. Liu, M. Lapata, Learning structured text representations, Trans. Assoc. Comput. Ling., 6 (2018), 63–75. https://doi.org/10.1162/tacl_a_00005 doi: 10.1162/tacl_a_00005
    [11] R. Barzilay, M. Lapata, Modeling local coherence: an entity-based approach, in 43rd Annual Meeting of the Association for Computational Linguistics (ACL), ACL, Michigan, USA, (2005), 141–148. https://doi.org/10.3115/1219840.1219858
    [12] A. Louis, A. Nenkova, A coherence model based on syntactic patterns, in Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (EMNLP-CoNLL), ACL, Jeju Island, Korea, (2012), 1157–1168.
    [13] D. Parveen, M. Mesgar, M. Strube, Generating coherent summaries of scientific articles using coherence patterns, in Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing (EMNLP), ACL, Austin, USA, (2016), 772–783. https://doi.org/10.18653/v1/d16-1074
    [14] J. Li, E. H. Hovy, A model of coherence based on distributed sentence representation, in Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), ACL, Doha, Qatar, (2014), 2039–2048. https://doi.org/10.3115/v1/d14-1218
    [15] L. Mou, R. Men, G. Li, Y. Xu, L. Zhang, R. Yan, et al., Recognizing entailment and contradiction by tree-based convolution, arXiv preprint, (2016), arXiv: 1512.08422. https://doi.org/10.48550/arXiv.1512.08422
    [16] K. Luan, X. Du, C. Sun, B. Liu, X. Wang, Sentence ordering based on attention mechanism, J. Chin. Inf. Technol., 32 (2018), 123–130. https://doi.org/10.3969/j.issn.1003-0077.2018.01.016 doi: 10.3969/j.issn.1003-0077.2018.01.016
    [17] P. Xu, H. Saghir, J. S. Kang, T. Long, A. J. Bose, Y. Cao, et al., A cross-domain transferable neural coherence model, in Proceedings of the 57th Conference of the Association for Computational Linguistics (ACL), ACL, Florence, Italy, (2019), 678–687. https://doi.org/10.18653/v1/p19-1067
    [18] F. Xu, S. Du, M. Li, M. Wang, An entity-driven recursive neural network model for Chinese discourse coherence modeling, Int. J. Artif. Intell. Appl., 8 (2017), 1–9. https://doi.org/10.5121/ijaia.2017.8201 doi: 10.5121/ijaia.2017.8201
    [19] S. Du, F. Xu, M. Wang, An entity-driven bidirectional LSTM model for discourse coherence in Chinese, J. Chin. Inf. Technol., 31 (2017), 67–74. https://doi.org/10.3969/j.issn.1003-0077.2017.06.010 doi: 10.3969/j.issn.1003-0077.2017.06.010
    [20] K. Liu, H. Wang, Research on automatic summarization coherence based on discourse rhetoric structure in Chinese, J. Chin. Inf. Technol., 33 (2019), 77–84. https://doi.org/10.3969/j.issn.1003-0077.2019.01.009 doi: 10.3969/j.issn.1003-0077.2019.01.009
    [21] Z. Yang, Z. Dai, Y. Yang, J. Carbonell, R. R. Salakhutdinov, Q. V. Le, XLNet: generalized autoregressive pretraining for language understanding, in Annual Conference on Neural Information Processing Systems (NeurIPS), (2019), 5754–5764.
    [22] J. Devlin, M. W. Chang, K. Lee, K. Toutanova, BERT: pre-training of deep bidirectional transformers for language understanding, in Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL-HLT), ACL, Minneapolis, USA, (2019), 4171–4186. https://doi.org/10.18653/v1/n19-1423
    [23] W. Zhao, M. Strube, S. Eger, Discoscore: evaluating text generation with bert and discourse coherence, arXiv preprint, (2022), arXiv: 2201.11176. https://doi.org/10.48550/arXiv.2201.11176
    [24] T. Zhang, V. Kishore, F. Wu, K. Q. Weinberger, Y. Artzi, BERTScore: evaluating text generation with BERT, arXiv preprint, (2019), arXiv: 1904.09675. https://doi.org/10.48550/arXiv.1904.09675
  • Reader Comments
  • © 2023 the Author(s), licensee AIMS Press. This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0)
通讯作者: 陈斌, bchen63@163.com
  • 1. 

    沈阳化工大学材料科学与工程学院 沈阳 110142

  1. 本站搜索
  2. 百度学术搜索
  3. 万方数据库搜索
  4. CNKI搜索

Metrics

Article views(1003) PDF downloads(54) Cited by(0)

Article outline

Figures and Tables

Figures(6)  /  Tables(3)

Other Articles By Authors

/

DownLoad:  Full-Size Img  PowerPoint
Return
Return

Catalog