Biomedical named entity recognition (Bio-NER) is the prerequisite for mining knowledge from biomedical texts. The state-of-the-art models for Bio-NER are mostly based on bidirectional long short-term memory (BiLSTM) and bidirectional encoder representations from transformers (BERT) models. However, both BiLSTM and BERT models are extremely computationally intensive. To this end, this paper proposes a temporal convolutional network (TCN) with a conditional random field (TCN-CRF) layer for Bio-NER. The model uses TCN to extract features, which are then decoded by the CRF to obtain the final result. We improve the original TCN model by fusing the features extracted by convolution kernel with different sizes to enhance the performance of Bio-NER. We compared our model with five deep learning models on the GENIA and CoNLL-2003 datasets. The experimental results show that our model can achieve comparative performance with much less training time. The implemented code has been made available to the research community.
Citation: Chao Che, Chengjie Zhou, Hanyu Zhao, Bo Jin, Zhan Gao. Fast and effective biomedical named entity recognition using temporal convolutional network with conditional random field[J]. Mathematical Biosciences and Engineering, 2020, 17(4): 3553-3566. doi: 10.3934/mbe.2020200
Biomedical named entity recognition (Bio-NER) is the prerequisite for mining knowledge from biomedical texts. The state-of-the-art models for Bio-NER are mostly based on bidirectional long short-term memory (BiLSTM) and bidirectional encoder representations from transformers (BERT) models. However, both BiLSTM and BERT models are extremely computationally intensive. To this end, this paper proposes a temporal convolutional network (TCN) with a conditional random field (TCN-CRF) layer for Bio-NER. The model uses TCN to extract features, which are then decoded by the CRF to obtain the final result. We improve the original TCN model by fusing the features extracted by convolution kernel with different sizes to enhance the performance of Bio-NER. We compared our model with five deep learning models on the GENIA and CoNLL-2003 datasets. The experimental results show that our model can achieve comparative performance with much less training time. The implemented code has been made available to the research community.
[1] | D. E. Rumelhart, G. E. Hinton, R. J. Williams, Learning representations by backpropagating errors, Nature, 323 (1986), 533-536. doi: 10.1038/323533a0 |
[2] | S. Hochreiter, J. Schmidhuber, Long short-term memory, Neural Comput., 9 (1997), 1735-1780. doi: 10.1162/neco.1997.9.8.1735 |
[3] | Q. Wang, Y. Zhou, T. Ruan, Y. Xia, D. Gao, P. He, Incorporating dictionaries into deep neural networks for the Chinese clinical named entity recognition, J. Biomed. Inform., 92 (2019), 103133. doi: 10.1016/j.jbi.2019.103133 |
[4] | S. Bai, J. Z. Kolter, V. Koltun, An empirical evaluation of generic convolutional and recurrent networks for sequence modeling, arXiv preprint arXiv, 2018 (2018), 1803.01271. |
[5] | G. Zhou, J. Su, Named entity recognition using an HMM-based chunk tagger, Proceedings of the 40th Annual Meeting on Association for Computational Linguistics, 2002. Available from: https://dl.acm.org/doi/10.3115/1073083.1073163. |
[6] | J. Lafferty, A. Mccallum, F. Pereira, Conditional random fields: Probabilistic models for segmenting and labeling sequence data, Proceedings of the 18th International Conference on Machine Learning, 2001. Available from: https://repository.upenn.edu/cis_papers/159/. |
[7] | D. Lin, X. Wu, Phrase clustering for discriminative learning, The International Joint Conference on Natural Language Processing, 1997. Available from: https://dl.acm.org/doi/10.5555/1690219.1690290. |
[8] | A. Passos, V. Kumar, A. Mccallum, Lexicon infused phrase embeddings for named entity resolution, arXiv preprint arXiv, 2014 (2014), 1404.5367 |
[9] | G. Luo, X. Huang, C. Lin, Z. Nie, Joint entity recognition and disambiguation, Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, 2015. Available from: https://www.aclweb.org/anthology/. |
[10] | B. Wang, Q. Zhang, X. Wei, Tabu Variable Neighborhood Search for Designing DNA Barcodes, IEEE Trans. Nanobiosc., 19 (2020), 127-131. doi: 10.1109/TNB.2019.2942036 |
[11] | R. Collobert, J. Weston, L. Bottou, Nature language processing (almost) from scratch. J. Mach. Learn Res., 12 (2011), 2493-2537. |
[12] | L. Yao, H. Liu, Y. Liu, D. Huang, Biomedical named entity recognition based on deep neutral network, Int. J. Hybrid Inform., 8 (2015), 279-288. |
[13] | Y. Wu, M. Jiang, J. Lei, H. Xu, Named entity recognition in Chinese clinical text using deep neural network, Stud. Health Technol., 216 (2015), 216-624. |
[14] | Z. Huang, W. Xu, K. Yu, Bidirectional LSTM-CRF models for sequence tagging, arXiv preprint arXiv, 2015 (2015), 1508.01991. |
[15] | G. Lample, M. Ballesteros, S. Subramanian, K. Kawakami, C. Dyer, Neural architectures for named entity recognition, The 15th Annual Conference of the North American Chapter of the Association for Computational Linguistics, arXiv preprint arXiv, 2016 (2016), 1603.01360. |
[16] | J. P. C. Chiu, E. Nichols, Named entity recognition with bidirectional LSTM-CNNs, Trans. Assoc. Comput. Linguist., 4 (2016), 357-370. doi: 10.1162/tacl_a_00104 |
[17] | L. Li, L. Jin, Z. Jiang, Biomedical Named Entity Recognition Based on Extended Recurrent Neural Networks, IEEE International Conference on Bioinfonnatics and Biomedicine, 2015. Available from: https://ieeexplore.ieee.org/. |
[18] | M. Gridach, Character-Level Neural Network for Biomedical Named Entity Recognition, J. Biomed. Inform., 70 (2017), 85-91. doi: 10.1016/j.jbi.2017.05.002 |
[19] | J. Qiu, Q. Wang, Y. Zhou, Fast and Accurate Recognition of Chinese Clinical Named Entities with Residual Dilated Convolutions, 2018 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), 2018. Available from: https://ieeexplore.ieee.org/. |
[20] | L. Li, Y. Jiang, Biomedical Named Entity Recognition Based on the Two Channels and Sentence-level Reading Control Conditioned LSTM-CRF, 2017 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), 2017. Available from: https://ieeexplore.ieee.org/. |
[21] | G. Lin, S. Zhang, H. Lin, Named entity identification based on fine-grained word representation, J. Chin. Inform. Process., 32 (2018), 62-72. |
[22] | T. Mikolov, K. Chen, G. S. Corrado, Efficient estimation of word representations in vector space, arXiv preprint arXiv, 2013 (2013), 1301.3781. |
[23] | J. Pennington, R. Socher, C. D. Manning, Glove: Global vectors for word representation, Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP), 2014. Available from: https://www.aclweb.org/anthology/. |
[24] | T. J. Brazil, Causal-convolution-a new method for the transient analysis of linear systems at microwave frequencies, IEEE Trans. Microwave Theory Tech., 43 (1995), 315-323. doi: 10.1109/22.348090 |
[25] | A. V. Den Oord, S. Dieleman, H. Zen, WaveNet: A generative model for raw audio, arXiv preprint arXiv, 2016 (2016), 1609.03499. |
[26] | K. He, X. Zhang, S. Ren, Deep residual learning for image recognition, Proceedings of the IEEE conference on computer vision and pattern recognition, 2016. Available from: http://cvpr2016.thecvf.com/. |
[27] | J. Devlin, M. Chang, K. Lee, K. Toutanova, BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding, arXiv preprint arXiv, 2018 (2018), 1810.04805. |
[28] | J. Lee, W. Yoon, S. Kim, D. Kim, S. Kim, J. Kang, BioBERT: A pre-trained biomedical language representation model for biomedical text mining, Bioinformatics, 36 (2020), 1234-1240. |
[29] | A. Katiyar, C. Cardie, Nested named entity recognition revisited, Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2018. Available from: https://www.aclweb.org/anthology/. |
[30] | L. Yao, H. Liu, Y. Liu, X. Li, M. W. Anwar, Biomedical named entity recognition based on deep neutral network, Int. J. Hybrid Inf. Technol., 8 (2015), 279-288. |
[31] | M. Ju, M. Miwa, S. Ananiadou, A neural layered model for nested named entity recognition, Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2018. Available from: https://www.aclweb.org/anthology/. |
[32] | E. Strubell, P. Verga, D. Belanger, A. McCallum, Fast and accurate entity recognition with iterated dilated convolutions, arXiv preprint arXiv, 2017 (2017), 1702.02098. |
[33] | X. Ma, E. Hovy, End-to-end Sequence Labeling via Bi-directional LSTM-CNNs-CRF, arXiv preprint arXiv, 2016 (2016), 1603.01354. |
[34] | H. Zhao, C. Che, B. Jin, A Viral Protein Identifying Framework Based on Temporal Convolutional Network, Math. Biosci. Eng., 16 (2019), 1709-1717. doi: 10.3934/mbe.2019081 |