
The present study evaluated varying inclusion levels (10%, 20%, 30%, 40%, and 50%) of spent mushroom substrate (SMS) derived from Pleurotus ostreatus, ensiled for 0, 21, 42, and 64 d, using an in vitro batch culture technique. The study employed a 6 × 4 factorial design with six inclusion levels and four ensiling durations. The batch culture was conducted over 24 h across two runs. Gas production (GP), greenhouse gases (GHG) production, nutrient degradability, and volatile fatty acids (VFA) were measured. Significant interactions (P < 0.01) between ensiling duration and diet were observed for the concentrations of different nutrients and GHG production. SMS levels in diets increased (P < 0.001) dry matter (DM), neutral (NDF), and acid (ADF) detergent fiber concentrations but decreased crude protein (CP) and cellulose levels. Ensiling period decreased (P < 0.001) DM, NDF, acid-detergent lignin (ADL), and hemicellulose concentrations but increased non-structural carbohydrates (P < 0.05). Diets with higher SMS levels had lower (P < 0.001) GP, methane (CH4), and carbon dioxide (CO2) production, together with increased degradability of DM, NDF, ADF, and ADL. Conversely, extending ensiling increased CH4 and CO2 production, degradability of DM, and proportions of acetate and propionate but decreased NDF and ADF degradability. Total VFA and butyrate were highest (P < 0.05) in the diet with 50% SMS inclusion. In conclusion, SMS can replace up to 50% of corn silage in the diets of beef and non-lactating dairy cows; however, extending the ensiling duration is not recommended.
Citation: Chika C. Anotaenwere, Omoanghe S. Isikhuemhen, Peter A. Dele, Michael Wuaku, Joel O. Alabi, Oludotun O. Adelusi, Deborah O. Okedoyin, Kelechi A. Ike, DeAndrea Gray, Ahmed E. Kholif, Uchenna Y. Anele. Ensiled Pleurotus ostreatus based spent mushroom substrate from corn: In vitro gas production, greenhouse gas emissions, nutrient degradation, and ruminal fermentation characteristics[J]. AIMS Microbiology, 2025, 11(1): 1-21. doi: 10.3934/microbiol.2025001
[1] | Chaofan Li, Kai Ma . Entity recognition of Chinese medical text based on multi-head self-attention combined with BILSTM-CRF. Mathematical Biosciences and Engineering, 2022, 19(3): 2206-2218. doi: 10.3934/mbe.2022103 |
[2] | Xiaoqing Lu, Jijun Tong, Shudong Xia . Entity relationship extraction from Chinese electronic medical records based on feature augmentation and cascade binary tagging framework. Mathematical Biosciences and Engineering, 2024, 21(1): 1342-1355. doi: 10.3934/mbe.2024058 |
[3] | Zhichang Zhang, Yu Zhang, Tong Zhou, Yali Pang . Medical assertion classification in Chinese EMRs using attention enhanced neural network. Mathematical Biosciences and Engineering, 2019, 16(4): 1966-1977. doi: 10.3934/mbe.2019096 |
[4] | Kunli Zhang, Bin Hu, Feijie Zhou, Yu Song, Xu Zhao, Xiyang Huang . Graph-based structural knowledge-aware network for diagnosis assistant. Mathematical Biosciences and Engineering, 2022, 19(10): 10533-10549. doi: 10.3934/mbe.2022492 |
[5] | Mengqi Zhang, Lei Ma, Yanzhao Ren, Ganggang Zhang, Xinliang Liu . Span-based model for overlapping entity recognition and multi-relations classification in the food domain. Mathematical Biosciences and Engineering, 2022, 19(5): 5134-5152. doi: 10.3934/mbe.2022240 |
[6] | Luqi Li, Yunkai Zhai, Jinghong Gao, Linlin Wang, Li Hou, Jie Zhao . Stacking-BERT model for Chinese medical procedure entity normalization. Mathematical Biosciences and Engineering, 2023, 20(1): 1018-1036. doi: 10.3934/mbe.2023047 |
[7] | Hongyang Chang, Hongying Zan, Shuai Zhang, Bingfei Zhao, Kunli Zhang . Construction of cardiovascular information extraction corpus based on electronic medical records. Mathematical Biosciences and Engineering, 2023, 20(7): 13379-13397. doi: 10.3934/mbe.2023596 |
[8] | Zhaoyu Liang, Zhichang Zhang, Haoyuan Chen, Ziqin Zhang . Disease prediction based on multi-type data fusion from Chinese electronic health record. Mathematical Biosciences and Engineering, 2022, 19(12): 13732-13746. doi: 10.3934/mbe.2022640 |
[9] | Qiao Pan, Chen Huang, Dehua Chen . A method based on multi-standard active learning to recognize entities in electronic medical record. Mathematical Biosciences and Engineering, 2021, 18(2): 1000-1021. doi: 10.3934/mbe.2021054 |
[10] | Quan Zhu, Xiaoyin Wang, Xuan Liu, Wanru Du, Xingxing Ding . Multi-task learning for aspect level semantic classification combining complex aspect target semantic enhancement and adaptive local focus. Mathematical Biosciences and Engineering, 2023, 20(10): 18566-18591. doi: 10.3934/mbe.2023824 |
The present study evaluated varying inclusion levels (10%, 20%, 30%, 40%, and 50%) of spent mushroom substrate (SMS) derived from Pleurotus ostreatus, ensiled for 0, 21, 42, and 64 d, using an in vitro batch culture technique. The study employed a 6 × 4 factorial design with six inclusion levels and four ensiling durations. The batch culture was conducted over 24 h across two runs. Gas production (GP), greenhouse gases (GHG) production, nutrient degradability, and volatile fatty acids (VFA) were measured. Significant interactions (P < 0.01) between ensiling duration and diet were observed for the concentrations of different nutrients and GHG production. SMS levels in diets increased (P < 0.001) dry matter (DM), neutral (NDF), and acid (ADF) detergent fiber concentrations but decreased crude protein (CP) and cellulose levels. Ensiling period decreased (P < 0.001) DM, NDF, acid-detergent lignin (ADL), and hemicellulose concentrations but increased non-structural carbohydrates (P < 0.05). Diets with higher SMS levels had lower (P < 0.001) GP, methane (CH4), and carbon dioxide (CO2) production, together with increased degradability of DM, NDF, ADF, and ADL. Conversely, extending ensiling increased CH4 and CO2 production, degradability of DM, and proportions of acetate and propionate but decreased NDF and ADF degradability. Total VFA and butyrate were highest (P < 0.05) in the diet with 50% SMS inclusion. In conclusion, SMS can replace up to 50% of corn silage in the diets of beef and non-lactating dairy cows; however, extending the ensiling duration is not recommended.
With the combination of the medical field and big data, more and more consultation data and disease information are recorded in the form of electronic medical records (EMRs), and gradually become an important basis for assisting doctors in therapeutic diagnosis. EMRs record a large number of diagnostic information of patients: hospital records, course records, doctor's orders, case data and so on, including key entity information such as disease, surgery, drugs, etc. This information is a decisive factor for doctors to make treatment plans for patients [1]. It is of great significance to study how to extract key entity information from massive EMRs efficiently and accurately through intelligent methods.
Named entity recognition (NER) is a vital part of natural language processing (NLP) that meets the aforementioned requirements [2]. Its purpose is to recognize various named entities, e.g. names, place, organizations, etc., from raw text. Extracted entities can be taken as information for people, and can also pave the way for other NLP tasks, such as relationship extraction and knowledge graph construction. Recently, with the rise of deep learning technology, deep neural networks are utilized to achieve medical NER and have attracted much research attention.
So far, NER still faces huge problems in the field of Chinese electronic medical records (CEMRs). The main reasons are as follows: first of all, an entity may have multiple names due to the undefined text labeling standards [3]; secondly, the meaning of the same word or character may be completely different in different contexts, which causes confusions towards Chinese semantics; last but not least, Chinese has no natural vocabulary boundaries (spaces) as English does, so Chinese have no strict and correct vocabulary boundaries. In previous NER researches, the BiLSTM-CRF, which is the abbreviation of bi-directional Long-Short Term Memory (LSTM) joining with a conditional random field (CRF) layer, shows advanced performance and has become a prevalent architecture for various NER tasks [4,5]. This architecture outperforms traditional methods in that it eliminates the inefficient and complex method of manually designing feature templates and utilizes recurrent neural network (RNN) to automatically capture text features. However, lengths of CEMR texts are generally longer than traditional text, for CEMR texts contain at least several hundred Chinese tokens. Although it can capture long-term contextual dependencies [6], Long-Short Term Memory (LSTM) present inferior performance when text lengths exceed a certain step size [7]. In addition, at the beginning of the NLP task, each token in the text is represented by a low-dimensional dense vector [8]. Due to the highly domain-specialized medical field, universal pre-trained models are hardly adopted in practical tasks. The lack of Chinese medical corpus even makes it more difficult to pre-training medical domain-specific language models. In most cases, vector representations are generally randomly initialized, leading to a context-independent representation for each token [9], as a result, it can't tackle polysemy problems, and the limitations are very obvious.
Our work focus on Chinese medical NER in CEMRs, which has been a subtask of numerous influential academic conferences in NLP domain, e.g. China Conference on Knowledge Graph and Semantic Computing (CCKS), China Health Information Processing Conference (CHIP) etc.. These tasks not only accelerate Chinese medical NER research, but also provide several precious corpora for Chinese medical NER. In this paper, we first collect 4679 CEMRs, which are utilized to fine-tune the Chinese Embeddings from Language Models (ELMo) [9] trained with common field corpora, and then get a pre-trained model that can be used to dynamically generate context-dependent character embeddings for Chinese characters. Secondly, encoder from Transformer (ET) [10] is utilized as the model encoder instead of the traditional bi-directional Long-Short Term Memory (BiLSTM) in order to provide the proposed model with the ability to capture the long-term dependence of ultra-long CEMR texts efficiently. In ET, the distance between token and token is one, so there is no problem that the dependence is lost due to the lengthy distance between tokens. Our contributions are summarized as follows.
1) We fine-tuned a Chinese medical domain-specific ELMo model, which provides an authentic pre-trained language model for further research. A Chinese medical corpus with 4679 real-world CEMRs is constructed, containing about 1.8 million Chinese characters. Then a medical domain-specific model ELMo is fine-tuned by the efficiently application of Chinese medical corpus and the public available Chinese ELMo model.
2) We realize Chinese medical NER in CEMRs with ET-CRF model, which can tackle the long context dependencies better than the BiLSTM model with self-attention mechanism, and to the best of our knowledge this is the first time to apply ET-CRF model to Chinese medical NER.
3) Owning to the contributions above, the proposed ELMo-ET-CRF model achieves the best performance among all model architectures mentioned in this paper on the CCKS 2019 datasets, and the final F1-score is competitive to the current state-of-the-art performance.
NER has become one of the important tasks of information retrieval, data mining and NLP owing to of its extraordinary significance [11], and various solutions have been proposed in the existing literature.
Matching entities through handwritten rules is the main method to deal with NER tasks in the early stage [12]. However, the construction of rules requires a certain level of expertise, and even a domain expert cannot enumerate rules that can model all entities. In addition, rules cannot be migrated because they rely on datasets. Thus, a same set of rules may not work under different datasets. This kind of handcrafted approach always leads to a relatively high system engineering cost.
The statistical machine learning method treats NER as a sequence labeling problem by inputting a set of sequences and outputing a set of predicted optimal tags sequences. Traditional methods include hidden Markov models [13,14], maximum entropy Markov models [15], conditional random fields [16,17], and support vector machines [18]. The most common implementation is feature template with CRF and different feature templates can be combined to form a new feature template. However, this statistical machine learning based method relies heavily on hand-crafted features, which cost a lot of overhead when find the most appropriate features.
In recent years, with the increase of computing power, training of deep neural network has become simple and feasible. The propose of word embeddings (e.g. word2vec, glove) make the useage of deep neural networks to deal with NLP tasks become a research focus [8,19]. Different from the traditional statistical machine learning training method, the training process of the neural network is regarded as an end-to-end process, which can automatically learn data features and avoid the extra overhead such as feature engineering.
More recently, the model structure which is using bidirectional LSTM to encode data with CRF as the decoder has got the most advanced results in the medical NER task [4,5], and the effect is significantly better than the traditional statistical model. The theory of model architecture was first proposed by Collobert et al. [20], Huang [21] and Lample [22] used LSTM-CRF for the first time to deal with sentence-level annotation problems. Ma et al. [23] used LSTM-CRF structure for the first time in English NER task, and achieved promising results, and then Dong et al. [24] first used LSTM-CRF to handle the Chinese NER task. LSTM has a cell structure and gate mechanism that allows the model to effectively capture long-term dependencies and has certain forgetting capabilities [6], and the advantage of splicing CRF after encoder layer is that it can use information that has already occurred during the sequence generation process to ensure that the output value is a set of optimal solution sequences. In addition, Zhang et al. [28] investigate a lattice-structured LSTM model for Chinese NER, which encodes a sequence of input characters as well as all potential words that match a lexicon. Zhang et al. [29] propose a convolutional attention layer to extract the implicit local context features from character sequence. Liu et al. [30] propose a Global Context enhanced Deep Transition architecture for sequence labeling. Qiu et al. [3] proposed the RD-CNN-CRF model, which effectively reduced the time required for training without losing the accuracy of the model. In terms of the inconsistency of the label, Ji et al. [1] proposed a hybrid model based on the attention mechanism, the attention mechanism effectively alleviates the problem of model accuracy decline caused by label inconsistency.
In our method, raw CEMRs are used as the input of ELMo to get the vector representation of sentences, sentence features are then extracted by ET before decoded through CRF to generate the annotation sequence. Figure 1 shows the structure of the model, which is detailed in this section.
In early NLP tasks, the input of RNN is a set of word embeddings generated by Word2Vec [8], Glove [19] etc. However, these embeddings are context-independent, i.e., each word corresponds to a unique static vector without changing with the context, so these methods are limited in the case of polysemy. The emergence of ELMo effectively solves this problem by applying stacked BiLSTM to model the entire sentence from two directions and mapping the sentences into a sequence of vectors. Since LSTM can capture context dependencies, it ensures that the output embedding sequence has a front-to-back correlation.
Given S sequence of N tokens, {w1,w2,...,wN}, these tokens first go through a token layer, which transfers the dimension of original character embedding to the input dimension of BiLSTM according to a weight matrix, then the output of token layer will be sent to a stacked BiLSTM to build a language model, as shown in Figure 2. Suppose the sentence length is N, and L represents the number of layers of BiLSTM, at each position t, each layer l of LSTM output a context-dependent hidden vector hLMt,l([→hLMt,l,←hLMt,l]) where t=1,2,...,Nand l=1,2,...,L. The hidden vector hLMt,l of the last layer of LSTM output is used by the softmax layer to predict the token at the next moment.
When the training process finishes, each sentence will learn2L+1 representations Rt, where L is the number of layers in the model:
Rt={xLMt,→hLMt,l,←hLMt,l|l=1,…,L}={hLMt,l|l=0,…,L} | (1) |
where hLMt,0 is the output of token layer and hLMt,l=[→hLMt,l,←hLMt,l] for each BiLSTM layer.
Our model uses ET as the encoder layer. Compared with RNN and CNN, it takes a different approach that relies entirely on the self-attention mechanism to extract context features, which makes it highly parallel in the encoding process.
Suppose the input S is a set of sequences {w1,w2,...,wN}, S∈RN×dmodel, where N is the length of the sequence and dmodel is the dimension of the input vector. We use multiple of the scaled dot-product attention components inside the multi-head attention layer to enhance the model's ability to encode sequences internally. We first add position information to the input sequence S, which follows the approach of Vaswani et al. [10], then perform a matrix transformation operation on S to obtain three weight matrices, namely key matrix K, query matrix Q and value matrix V. Finally, the output representation of a single self-attention is obtained by scaled dot-product attention:
Q,K,V=SWQ,SWK,SWV | (2) |
Attn(Q,K,V)=softmax(QKT√dk)V | (3) |
where WQ∈Rdmodel×dk, WK∈Rdmodel×dk, WV∈Rdmodel×dk, are learnable parameters and softmax()is performed row-wise.
Single head attention may inhibit the information from different representation subspaces at different positions, so our model uses multi-head attention mechanisms [10]:
MultiHead(S)=[head1,head2,…,headn]WO | (4) |
where headi=Attn(Qi,Ki,Vi) and WO is learnable parameter.
Multi-head attention will be followed by a feedforward neural network, and the output of each sublayer in the encoder will have a residual link and layer normalization as shown in Figure 3.
Assuming that the output of the multi-attention is the −S, the final output of the encoder will be calculated as follows:
−S=layernorm(S+MultiHead(S)) | (5) |
S=layernorm(−S+FFN(−S)) | (6) |
where FFN(x)=max(0,xW1+b1)W2+b2 and layernorm() represents the layer normalization [25].
In the NER task of CEMRs, the output tag sequence is strictly ordered. The CRF layer maintains a state transition matrix which stores the transition probability from the previous state to the current state, ensureing the tag prediction process has inner dependency.
Given the input sequence is X={x1,x2,...,xN}, the score for defining the output sequence Y={y1,y2,...,yN} is expressed as follows :
s(X,Y)=N∑i=0Ayi,yi+1+N∑i=1Pi,yi | (7) |
where A is a matrix of transition scores, Aij represents the score of a transition from the tag i to tag j. P is a matrix of tag scores which is obtained from the output of ET through a fully connected network, P∈RN×k, where N is the length of the sequence, k is the number of labels, and pijrepresents the score of the jth label of the ithposition in the sequence. y0 and yN+1 are represented by < bos > and < eos > .
The model uses softmax to calculate the probability of all the tag sequences that may be generated by the input sequence X, and defines the logarithmic probability of maximizing the correct annotated sequence as the goal of the model optimization [22]:
p(Y|X)=es(X,Y)∑Y'∈YALLeS(X,Y') | (8) |
log(p(Y|X))=s(X,Y)−log(∑Y'∈YALLeS(X,Y')) | (9) |
where YALL represents all possible tag sequences for an input sentence X.
The Chinese ELMo model used in this paper comes from the Research Center for Social Computing and Information Retrieval Harbin Institute of Technology. It was trained on Xinhua proportion of Chinese gigawords-v5, and takes roughly 3 days on an NVIDIA P100 GPU [26,27]. Since the datasets used in this paper belong to medical field, the corpus distribution has a strong specialized background, and there are also some uncommon characters in the text. Therefore, the pre-trained ELMo which is used in the general field cannot be directly applied to this task. We fine-tuned the ELMo with the medical corpus.
We collected 4679 CEMRs, which contains about 1.8 million characters, from CCKS2018, CCKS2019 and CHIP2018, and routinely preprocessed the text. In particular, it should be noted that in this paper the proposed neural network model is based on character embeddings. In the fine-tuning process, the hyperparameter setting is shown in the Table 1.
Hyperparameter name | Hyperparameter setting |
Character Embedding | 300 |
LSTM Layer | 2 |
LSTM Cell Size | 4096 |
LSTM Hidden Size | 1024 |
Batch Size | 1 |
Optimizer | Adam |
Learning rate | 0.001 |
Annealing rate | 0.9t/4679 |
Gradients clip | 5 |
Dropout | 0.1 |
Max epoch | 100 |
After the model training is completed, we freeze the parameters of ELMo and use the weighted summation of each layer of ELMo as the output vector of ELMo, finally take the concatenation of the ELMo output and the original character embeddings as downstream model input:
VEMLot=E(Rt;Θ)=λL∑l=0θlhLMt,l | (10) |
cembedt=[ct,VELMot] | (11) |
where θ are softmax-normalized weights and the λ is a scalar which can scale the vector according to a certain ratio. These two parameters are learnable and is updated with downstream model training. ct represents original character embedding which generated by Word2Vec, cembedt represents the final vector representation of the character at the t position in a sequence and it will be the input of ET.
This paper focus on Chinese medical NER, which aims to detect entity boundary and categorize entity into pre-defined categories. The dataset used in this paper comes from CCKS2019, which is jointly provided by Yiducloud (Beijing) Technology Co., Ltd.
The training data contains 1000 manually annotated CMERs, while the test data contains 379 manually annotated CMERs. Each EMR contains two parts: raw CEMR and annotation information. The annotation information consists of several triples, which are formed of entity start index, entity end index and entity category. Through entity start and end indices, we can extract the entity from CMERs. All entities are categorized into six categories, disease and diagnosis, imaging examination, laboratory test, surgery, medicine and anatomy. Entities contained in this dataset is shown in Table 2.
Category | Disease and diagnosis | Imaging examination | Laboratory test | Surgery | Medicine | Anatomy | Totally |
Training data | 4193 | 966 | 1194 | 1027 | 1814 | 8231 | 17425 |
test data | 1310 | 344 | 586 | 162 | 483 | 2938 | 5823 |
We use the standard evaluation criteria to validate the effectiveness of the model, namely precision, recall and micro F1-score. which can be calculated as follows:
Precision=TP(TP+FP) | (12) |
Recal=TP(TP+FN) | (13) |
F1=2×Precision×Recall(Precision+Recall) | (14) |
where the TP,TN,FPand FNare true positive, true negative, false positive and false negative respectively.
In this paper, CMERs are encoded with BIO format (Begin, Inside and Outside). The token will label as B-label if the token is the beginning of a named entity, I-label means the token is inside a named entity, other case will label as O-label. It can be seen from Figure 4 that tag ‘O’ are negative samples, and other tags are positive samples.
We first use the training data to train the BiLSTM-CRF. In the experiment, the batch size was 10, Adam was selected as the model's optimizer, and the learning rate was set to 0.001. The initial character embedding is generated by word2vec method with a dimension of 300. The BiLSTM layer number is 1, and the hidden layer vector dimension is 300.
For ET-CRF, since the layer numbers and self-attention heads have a critical impact on the model's performance, we trained six sets of models with different numbers of layers or heads, and compared the effects of the models under different parameters on the test set. The results are shown in Table 3. The hyperparameters for the best model are 2 layers and 8 heads. In addition, the model’s input is 512 dimensions character embedding, which is also generated by the word2vec method with the same experimental configuration as BiLSTM-CRF.
Layers | 2 | 2 | 4 | 4 | 6 | 6 |
Heads | 4 | 8 | 4 | 8 | 4 | 8 |
F1 | 83.95 | 84.05 | 83.68 | 84.01 | 83.55 | 83.98 |
The results of WV-BiLSTM-CRF and WV-ET-CRF are shown in Table 4 (WV represents word2vec). It is obvious that the F1-score of the medicine and the imaging examination part is ideal for the BiLSTM-CRF, but the recognition ability of disease and diagnosis is the short board of the model, F1-score only 76.23%. It is speculated that the entity length of the medicine and the anatomy part is generally short and there is no multiple standard, but the tag of the entity such as disease and diagnosis is subjective, context-dependent, and generally has a long length, so judging the boundary is quite difficultly for BiLSTM. The total F1-score of the ET-CRF is 84.05%, which is higher than that of BiLSTM-CRF, 2.19%, revealing that about 382 medical entities are rectified or extracted. And for entities such as disease and diagnosis, ET-CRF performs remarkable. This is an intuitive result, we speculated that due to the ET encoding process which offers effectively method of shortening the direct distance between characters, the context-dependent relationship become well captured and preserved, ensuring the good recognition rate of ET-CRF for longer entities.
Entity name | WV-BiLSTM-CRF | WV-ET-CRF | ||||
Strict index (%) | Strict index (%) | |||||
P | R | F1 | P | R | F1 | |
Disease and diagnosis | 74.47 | 78.07 | 76.23 | 77.86 | 81.04 | 79.42 |
Imaging examination | 82.22 | 89.31 | 85.62 | 77.78 | 96.55 | 86.15 |
Laboratory test | 82.35 | 84.13 | 83.23 | 82.56 | 85.34 | 83.92 |
Surgery | 79.04 | 81.27 | 80.14 | 80.00 | 87.63 | 83.64 |
Anatomy | 81.34 | 84.02 | 82.55 | 82.83 | 85.78 | 84.28 |
Medicine | 89.24 | 90.28 | 89.76 | 91.80 | 93.80 | 92.79 |
Total | 80.39 | 83.39 | 81.86 | 82.08 | 86.12 | 84.05 |
In order to verify the actual effect of ELMo on the NER task in CEMRs, we add ELMo components to the above two models, BiLSTM-CRF and ET-CRF. As a result, the input of the model becomes a dynamic context-dependent character embeddings with context information. The test results are shown in Table 5. For the ELMo-BiLSTM-CRF, the total F1-score is 2.64% higher than BiLSTM-CRF, which means that about 460 medical entities are rectified or extracted. The addition of ELMo has also improved the recognition ability of disease and diagnosis, which can be seen from the table that 2.24% has been improved. For the ELMo-ET-CRF, the total F1-score is 1.54% higher than that of ET-CRF, which shows that 268 medical entities are rectified or extracted. In summary, the addition of pre-trained ELMo enriches the information contained in the character embedding, which effectively improves the accuracy of the model.
Entity name | ELMo-LSTM-CRF | ELMo-ET-CRF | ||||
Strict index (%) | Strict index (%) | |||||
P | R | F1 | P | R | F1 | |
Disease and diagnosis | 76.93 | 80.07 | 78.47 | 79.57 | 82.53 | 81.02 |
Imaging examination | 84.38 | 93.10 | 88.52 | 82.35 | 96.55 | 88.89 |
Laboratory test | 84.94 | 86.78 | 85.85 | 83.72 | 86.54 | 85.11 |
Surgery | 81.79 | 84.10 | 82.93 | 80.65 | 88.34 | 84.32 |
Medicine | 84.74 | 86.36 | 85.54 | 84.75 | 87.92 | 86.31 |
Anatomy | 91.06 | 92.13 | 91.59 | 90.32 | 93.80 | 92.03 |
Total | 83.32 | 85.72 | 84.50 | 83.65 | 87.61 | 85.59 |
Table 6 illustrates that ELMo outperforms word2vec due to the obtained dynamic context-dependent model input,and the performance of ET-CRF in Chinese medical named entity recognition is significantly better than that of BiLSTM-CRF. Due to the excellent long context dependency capture capability of the self-attention mechanism,the ET-CRF’s ability to recognize long entity is significantly better than the BiLSTM-CRF. In addition,we also found that the convergence speed of ET-CRF is significantly faster than BiLSTM-CRF. It can also be seen in the table that the final F1-score of the best model ELMo-ET-CRF in the paper is 85.59%,which is still competitive compared to the top three in the CCKS2019 competition (https://www.biendata.com/competition/ccks_2019_1/final-leaderboard/).
Model | Strict index (%) | ||
P | R | F1 | |
WV-LSTM-CRF | 80.39 | 83.39 | 81.86 |
WV-ET-CRF | 82.08 | 86.12 | 84.05 |
ELMo-LSTM-CRF | 83.32 | 85.72 | 84.50 |
ELMo-ET-CRF | 83.65 | 87.61 | 85.59 |
CCKS2019-No.1 | - | - | 85.62 |
CCKS2019-No.2 | - | - | 85.59 |
CCKS2019-No.3 | - | - | 85.16 |
In this paper, we firstly fine-tune a medical domain-specific ELMo model through a small medical corpus which is contained with 4679 CEMRs. Then we apply the ET-CRF model to Chinese medical NER on CEMRs. Finally, the proposed ELMo-ET-CRF model use dynamic context-dependent ELMo character embeddings to incorporate more lexical, syntantic and semantic information, and alleviates long context dependency problem. Under the strict evaluation index, the F1-score of ELMo-ET-CRF on the test set is 85.59%, which is competitive to the state-of-the-art on this dataset, and indicates the effectiveness of the proposed model architecture.
We thank the anonymous reviewers for their constructive comments. This work was supported by grants from the National Key Research and Development Program of China (2017YFB0202104). Publication costs are funded by a grant of the National Key Research and Development Program of China (2017YFB0202104).
All authors declare no conflicts of interest in this paper.
[1] |
Smith LC, Haddad L (2015) Reducing child undernutrition: Past Drivers and Priorities for the post-MDG era. World Dev 68: 180-204. https://doi.org/10.1016/j.worlddev.2014.11.014 ![]() |
[2] | FAOThe future of food and agriculture–Trends and challenges, Rome, Italy, FAO, United Nations, Rome (2017). Available from: https://www.fao.org/global-perspectives-studies/resources/detail/en/c/458158/ |
[3] |
Tilman D, Clark M, Williams DR, et al. (2017) Future threats to biodiversity and pathways to their prevention. Nature 546: 73-81. https://doi.org/10.1038/nature22900 ![]() |
[4] |
Mhlongo G, Mnisi CM, Mlambo V (2021) Cultivating oyster mushrooms on red grape pomace waste enhances potential nutritional value of the spent substrate for ruminants. PLoS One 16: e0246992. https://doi.org/10.1371/journal.pone.0246992 ![]() |
[5] |
Kousar A, Khan HA, Farid S, et al. (2024) Recent advances on environmentally sustainable valorization of spent mushroom substrate: A review. Biofuels Bioprod Biorefin 18: 639-651. https://doi.org/10.1002/bbb.2559 ![]() |
[6] | Olagunju LK, Isikhuemhen OS, Dele PA, et al. (2023) Pleurotus ostreatus can significantly improve the nutritive value of lignocellulosic crop residues. Agriculture (Switzerland) 13: 1161. https://doi.org/10.3390/agriculture13061161 |
[7] |
Mohd Hanafi FH, Rezania S, Mat Taib S, et al. (2018) Environmentally sustainable applications of agro-based spent mushroom substrate (SMS): An overview. J Mater Cycles Waste Manag 20: 1383-1396. https://doi.org/10.1007/s10163-018-0739-0 ![]() |
[8] |
Kholif AE, Khattab HM, El-Shewy AA, et al. (2014) Nutrient digestibility, ruminal fermentation activities, serum parameters and milk production and composition of lactating goats fed diets containing rice straw treated with Pleurotus ostreatus. Asian-Australas J Anim Sci 27: 357-364. https://doi.org/10.5713/ajas.2013.13405 ![]() |
[9] | Ngaowkakhiaw N, Kongmun P, Bungsrisawat P, et al. (2021) Effects of phytochemicals in spent mushroom substrate silage-based diets on methane mitigation by in vitro technique. Khon Kaen Agric J 2: 372-378. |
[10] |
Kholif AE, Gouda GA, Patra AK (2022) The sustainable mitigation of in vitro ruminal biogas emissions by ensiling date palm leaves and rice straw with lactic acid bacteria and Pleurotus ostreatus for cleaner livestock production. J Appl Microbiol 132: 2925-2939. https://doi.org/10.1111/jam.15432 ![]() |
[11] |
Pedroso A de F, Nussio LG, Paziani S de F, et al. (2005) Fermentation and epiphytic microflora dynamics in sugar cane silage. Sci Agric 62: 427-432. https://doi.org/10.1590/S0103-90162005000500003 ![]() |
[12] |
Senger CCD, Mühlbach PRF, Sánchez LMB, et al. (2005) Chemical composition and ‘in vitro’ digestibility of maize silages with different maturities and packing densities. Ciência Rural 35: 1393-1399. https://doi.org/10.1590/S0103-84782005000600026 ![]() |
[13] |
Alabi JO, Wuaku M, Anotaenwere CC, et al. (2024) A mixture of prebiotics, essential oil blends, and onion peel did not affect greenhouse gas emissions or nutrient degradability, but altered volatile fatty acids production in dairy cows using rumen simulation technique (RUSITEC). Fermentation 10: 324. https://doi.org/10.3390/fermentation10060324 ![]() |
[14] |
Ike KA, Adelusi OO, Alabi JO, et al. (2024) Effects of different essential oil blends and fumaric acid on in vitro fermentation, greenhouse gases, nutrient degradability, and total and molar proportions of volatile fatty acid production in a total mixed ration for dairy cattle. Agriculture 14: 876. https://doi.org/10.3390/agriculture14060876 ![]() |
[15] |
Hassan OGA, Hassaan NA, Kholif AE, et al. (2024) Influence of replacing soybean meal with nigella sativa seed meal on feed intake, digestibility, growth performance, blood metabolites, and antioxidant activity of growing lambs. Animals 14: 1878. https://doi.org/10.3390/ani14131878 ![]() |
[16] |
Mousa GA, Kholif AE, Hassaan NA, et al. (2024) Effect of replacing cottonseed meal with fenugreek seed meal on feed intake, digestibility, growth, blood parameters and economics of fattening lambs. Small Rumin Res 236: 107305. https://doi.org/10.1016/j.smallrumres.2024.107305 ![]() |
[17] |
Ike KA, Okedoyin DO, Alabi JO, et al. (2024) The combined effect of four nutraceutical-based feed additives on the rumen microbiome, methane gas emission, volatile fatty acids, and dry matter disappearance using an in vitro batch culture technique. Fermentation 10: 499. https://doi.org/10.3390/fermentation10100499 ![]() |
[18] |
Anele UY, Refat B, Swift ML, et al. (2014) Effects of bulk density, precision processing and processing index on in vitro ruminal fermentation of dry-rolled barley grain. Anim Feed Sci Technol 195: 28-37. https://doi.org/10.1016/j.anifeedsci.2014.06.015 ![]() |
[19] |
Anele UY, Südekum KH, Hummel J, et al. (2011) Chemical characterization, in vitro dry matter and ruminal crude protein degradability and microbial protein synthesis of some cowpea (Vigna unguiculata L. Walp) haulm varieties. Anim Feed Sci Technol 163: 161-169. https://doi.org/10.1016/j.anifeedsci.2010.11.005 ![]() |
[20] |
Blümmel M, Lebzien P (2001) Predicting ruminal microbial efficiencies of dairy rations by in vitro techniques. Livest Prod Sci 68: 107-117. https://doi.org/10.1016/S0301-6226(00)00241-4 ![]() |
[21] |
Blümmel M, Steingaβ H, Becker K (1997) The relationship between in vitro gas production, in vitro microbial biomass yield and 15N incorporation and its implications for the prediction of voluntary feed intake of roughages. Br J Nutr 77: 911-921. https://doi.org/10.1079/BJN19970089 ![]() |
[22] |
Ruiz-Moreno M, Binversie E, Fessenden SW, et al. (2015) Mitigation of in vitro hydrogen sulfide production using bismuth subsalicylate with and without monensin in beef feedlot diets. J Anim Sci 93: 5346-5354. https://doi.org/10.2527/jas.2015-9392 ![]() |
[23] |
Latimer George (2019) Official Methods of Analysis of AOAC International. Washington DC: Oxford University Press. ![]() |
[24] |
Van Soest PJ, Robertson JB, Lewis BA (1991) Methods for dietary fiber, neutral detergent fiber, and nonstarch polysaccharides in relation to animal nutrition. J Dairy Sci 74: 3583-3597. https://doi.org/10.3168/jds.S0022-0302(91)78551-2 ![]() |
[25] |
Ruggeri B, Sassi G (2003) Experimental sensitivity analysis of a trickle bed bioreactor for lignin peroxidases production by P. chrysosporium. Process Biochem 38: 1669-1676. https://doi.org/10.1016/S0032-9592(02)00199-1 ![]() |
[26] | Khattab HM, Gado HM, Salem AZM, et al. (2013) Chemical composition and in vitro digestibility of Pleurotus ostreatus spent rice straw. Anim Nutr Feed Technol 13: 507-516. |
[27] |
Kwak WS, Kim YI, Seok JS, et al. (2009) Molasses and microbial inoculants improve fermentability and silage quality of cotton waste-based spent mushroom substrate. Bioresour Technol 100: 1471-1473. https://doi.org/10.1016/j.biortech.2008.07.066 ![]() |
[28] | Akinfemi A, Ogunwole OA (2012) Chemical composition and in vitro digestibility of rice straw treated with Pleurotus ostreatus, Pleurotus pulmonarius and Pleurotus tuber-regium. Slovak J Anim Sci 45: 14-20. |
[29] | Adamovic M, Bocarov-Stancic A, Milenkovic I, et al. (2007) The quality of silage of corn grain and spent P. ostreatus mushroom substrate. Zb Matice Srp Prir Nauk 211–218. https://doi.org/10.2298/ZMSPN0713211A |
[30] |
Kwak WS, Jung SH, Kim YI (2008) Broiler litter supplementation improves storage and feed-nutritional value of sawdust-based spent mushroom substrate. Bioresour Technol 99: 2947-2955. https://doi.org/10.1016/j.biortech.2007.06.021 ![]() |
[31] |
Kim YI, Lee YH, Kim KH, et al. (2012) Effects of supplementing microbially-fermented spent mushroom substrates on growth performance and carcass characteristics of Hanwood steers (a field study). Asian-Australas J Anim Sci 25: 1575-1581. https://doi.org/10.5713/ajas.2012.12251 ![]() |
[32] |
Kim YI, Park JM, Lee YH, et al. (2014) Effect of by-product feed-based silage feeding on the performance, blood metabolites, and carcass characteristics of Hanwood steers (a field study). Asian-Australas J Anim Sci 28: 180-187. https://doi.org/10.5713/ajas.14.0443 ![]() |
[33] |
Darwish GAMA, Bakr AA, Abdallah MMF (2012) Nutritional value upgrading of maize stalk by using Pleurotus ostreatus and Saccharomyces cerevisiae in solid state fermentation. Ann Agric Sci 57: 47-51. https://doi.org/10.1016/j.aoas.2012.03.005 ![]() |
[34] |
Kholif AE, Gouda GA, Morsy TA, et al. (2024) The effects of replacement of berseem hay in total mixed rations with date palm leaves ensiled with malic or lactic acids at different levels on the nutritive value, ruminal in vitro biogas production and fermentation. Biomass Convers Biorefin 14: 3763-3775. https://doi.org/10.1007/s13399-022-02508-y ![]() |
[35] |
Kholif AE, Elghandour MMY, Salem AZM, et al. (2017) The effects of three total mixed rations with different concentrate to maize silage ratios and different levels of microalgae Chlorella vulgaris on in vitro total gas, methane and carbon dioxide production. J Agric Sci 155: 494-507. https://doi.org/10.1017/S0021859616000812 ![]() |
[36] |
Morsy TA, Gouda GAGA, Kholif AE (2022) In vitro fermentation and production of methane and carbon dioxide from rations containing Moringa oleifera leave silage as a replacement of soybean meal: In vitro assessment. Environ Sci Pollut Res 29: 69743-69752. https://doi.org/10.1007/s11356-022-20622-2 ![]() |
[37] |
Boadi D, Benchaar C, Chiquette J, et al. (2004) Mitigation strategies to reduce enteric methane emissions from dairy cows: Update review. Can J Anim Sci 84: 319-335. https://doi.org/10.4141/A03-109 ![]() |
[38] |
Boadi DA, Wittenberg KM (2002) Methane production from dairy and beef heifers fed forages differing in nutrient density using the sulphur hexafluoride (SF6) tracer gas technique. Can J Anim Sci 82: 201-206. https://doi.org/10.4141/A01-017 ![]() |
[39] |
Sallam SMA, Rady AMS, Attia MFA, et al. (2024) Different maize silage cultivars with or without urea as a feed for ruminant: Chemical composition and in vitro fer-mentation and nutrient degradability. Chil J Agric Anim Sci 40: 137-149. https://doi.org/10.29393/CHJAAS40-14DMSA60014 ![]() |
[40] |
McAllister TA, Newbold CJ (2008) Redirecting rumen fermentation to reduce methanogenesis. Aust J Exp Agric 48: 7-13. https://doi.org/10.1071/EA07218 ![]() |
[41] | Aslam S (2013) Organic management of root knot nematodes in tomato with spent mushroom compost. Sarhad J Agric 29: 63-69. |
[42] |
Mahesh MS, Madhu M (2013) Biological treatment of crop residues for ruminant feeding: A review. Afr J Biotechnol 12: 4221-4231. https://doi.org/10.5897/AJB2012.2940 ![]() |
[43] | Mahesh MS (2012) Fungal bioremediation of wheat straw to improve the nutritive value and its effect on methane production in ruminants. MVSc thesis submitted to National Dairy Research institute (Deemed University), Karnal, Haryana, India, 2012 . |
[44] | Sallam SMA, Nasser MEA, El-Waziry AM, et al. (2007) Use of an in vitro rumen gas production technique to evaluate some ruminant feedstuffs. J Appl Sci Res 3: 34-41. https://doi.org/10.1017/S1752756200021219 |
[45] |
Kholif AE, Gouda GA, Morsy TA, et al. (2022) Dietary date palm leaves ensiled with fibrolytic enzymes decreased methane production, and improved feed degradability and fermentation kinetics in a ruminal in vitro system. Waste Biomass Valorization 13: 3475-3488. https://doi.org/10.1007/s12649-022-01752-7 ![]() |
[46] |
Li X, Pang Y, Zhang R (2001) Compositional changes of cottonseed hull substrate during P. ostreatus growth and the effects on the feeding value of the spent substrate. Bioresour Technol 80: 157-161. https://doi.org/10.1016/S0960-8524(00)00170-X ![]() |
[47] |
Jung HG, Allen MS (1995) Characteristics of plant cell walls affecting intake and digestibility of forages by ruminants. J Anim Sci 73: 2774. https://doi.org/10.2527/1995.7392774x ![]() |
[48] |
Kung L, Shaver RD, Grant RJ, et al. (2018) Silage review: Interpretation of chemical, microbial, and organoleptic components of silages. J Dairy Sci 101: 4020-4033. https://doi.org/10.3168/jds.2017-13909 ![]() |
[49] |
Zadrazil F (1997) Changes in in vitro digestibility of wheat straw during fungal growth and after harvest of oyster mushrooms (Pleurotus spp.) on laboratory and industrial scale. J Appl Anim Res 11: 37-48. https://doi.org/10.1080/09712119.1997.9706159 ![]() |
[50] |
Kholif AE, Gouda GA, Morsy TA, et al. (2023) Associative effects between Chlorella vulgaris microalgae and Moringa oleifera leaf silage used at different levels decreased in vitro ruminal greenhouse gas production and altered ruminal fermentation. Environ Sci Pollut Res 30: 6001-6020. https://doi.org/10.1007/s11356-022-22559-y ![]() |
[51] |
Borreani G, Tabacco E, Schmidt RJJ, et al. (2018) Silage review: Factors affecting dry matter and quality losses in silages. J Dairy Sci 101: 3952-3979. https://doi.org/10.3168/jds.2017-13837 ![]() |
[52] |
Bedford A, Gong J (2018) Implications of butyrate and its derivatives for gut health and animal production. Anim Nutr 4: 151-159. https://doi.org/10.1016/j.aninu.2017.08.010 ![]() |
1. | Ide Yunianto, Adhistya Erna Permanasari, Widyawan Widyawan, 2020, Domain-Specific Contextualized Embedding: A Systematic Literature Review, 978-1-7281-1097-4, 162, 10.1109/ICITEE49829.2020.9271752 | |
2. | Jun Kong, Leixin Zhang, Min Jiang, Tianshan Liu, Incorporating multi-level CNN and attention mechanism for Chinese clinical named entity recognition, 2021, 116, 15320464, 103737, 10.1016/j.jbi.2021.103737 | |
3. | Min Zuo, Baoyu Zhang, Qingchuan Zhang, Wenjing Yan, Dongmei Ai, Xin Ning, An Entity Relation Extraction Method for Few-Shot Learning on the Food Health and Safety Domain, 2022, 2022, 1687-5273, 1, 10.1155/2022/1879483 | |
4. | Yuchen Zheng, Zhenggong Han, Yimin Cai, Xubo Duan, Jiangling Sun, Wei Yang, Haisong Huang, An imConvNet-based deep learning model for Chinese medical named entity recognition, 2022, 22, 1472-6947, 10.1186/s12911-022-02049-4 | |
5. | Tanvir Islam, Sakila Mahbin Zinat, Shamima Sukhi, M. F. Mridha, 2022, Chapter 57, 978-981-16-2596-1, 665, 10.1007/978-981-16-2597-8_57 | |
6. | Xiaojing Du, Yuxiang Jia, Hongying Zan, 2022, Chapter 10, 978-3-031-18314-0, 149, 10.1007/978-3-031-18315-7_10 | |
7. | Jintong Shi, Mengxuan Sun, Zhengya Sun, Mingda Li, Yifan Gu, Wensheng Zhang, Multi-level semantic fusion network for Chinese medical named entity recognition, 2022, 133, 15320464, 104144, 10.1016/j.jbi.2022.104144 | |
8. | Yani Chen, Danqing Hu, Mengyang Li, Huilong Duan, Xudong Lu, Automatic SNOMED CT coding of Chinese clinical terms via attention-based semantic matching, 2022, 159, 13865056, 104676, 10.1016/j.ijmedinf.2021.104676 | |
9. | Oswaldo Solarte-Pabon, Alberto Blazquez-Herranz, Maria Torrente, Alejandro Rodriguez-Gonzalez, Mariano Provencio, Ernestina Menasalvas, 2021, Extracting Cancer Treatments from Clinical Text written in Spanish: A Deep Learning Approach, 978-1-6654-2099-0, 1, 10.1109/DSAA53316.2021.9564137 | |
10. | Zaira Hassan Amur, Yew Kwang Hooi, Hina Bhanbhro, Kamran Dahri, Gul Muhammad Soomro, Short-Text Semantic Similarity (STSS): Techniques, Challenges and Future Perspectives, 2023, 13, 2076-3417, 3911, 10.3390/app13063911 | |
11. | Weijie Wang, Xiaoying Li, Huiling Ren, Dongping Gao, An Fang, Chinese Clinical Named Entity Recognition from Electronic Medical Records based on Multi-semantic Features by using RoBERTa-wwm and CNN: Model Development and Validation (Preprint), 2022, 2291-9694, 10.2196/44597 | |
12. | Yinlong Xiao, Zongcheng Ji, Jianqiang Li, Qing Zhu, CLART: A cascaded lattice-and-radical transformer network for Chinese medical named entity recognition, 2023, 9, 24058440, e20692, 10.1016/j.heliyon.2023.e20692 | |
13. | Jinhong Zhong, Zhanxiang Xuan, Kang Wang, Zhou Cheng, A BERT-Span model for Chinese named entity recognition in rehabilitation medicine, 2023, 9, 2376-5992, e1535, 10.7717/peerj-cs.1535 | |
14. | Haoze Du, Jiahao Xu, Zhiyong Du, Lihui Chen, Shaohui Ma, Dongqing Wei, Xianfang Wang, MF-MNER: Multi-models Fusion for MNER in Chinese Clinical Electronic Medical Records, 2024, 16, 1913-2751, 489, 10.1007/s12539-024-00624-z | |
15. | Hui Peng, Zhichang Zhang, Dan Liu, Xiaohui Qin, Chinese medical entity recognition based on the dual-branch TENER model, 2023, 23, 1472-6947, 10.1186/s12911-023-02243-y | |
16. | Feng Li, Zhongao Bi, Hongzeng Xu, Yunqi Shi, Na Duan, Zhaoyu Li, Design and implementation of a smart Internet of Things chest pain center based on deep learning, 2023, 20, 1551-0018, 18987, 10.3934/mbe.2023840 | |
17. | Benedict Hartmann, Philippe Tamla, Matthias Hemmje, 2023, Chapter 6, 978-3-031-48056-0, 84, 10.1007/978-3-031-48057-7_6 | |
18. | Xin Wang, Zurui Gan, Yaxi Xu, Bingnan Liu, Tao Zheng, Extracting Domain-Specific Chinese Named Entities for Aviation Safety Reports: A Case Study, 2023, 13, 2076-3417, 11003, 10.3390/app131911003 | |
19. | Chi Xinyan, Huo Guang, Jin Qi, Hong Zhaoyang, Yang Chuang, Agri-NER-Net: Glyph Fusion for Chinese Field Crop Diseases and Pests Named Entity Recognition Network, 2024, 58, 0146-4116, 679, 10.3103/S0146411624701141 | |
20. | Meijing Li, Runqing Huang, Xianxian Qi, Chinese Clinical Named Entity Recognition Using Multi-Feature Fusion and Multi-Scale Local Context Enhancement, 2024, 80, 1546-2226, 2283, 10.32604/cmc.2024.053630 | |
21. | Hongyu Zhang, Long Lyu, Weifu Chang, Yuexin Zhao, Xiaoqing Peng, A Chinese medical named entity recognition method considering length diversity of entities, 2025, 150, 09521976, 110649, 10.1016/j.engappai.2025.110649 |
Hyperparameter name | Hyperparameter setting |
Character Embedding | 300 |
LSTM Layer | 2 |
LSTM Cell Size | 4096 |
LSTM Hidden Size | 1024 |
Batch Size | 1 |
Optimizer | Adam |
Learning rate | 0.001 |
Annealing rate | 0.9t/4679 |
Gradients clip | 5 |
Dropout | 0.1 |
Max epoch | 100 |
Category | Disease and diagnosis | Imaging examination | Laboratory test | Surgery | Medicine | Anatomy | Totally |
Training data | 4193 | 966 | 1194 | 1027 | 1814 | 8231 | 17425 |
test data | 1310 | 344 | 586 | 162 | 483 | 2938 | 5823 |
Layers | 2 | 2 | 4 | 4 | 6 | 6 |
Heads | 4 | 8 | 4 | 8 | 4 | 8 |
F1 | 83.95 | 84.05 | 83.68 | 84.01 | 83.55 | 83.98 |
Entity name | WV-BiLSTM-CRF | WV-ET-CRF | ||||
Strict index (%) | Strict index (%) | |||||
P | R | F1 | P | R | F1 | |
Disease and diagnosis | 74.47 | 78.07 | 76.23 | 77.86 | 81.04 | 79.42 |
Imaging examination | 82.22 | 89.31 | 85.62 | 77.78 | 96.55 | 86.15 |
Laboratory test | 82.35 | 84.13 | 83.23 | 82.56 | 85.34 | 83.92 |
Surgery | 79.04 | 81.27 | 80.14 | 80.00 | 87.63 | 83.64 |
Anatomy | 81.34 | 84.02 | 82.55 | 82.83 | 85.78 | 84.28 |
Medicine | 89.24 | 90.28 | 89.76 | 91.80 | 93.80 | 92.79 |
Total | 80.39 | 83.39 | 81.86 | 82.08 | 86.12 | 84.05 |
Entity name | ELMo-LSTM-CRF | ELMo-ET-CRF | ||||
Strict index (%) | Strict index (%) | |||||
P | R | F1 | P | R | F1 | |
Disease and diagnosis | 76.93 | 80.07 | 78.47 | 79.57 | 82.53 | 81.02 |
Imaging examination | 84.38 | 93.10 | 88.52 | 82.35 | 96.55 | 88.89 |
Laboratory test | 84.94 | 86.78 | 85.85 | 83.72 | 86.54 | 85.11 |
Surgery | 81.79 | 84.10 | 82.93 | 80.65 | 88.34 | 84.32 |
Medicine | 84.74 | 86.36 | 85.54 | 84.75 | 87.92 | 86.31 |
Anatomy | 91.06 | 92.13 | 91.59 | 90.32 | 93.80 | 92.03 |
Total | 83.32 | 85.72 | 84.50 | 83.65 | 87.61 | 85.59 |
Model | Strict index (%) | ||
P | R | F1 | |
WV-LSTM-CRF | 80.39 | 83.39 | 81.86 |
WV-ET-CRF | 82.08 | 86.12 | 84.05 |
ELMo-LSTM-CRF | 83.32 | 85.72 | 84.50 |
ELMo-ET-CRF | 83.65 | 87.61 | 85.59 |
CCKS2019-No.1 | - | - | 85.62 |
CCKS2019-No.2 | - | - | 85.59 |
CCKS2019-No.3 | - | - | 85.16 |
Hyperparameter name | Hyperparameter setting |
Character Embedding | 300 |
LSTM Layer | 2 |
LSTM Cell Size | 4096 |
LSTM Hidden Size | 1024 |
Batch Size | 1 |
Optimizer | Adam |
Learning rate | 0.001 |
Annealing rate | 0.9t/4679 |
Gradients clip | 5 |
Dropout | 0.1 |
Max epoch | 100 |
Category | Disease and diagnosis | Imaging examination | Laboratory test | Surgery | Medicine | Anatomy | Totally |
Training data | 4193 | 966 | 1194 | 1027 | 1814 | 8231 | 17425 |
test data | 1310 | 344 | 586 | 162 | 483 | 2938 | 5823 |
Layers | 2 | 2 | 4 | 4 | 6 | 6 |
Heads | 4 | 8 | 4 | 8 | 4 | 8 |
F1 | 83.95 | 84.05 | 83.68 | 84.01 | 83.55 | 83.98 |
Entity name | WV-BiLSTM-CRF | WV-ET-CRF | ||||
Strict index (%) | Strict index (%) | |||||
P | R | F1 | P | R | F1 | |
Disease and diagnosis | 74.47 | 78.07 | 76.23 | 77.86 | 81.04 | 79.42 |
Imaging examination | 82.22 | 89.31 | 85.62 | 77.78 | 96.55 | 86.15 |
Laboratory test | 82.35 | 84.13 | 83.23 | 82.56 | 85.34 | 83.92 |
Surgery | 79.04 | 81.27 | 80.14 | 80.00 | 87.63 | 83.64 |
Anatomy | 81.34 | 84.02 | 82.55 | 82.83 | 85.78 | 84.28 |
Medicine | 89.24 | 90.28 | 89.76 | 91.80 | 93.80 | 92.79 |
Total | 80.39 | 83.39 | 81.86 | 82.08 | 86.12 | 84.05 |
Entity name | ELMo-LSTM-CRF | ELMo-ET-CRF | ||||
Strict index (%) | Strict index (%) | |||||
P | R | F1 | P | R | F1 | |
Disease and diagnosis | 76.93 | 80.07 | 78.47 | 79.57 | 82.53 | 81.02 |
Imaging examination | 84.38 | 93.10 | 88.52 | 82.35 | 96.55 | 88.89 |
Laboratory test | 84.94 | 86.78 | 85.85 | 83.72 | 86.54 | 85.11 |
Surgery | 81.79 | 84.10 | 82.93 | 80.65 | 88.34 | 84.32 |
Medicine | 84.74 | 86.36 | 85.54 | 84.75 | 87.92 | 86.31 |
Anatomy | 91.06 | 92.13 | 91.59 | 90.32 | 93.80 | 92.03 |
Total | 83.32 | 85.72 | 84.50 | 83.65 | 87.61 | 85.59 |
Model | Strict index (%) | ||
P | R | F1 | |
WV-LSTM-CRF | 80.39 | 83.39 | 81.86 |
WV-ET-CRF | 82.08 | 86.12 | 84.05 |
ELMo-LSTM-CRF | 83.32 | 85.72 | 84.50 |
ELMo-ET-CRF | 83.65 | 87.61 | 85.59 |
CCKS2019-No.1 | - | - | 85.62 |
CCKS2019-No.2 | - | - | 85.59 |
CCKS2019-No.3 | - | - | 85.16 |