Loading [MathJax]/jax/output/SVG/jax.js
Research article Special Issues

Computational methods for recognition of cancer protein markers in saliva

  • In recent years, many studies have supported that cancer tissues can make disease-specific changes in some salivary proteins through some mediators in the pathogenesis of systemic diseases. These salivary proteins have the potential to become cancer-specific biomarkers in the early diagnosis stage. How to effectively identify these potential markers is one of the challenging issues. In this paper, we propose novel machine learning methods for recognition cancer biomarkers in saliva by two stages. In the first stage, salivary secreted proteins are recognized which are considered as candidate biomarkers of cancers. We picked up 557 salivary secretory proteins from 20379 human proteins by public databases and published literatures. Then, we present a training set construction strategy to solve the imbalance problem in order to make the classification methods get better accuracy. From all human protein set, the proteins belonging to the same families as salivary secretory proteins are removed. After that, we use SVC-KM method to cluster the remaining proteins, and select negative samples from each cluster in proportion. Next, the features of proteins are calculated by tools. We collect 24 protein properties such as sequence, structure and physicochemical properties, a total of 1087 features. An innovative procedure based on the local samples is proposed for selecting the appropriate features, in order to further improve the performance of SVM classifier. Experimental results show that the average sensitivity, specificity and accuracy of salivary secretory protein recognition using selected 32 features in training set are 97.09%, 98.10%, 97.61%, respectively. The use of these methods can improve the accuracy of recognition by solving the problems of unbalanced sample size and uneven distribution in training set. In the second stage, we apply the best model to dig out the salivary secreted proteins from 58 reported cancer markers, and get a total of 42 proteins which are considered to be used for salivary diagnosis. We analyze the gene expression data of three types of cancer, and predict that 33 genes will appear in saliva after they are translated into proteins. This study provides an important computational tool to help biologists and researchers reduce the number of candidate proteins and the cost of research. So as to further accelerate the discovery of cancer biomarkers in saliva and promote the development of saliva diagnosis.

    Citation: Ying Sun, Wei Du, Lili Yang, Min Dai, Ziying Dou, Yuxiang Wang, Jining Liu, Gang Zheng. Computational methods for recognition of cancer protein markers in saliva[J]. Mathematical Biosciences and Engineering, 2020, 17(3): 2453-2469. doi: 10.3934/mbe.2020134

    Related Papers:

    [1] Jiafan Zhang . On the distribution of primitive roots and Lehmer numbers. Electronic Research Archive, 2023, 31(11): 6913-6927. doi: 10.3934/era.2023350
    [2] Yang Gao, Qingzhong Ji . On the inverse stability of zn+c. Electronic Research Archive, 2025, 33(3): 1414-1428. doi: 10.3934/era.2025066
    [3] J. Bravo-Olivares, E. Fernández-Cara, E. Notte-Cuello, M.A. Rojas-Medar . Regularity criteria for 3D MHD flows in terms of spectral components. Electronic Research Archive, 2022, 30(9): 3238-3248. doi: 10.3934/era.2022164
    [4] Zhefeng Xu, Xiaoying Liu, Luyao Chen . Hybrid mean value involving some two-term exponential sums and fourth Gauss sums. Electronic Research Archive, 2025, 33(3): 1510-1522. doi: 10.3934/era.2025071
    [5] Jorge Garcia Villeda . A computable formula for the class number of the imaginary quadratic field Q(p), p=4n1. Electronic Research Archive, 2021, 29(6): 3853-3865. doi: 10.3934/era.2021065
    [6] Li Wang, Yuanyuan Meng . Generalized polynomial exponential sums and their fourth power mean. Electronic Research Archive, 2023, 31(7): 4313-4323. doi: 10.3934/era.2023220
    [7] Qingjie Chai, Hanyu Wei . The binomial sums for four types of polynomials involving floor and ceiling functions. Electronic Research Archive, 2025, 33(3): 1384-1397. doi: 10.3934/era.2025064
    [8] Hai-Liang Wu, Li-Yuan Wang . Permutations involving squares in finite fields. Electronic Research Archive, 2022, 30(6): 2109-2120. doi: 10.3934/era.2022106
    [9] Li Rui, Nilanjan Bag . Fourth power mean values of one kind special Kloosterman's sum. Electronic Research Archive, 2023, 31(10): 6445-6453. doi: 10.3934/era.2023326
    [10] Hongliang Chang, Yin Chen, Runxuan Zhang . A generalization on derivations of Lie algebras. Electronic Research Archive, 2021, 29(3): 2457-2473. doi: 10.3934/era.2020124
  • In recent years, many studies have supported that cancer tissues can make disease-specific changes in some salivary proteins through some mediators in the pathogenesis of systemic diseases. These salivary proteins have the potential to become cancer-specific biomarkers in the early diagnosis stage. How to effectively identify these potential markers is one of the challenging issues. In this paper, we propose novel machine learning methods for recognition cancer biomarkers in saliva by two stages. In the first stage, salivary secreted proteins are recognized which are considered as candidate biomarkers of cancers. We picked up 557 salivary secretory proteins from 20379 human proteins by public databases and published literatures. Then, we present a training set construction strategy to solve the imbalance problem in order to make the classification methods get better accuracy. From all human protein set, the proteins belonging to the same families as salivary secretory proteins are removed. After that, we use SVC-KM method to cluster the remaining proteins, and select negative samples from each cluster in proportion. Next, the features of proteins are calculated by tools. We collect 24 protein properties such as sequence, structure and physicochemical properties, a total of 1087 features. An innovative procedure based on the local samples is proposed for selecting the appropriate features, in order to further improve the performance of SVM classifier. Experimental results show that the average sensitivity, specificity and accuracy of salivary secretory protein recognition using selected 32 features in training set are 97.09%, 98.10%, 97.61%, respectively. The use of these methods can improve the accuracy of recognition by solving the problems of unbalanced sample size and uneven distribution in training set. In the second stage, we apply the best model to dig out the salivary secreted proteins from 58 reported cancer markers, and get a total of 42 proteins which are considered to be used for salivary diagnosis. We analyze the gene expression data of three types of cancer, and predict that 33 genes will appear in saliva after they are translated into proteins. This study provides an important computational tool to help biologists and researchers reduce the number of candidate proteins and the cost of research. So as to further accelerate the discovery of cancer biomarkers in saliva and promote the development of saliva diagnosis.


    Let Fq be the finite field of q elements with characteristic p, where q=pr, p is a prime number. Let Fq=Fq{0} and Z+ denote the set of positive integers. Let sZ+ and bFq. Let f(x1,,xs) be a diagonal polynomial over Fq of the following form

    f(x1,,xs)=a1xm11+a2xm22++asxmss,

    where aiFq, miZ+, i=1,,s. Denote by Nq(f=b) the number of Fq-rational points on the affine hypersurface f=b, namely,

    Nq(f=b)=#{(x1,,xs)As(Fq)f(x1,,xs)=b}.

    In 1949, Hua and Vandiver [1] and Weil [2] independently obtained the formula of Nq(f=b) in terms of character sum as follows

    Nq(f=b)=qs1+ψ1(a11)ψs(ass)J0q(ψ1,,ψs), (1.1)

    where the sum is taken over all s multiplicative characters of Fq that satisfy ψmii=ε, ψiε, i=1,,s and ψ1ψs=ε. Here ε is the trivial multiplicative character of Fq, and J0q(ψ1,,ψs) is the Jacobi sum over Fq defined by

    J0q(ψ1,,ψs)=c1++cs=0,ciFqψ1(c1)ψs(cs).

    Though the explicit formula for Nq(f=b) are difficult to obtain in general, it has been studied extensively because of their theoretical importance as well as their applications in cryptology and coding theory; see[3,4,5,6,7,8,9]. In this paper, we use the Jacobi sums, Gauss sums and the results of quadratic form to deduce the formula of the number of Fq2-rational points on a class of hypersurfaces over Fq2 under certain conditions. The main result of this paper can be stated as

    Theorem 1.1. Let q=2r with rZ+ and Fq2 be the finite field of q2 elements. Let f(X)=a1xm11+a2xm22++asxmss, g(Y)=y1y2+y3y4++yn1yn+y2n2t1+ +y2n3+y2n1+bty2n2t++b1y2n2+b0y2n, and l(X,Y)=f(X)+g(Y), where ai,bjFq2, mi1, (mi,mk)=1, ik, mi|(q+1), miZ+, 2|n, n>2, 0tn22, TrFq2/F2(bj)=1 for i,k=1,,s and j=0,1,,t. For hFq2, we have

    (1) If h=0, then

    Nq2(l(X,Y)=0)=q2(s+n1)+γFq2(si=1((γai)mimi1)(qs+2n3+(1)tqs+n3)).

    (2) If hFq2, then

    Nq2(l(X,Y)=h)=q2(s+n1)+(qs+2n3+(1)t+1(q21)qs+n3)si=1((hai)mimi1)+γFq2{h}[si=1((γai)mimi1)(q2n+s3+(1)tqn+s3)].

    Here,

    (γai)mi={1,ifγaiisaresidueofordermi,0,otherwise.

    To prove Theorem 1.1, we need the lemmas and theorems below which are related to the Jacobi sums and Gauss sums.

    Definition 2.1. Let χ be an additive character and ψ a multiplicative character of Fq. The Gauss sum Gq(ψ,χ) in Fq is defined by

    Gq(ψ,χ)=xFqψ(x)χ(x).

    In particular, if χ is the canonical additive character, i.e., χ(x)=e2πiTrFq/Fp(x)/p where TrFq/Fp(y)=y+yp++ypr1 is the absolute trace of y from Fq to Fp, we simply write Gq(ψ):=Gq(ψ,χ).

    Let ψ be a multiplicative character of Fq which is defined for all nonzero elements of Fq. We extend the definition of ψ by setting ψ(0)=0 if ψε and ε(0)=1.

    Definition 2.2. Let ψ1,,ψs be s multiplicative characters of Fq. Then, Jq(ψ1,,ψs) is the Jacobi sum over Fq defined by

    Jq(ψ1,,ψs)=c1++cs=1,ciFqψ1(c1)ψs(cs).

    The Jacobi sums Jq(ψ1,,ψs) as well as the sums J0q(ψ1,,ψs) can be evaluated easily in case some of the multiplicative characters ψi are trivial.

    Lemma 2.3. ([10,Theorem 5.19,p. 206]) If the multiplicative characters ψ1,,ψs of Fq are trivial, then

    Jq(ψ1,,ψs)=J0q(ψ1,,ψs)=qs1.

    If some, but not all, of the ψi are trivial, then

    Jq(ψ1,,ψs)=J0q(ψ1,,ψs)=0.

    Lemma 2.4. ([10,Theorem 5.20,p. 206]) If ψ1,,ψs are multiplicative characters of Fq with ψs nontrivial, then

    J0q(ψ1,,ψs)=0

    if ψ1ψs is nontrivial and

    J0q(ψ1,,ψs)=ψs(1)(q1)Jq(ψ1,,ψs1)

    if ψ1ψs is trivial.

    If all ψi are nontrivial, there exists an important connection between Jacobi sums and Gauss sums.

    Lemma 2.5. ([10,Theorem 5.21,p. 207]) If ψ1,,ψs are nontrivial multiplicative characters of Fq and χ is a nontrivial additive character of Fq, then

    Jq(ψ1,,ψs)=Gq(ψ1,χ)Gq(ψs,χ)Gq(ψ1ψs,χ)

    if ψ1ψs is nontrivial and

    Jq(ψ1,,ψs)=ψs(1)Jq(ψ1,,ψs1)=1qGq(ψ1,χ)Gq(ψs,χ)

    if ψ1ψs is trivial.

    We turn to another special formula for Gauss sums which applies to a wider range of multiplicative characters but needs a restriction on the underlying field.

    Lemma 2.6. ([10,Theorem 5.16,p. 202]) Let q be a prime power, let ψ be a nontrivial multiplicative character of Fq2 of order m dividing q+1. Then

    Gq2(ψ)={q,ifmoddorq+1meven,q,ifmevenandq+1modd.

    For hFq2, define v(h)=1 if hFq2 and v(0)=q21. The property of the function v(h) will be used in the later proofs.

    Lemma 2.7. ([10,Lemma 6.23,p. 281]) For any finite field Fq, we have

    cFqv(c)=0,

    for any bFq,

    c1++cm=bv(c1)v(ck)={0,1k<m,v(b)qm1,k=m,

    where the sum is over all c1,,cmFq with c1++cm=b.

    The quadratic forms have been studied intensively. A quadratic form f in n indeterminates is called nondegenerate if f is not equivalent to a quadratic form in fewer than n indeterminates. For any finite field Fq, two quadratic forms f and g over Fq are called equivalent if f can be transformed into g by means of a nonsingular linear substitution of indeterminates.

    Lemma 2.8. ([10,Theorem 6.30,p. 287]) Let fFq[x1,,xn], q even, be a nondegenerate quadratic form. If n is even, then f is either equivalent to

    x1x2+x3x4++xn1xn

    or to a quadratic form of the type

    x1x2+x3x4++xn1xn+x2n1+ax2n,

    where aFq satisfies TrFq/Fp(a)=1.

    Lemma 2.9. ([10,Corollary 3.79,p. 127]) Let aFq and let p be the characteristic of Fq, the trinomial xpxa is irreducible in Fq if and only if TrFq/Fp(a)0.

    Lemma 2.10. ([10,Lemma 6.31,p. 288]) For even q, let aFq with TrFq/Fp(a)=1 and bFq. Then

    Nq(x21+x1x2+ax22=b)=qv(b).

    Lemma 2.11. ([10,Theorem 6.32,p. 288]) Let Fq be a finite field with q even and let bFq. Then for even n, the number of solutions of the equation

    x1x2+x3x4++xn1xn=b

    in Fnq is qn1+v(b)q(n2)/2. For even n and aFq with TrFq/Fp(a)=1, the number of solutions of the equation

    x1x2+x3x4++xn1xn+x2n1+ax2n=b

    in Fnq is qn1v(b)q(n2)/2.

    Lemma 2.12. Let q=2r and hFq2. Let g(Y)Fq2[y1,y2,,yn] be a polynomial of the form

    g(Y)=y1y2+y3y4++yn1yn+y2n2t1++y2n3+y2n1+bty2n2t++b1y2n2+b0y2n,

    where bjFq2, 2|n, n>2, 0tn22, TrFq2/F2(bj)=1, j=0,1,,t. Then

    Nq2(g(Y)=h)=q2(n1)+(1)t+1qn2v(h). (2.1)

    Proof. We provide two proofs here. The first proof is as follows. Let q1=q2. Then by Lemmas 2.7 and 2.10, the number of solutions of g(Y)=h in Fq2 can be deduced as

    Nq2(g(Y)=h)=c1+c2++ct+2=hNq2(y1y2+y3y4++yn2t3yn2t2=c1)Nq2(yn2t1yn2t+y2n2t1+bty2n2t=c2)Nq2(yn1yn+y2n1+b0y2n=ct+2)=c1+c2++ct+2=h(qn2t31+v(c1)q(n2t4)/21)(q1v(c2))(q1v(ct+2))=c1+c2++ct+2=h(qn2t21+v(c1)q(n2t2)/21v(c2)qn2t31v(c1)v(c2)q(n2t4)/21)(q1v(c3))(q1v(ct+2))=c1+c2++ct+2=h(qnt21+v(c1)q(n2)/21v(c2)qnt31++(1)t+1v(c1)v(c2)v(ct+2)q(n2t4)/21)=qn11+q(n2)/21c1Fq2v(c1)++(1)t+1c1+c2++ct+2=hv(c1)v(c2)v(ct+2)q(n2t4)/21. (2.2)

    By Lamma 2.7 and (2.2), we have

    Nq2(g(Y)=h)=qn11+(1)t+1v(h)q(n2)/21=q2(n1)+(1)t+1v(h)qn2.

    Next we give the second proof. Note that if f and g are equivalent, then for any bFq2 the equation f(x1,,xn)=b and g(x1,,xn)=b have the same number of solutions in Fq2. So we can get the number of solutions of g(Y)=h for hFq2 by means of a nonsingular linear substitution of indeterminates.

    Let k(X)Fq2[x1,x2,x3,x4] and k(X)=x1x2+x21+Ax22+x3x4+x23+Bx24, where TrFq2/F2(A)=TrFq2/F2(B)=1. We first show that k(x) is equivalent to x1x2+x3x4.

    Let x3=y1+y3 and xi=yi for i3, then k(X) is equivalent to y1y2+y1y4+y3y4+Ay22+y23+By24.

    Let y2=z2+z4 and yi=zi for i2, then k(X) is equivalent to z1z2+z3z4+Az22+z23+Az24+Bz24.

    Let z1=α1+Aα2 and zi=αi for i1, then k(X) is equivalent to α1α2+α23+α3α4+(A+B)α24.

    Since TrFq2/F2(A+B)=0, we have α23+α3α4+(A+B)α24 is reducible by Lemma 2.9. Then k(X) is equivalent to x1x2+x3x4. It follows that if t is odd, then g(Y) is equivalent to x1x2+x3x4++xn1xn, and if t is even, then g(Y) is equivalent to x1x2+x3x4++xn1xn+x2n1+ax2n with TrFq2/F2(a)=1. By Lemma 2.11, we get the desired result.

    From (1.1), we know that the formula for the number of solutions of f(X)=0 over Fq2 is

    Nq2(f(X)=0)=q2(s1)+d11j1=1ds1js=1¯ψj11(a1)¯ψjss(as)J0q2(ψj11,,ψjss),

    where di=(mi,q21) and ψi is a multiplicative character of Fq2 of order di. Since mi|q+1, we have di=mi. Let H={(j1,,js)1ji<mi, 1is}. It follows that ψj11ψjss is nontrivial for any (j1,,js)H as (mi,mj)=1. By Lemma 2, we have J0q2(ψj11,,ψjss)=0 and hence Nq2(f(X)=0)=q2(s1).

    Let Nq2(f(X)=c) denote the number of solutions of the equation f(X)=c over Fq2 with cFq2. Let V={(j1,,js)|0ji<mi,1is}. Then

    Nq2(f(X)=c)=γ1++γs=cNq2(a1xm11=γ1)Nq2(asxmss=γs)=γ1++γs=cm11j1=0ψj11(γ1a1)ms1js=0ψjss(γsas).

    Since ψi is a multiplicative character of Fq2 of order mi, we have

    Nq2(f(X)=c)=γ1c++γsc=1(j1,,js)Vψj11(γ1c)ψj11(ca1)ψjss(γsc)ψjss(cas)=(j1,,js)Vψj11(ca1)ψjss(cas)γ1c++γsc=1ψj11(γ1c)ψjss(γsc)=(j1,,js)Vψj11(ca1)ψjss(cas)Jq2(ψj11,,ψjss).

    By Lemma 2.3,

    Nq2(f(X)=c)=q2(s1)+(j1,,js)Hψj11(ca1)ψjss(cas)Jq2(ψj11,,ψjss).

    By Lemma 2.5,

    Jq2(ψj11,,ψjss)=Gq2(ψj11)Gq2(ψjss)Gq2(ψj11ψjss).

    Since mi|q+1 and 2mi, by Lemma 2.6, we have

    Gq2(ψj11)==Gq2(ψjss)=Gq2(ψj11ψjss)=q.

    Then

    Nq2(f(X)=c)=q2(s1)+qs1m11j1=1ψj11(ca1)ms1js=1ψjss(cas)=q2(s1)+qs1(m11j1=0ψj11(ca1)1)(ms1js=0ψjss(cas)1).

    It follows that

    Nq2(f(X)=c)=q2(s1)+qs1si=1((cai)mimi1), (3.1)

    where

    (cai)mi={1,ifcai is a residue of ordermi,0,otherwise.

    For a given hFq2. We discuss the two cases according to whether h is zero or not.

    Case 1: h=0. If f(X)=0, then g(Y)=0; if f(X)0, then g(Y)0. Then

    Nq2(l(X,Y)=0)=c1+c2=0Nq2(f(X)=c1)Nq2(g(Y)=c2)=q2(s1)(q2(n1)+(1)t+1(q21)qn2)+c1+c2=0c1,c2Fq2Nq2(f(X)=c1)Nq2(g(Y)=c2). (3.2)

    By Lemma 2.12, (3.1) and (3.2), we have

    Nq2(l(X,Y)=0)=q2(s+n2)+(1)t+1q2(s1)+hn(1)t+1q2(s2)+n+c1Fq2[q2(s+n2)(1)t+1q2(s2)+n+si=1((c1ai)mimi1)(q2n+s3(1)t+1qn+s3)]=q2(s+n2)+(1)t+1q2(s1)+n(1)t+1q2(s2)+n+q2(s+n1)(1)t+1q2(s1)+nq2(s+n2)+(1)t+1q2(s2)+n+c1Fq2[si=1((c1ai)mimi1)(q2n+s3(1)t+1qn+s3)]=q2(s+n1)+c1Fq2[si=1((c1ai)mimi1)(q2n+s3(1)t+1qn+s3)]. (3.3)

    Case 2: hFq2. If f(X)=h, then g(Y)=0; if f(X)=0, then g(Y)=h; if f(X){0,h}, then g(Y){0,h}. So we have

    Nq2(l(X,Y))=h)=c1+c2=hNq2(f(X)=c1)Nq2(g(Y)=c2)=Nq2(f(X)=0)Nq2(g(Y)=h)+Nq2(f(X)=h)Nq2(g(Y)=0)+c1+c2=hc1,c2Fq2{h}Nq2(f(X)=c1)Nq2(g(Y)=c2). (3.4)

    By Lemma 2.12, (3.1) and (3.4),

    Nq2(l(X,Y)=h)=2q2(s+n2)+(1)t+1q2s+n2(1)t+12q2s+n4+(qs+2n3+(1)t+1(q21)qs+n3)si=1((hai)mimi1)+c1Fq2{h}[q2(s+n2)(1)t+1q2s+n4+si=1((c1ai)mimi1)(q2n+s3(1)t+1qn+s3)].

    It follows that

    Nq2(l(X,Y)=h)=2q2(s+n2)+(1)t+1q2s+n2(1)t+12q2s+n4+(qs+2n3+(1)t+1(q21)qs+n3)si=1((hai)mimi1)+c1Fq2{h}[q2(s+n2)(1)t+1q2s+n4+si=1((c1ai)mimi1)(q2n+s3(1)t+1qn+s3)]=q2(s+n1)+(qs+2n3+(1)t+1(q21)qs+n3)si=1((hai)mimi1)+c1Fq2{h}[si=1((c1ai)mimi1)(q2n+s3+(1)tqn+s3)]. (3.5)

    By (3.3) and (3.5), we get the desired result. The proof of Theorem 1.1 is complete.

    There is a direct corollary of Theorem 1.1 and we omit its proof.

    Corollary 4.1. Under the conditions of Theorem 1.1, if a1==as=hFq2, then we have

    Nq2(l(X,Y)=h)=q2(s+n1)+(qs+2n3+(1)t+1(q21)qs+n3)si=1(mi1)+γFq2{h}[si=1((γh)mimi1)(q2n+s3+(1)tqn+s3)],

    where

    (γh)mi={1,ifγhisaresidueofordermi,0,otherwise.

    Finally, we give two examples to conclude the paper.

    Example 4.2. Let F210=α=F2[x]/(x10+x3+1) where α is a root of x10+x3+1. Suppose l(X,Y)=α33x31+x112+y23+α10y24+y1y2+y3y4. Clearly, TrF210/F2(α10)=1, m1=3, m2=11, s=2, n=4, t=0, a2=1. By Theorem 1.1, we have

    N210(l(X,Y)=0)=10245+(327+323)×20=1126587102265344.

    Example 4.3. Let F212=β=F2[x]/(x12+x6+x4+x+1) where β is a root of x12+x6+x4+x+1. Suppose l(X,Y)=x51+x132+y23+β10y24+y1y2+y3y4. Clearly, TrF212/F2(β10)=1, m1=5, m2=13, s=2, n=4, t=0, a1=a2=1. By Corollary 1, we have

    N212(l(X,Y)=1)=25×12+(647643×4095)×48=1153132559312355328.

    This work was jointly supported by the Natural Science Foundation of Fujian Province, China under Grant No. 2022J02046, Fujian Key Laboratory of Granular Computing and Applications (Minnan Normal University), Institute of Meteorological Big Data-Digital Fujian and Fujian Key Laboratory of Data Science and Statistics.

    The authors declare there is no conflicts of interest.



    [1] R. Ruddon, Cancer Biology, Oxford University Press, 2007.
    [2] Y. Wang, S. Liang, Y. Tian, J. Zhao, W. Du, Y. Liang, et al., Using machine learning to measure relatedness between genes: a multi-features model, Sci. Rep., 9 (2019), 1-15.
    [3] S. Liang, A. Ma, S. Yang, Y. Wang, Q. Ma, A review of Matched-pairs feature selection methods for gene expression data analysis, Comput. Structur. Biotechnol. J., 16 (2018), 88-97.
    [4] A.W. Partin, J. Yoo, H. B. Carter, J. D. Pearson, D. W. Chan, J. I. Epstein, et al., The use of prostate specific antigen, clinical stage and Gleason score to predict pathological stage in men with localized prostate cancer, J. Urol., 150 (1993), 110-114.
    [5] M. Hollstein, D. Sidransky, B. Vogelstein, C. C. Harris, P53 mutations in human cancers, J. Sci., 253 (1991), 49-53.
    [6] K. E. Stuart, A. J. Anand, R. L. Jenkins, Hepatocellular carcinoma in the United States: prognostic features, treatment outcome, and survival, Cancer Interdiscipl. Int. J. Am. Cancer Soc., 77 (1996), 2217-2222.
    [7] P. Kuusela, C. Haglund, P. J. Roberts, Comparison of a new tumour marker CA 242 with CA 199, CA 50 and carcinoembryonic antigen (CEA) in digestive tract diseases, British J. Cancer, 63 (1991), 636-640.
    [8] J. Schneider, H. G. Velcovsky, H. Morr, N. Katz, K. Neu, E. Eigenbrodt, Comparison of the tumor markers tumor M2-PK, CEA, CYFRA 21-1, NSE and SCC in the diagnosis of lung cancer, Anticancer Res., 20 (2000), 5053-5058.
    [9] L. A. Cole, J. M. Sutton, Selecting an appropriate hCG test for managing gestational trophoblastic disease and cancer, J. Reproduct. Med., 49 (2004), 545-553.
    [10] J. A. Ludwig, J. N. Weinstein, Biomarkers in cancer staging, prognosis and treatment selection, Nat. Rev. Cancer, 5(2005), 845-856.
    [11] G. J. Rustin, M. Marples, A. E. Nelstrop, M. Mahmoudi, T. Meyer, Use of CA-125 to define progression of ovarian cancer in patients with persistently elevated levels, J. Clin. Oncol., 19 (2001), 4054-4057.
    [12] H. Zheng, R. C. Luo, Diagnostic value of combined detection of TPS, CA153 and CEA in breast cancer, J. First Milit. Med. Univers., 25 (2003), 1293.
    [13] H. Q. Zhang, R. B.Wang, H. J. Yan, W. Zhao, K. L. Zhu, S. M. Jiang, et al., Prognostic significance of CYFRA21-1, CEA and hemoglobin in patients with esophageal squamous cancer undergoing concurrent chemoradiotherapy, Asian Pacific J. Cancer Prevent., 13 (2012), 199-203.
    [14] A. Hsu, S. L. Tang, S. Halgamuge, An unsupervised hierarchical dynamic self-organising approach to cancer class discovery and marker gene identification in microarray data, Bioinformatics, 19 (2003), 2131-2140.
    [15] J. J. Liu, G. Cutler, W. Li, Z. Pan, S. Peng, T. Hoey, et al., Multiclass cancer classification and biomarker discovery using GA-based algorithms, Bioinformatics, 21 (2005), 2691-2697.
    [16] B. J. Beattie, P. N. Robinson, Binary state pattern clustering: A digital paradigm for class and biomarker discovery in gene microarray studies of cancer, J. Comput. Biol., 13 (2006), 1114-1130.
    [17] C. Harris, N. Ghaffari, Biomarker discovery across annotated and unannotated microarray datasets using semi-supervised learning, BMC Genomics, 9(2008), S7.
    [18] T. Abeel, T. Helleputte, Y. Van de Peer, P. Dupont, Y. Saeys, Robust biomarker identification for cancer diagnosis with ensemble feature selection methods, Bioinformatics, 26 (2010), 392-398.
    [19] L. Chen, J. Xuan, C. Wang, I. M. Shih, Y. Wang, Z. Zhang, et al., Knowledge-guided multi-scale independent component analysis for biomarker identification, BMC Bioinformatics, 9 (2008), 416.
    [20] J. Cui, Q. Liu, D. Puett, Y. Xu, Computational prediction of human proteins that can be secreted into the bloodstrea, Bioinformatics, 24 (2008), 2370-2375.
    [21] J. Cui, Y. Chen, W. C. Chou, L. Sun, L. Chen, J. Suo, et al., An integrated transcriptomic and computational analysis for biomarker identification in gastric cancer, Nucleic Acids Res., 39 (2011),1197-1207.
    [22] C. S. Hong, J. Cui, Z. Ni, Y. Su, D. Puett, F. Li, et al., A computational method for prediction of excretory proteins and application to identification of gastric cancer markers in urine, PloS One, 6 (2011), e16875.
    [23] J. Wang, Y. Liang, Y. Wang, J. Cui, M. Liu, W. Du, et al., Computational prediction of human salivary proteins from blood circulation and application to diagnostic biomarker identification, PloS One, 8 (2013), e80211.
    [24] Y. Sun, W. Du, C. Zhou, Y. Zhou, Z. Cao, Y. Tian, et al., A Computational Method for Prediction of Saliva-Secretory Proteins and its Application to Identification of Head and Neck Cancer Biomarkers for Salivary Diagnosis, IEEE Transact. Nanobiosci., 14 (2015),167-174.
    [25] A. Ben-Hur, D. Horn, H. T. Siegelmann, V. Vapnik, A support vector method for clustering, Adv. Neural Inform. Process. Syst., 13 (2001), 367-373.
    [26] Y. Chen, Y. Zhang, Y. Yin, G. Gao, S. Li, Y. Jiang, et al., SPD-a web-based secreted protein database, Nucleic Acids Res., 33 (2005), D169-D173.
    [27] J. Sprenger, J. Lynn Fink, S. Karunaratne, K. Hanson, N. A. Hamilton, R. D. Teasdale, LOCATE: A mammalian protein subcellular localization database, Nucleic Acids Res., 36 (2007), D230-D233.
    [28] M. Magrane, Uniprot knowledgebase: A hub of integrated protein data, Database, 2011 (2011).
    [29] S. J. Li, M. Peng, H. Li, B. S. Liu, C. Wang, J. R. Wu, et al., Sys-bodyfluid: A systematical database for human body fluid proteome research, Nucleic Acids Res., 37 (2009), 907-912.
    [30] S. Hu, J. A. Loo, D. T. Wong, Human saliva proteome analysis and disease biomarker discovery, Expert Rev. Proteom., 4 (2007), 531-538.
    [31] P. Denny, F. K. Hagen, M. Hardt, L. Liao, W. Yan, M. Arellanno, et al., The proteomes of human parotid and submandibular/sublingual gland salivas collected as the ductal secretions, J. Proteom. Res., 7 (2008), 1994-2006.
    [32] S. El-Gebali, J. Mistry, A. Bateman, S. R. Eddy, A. Luciani, S. C. Potter, et al., The Pfam protein families database in 2019, Nucleic Acids Res., 47 (2019), D427-D432.
  • Reader Comments
  • © 2020 the Author(s), licensee AIMS Press. This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0)
通讯作者: 陈斌, bchen63@163.com
  • 1. 

    沈阳化工大学材料科学与工程学院 沈阳 110142

  1. 本站搜索
  2. 百度学术搜索
  3. 万方数据库搜索
  4. CNKI搜索

Metrics

Article views(4975) PDF downloads(410) Cited by(5)

Figures and Tables

Figures(4)  /  Tables(6)

/

DownLoad:  Full-Size Img  PowerPoint
Return
Return

Catalog