Processing math: 62%
Research article

Unsupervised segmentation of images using bi-dimensional pairwise Markov chains model

  • Received: 30 July 2024 Revised: 25 September 2024 Accepted: 26 September 2024 Published: 31 October 2024
  • MSC : 62C10, 62H30

  • The pair-wise Markov chain (PMC) model serves as an extension to the hidden Markov chain (HMC) model and has been widely used in unsupervised restoration tasks associated with reconstructing the hidden data. In fact, the PMC model can treat fairly complicated situations for which application of Bayesian restoration estimators such as maximum A Posteriori (MAP), or maximal Posterior mode (MPM) remains possible. The major novelty in this work is to construct a PMC model with observational data in two dimensions, and subsequently adapt the estimation algorithms, as well as, image restoration methods for that context. Often, the transformation of an image from a two-dimensional format to a one-dimensional sequence occurs via Hilbert-Peano scan (HPS), whereas in the proposed model, the second component of the observed process takes over this role to exceed the situation of pixel missing information after transformation for a to be segmented image. To reconstruct the hidden process, we used the MPM decision criterion after estimating the model's parameters with two algorithms: Stochastic expectation maximization (SEM) and iterative conditional estimation (ICE). In this study, experimental, numerical, and visual results are shown to demonstrate the superiority of the proposed model over the classical PMC for unsupervised restorations.

    Citation: A. Joumad, A. El Moutaouakkil, A. Nasroallah, O. Boutkhoum, Mejdl Safran, Sultan Alfarhood, Imran Ashraf. Unsupervised segmentation of images using bi-dimensional pairwise Markov chains model[J]. AIMS Mathematics, 2024, 9(11): 31057-31086. doi: 10.3934/math.20241498

    Related Papers:

    [1] Luigi Accardi, Amenallah Andolsi, Farrukh Mukhamedov, Mohamed Rhaima, Abdessatar Souissi . Clustering quantum Markov chains on trees associated with open quantum random walks. AIMS Mathematics, 2023, 8(10): 23003-23015. doi: 10.3934/math.20231170
    [2] Luigi Accardi, El Gheted Soueidi, Abdessatar Souissi, Mohamed Rhaima, Farrukh Mukhamedov, Farzona Mukhamedova . Structure of backward quantum Markov chains. AIMS Mathematics, 2024, 9(10): 28044-28057. doi: 10.3934/math.20241360
    [3] Lin Xu, Linlin Wang, Hao Wang, Liming Zhang . Optimal investment game for two regulated players with regime switching. AIMS Mathematics, 2024, 9(12): 34674-34704. doi: 10.3934/math.20241651
    [4] Ahmed Ghezal, Mohamed balegh, Imane Zemmouri . Markov-switching threshold stochastic volatility models with regime changes. AIMS Mathematics, 2024, 9(2): 3895-3910. doi: 10.3934/math.2024192
    [5] Abdessatar Souissi, El Gheteb Soueidy, Mohamed Rhaima . Clustering property for quantum Markov chains on the comb graph. AIMS Mathematics, 2023, 8(4): 7865-7880. doi: 10.3934/math.2023396
    [6] Ciro D'Apice, Alexander Dudin, Sergei Dudin, Rosanna Manzo . Analysis of a multi-server retrial queue with a varying finite number of sources. AIMS Mathematics, 2024, 9(12): 33365-33385. doi: 10.3934/math.20241592
    [7] Andrey Borisov . Filtering of hidden Markov renewal processes by continuous and counting observations. AIMS Mathematics, 2024, 9(11): 30073-30099. doi: 10.3934/math.20241453
    [8] Xiaolong Chen, Hongfeng Zhang, Cora Un In Wong . Optimization study of tourism total revenue prediction model based on the Grey Markov chain: a case study of Macau. AIMS Mathematics, 2024, 9(6): 16187-16202. doi: 10.3934/math.2024783
    [9] Xin-Jiang He, Sha Lin . Analytical formulae for variance and volatility swaps with stochastic volatility, stochastic equilibrium level and regime switching. AIMS Mathematics, 2024, 9(8): 22225-22238. doi: 10.3934/math.20241081
    [10] Refah Alotaibi, Mazen Nassar, Zareen A. Khan, Ahmed Elshahhat . Analysis of Weibull progressively first-failure censored data with beta-binomial removals. AIMS Mathematics, 2024, 9(9): 24109-24142. doi: 10.3934/math.20241172
  • The pair-wise Markov chain (PMC) model serves as an extension to the hidden Markov chain (HMC) model and has been widely used in unsupervised restoration tasks associated with reconstructing the hidden data. In fact, the PMC model can treat fairly complicated situations for which application of Bayesian restoration estimators such as maximum A Posteriori (MAP), or maximal Posterior mode (MPM) remains possible. The major novelty in this work is to construct a PMC model with observational data in two dimensions, and subsequently adapt the estimation algorithms, as well as, image restoration methods for that context. Often, the transformation of an image from a two-dimensional format to a one-dimensional sequence occurs via Hilbert-Peano scan (HPS), whereas in the proposed model, the second component of the observed process takes over this role to exceed the situation of pixel missing information after transformation for a to be segmented image. To reconstruct the hidden process, we used the MPM decision criterion after estimating the model's parameters with two algorithms: Stochastic expectation maximization (SEM) and iterative conditional estimation (ICE). In this study, experimental, numerical, and visual results are shown to demonstrate the superiority of the proposed model over the classical PMC for unsupervised restorations.



    The hidden Markov chain (HMC) model, as characterized in [1,2,3,4,5], is a doubly stochastic process, usually denoted by (X,Y). The process X, which models a to be segmented image, is assumed to be hidden and Markovian, and the process Y, which models the noisy version of the image to be recuperated, is assumed to be observed and real. The success of the HMC model stems from the fact that, when the noise is not too complex, several Bayesian classification approaches such as maximum A Posteriori (MAP) or maximal A Posteriori (MPM) estimators can be used to determine the image X=x from the observed version Y=y. In addition, the spatial regularity of pixels may hinder the effectiveness of the HMC model in image segmentation.

    We consider the problem of segmenting a satellite image into two classes, "water" and "trees". The hidden image represents all of the classes, whereas the observed image represents all of the gray levels on each pixel. In general, all neighboring pixels are considered to have a higher probability of the same class than pixels situated further from each other. Nevertheless, in this situation, we can detect pixels that are among a set of pixels in identical groups or near edges but have distinct appearances. According to the classical hypothesis, this attribute cannot be accommodated in the HMC model. For this reason, the HMC model has been augmented to include the pair-wise Markov chain (PMC) model The PMC model is more general than the HMC model, given the process X is not necessarily a Markov chain and the pair (X,Y) is directly Markovian. Another advantage of PMC against HMC is its capacity to easily take into account the noise correlation. Furthermore, the PMC model, like HMC, is capable of implementing Bayesian MAP and MPM restorations.

    Another major problem is that when a one-dimensional sequence is generated from a two-dimensional image using any reading process, such as human preference score (HPS), some locality information of the pixels is lost. What prompted us in this study was the success of adding additional components to the observations in the studies [6,7,8] in the framework of the HMC model by taking advantage of a neighborhood of each pixel (for example, four nearest neighbors), which culminated in an impressive model on a range of synthetic images. In this work, we apply the same concept to the PMC model in order to tackle the main issue of spatial regularity of pixels, as well as the problem of pixel information loss in an image, which is the objective of the segmentation process. As a result, we formulate a PMC model with bidimensional observed data (BPMC), whose most important task is to segment images while taking into account the above-mentioned last two difficulties. The focus of this study is to look into the potential benefits of BPMC over PMC in terms of unsupervised segmentation. Initially, the unidentified parameters of the model must be estimated. In the case of PMC, we cannot simply maximize the likelihood analytically, which is the case for the expectation-maximization (EM) algorithm [9,10], at each iteration, so we need to apply iterative conditional estimation (ICE) and SEM [11,12,13]. In general, the ICE and structural equation models (SEM) algorithms are very powerful when used with Gaussian estimation and generalized mixture. In this study, we adapt the ICE and SEM algorithms to the new model. Furthermore, SEM and ICE algorithms are examined in the presence of Gaussian noise for both PMC and BPMC, and it is discovered that SEM and ICE algorithms are more adaptable because likelihood is not always required. In addition, the estimators beyond the maximum likelihood can also be utilized. Therefore, we need to answer the following questions.

    ● Does the use of BPMC significantly improve restoration results, compared to the PMC model, and how do ICE and SEM algorithms work in the BPMC context?

    ● Does the use of BPMC compared to PMC permit the image to be converted into a one-dimensional sequence without losing pixel locality information?

    The last point ought to be considered since all kinds of hidden Markov chain models based on HPS have shown potential for image segmentation, and in certain situations, they may even be competing with the hidden Markov field (HMF) model in classification [14,15,16]. On top of that, the HMF model allows for a more precise and more intuitive modeling of the spatial relationships between pixels [17], while the PMC model requires HPS to account for the spatial details of the pixels. Conversely, the BPMC model takes into account temporal and spatial pixel information by incorporating an additional element right into the observed process, allowing it to compete with the PMC model in this regard.

    The rest of this paper is structured into seven sections. Section 2 presents studies relevant to the current study. In Section 3, we provide the reader with a mathematical background of the BPMC model. Section 4 presents the two models discussed in this study and their properties. Section 5 is committed to the simulation of PMC and BPMC models in a supervised way using MPM classifier-based classifications. Section 6 deals with estimating parameter algorithms; here, we describe the main steps of the ICE and SEM algorithms in the case of the two models, PMC and BPMC. In Section 7, with numerical experiments for various noise factors, we give a study of comparison of the two estimation procedures based on the accuracy of parameter estimates for both models. Furthermore, using error rates and the peak signal-to-noise ratio (PSNR) index, we demonstrate the performance of these models on a sample of synthetic images. Finally, conclusions and some perspectives are discussed in Section 8.

    The pair-wise terminology was first attached to the hidden Markov model in the works [18,19,20], where the main motivation was to solve the problem of correlation between noise and pixels located on the image boundary. In fact, this difficulty has been emphasized for the first time in the pair-wise Markov field (PMF) model and the pair-wise Markov tree (PMT) model. The PMF model has been successfully used for textured image segmentation [20], real synthetic-aperture radar (SAR) images [21], and hyper-spectral images [22]. The PMT model has also been used for image and signal restoration tasks [19]. Recently, the PMC model [18] has been used for applications including signal and image processing. The PMC model has specific applications in image processing, such as textured and real images segmentation [14,23,24,25,26,27] and text segmentation [28,29], where the authors used an HPS for converting the image and copula to model conditional densities of class. Images with multi-scale can be segmented using PMC [30]. Other uses include fuzzy PMC to treat spatially correlated fuzzy classes between observations [31,32].

    Several other methods based on hidden Markov models have been attempted in order to solve the question of the relationship between pixels in the chain after the transformation of a bidimensional image. Among these, an approach suggested by [15] is based on a combination of the HMC and HMF models. The first model is employed in the estimation phase and the second is exploited in the final classification to determine the X configuration. Besides, another strategy in [33,34] is centered on the integration of spectral and contextual information into the HMC model, where the structure of the gathered data is no more Markov when modeling an image using HPS. Additionally, the study [35] entails another suggestion for an algorithm that incorporates fuzzy C-means combined with HPS. In [36], a new HMC is proposed for representing the semantic and spatial interactions between pixels.

    A new segmentation method in [7] has recently been introduced that tackles the issues of spatial and temporal pixels via contextual HPS. This method adds an additional element to the observation process in the HMC model so that each pixel recovered from the HPS is linked to another pair of pixels next to it in the image but not in the chain. Similarly, in [8], a second component is introduced in the process Y and a two-dimensional model is generated with estimation and segmentation techniques adapted to this scenario. Table 1 summarizes all of the methods covered here.

    Table 1.  A brief overview of research into noise correlation and pixel misplacement in image segmentation.
    Models Image used Technique suggested Year Reference
    PMF Textured image Pixel neighborhood 2001 [20]
    PMT Synthetic image Partition of pixels set 2003 [19]
    PMC Synthetic and SAR images HPS 2004 [14]
    PMC Textured and SAR images HPS and copula 2013 [26]
    PMC Scanned document HPS 2011 [28]
    FPMC Fuzzy and Astronomical images HPS 2008 [31]
    HMC Synthetic image Spectral and spatial pixel information 2005 [33]
    Fuzzy segmentation multi-spectral image FCM algorithm and HPS 2005 [35]
    HMM Medical image Adapted Viterbi algorithm 2020 [36]
    HMC Textured image Contextual HPS 2021 [7]
    Bi-dimensional HMC Synthetic and mammography image Bi-dimensional observation 2023 [8]

     | Show Table
    DownLoad: CSV

    In this part, we describe a brief mathematical basis for the BPMC model given in this study. We consider a couple of stochastic processes (X,Y)=((Xn)1nN,(Yn)1nN), where N is the number of pixels. As previously explained, in the segmentation of images with the hidden Markov model (HMM), we need to estimate the latent process X=x via the observed process Y=y.

    In the context of the PMC model with two-dimensional observed data, we consider another realization of a random variable ˉY=ˉy modeling information about each pixel based on its neighborhood in the image, but not in the chain, in such a way that the observed process is two-dimensional and is denoted by (Y,ˉY). Then, we intend to estimate the parameters of the joint probability p(x,y,ˉy) by using data unsupervisedly. Based on the observation, Bayesian estimators are used to restructure the hidden process. In the present model's segmentation problem, it considers two laws: An a priori law p(x) and a two-dimensional conditional density p(y,ˉy|x).

    To take on the above-mentioned segmentation problem, the a posteriori distribution must be identified, which contains all of the information on X that is available at (Y,ˉY). This distribution was supplied by: p(x|y,ˉy)=p(x,y,ˉy)xp(x,y,ˉy), where p(x,y,ˉy)=p(x)×p(y,ˉy|x).

    Let L be a bivariate loss function and ˆx be an approximation of the hidden realization, where the latter is obtained by this method:

    ˆx=argminxxL(x,x)p(x|y,ˉy). (3.1)

    In Bayesian statistics, the two best-known estimators are MAP and MPM, which are associated respectively with the two loss functions LMAP(x,x)=L(x,x) and LMPM(x,x)=Nn=1L(xn,xn), with L here representing the Kronecker symbol.

    The two estimators are consequently and respectively defined by:

    ˆxMAP=argminxxLMAP(x,x)p(x|y,ˉy)=argminxxxp(x|y,ˉy)=argminx(1p(x|y,ˉy))=argmaxxp(x|y,ˉy),

    and

    ˆxMPM=argminxxLMPM(x,x)p(x|y,ˉy)=argminxxNn=1L(xn,xn)p(x|y,ˉy)=argminxNn=1xnxnp(xn|y,ˉy)=argminxNn=1(1p(xn|y,ˉy))=argmaxxNn=1p(xn|y,ˉy).

    When comparing these two estimators, MPM maximizes the a posteriori marginal probability for each xn, while MAP maximizes the a posteriori probability p(x|y,ˉy) in a direct way, and allows us to estimate the whole x sequence. Next, we only utilize the MPM estimator since it is straightforward to use. We use iterative strategies to estimate the joint distribution of the unobserved process using the data available (y,ˉy). For calculating the joint distribution, we often use a conditional expectation approximation in the case of the ICE method or stochastic generators to estimate a sequence of model parameters from the dataset in the case of the SEM algorithm.

    Let's consider two stochastic processes X=(Xn)1nN and Y=(Yn)1nN, where N is the number of pixels of an image. The process X is the unknown image whereas Y is the observed one. Each random variable Xn has the values from a finite set of K classes Ω={ω1,,ωK}, and each Yn has attributed the values from the set of real numbers IR. Realizations of X and Y are denoted by x=(xn)1nN and y=(yn)1nN, respectively.

    We suppose that the process X is stationary and that the random variables Y=(Yn)1nN are correlated and conditionally depend on X. As in all Markov models, the difficulty of classification is estimating the latent process X=x based on the observed process Y=y.

    This study uses the PMC model and proposes updating the classical pair-wise model. The classical PMC model is a doubly stochastic process, usually noted (X, Y), where Y (one-dimensional) represents the pixels of a hidden image X. The same concept for the bidimensional PMC model is true, except for the process Y. In the classical model, a direct Hilbert Peano scan is used to transform the image from its two-dimensional version to a one-dimensional chain, followed by an inverse scan to reconstruct the final image, but in the suggested model, the second component of Y provides for this goal. The visual results revealed a significant advantage of the proposed model over the classical model, especially for isolated pixels in the to-be-segmented image.

    An image of a bidimensional form can be transformed into a one-dimensional form inevitably by combining the image column by column or line by line. Moreover, we can use the classical HPS for an image 2p×2p [2], or the generalized HPS for any given image [37]. In a moderately different sense than above, now let's consider the pair-wise process Z=(Z1,,Zn,,ZN) corresponding to X the hidden process and Y the observed process, where Zn=(Xn,Yn) for n=1,,N. Lowercase letters are used to identify the realizations of such variables and processes. To keep things straightforward and brief, we use the letters p and f, respectively, to describe various distributions on Ω and densities on IR in the following text. p(zn) is also employed for p(Zn=zn). Referring to the previous, the pair Z is a PMC model when its joint distribution can be given as follows p(z)=p(z1)Nn=2p(zn|zn1), with

    p(z1)=p(x1,y1)=x2IRp(x1,x2,y1,y2)dy2=x2IRp(x1,x2)p(y1,y2|x1,x2)dy2=x2p(x1,x2)IRp(y1,y2|x1,x2)dy2=x2p(x1,x2)p(y1|x1,x2)=x2p(x1,x2)fx1,x2(y1),

    and

    p(zn|zn1)=p(xn,yn|xn1,yn1)=p(yn|xn1,yn1,xn)p(xn|xn1,yn1)forn=2,,N. (4.1)

    Based upon whether the process X is stationary, we can also define the Z distribution only as p(z1)p(z2|z1)=p(z1,z2). Thus the following proposition is inspired by the following works [4,6,38].

    Proposition 4.1. For Z a stationary PMC model related to processes X and Y, and for an n from {2,,N}, the following results are obtained.

    (1) Both the distributions of X conditionally to Y=y and of Y conditionally to X=x are Markovians.

    (2) If p(xn|xn1,yn1)=p(xn|xn1) and p(yn|xn,xn1,yn1)=p(yn|xn,yn1), then Z is an HMC with dependent noise.

    (3) If p(xn|xn1,yn1)=p(xn|xn1) and p(yn|xn,xn1,yn1)=p(yn|xn), then Z is an HMC with independent noise.

    (4) If p(yn1|xn1,xn)=p(yn1|xn1), then Z is an HMC with dependent noise.

    Proof. (1) Let Z=(X,Y) be a PMC model. Then, n belongs to {2,,N}, and we have

    p(xn|x1,,xn1,y)=p(x1,,xn1,xn,y)p(x1,,xn1,y)=xn+1,,xNp(x,y)xn,,xNp(x,y)=xn+1,,xNp(x1,y1)p(x2,y2|x1,y1)p(xN,yN|xN1,yN1)xn,,xNp(x1,y1)p(x2,y2|x1,y1)p(xN,yN|xN1,yN1)=p(x1,y1)p(xn,yn|xn1,yn1)xn+1,,xNp(xn+1,yn+1|xn,yn)p(xN,yN|xN1,yN1)p(x1,y1)p(xn1,yn1|xn2,yn2)xn,xn+1,,xNp(xn,yn|xn1,yn1)p(xn+1,yn+1|xn,yn)p(xN,yN|xN1,yN1)=p(xn,yn|xn1,yn1)xn+1,,xNp(xn+1,yn+1|xn,yn)p(xN,yN|xN1,yN1)xn,xn+1,,xNp(xn,yn|xn1,yn1)p(xn+1,yn+1|xn,yn)p(xN,yN|xN1,yN1).

    We also have

    p(xn|xn1,y)=p(xn1,xn,y)p(xn1,y)={x1,,xN}{xn1,xn}p(x,y){x1,,xN}{xn1}p(x,y)={x1,,xN}{xn1,xn}p(x1,y1)p(x2,y2|x1,y1)p(xN,yN|xN1,yN1){x1,,xN}{xn1}p(x1,y1)p(x2,y2|x1,y1)p(xN,yN|xN1,yN1)=p(xn,yn|xn1,yn1)x1,,xn2p(x1,y1)p(x2,y2|x1,y1)p(xn1,yn1|xn2,yn2)xn+1,,xNp(xn+1,yn+1|xn,yn)p(xN,yN|xN1,yN1)x1,,xn2p(x1,y1)p(x2,y2|x1,y1)p(xn1,yn1|xn2,yn2)xn,xn+1,,xNp(xn,yn|xn1,yn1)p(xN,yN|xN1,yN1)=p(xn,yn|xn1,yn1)xn+1,,xNp(xn+1,yn+1|xn,yn)p(xN,yN|xN1,yN1)xn,xn+1,,xNp(xn,yn|xn1,yn1)p(xN,yN|xN1,yN1).

    This gives p(xn|x1,,xn1,y)=p(xn|xn1,y), therefore X conditionally to Y=y is a Markov chain. In the same way, we can show that Y conditionally to X=x is a Markov chain, and x and y do a symmetrical function here.

    (2) Z is a PMC model, then

    p(z)=p(x1,y1)Nn=2p(yn|xn1,yn1,xn)p(xn|xn1,yn1)=p(x1)p(y1|x1)Nn=2p(yn|xn,yn1)p(xn|xn1),

    from which Z is an HMC with dependent noise, identified by the initial distribution p(z1) by

    p(z1)=p(x1)p(y1|x1),

    and the transition matrix p(zn|zn1) given by

    p(zn|zn1)=p(xn|xn1)p(yn|xn,yn1).

    (3) Z is a PMC model, then

    p(z)=p(x1,y1)Nn=2p(yn|xn1,yn1,xn)p(xn|xn1,yn1)=p(x1)p(y1|x1)Nn=2p(yn|xn,yn1)p(xn|xn1)=p(x1)Nn=2p(xn|xn1)p(y1|x1)Nn=2p(yn|xn,yn1)=p(x1)Nn=2p(xn|xn1)Nn=1p(yn|xn).

    From which Z is an HMC with independent noise.

    (4) Let us show here that the process X is a Markov chain. We have, for n from {2,,N},

    p(xn|x1,,xn1)=p(x1,,xn1,xn)p(x1,,xn1)=xn+1,,xNIRNp(x,y)dy1dyNxn,,xNIRNp(x,y)dy1dyN=xn+1,,xNIRNp(x1,y1)p(x2,y2|x1,y1)p(xN,yN|xN1,yN1)dy1dyNxn,,xNIRNp(x1,y1)p(x2,y2|x1,y1)p(xN,yN|xN1,yN1)dy1dyN=IRp(xn,yn|xn1,yn1)dynxn+1,,xNIRNn1p(xn+1,yn+1|xn,yn)p(xN,yN|xN1,yN1)dyn+1dyNn2xnIRp(xn,yn|xn1,yn1)dynxn+1,,xNIRNn2p(xn+1,yn+1|xn,yn)p(xN,yN|xN1,yN1)dyn+1dyNn2=IRp(xn1,xn)p(yn1,yn|xn1,xn)p(xn1)p(yn1|xn1)dynxnIRp(xn1,xn)p(yn1,yn|xn1,xn)p(xn1)p(yn1|xn1)dyn=p(xn1,xn)p(yn1|xn1,xn)p(xn1)p(yn1|xn1)xnp(xn1,xn)p(yn1|xn1,xn)p(xn1)p(yn1|xn1).

    Taking advantage that p(yn1|xn1,xn)=p(yn1|xn1), we thus have p(xn|x1,,xn1)=p(xn|xn1), which completes the proof.

    Remark 4.1. Properties 2 and 3 in Proposition4.1 demonstrate the strict generality of the PMC model compared with the HMC model.

    In this model, we consider another observed process noted ˉY given by ˉY=(Y1,ˉY1,,Yn,ˉYn,,YN,ˉYN). Consequently, for each n=1,,N, the random variable Xn is associated with the pair (Yn,ˉYn), where ˉYn is related only to the observation Yn as defined below. The problem remains unchanged; determining X based on Y. To establish this new model, we may then state the following conditions (Figure 1).

    Figure 1.  Probability graph of BPMC model for three successive variables.

    (1) The process Z=(X,Y) is a PMC model.

    (2) The random variables (ˉYn)1nN are conditionally independent to Z.

    (3) Each ˉYn is conditionally distributed to Z and its distribution is conditionally to Zn.

    The model to be studied (Z,ˉY), called the BPMC model, could be interpreted as a semi-hidden Markov chain model [4]. Its distribution can be stated as follows.

    p(z,ˉy)=p(z,ˉy1,,ˉyN)=p(z)p(ˉy1,,ˉyN|z)=p(z1)N1n=1p(zn+1|zn)Nn=1p(ˉyn|zn)=p(x1,y1,ˉy1)N1n=1p(xn+1,yn+1,ˉyn+1|xn,yn).

    Let's emphasize that the BPMC model is particularly interesting because, given the pair (Y,ˉY)=(y,ˉy), the distribution of X is a Markov. Using the same logic as the demonstration of the first point in Proposition4.1, we get the following proposition.

    Proposition 4.2. For (Z,ˉY), a stationary BPMC model related to processes X and Y, the following result is obtained.

    The distributions of X conditional on (Y,ˉY)=(y,ˉy) is a Markov chain.

    The reason behind the achievements of the HMC model can be attributed to the fact that the process X conditionally on Y=y is a Markov chain and its transitions can be computed. This observation holds true in the case of the PMC model. However, when it comes to the BPMC model, this conditional distribution exhibits a complex structure. That is precisely why we employ the distribution p(x|y,ˉy). Both models, PMC and BPMC, are assumed to be stationary, such that their distributions are given by p(x1,y1,x2,y2) for PMC and p(x1,y1,ˉy1,x2,y2,ˉy2) for BPMC.

    In this section, we contrast the two models using the MPM restoration algorithm in the Gaussian context. Both models utilize Gaussian densities, which implies that p(yn1,yn|xn1,xn) and p(ˉyn1|xn1), for each n, are Gaussian (in the BPMC model case, we choose p(ˉyn1|xn1) to be equal to p(ˉyn1|xn1,xn)). In this scenario, we have K2 probabilities p(xn1=ωi,xn=ωj), K2 mean vectors μij=(μ1ijμ2ij), and K2 variance-covariance matrix Σij=(σ1ijσ12ijσ12ijσ2ij) of the bidimensional densities fij(yn1,yn) to represent the PMC model. Additionally, the BPMC model is represented by the parameters μ1ij, μ2ij, σ1ij, and σ2ij of the K2 mono-dimensional densities fij(ˉyn1). In the following subsection, we will simulate the two processes X and Y assuming that all the aforementioned parameters are initially known. Furthermore, we will reconstruct the process X only based on the process Y using the PMC model, while considering both Y and ˉY simultaneously using the BPMC model.

    Here, we evoke the previously defined distributions of the PMC model and BPMC model. The distribution of the PMC model is given by p(x1,y1)=p(x1)p(y1|x1) and p(xn,yn|xx1,yn1)=p(yn|xn1,yn1,xn)p(xn|xn1,yn1), while the distribution of the BPMC model is represented by p(x1,y1,ˉy1)=p(x1)p(y1|x1)p(ˉy1|x1,y1), p(xn,yn|xn1,yn1), and p(ˉyn|xn,yn).

    We can simulate the processes X, Y and ˉY of models concurrently and interchangeably using the following approach.

    ● For the PMC model, x1, y1, xn, and yn, for each n=2,,N, are simulated with the following probabilities:

    p(x1)=x2p(x1,x2). (5.1)
    p(y1|x1)=p(x1,y1)p(x1)=x2p(x1,x2)fx1,x2(y1)x2p(x1,x2). (5.2)
    p(xn|xn1,yn1)=p(xn,xn1,yn1)p(xn1,yn1)=p(xn1,xn)fxn1,xn(yn1)xnp(xn1,xn)fxn1,xn(yn1). (5.3)
    p(yn|xn,xn1,yn1)=fxn1,xn(yn1,yn)fxn1,xn(yn1). (5.4)

    ● For the BPMC model, x1, y1, xn, and yn are simulated in the same way as above. Additionally, we have simulated ˉy1 and ˉyn, for each n=2,,N, with these probabilities:

    p(ˉy1|x1,y1)=fx1,x2(ˉy1). (5.5)
    p(ˉyn|xn,yn)=fxn1,xn(ˉyn). (5.6)

    Remark 5.1. It can be seen here that (to dig deeper, see [14,39])

    fx1,x2(y1) is a mono-dimensional Gaussian density N(μ1ij,σ1ij).

    fxn1,xn(yn1) is a mono-dimensional Gaussian density N(μ1ij,σ1ij).

    it can be shown that yn is generated according to a Gaussian drawing, where p(yn|xn,xn1,yn1) is a mono-dimensional Gaussian density of mean μ2ij+σ12ijσ1ij(yn1μ1ij) and standard deviation σ1ijσ2ijσ12ijσ1ij.

    fx1,x2(ˉy1) is a mono-dimensional Gaussian density N(μ1ij,σ1ij).

    fxn1,xn(ˉyn) is a mono-dimensional Gaussian density N(μ2ij,σ2ij).

    In this work, we look at the MPM approach for reconstructing the hidden sequence X=x.

    The MPM estimator for the PMC model consists of maximizing a posteriori marginal probability for each xn as follows:

    ˆxn=argmaxωiχn(ωi), (5.7)

    where χn(ωi)=p(xn|y). Formally, this probability is calculated using the "conditional forward probabilities" αn(ωi)=p(xn=ωi|y1,,yn) and the "conditional backward probabilities" βn(ωi)=p(yn+1,,yN|xn=ωi,yn)p(yn+1,,yN|y1,,yn) to avoid the numerical problems encountered by the same probabilities in the classical case. These conditional probabilities can be given using the following recursion:

    ● Initiation phase: For i=1,,K.

    (1) Forward

    α1(ωi)=p(x1=ωi,y1)Kj=1p(x1=ωj,y1). (5.8)

    (2) Backward

    βN(ωi)=1. (5.9)

    ● Induction phase: For i=1,,K and n=1,,N1.

    (1) Forward

    αn+1(ωi)=Kj=1αn(ωj)p(xn+1=ωi,yn+1|xn=ωj,yn)Kj=1Kk=1αn(ωj)p(xn+1=ωk,yn+1|xn=ωj,yn). (5.10)

    (2) Backward

    βn(ωi)=Kj=1βn+1(ωj)p(xn+1=ωj,yn+1|xn=ωi,yn)Kj=1Kk=1αn(ωk)p(xn+1=ωj,yn+1|xn=ωk,yn), (5.11)

    with p(xn+1,yn+1|xn,yn)=p(xn,yn,xn+1,yn+1)p(xn,yn)=p(xn,xn+1)fxn,xn+1(yn,yn+1)xn+1p(xn,xn+1)fxn,xn+1(yn).

    It is easy to demonstrate that

    χn(ωi)=αn(ωi)βn(ωi). (5.12)

    Besides this, the MPM estimator under the BPMC model can be calculated as follows.

    ˆxn=argmaxωiχn(ωi), (5.13)

    where χn(ωi)=p(xn|y,ˉy). This probability can be achieved using the following conditional forward and backward probabilities αn(ωi)=p(xn=ωi|y1,ˉy1,,yn,ˉyn) and βn(ωi)=p(yn+1,ˉyn+1,,yN,ˉyN|xn=ωi,yn,ˉyn)p(yn+1,ˉyn+1,,yN,ˉyN|y1,ˉy1,,yn,ˉyn). These conditional probabilities can be given using the following recursion:

    ● Initiation phase: For i=1,,K.

    (1) Forward

    α1(ωi)=p(x1=ωi,y1,ˉy1)Kj=1p(x1=ωj,y1,ˉy1). (5.14)

    (2) Backward

    βN(ωi)=1. (5.15)

    ● Induction phase: For i=1,,K and n=1,,N1.

    (1) Forward

    αn+1(ωi)=Kj=1αn(ωj)p(xn+1=ωi,yn+1,ˉyn+1|xn=ωj,yn,ˉyn)Kj=1Kk=1αn(ωj)p(xn+1=ωk,yn+1,ˉyn+1|xn=ωj,yn,ˉyn). (5.16)

    (2) Backward

    βn(ωi)=Kj=1βn+1(ωj)p(xn+1=ωj,yn+1,ˉyn+1|xn=ωi,yn,ˉyn)Kj=1Kk=1αn(ωk)p(xn+1=ωj,yn+1,ˉyn+1|xn=ωk,yn,ˉyn), (5.17)

    with p(xn+1,yn+1,ˉyn+1|xn,yn,ˉyn)=p(xn,xn+1)fxn,xn+1(yn,yn+1)fxn,xn+1(ˉyn+1)xn+1p(xn,xn+1)fxn,xn+1(yn).

    Considering the foregoing considerations, the distribution of Xn conditionally on (Y=y,ˉY=ˉy) is given by

    p(xn|y,ˉy)=p(xn,y,ˉy)p(y,ˉy)=p(xn,y1,ˉy1,,yn,ˉyn)p(yn+1,ˉyn+1,,yN,ˉyN|xn,y1,ˉy1,,yn,ˉyn)p(yn+1,ˉyn+1,,yN,ˉyN|y1,ˉy1,,yn,ˉyn)p(y1,ˉy1,,yn,ˉyn)=p(xn|y1,ˉy1,,yn,ˉyn)p(yn+1,ˉyn+1,,yN,ˉyN|xn,y1,ˉy1,,yn,ˉyn)p(yn+1,ˉyn+1,,yN,ˉyN|y1,ˉy1,,yn,ˉyn).

    This gives

    χn(ωi)=αn(ωi)βn(ωi). (5.18)

    In this section, we attempt to show the importance of the BPMC model compared to the PMC model. For that, we treat an example by considering N=100000 and K=2 (i.e., Ω={ω1,ω2}). We shall investigate the models' performance by taking two factors into account, two different chain homogeneities, noted H1 and H2, and different noise parameters:

    ● Factor chain:

    - Case H1: p(ω1,ω1)=p(ω2,ω2)=0.48 and p(ω1,ω2)=p(ω2,ω1)=0.02.

    - Case H2: p(ω1,ω1)=p(ω2,ω2)=0.25 and p(ω1,ω2)=p(ω2,ω1)=0.25.

    ● Factor noise:

    - Parameters of density fω1,ω1: μ1ω1,ω1=3, μ2ω1,ω1=3, σ1ω1,ω1=14, σ2ω1,ω1=14, and σ12ω1,ω1=0.1.

    - Parameters of density fω1,ω2: μ1ω1,ω2=5, μ2ω1,ω2=5, σ1ω1,ω2=11, σ2ω1,ω2=9, and σ12ω1,ω2=0.9.

    - Parameters of density fω2,ω1: μ1ω2,ω1=3, μ2ω2,ω1=3, σ1ω2,ω1=9, σ2ω2,ω1=11, and σ12ω2,ω1=0.1.

    - Parameters of density fω2,ω2: μ1ω2,ω2=5, μ2ω2,ω2=5, σ1ω2,ω2=18, σ2ω2,ω2=18, and σ12ω2,ω2=0.1.

    The following parameter is used to determine noise correlation influence on the error rate

    ρ=Ki,j=1(ρωi,ωjρωi,ωj)2 with ρωi,ωj>ρωi,ωj, where ρωi,ωj=σ12ωi,ωjσ1ωi,ωjσ2ωi,ωj,fori,j=1,,K.

    We consider four sets of ρ and the covariance values associated with these four simulations are shown in Table 2.

    Table 2.  Covariance values associated to parameter ρ used in the four experiments.
    Experiments σ12ω1,ω1 σ12ω1,ω2 σ12ω2,ω1 σ12ω2,ω2
    Experiment 1 ρ=190 0.1 0.9 0.1 0.1
    Experiment 2 ρ=45 0.2 0.9 0.3 0.2
    Experiment 3 ρ=17 0.3 0.1 0.3 0.2
    Experiment 4 ρ=5 0.3 0.4 0.4 0.9

     | Show Table
    DownLoad: CSV

    We specify the misclassification rates computed by the MPM method in Table 3 after calculating the means from 300 independent experiments.

    Table 3.  Incorrectly classified (%) pixels given by PMC and BPMC models-based Bayesian MPM for two cases of chain.
    Experiments Case H1 Case H2
    PMC BPMC PMC BPMC
    Experiment 1 13.66 % 9.712 % 32.08 % 35.86 %
    Experiment 2 15.74 % 10.42 % 33.31 % 34.57 %
    Experiment 3 16.87 % 12.65 % 30.18% 32.73 %
    Experiment 4 44.29 % 37.12 % 43.62 % 48.03 %

     | Show Table
    DownLoad: CSV

    According to these results, for chain in Case H1, the misclassification rates are more important when the coefficient ρ is small (Experiment 4), and becomes progressively smaller when the coefficient ρ decreases. It can be seen from all experiments that the BPMC model-based MPM restorations often work better than the PMC model-based ones. For the case H2 of chain, the misclassification rates are consistently more important for both models. In this situation, the PMC model is preferable to the BPMC model. Other simulations have been done, showing that chain homogeneity is an important element and it can be used during the restoration phase when the data is complete.

    The parameter estimation problem from incomplete data for the PMC is considerably more complex than for the HMM. Considering the fact that we are dealing with unsupervised segmentation, all of the parameters noted θ=(θ1,,θl) must be estimated. We are interested, in this section, in estimating K2 probabilities p(ωi,ωj), 2K2 parameters of μij and 3K2 parameters of Σij. Considering that the log-likelihood of the models under examination cannot be maximized analytically, the methods follow the simulation according to posterior distributions used for parameter estimation. We use here ICE and SEM algorithms.

    The ICE algorithm under the PMC model is an iterative procedure, working as follows:

    ● initialize θ0i;

    θq+1i is calculated using θq+1i=Eθqi[ˆθi(X,Y)|Y=y] if this expectation is computable, or θq+1i=1TTt=1ˆθi(xi,y) if the expectation above is not computable, where x1,,xT are simulated according to p(x|y,θq).

    Returning to our problem, we can take the following estimator for p(ωi,ωj):

    ˆp(ωi,ωj)(z)=1N1N1n=111{xn=ωi,xn+1=ωj}. (6.1)

    The conditional expectation of this estimator, at iteration (q+1), can be calculated and gives

    p(q+1)(ωi,ωj)=E[ˆp(ωi,ωj)(z)|Y=y]=1N1N1n=1ψ(q)n(ωi,ωj), (6.2)

    where ψ(q)n(ωi,ωj)=p(xn=ωi,xn+1=ωj|y) is a joint a posteriori probability of two consecutive classes, and can be calculated using

    ψ(q)n(ωi,ωj)=αn(ωi)p(xn+1=ωj,yn+1|xn=ωi,yn)βn+1(ωj)Kl=1Kk=1αn(ωl)p(xn+1=ωk,yn+1|xn=ωl,yn)βn+1(ωk). (6.3)

    For the parameters μij and Σij, we can choose the following estimators:

    ˆμij(z)=N1n=1(ynyn+1)11{xn=ωi,xn+1=ωj}N1n=111{xn=ωi,xn+1=ωj}. (6.4)
    ˆΣij(z)=N1n=1((ynyn+1)ˆμij(z))((ynyn+1)ˆμij(z))t11{Xn=ωi,xn+1=ωj}N1n=111{Xn=ωi,xn+1=ωj}. (6.5)

    However, we cannot calculate the conditional expectation of these estimators. In this situation, we employ a stochastic strategy based on the simulation of X according to its a posteriori distribution, which is a nonstationary Markov chain, as we demonstrated previously:

    p(x|y)=p(x1|y)N1n=1p(xn+1|xn,y), (6.6)

    with an initial distribution χ1(ωi) and a transition matrix

    p(xn+1=ωj|xn=ωi,y)=ψn(ωi,ωj)χn(ωi). (6.7)

    The process for applying the ICE algorithm to the PMC model is then presented in the following manner.

    ● Iteration (q=0), for i,j=1,,K.

    (1) Initialization of the algorithm with p(q)(ωi,ωj),μ(q)ij,Σ(q)ij.

    (2) Calculation of probabilities α(q)n(ωi) according to (5.8) and (5.10).

    (3) Calculation of probabilities β(q)n(ωi) according to (5.9) and (5.11).

    (4) Deduction of probabilities χ(q)n(ωi) according to (5.12).

    (5) Deduction of probabilities ψ(q)n(ωi,ωj) according to (6.3).

    (6) Simulation of x(q),1,,x(q),T according to (6.6).

    ● Iteration (q+1), for i,j=1,,K.

    (1) Calculation of p(q+1)(ωi,ωj) according to (6.2).

    (2) Calculation of μ(q+1)ij and Σ(q+1)ij by

    μ(q+1)ij=1TTt=1μtij,Σ(q+1)ij=1TTt=1Σtij, (6.8)

    where μtij and Σtij are calculated according to (6.4) and (6.5).

    ● Iterations are stopped if the algorithm converges; otherwise, the preceding step is repeated.

    The ICE algorithm follows this method:

    ● Initialize θ0i;

    θq+1i is calculated using θq+1i=Eθqi[ˆθi(X,Y,ˉY)|Y=y,ˉY=ˉy] if this expectation is computable, or θq+1i=1TTt=1ˆθi(xi,y,ˉy) if the expectation above is not computable, where x1,,xT are simulated according to p(x|y,ˉy,θq).

    Here, we take the same estimator for p(ωi,ωj), as given in (6.1). The conditional expectation of this estimator, at iteration (q+1), can be calculated and gives

    p(q+1)(ωi,ωj)=E[ˆp(ωi,ωj)(z)|Y=y,ˉY=ˉy]=1N1N1n=1ψ(q)n(ωi,ωj), (6.9)

    where ψ(q)n(ωi,ωj)=p(Xn=ωi,xn+1=ωj|y,ˉy) is a joint a posteriori probability of two consecutive classes, and it can be shown that

    ψ(q)n(ωi,ωj)=αn(ωi)p(xn+1=ωj,yn+1,ˉyn+1|xn=ωi,yn,ˉyn)βn+1(ωj)Kl=1Kk=1αn(ωl)p(xn+1=ωk,yn+1,ˉyn+1|xn=ωl,yn,ˉyn)βn+1(ωk). (6.10)

    For the parameters μij and Σij, we choose the same estimators as in (6.4) and (6.5).

    Here, we simulate the process X according to its a posteriori distribution under BPMC model p(x|y,ˉy), where

    p(x|y,ˉy)=p(x1|y,ˉy)N1n=1p(xn+1|xn,y,ˉy), (6.11)

    with an initial distribution χ1(ωi) and a transition matrix

    p(xn+1=ωj|xn=ωi,y,ˉy)=ψn(ωi,ωj)χn(ωi). (6.12)

    The ICE algorithm procedure for the BPMC model is then explained as follows.

    ● Iteration (q=0), for i,j=1,,K.

    (1) Initialization of the algorithm with p(q)(ωi,ωj),μ(q)ij,Σ(q)ij.

    (2) Calculation of probabilities α(q)n(ωi) according to (5.14) and (5.16).

    (3) Calculation of probabilities β(q)n(ωi) according to (5.15) and (5.17).

    (4) Deduction of probabilities χ(q)n(ωi) according to (5.18).

    (5) Deduction of probabilities ψ(q)n(ωi,ωj) according to (6.10).

    (6) Simulation of x(q),1,,x(q),T according to (6.11).

    ● Iteration (q+1), for i,j=1,,K.

    (1) Calculation of p(q+1)(ωi,ωj) according to (6.9).

    (2) Calculation of μ(q+1)ij and Σ(q+1)ij according to (6.8).

    ● Iterations are stopped if the algorithm converges; otherwise, the preceding step is repeated.

    The SEM algorithm is an iterative procedure that uses stochastic drawings to estimate a sequence of model parameters from observations and realizations of X. In each iteration, we simulate the process X according to its a posteriori distribution based on the parameters obtained in the current iteration. Then, the SEM algorithm for the models in this work proceeds as follows:

    ● We initialize the algorithm with θ0i.

    ● At each iteration (q), we simulate just one realization x of X according to its a posteriori distribution considering the current parameters. The parameters θ(q+1)i are then calculated.

    The SEM algorithm under the PMC model runs as follows:

    ● Iteration (q=0), for i,j=1,,K.

    (1) Initialization of the algorithm with p(q)(ωi,ωj),μ(q)ij,Σ(q)ij.

    (2) Calculation of probabilities α(q)n(ωi) according to (5.8) and (5.10).

    (3) Calculation of probabilities β(q)n(ωi) according to (5.9) and (5.11).

    (4) Deduction of probabilities χ(q)n(ωi) according to (5.12).

    (5) Deduction of probabilities ψ(q)n(ωi,ωj) according to (6.3).

    (6) Simulation of x(q) according to (6.6).

    ● Iteration (q+1), for i,j=1,,K.

    (1) Calculation of p(q+1)(ωi,ωj) by

    p(q+1)(ωi,ωj)=1N1N1n=111{x(q)n=ωi,x(q)n+1=ωj}. (6.13)

    (2) Calculation of μ(q+1)ij and Σ(q+1)ij by

    μ(q+1)ij=N1n=1(ynyn+1)11{x(q)n=ωi,x(q)n+1=ωj}N1n=111{x(q)n=ωi,x(q)n+1=ωj}. (6.14)
    Σ(q+1)ij=N1n=1((ynyn+1)μ(q)ij)((ynyn+1)μ(q)ij)t11{x(q)n=ωi,x(q)n+1=ωj}N1n=111{x(q)n=ωi,x(q)n+1=ωj}. (6.15)

    ● Iterations are stopped if the algorithm converges; otherwise, the preceding step is repeated.

    The SEM algorithm under the BPMC model runs as follows:

    ● Iteration (q=0), for i,j=1,,K.

    ● Initialization of the algorithm with p(q)(ωi,ωj),μ(q)ij,Σ(q)ij.

    (1) Calculation of probabilities α(q)n(ωi) according to (5.14) and (5.16).

    (2) Calculation of probabilities β(q)n(ωi) according to (5.15) and (5.17).

    (3) Deduction of probabilities χ(q)n(ωi) according to (5.18).

    (4) Deduction of probabilities ψ(q)n(ωi,ωj) according to (6.10).

    (5) Simulation of x(q) according to (6.11).

    ● Iteration (q+1), for i,j=1,,K.

    (1) Calculation of p(q+1)(ωi,ωj) by

    p(q+1)(ωi,ωj)=1N1N1n=111{x(q)n=ωi,x(q)n+1=ωj}. (6.16)

    (2) Calculation of μ(q+1)ij and Σ(q+1)ij by

    μ(q+1)ij=N1n=1(ynyn+1)11{x(q)n=ωi,x(q)n+1=ωj}N1n=111{x(q)n=ωi,x(q)n+1=ωj}. (6.17)
    Σ(q+1)ij=N1n=1((ynyn+1)μ(q)ij)((ynyn+1)μ(q)ij)t11{X(q)n=ωi,x(q)n+1=ωj}N1n=111{X(q)n=ωi,x(q)n+1=ωj}. (6.18)

    ● Iterations are stopped if the algorithm converges; otherwise, the preceding step is repeated.

    In the present section, we show several results on the use of PMC and BPMC models for gray image segmentation in the Gaussian context. First, we present a comparative study of the results obtained from both PMC and BPMC models, concerning the misclassification rates by the Bayesian MPM algorithm, PSNR index, and the parameters estimation obtained by ICE and SEM algorithms. Second, we illustrate the segmentation results obtained on simulated images corrupted with some correlated noise. These tests give an idea about the behavior of the ICE and SEM methods for numerical and visual results.

    We segment some noisy simulated images with correlated noise using an unsupervised method for evaluating the endurance of the BPMC model in comparison to the PMC model, as well as the consequences when the data is not PMC or HMC-suited. For PMC model segmentation methods, the modeling of the image is done via an HPS to convert the bidimensional collection of pixels as a mono-dimensional chain, and then reorganize the image for segmentation using an inverse HPS. This transformation gives us a stochastic process with a very complex structure. This difficulty can be seen in a few papers [2,7,14].

    In this context, the observation process representing the correlated image was obtained as follows: For each xs, we simulate the variable Ns according to a Gaussian distribution N(μxs,σ2xs), then we take

    Ys=14a+1(Ns+a4i=1Nsi),for eachsS, (7.1)

    where S is the set of pixels in the image in its bidimensional form, where si is a neighbor of pixel s in a neighborhood of four nearest neighbors.

    We study the performance of the models presented in this work on two classes of images of different sizes, which have been noised by the previous method. However, for BPMC model segmentation methods, the sequence of pixels has been obtained via a "line by line" proceeding. We use the same form of Ys as above, and for the second element of the observable process, we set ˉYs as an average of two observations for pixels neighboring to s in the image but not in the chain.

    The sequence of pixels collected by converting a bidimensional image to a chain of one dimension is designated by (si)1iN. The two realizations of the stochastic processes X and Y are respectively. x=(x1,,xn,,xN) and y=(y1,,yn,,yN), where xi=xsi and yi=ysi. Under these assumptions, both processes Z=(X,Y) and (Z,ˉY) have fairly complex structures, and we can see that the distribution of Y conditionally to X=x is not necessarily a Markov; however, we have segmented the noisy image using an unsupervised method based on MPM restoration under PMC and BPMC models, where all parameters are estimated with ICE and SEM algorithms. The question is therefore whether using the BPMC model instead of the PMC model in such a context can improve segmentation results. To give some experiences, we consider two series of parameters a, μxs, and σxs. Based on the parameters shown in Table 4, the original images and their noisy versions are reported in Figure 2.

    Table 4.  Parameters used for the segmentation experiments.
    a μ1 μ2 σ1 σ2
    series 1 0.5 16 10 1 4
    series 2 0.05 1 7 1 2

     | Show Table
    DownLoad: CSV
    Figure 2.  The synthetic images and their noisy versions used in the experiments.

    We performed six segmentations, and we note that the complete data (C-D) means the original image and its noised version.

    ● Segmentation by MPM based on parameters of PMC model obtained from the C-D.

    ● Segmentation by MPM based on parameters of BPMC model obtained from the C-D.

    ● Segmentation by MPM based on parameters of PMC model obtained with ICE algorithm.

    ● Segmentation by MPM based on parameters of PMC model obtained with SEM algorithm.

    ● Segmentation by MPM based on parameters of BPMC model obtained with ICE algorithm.

    ● Segmentation by MPM based on parameters of BPMC model obtained with SEM algorithm.

    We calculated some evaluated criteria to confirm the visual obtained results. The comparison of the algorithms is performed using quality measures: the error rate and the PSNR index (Table 5).

    Table 5.  Incorrectly classified and PSNR index given by different MPM segmentations for two series of parameters.
    Data MPM based on BPMC and C-D MPM based on PMC and C-D MPM based on BPMC and ICE MPM based on PMC and ICE MPM based on BPMC and SEM MPM based on PMC and SEM
    \tau (\%) PSNR \tau (\%) PSNR \tau (\%) PSNR \tau (\%) PSNR \tau (\%) PSNR \tau (\%) PSNR
    \verb"Image1" series 1 3.483 40.91 8.996 38.12 4.473 40.60 12.06 40.48 5.471 42.25 9.730 37.03
    series 2 2.366 40.37 9.654 37.12 3.277 39.50 2.571 39.79 2.030 39.95 3.981 36.69
    \verb"Image2" series 1 3.013 43.25 6.553 41.99 4.400 43.45 11.89 43.32 2.662 42.59 9.730 40.79
    series 2 2.655 42.84 4.801 41.94 2.500 42.87 2.900 43.52 2.465 42.51 2.767 39.84
    \verb"Image3" series 1 1.437 35.62 2.740 35.52 6.786 35.46 13.36 35.05 0.618 35.69 2.752 35.52
    series 2 1.212 35.68 1.820 35.59 4.059 35.58 4.669 35.51 0.688 35.68 1.962 35.56
    \verb"Image4" series 1 3.561 31.04 11.015 30.74 12.554 30.69 15.53 30.78 4.981 30.73 6.212 30.64
    series 2 2.421 31.04 2.713 31.02 3.988 31.03 5.633 30.50 3.117 30.74 4.389 30.67

     | Show Table
    DownLoad: CSV

    To start, we can make a number of important observations. Columns 1 and 2 can be seen as a kind of reference for the two models presented in this work, on the grounds that the complete data is used in the estimations. Comparing the results obtained by the estimation algorithms with those calculated by the complete data, one can notice that both the ICE and SEM algorithms implemented under the BPMC model shows favorable results indicating that misclassification rates and PSNR indexes are similar for both noise parameter series, except in the case of \verb"Image4" correlated by noise with the parameter of type series 1 .

    The first conclusion of interest in image segmentation is that the SEM algorithm behaves correctly in the situation under consideration. The comparison of BPMC and PMC models is the second conclusion. It is known that if the data did not fit either an HMC or a PMC, the PMC model-based MPM with HPS might have given better results than another model. This is not the case, since according to the findings and other experiments we have executed, regardless of the presence of strong enough noise, BPMC model-based MPM restorations surpass PMC model-based ones. Accordingly, using MPM under the BPMC model, ICE and SEM algorithms provide strong unsupervised segmentation methods that may substantially beat PMC model-based ones with HPS.

    Also, in Table 6, parameters are also found using complete data and the parameters estimated from ICE and SEM algorithms based on BPMC and PMC models. Regarding these estimations by both algorithms, one can perceive that most of the time the estimates are similar by comparing with those obtained by C-D for PMC and BPMC models in the majority of situations of segmented images. With respect to this, the takeaway is that the BPMC model, in its current form, provides another choice that can compete with and likely outperform nearly any other model in the circumstances at hand.

    Table 6.  Estimations of the parameters for Image1 with correlated noise by series 1 .
    Data Estimates from C-D under BPMC Estimates from C-D under PMC Estimates from ICE under BPMC Estimates from ICE under PMC Estimates from SEM under BPMC Estimates from SEM under PMC
    p(\omega_i, \omega_j) p(\omega_1, \omega_1) 0.095 0.098 0.125 0.117 0.141 0.028
    p(\omega_1, \omega_2) 0.057 0.053 0.374 0.382 0.054 0.032
    p(\omega_2, \omega_1) 0.057 0.053 0.064 0.109 0.054 0.032
    p(\omega_2, \omega_2) 0.790 0.794 0.435 0.390 0.749 0.906
    f_{\omega_i, \omega_j}(y_n, y_{n+1}) \mu_{\omega_1, \omega_1} (14.97, 14.96) (14.30, 14.30) (14.68, 14.69) (13.90, 13.90) (14.44, 14.46) (14.73, 14.71)
    \mu_{\omega_1, \omega_2} (13.95, 11.89) (13.22, 11.95) (13.18, 11.36) (12.17, 10.69) (12.86, 11.47) (14.46, 13.03)
    \mu_{\omega_2, \omega_1} (11.90, 13, 97) (11.94, 13.23) (11.37, 13.18) (10.69, 12.18) (11.47, 12.82) (13.07, 14.48)
    \mu_{\omega_2, \omega_2} (10.13, 10.13) (10.27, 10.27) (10.20, 10.21) (10.17, 10.17) (10.06, 10.06) (10.59, 10.59)
    \sigma^1_{\omega_1, \omega_1} 1.059 1.045 1.306 1.322 1.836 0.331
    \sigma^2_{\omega_1, \omega_1} 1.065 1.060 1.313 1.316 1.782 0.361
    \sigma^{12}_{\omega_1, \omega_1} 0.092 0.258 0.510 0.602 0.294 0.078
    \sigma^1_{\omega_1, \omega_2} 1.022 1.161 3.272 1.713 2.120 0.806
    \sigma^2_{\omega_1, \omega_2} 1.405 1.428 2.427 2.113 1.237 1.099
    \sigma^{12}_{\omega_1, \omega_2} 0.018 0.255 1.094 0.943 0.315 0.229
    \sigma^1_{\omega_2, \omega_1} 1.393 1.447 2.409 2.122 1.189 1.090
    \sigma^2_{\omega_2, \omega_1} 1.045 1.163 3.251 1.725 2.141 0.789
    \sigma^{12}_{\omega_2, \omega_1} 0.047 0.266 1.077 0.966 0.351 0.227
    \sigma^1_{\omega_2, \omega_2} 1.084 1.180 1.441 1.242 1.001 2.031
    \sigma^2_{\omega_2, \omega_2} 1.085 1.179 1.442 1.242 0.997 2.039
    \sigma^{12}_{\omega_2, \omega_2} 0.073 0.191 0.521 0.530 0.015 0.994

     | Show Table
    DownLoad: CSV

    The visual results of the unsupervised segmentation using both BPMC and PMC models are discussed here. The latter have been initialized using a two-class segmentation done by the K -means algorithm to generate the initial configuration of the process X .

    The images (Figure 2) to be segmented are chosen based on the homogeneity of the objects. \verb"Images 1" and \verb"2" show extremely fine details, however \verb"Images 3" and \verb"4" are not particularly homogeneous and missing fine details. Thus, it is easy to demonstrate a significant advantage of the BPMC model over the PMC model, with HPS in the context of image segmentation. In fact, regarding Table 5, the rating is based on the average rate of each algorithm across eight experiments. Under the BPMC model, ICE was 5.25 \% and SEM was 2.75 \% , while under the PMC model, ICE was 8.57 \% and SEM was 5.19 \% . The PSNR index average results in the following rankings: ICE ( 37.39 ), SEM ( 37.51 ) under the BPMC model and ICE ( 37.36 ), and SEM ( 35.84 ) under the PMC model.

    We can also visually observe that, depending on the segmentation phase required in these experiments, the error rate is not always the most important criterion. In fact, in \verb"Image 4" , the details of the zebra (Figure 3) are better restored and there are fewer spurious whites and blacks on the background in the case of the BPMC model, whereas this is not the case for the PMC model (Figure 4). However, the error rates of the two models are very close.

    Figure 3.  Segmentation results corresponding to the four synthetic images for two noises under the BPMC model.
    Figure 4.  Segmentation results corresponding to the four synthetic images for two noises under PMC model.

    Nonetheless, we obtained the following conclusions:

    (1) The introduction of a second component in the observation process makes the BPMC model a good rival to the PMC model with HPS in all numerical (Table 5) and visual scenarios (Figure 3) in the context of image segmentation.

    (2) The SEM algorithm performs better than the ICE algorithm, especially when working on the BPMC model (Table 5).

    Remark 7.1. The HPS is a space-filling curve used to read pixels during image processing. This sweep takes advantage of the two-dimensional locality of pixels. It should be noted that the introduction of such a scan is only conceivable in the case of 2^n\times2^n images, which is why we limit our study to the PMC model. However, the advantage of the proposed model BPMC is that it is capable of handling images of any size.

    An unsupervised approach is particularly useful when a smaller amount of or no labeled data is available for training. The proposed approach has shown superior results compared to the classical PMC model and can be potentially utilized for various applications. Particularly, the suggested model can be used for the segmentation of radar images, textured images, and medical images, because these types of images have an important correlation of noise.

    We proposed an unsupervised approach for restoring hidden data by applying the PMC model. This study contributes to provide a 2D observed process for the latter model. The novel BPMC model is proposed first to compete with the classical PMC model for image segmentation and then to solve the problem of correlation between noise and the difficulty of two-dimensional pixel locality during one-dimensional image transformation. The parameter estimation methods described for this model are applicable to Gaussian and possibly correlated noise. We initially found that when the data follows a PMC structure, Bayesian restorations according to the BPMC model beat those based on the PMC model. Likewise, the parameter estimating methods described for the BPMC model are extremely efficient, as restorations based on complete parameters are comparable to restorations using approximated values. Next, we executed a series of tests combining the use of unsupervised segmentation of images via an HPS in the case of the PMC model and the insertion of a second element into the observed process in the case of the BPMC model. The assessment of interest in this study is based on two points. In the first step, we show that the BPMC model works best when the stochastic process from a noisy class image with correlated noise is very complex (which is neither PMC nor HMC). Several experiments in this study show that the BPMC model always gives better results compared to the PMC model, and the difference can even be quite significant in some situations. So, one way to generate complex data is to use a noisy image in a correlative version and an HPS or line by line technique. In the second step, we proved that estimation methods corresponding to the proposed model can compete with classical methods based on the PMC model with high efficiency and are therefore of interest for the image segmentation problem.

    As a general conclusion, we can confirm that the proposed model, where we introduced a second component in the observed process, offers an interesting opportunity compared to the PMC model with an HPS. Basically, the composition of the second component of \textbf{Y} based on the neighborhood of four nearest neighbors of each pixel provides the same role as the HPS, while on the other hand, the BPMC-based unsupervised restorations with parameter estimation methods surpass those based on the PMC model.

    As perspectives for further work, we may apply the concept used here to the triplet Markovian model described in [40,41] in the case of Gaussian noise [4], in a way that the observed process will be two-dimensional and the resulting model will therefore be a bidimensional triplet Markov chain. Of course, the same concept could be used for the segmentation of fuzzy data. It consists of considering a fuzzy HMC model with bidimensional observations as presented in [42] and applying it to medical images. In the proposed model, both spatial correlation and locality details about pixels should be considered at the same time. Another potential direction is concerning types of noise, such as applying non-Gaussian noise to image segmentation using bidimensional Markov models.

    A. Joumad: Conceptualization, Data curation, Writing-original draft; A. El Moutaouakkil: Conceptualization, Formal analysis, Writing-original draft; A. Nasroallah: Data curation, Formal analysis, Methodology; O. Boutkhoum: Methodology, Project administration, Software; Mejdl Safran: Funding acquisition, Investigation, Visualization; Sultan Alfarhood: Investigation, Project administration, Visualization; Imran Ashraf: Supervision, Validation, Writing-review & editing. All authors have read and approved the final version of the manuscript for publication.

    The authors extend their appreciation to King Saud University for funding this research through Researchers Supporting Project Number (RSPD2024R890), King Saud University, Riyadh, Saudi Arabia.

    The authors declare there is no conflict of interest.



    [1] O. Cappé, E. Moulines, T. Rydén, Inference in hidden markov models, Proceedings of EUSFLAT Conference, 2009, 14–16.
    [2] B. Benmiloud, W. Pieczynski, Estimation des paramètres dans les chaînes de markov cachées et segmentation d'images, Traitement du. signal, 12 (1995), 433–454.
    [3] J. B. M. Robert, J. Elliott, L. Aggoun, Hidden Markov models: Estimation and control, Science & Business Media, 1995.
    [4] C. Fernandes, Chaînes de Markov triplets et segmentation non supervisée d'images, Institut Polytechnique de Paris, 2022.
    [5] L. Rabiner, A tutorial on hidden markov models and selected applicationsin speech recognition, Proceedings of IEEE, 77 (1989), 257–286. https://doi.org/10.1109/5.18626 doi: 10.1109/5.18626
    [6] W. Pieczynski, Pairwise markov chains, IEEE T. Pattern Anal., 25 (2003), 634–639. https://doi.org/10.1109/TPAMI.2003.1195998
    [7] C. Fernandes, T. Monti, E. Monfrini, W. Pieczynski, Fast image segmentation with contextual scan and Markov chains, In: 29th European Signal Processing Conference, Dublin, Ireland, 2021,626–630. https://doi.org/10.23919/EUSIPCO54536.2021.9616332
    [8] A. Joumad, A. E. Moutaouakkil, A. Nasroallah, O. Boutkhoum, F. Rustam, I. Ashraf, Unsupervised statistical image segmentation using bi-dimensional hidden markov chains model with application to mammography images, J. King Saud Univ.-Com., 35 (2023), 101715. https://doi.org/10.1016/j.jksuci.2023.101715 doi: 10.1016/j.jksuci.2023.101715
    [9] A. P. Dempster, N. M. Laird, D. B. Rubin, Maximum likelihood from incomplete data via the EM algorithm, J. R. Stat. Soc. B, 39 (1977), 1–22. https://doi.org/10.1111/j.2517-6161.1977.tb01600.x doi: 10.1111/j.2517-6161.1977.tb01600.x
    [10] G. Celeux, F. Forbes, N. Peyrard, EM procedures using mean field-like approximations for markov model-based image segmentation, Pattern Recogn., 36 (2003), 131–144. https://doi.org/10.1016/S0031-3203(02)00027-4 doi: 10.1016/S0031-3203(02)00027-4
    [11] W. Pieczynski, Hidden markov fields and iterative conditional estimation, Traitement du Signal, 11 (1994), 141–154.
    [12] S. Allassonnière, E. Kuhn, Convergent stochastic expectation maximization algorithm with efficient sampling in high dimension. application to deformable template model estimation, Comput. Stat. Data An., 91 (2015), 4–19. https://doi.org/10.1016/j.csda.2015.04.011 doi: 10.1016/j.csda.2015.04.011
    [13] S. Huda, J. Yearwood, R. Togneri, A stochastic version of expectation maximization algorithm for better estimation of hidden Markov model, Pattern Recogn. Lett., 30 (2009), 1301–1309. https://doi.org/10.1016/j.patrec.2009.06.006 doi: 10.1016/j.patrec.2009.06.006
    [14] S. Derrode, W. Pieczynski, Signal and image segmentation using pairwise Markov chains, IEEE T. Signal Proces., 52 (2004), 2477–2489. https://doi.org/10.1109/TSP.2004.832015 doi: 10.1109/TSP.2004.832015
    [15] R. Fjortoft, Y. Delignon, W. Pieczynski, M. Sigelle, F. Tupin, Unsupervised classification of radar images using hidden Markov chains and hidden Markov random fields, In: IEEE Transactions on Geoscience and Remote Sensing, 41 (2003), 675–686. https://doi.org/10.1109/TGRS.2003.809940
    [16] S. Z. Li, Markov random field modeling in image analysis, Springer Science & Business Media, 2009.
    [17] Q. Jackson, D. A. Landgrebe, Adaptive Bayesian contextual classification based on Markov random fields, In: IEEE Transactions on Geoscience and Remote Sensing, 40 (2002), 2454–2463. https://doi.org/10.1109/TGRS.2002.805087
    [18] K. Kuljus, J. Lember, Pairwise Markov models and hybrid segmentation approach, Methodol. Comput. Appl. Probab., 25 (2023). https://doi.org/10.1007/s11009-023-10044-z
    [19] E. Monfrini, J. Lecomte, F. Desbouvries, W. Pieczynski, Image and signal restoration using pairwise Markov trees, In: IEEE Workshop on Statistical Signal Processing, Saint Louis, MO, USA, 2003,174–177. https://doi.org/10.1109/SSP.2003.1289372
    [20] W. Pieczynski, A. N. Tebbache, Pairwise Markov random fields and segmentation of textured images, Mach. Graph. Vision, 9 (2000), 705–718.
    [21] M. E. Y. Boudaren, L. An, W. Pieczynski, Dempster-Shafer fusion of evidential pairwise Markov fields, Int. J. Approx. Reason., 74 (2016), 13–29. https://doi.org/10.1016/j.ijar.2016.03.006 doi: 10.1016/j.ijar.2016.03.006
    [22] J. B. Courbot, V. Mazet, E. Monfrini, C. Collet, Pairwise Markov fields for segmentation in astronomical hyperspectral images, Signal Process., 163 (2019), 41–48. https://doi.org/10.1016/j.sigpro.2019.05.005 doi: 10.1016/j.sigpro.2019.05.005
    [23] S. Derrode, SAR image segmentation using generalized pairwise Markov chains, In: Proceedings of SPIE-The International Society for Optical Engineering, 4885 (2002). https://doi.org/10.1117/12.463177
    [24] S. Derrode, W. Pieczynski, Segmentation non supervisée d'images par chaîne de Markov couple, In: Ateliers Traitement et Analyse de l'Information: Méeshodes et Applications, Hammamet, Tunisie, 2003.
    [25] N. Brunel, F. Barbaresco, Doppler and polarimetric statistical segmentation for radar clutter map based on pairwise Markov chains, Proc. of IEEE RADAR, (2007), 8–10.
    [26] S. Derrode, W. Pieczynski, Unsupervised data classification using pairwise Markov chains with automatic copulas selection, Comput. Stat. Data An., 63 (2013), 81–98. https://doi.org/10.1016/j.csda.2013.01.027 doi: 10.1016/j.csda.2013.01.027
    [27] A. K. Atiampo, G. L. Loum, Unsupervised image segmentation with pairwise Markov chains based on nonparametric estimation of copula using orthogonal polynomials, Int. J. Image Graph., 16 (2016), 2526–2541. https://doi.org/10.1142/S0219467816500200 doi: 10.1142/S0219467816500200
    [28] S. Rafi, M. Castella, W. Pieczynski, Pairwise Markov model applied to unsupervised image separation, In: IASTED International Conference on Signal Processing, Pattern Recognition, and Applications, Innsbruck, Austria, 2011. https://doi.org/10.2316/P.2011.721-044
    [29] E. Azeraf, E. Monfrini, E. Vignon, W. Pieczynski, Highly fast text segmentation with pairwise Markov chains, In: 6th IEEE International Congress on Information Science and Technology, Machine Learning for Natural Language Processing, Agadir-Essaouira, Morocco, (2020), 361–366. https://doi.org/10.1109/CiSt49399.2021.9357304
    [30] I. Papila, O. Ersoy, Multiscale segmentation of remotely sensed images using pairwise Markov chains, In: IEEE Antennas and Propagation Society Symposium, 2 (2004), 2123–2126. https://doi.org/10.1109/APS.2004.1330629
    [31] S. L. Cam, F. Salzenstein, C. Collet, Fuzzy pairwise Markov chain to segment correlated noisy data, Signal Proces., 88 (2008), 2526–2541. https://doi.org/10.1016/j.sigpro.2008.05.003 doi: 10.1016/j.sigpro.2008.05.003
    [32] C. Carincotte, S. Derrode, S. Bourennane, Multivariate fuzzy hidden Markov chains model applied to unsupervised multiscale SAR image segmentation, In: The 14th IEEE International Conference on Fuzzy Systems, Reno, NV, USA, 2005,288–293. https://doi.org/10.1109/FUZZY.2005.1452408
    [33] B. Tso, R. C. Olsen, Combining spectral and spatial information into hidden Markov models for unsupervised image classification, Int. J. Remote Sens., 26 (2005), 2113–2133. https://doi.org/10.1080/01431160512331337844 doi: 10.1080/01431160512331337844
    [34] J. F. Mari, F. L. Ber, Temporal and spatial data mining with second-order hidden markov models, Soft Comput., 10 (2006), 406–414.
    [35] A. Hafiane, B. Zavidovique, S. Chaudhuri, A modified FCM with optimal Peano scans for image segmentation, In: IEEE International Conference on Image Processing, Genova, Italy, 2005, III–840. https://doi.org/10.1109/ICIP.2005.1530523
    [36] Y. L. Song, B. Adobah, J. F. Qu, C. M. Liu, Segmentation of ordinary images and medical images with an adaptive hidden Markov model and viterbi algorithm, Curr. Signal Transd. T., 15 (2020), 109–123. https://doi.org/10.2174/1574362413666181109113834 doi: 10.2174/1574362413666181109113834
    [37] W. Skarbek, Generalized hilbert scan in image printing, In: Theoretical Foundations of Computer Vision, R. Klette et W. G. Kropetsh, editors, Akademik Verlag, Berlin, 1992, 45–57.
    [38] N. Brunel, Sur quelques extensions des chaînes de Markov cachées et couples. Application à la segmentation non supervisée des signaux radar, Université Pierre et Marie Curie-Paris VI, 2005.
    [39] P. Lanchantin, Chaînes de Markov triplets et segmentation non supervisée de signaux, Evry: Institut national des télécommunications, 2006.
    [40] W. Pieczynski, Chaines de Markov triplet, Comptes Rendus. Mathématiques, 335 (2002), 275–278. https://doi.org/10.1016/S1631-073X(02)02462-7
    [41] W. Pieczynski, F. Desbouvries, On triplet Markov chains, In: Proceeding of the International Symposium on Applied Stochastic Models and Data Analysis, Brest, France, 335 (2005).
    [42] A. Joumad, A. E. Moutaouakkil, A. Nasroallah, O. Boutkhoum, Unsupervised image segmentation using fuzzy hidden Markov chain with bi-dimensional data, In: 11th International Symposium on Signal, Image, Video and Communications, El Jadida, Morocco, 2022, 1–6. https://doi.org/10.1109/ISIVC54825.2022.9800731
  • Reader Comments
  • © 2024 the Author(s), licensee AIMS Press. This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0)
通讯作者: 陈斌, bchen63@163.com
  • 1. 

    沈阳化工大学材料科学与工程学院 沈阳 110142

  1. 本站搜索
  2. 百度学术搜索
  3. 万方数据库搜索
  4. CNKI搜索

Metrics

Article views(621) PDF downloads(60) Cited by(0)

Figures and Tables

Figures(4)  /  Tables(6)

/

DownLoad:  Full-Size Img  PowerPoint
Return
Return

Catalog