Research article

A change point analysis protocol for comparing intracellular transport by different molecular motor combinations

  • Received: 09 May 2021 Accepted: 06 September 2021 Published: 18 October 2021
  • Intracellular transport by microtubule-based molecular motors is marked by qualitatively different behaviors. It is a long-standing and still-open challenge to accurately quantify the various individual-cargo behaviors and how they are affected by the presence or absence of particular motor families. In this work we introduce a protocol for analyzing change points in cargo trajectories that can be faithfully projected along the length of a (mostly) straight microtubule. Our protocol consists of automated identification of velocity change points, estimation of velocities during the behavior segments, and extrapolation to motor-specific velocity distributions. Using simulated data we show that our method compares favorably with existing methods. We then apply the technique to data sets in which quantum dots are transported by Kinesin-1, by Dynein-Dynactin-BicD2 (DDB), and by Kinesin-1/DDB pairs. In the end, we identify pausing behavior that is consistent with some tug-of-war model predictions, but also demonstrate that the simultaneous presence of antagonistic motors can lead to long processive runs that could contribute favorably to population-wide transport.

    Citation: Melanie A. Jensen, Qingzhou Feng, William O. Hancock, Scott A. McKinley. A change point analysis protocol for comparing intracellular transport by different molecular motor combinations[J]. Mathematical Biosciences and Engineering, 2021, 18(6): 8962-8996. doi: 10.3934/mbe.2021442

    Related Papers:

    [1] Tangsheng Zhang, Hongying Zhi . A fuzzy set theory-based fast fault diagnosis approach for rotators of induction motors. Mathematical Biosciences and Engineering, 2023, 20(5): 9268-9287. doi: 10.3934/mbe.2023406
    [2] Sandesh Athni Hiremath, Christina Surulescu, Somayeh Jamali, Samantha Ames, Joachim W. Deitmer, Holger M. Becker . Modeling of pH regulation in tumor cells: Direct interaction between proton-coupled lactate transporters and cancer-associated carbonicanhydrase. Mathematical Biosciences and Engineering, 2019, 16(1): 320-337. doi: 10.3934/mbe.2019016
    [3] Xian Hua, Jing Li, Ting Wang, Junhong Wang, Shaojun Pi, Hangcheng Li, Xugang Xi . Evaluation of movement functional rehabilitation after stroke: A study via graph theory and corticomuscular coupling as potential biomarker. Mathematical Biosciences and Engineering, 2023, 20(6): 10530-10551. doi: 10.3934/mbe.2023465
    [4] Gerasimos G. Rigatos, Efthymia G. Rigatou, Jean Daniel Djida . Change detection in the dynamics of an intracellular protein synthesis model using nonlinear Kalman filtering. Mathematical Biosciences and Engineering, 2015, 12(5): 1017-1035. doi: 10.3934/mbe.2015.12.1017
    [5] Liqiang Zhu, Ying-Cheng Lai, Frank C. Hoppensteadt, Jiping He . Characterization of Neural Interaction During Learning and Adaptation from Spike-Train Data. Mathematical Biosciences and Engineering, 2005, 2(1): 1-23. doi: 10.3934/mbe.2005.2.1
    [6] Giorgos Minas, David A Rand . Parameter sensitivity analysis for biochemical reaction networks. Mathematical Biosciences and Engineering, 2019, 16(5): 3965-3987. doi: 10.3934/mbe.2019196
    [7] Liping Yang, Na Xie, Yanru Yao, Chunxia Wang, Maozai Tian, Kai Wang . Hepatitis B time series in Xinjiang, China (2006–2021): change point detection based on the Mann-Kendall-Sneyers test. Mathematical Biosciences and Engineering, 2024, 21(2): 2458-2469. doi: 10.3934/mbe.2024108
    [8] Xinrui Ren, Jianbo Yu, Zhaomin Lv . Support vector machine optimization via an improved elephant herding algorithm for motor energy efficiency rating. Mathematical Biosciences and Engineering, 2022, 19(12): 11957-11982. doi: 10.3934/mbe.2022557
    [9] Zahid Raza, Juan LG Guirao, Ghada Bassioni . The comparative analysis of two molecular indices in random polyphenyl and spiro chains. Mathematical Biosciences and Engineering, 2022, 19(12): 12500-12517. doi: 10.3934/mbe.2022583
    [10] Yue Hu, Xiaoqin Mei, Dong Tang . Long non-coding RNA XIST is down-regulated and correlated to better prognosis in ovarian cancer. Mathematical Biosciences and Engineering, 2020, 17(3): 2070-2081. doi: 10.3934/mbe.2020110
  • Intracellular transport by microtubule-based molecular motors is marked by qualitatively different behaviors. It is a long-standing and still-open challenge to accurately quantify the various individual-cargo behaviors and how they are affected by the presence or absence of particular motor families. In this work we introduce a protocol for analyzing change points in cargo trajectories that can be faithfully projected along the length of a (mostly) straight microtubule. Our protocol consists of automated identification of velocity change points, estimation of velocities during the behavior segments, and extrapolation to motor-specific velocity distributions. Using simulated data we show that our method compares favorably with existing methods. We then apply the technique to data sets in which quantum dots are transported by Kinesin-1, by Dynein-Dynactin-BicD2 (DDB), and by Kinesin-1/DDB pairs. In the end, we identify pausing behavior that is consistent with some tug-of-war model predictions, but also demonstrate that the simultaneous presence of antagonistic motors can lead to long processive runs that could contribute favorably to population-wide transport.



    Intracellular transport of vesicles, organelles and other biomolecular cargo is commonly carried about by two families of motor proteins (kinesin and dynein) that tether cargoes to microtubules in the cell and move along these microtubules in a directed fashion. Microtubules have a structural polarization, termed the plus-end and a minus-end. Motors in the kinesin family generally move in the plus-end (anterograde) direction, while cytoplasmic dynein moves in the minus-end (retrograde) direction. Typically, more than one motor will be attached to a cargo at a time, and it has been a long-term theoretical challenge to understand how multiple motors cooperate [1,2,3,4] and/or compete [5,6,7,8] in the course of cargo transport. See Figure 1 for a schematic depiction. Indeed, particle tracking experiments for live cells have consistently revealed cargo trajectories that are bidirectional, marked by periods of both anterograde and retrograde motion, while featuring periods when transport appears to be "paused" [9,10]. The causes of the pauses [11,12] and the manner in which the motors exchange dominance to give rise to bidirectionality remain the focus of extensive investigation [10,13,14].

    Figure 1.  Anterograde moving motors, such as kinesin, transport cargo towards the cell periphery (plus end) by stepping along the microtubule, while retrograde moving motors, such as dynein, carry the cargo towards the cell nucleus (negative end). Figure copied with permission from [10].

    Initially, in vitro experiments of motor-driven cargo transport by motors were in one of two categories: (1) transport by a single motor, achieved by holding the motor concentration very low compared to cargo concentration thereby assuring at most one motor is attached; or (2) multi-motor transport with an uncertain number of motors being involved in the transport [15,16,17]. More recently, DNA origami techniques have been developed in order to observe transport by exactly two motors, which may or may not be of the same type [18,19,20,21]. The effort to make comparisons between single- and multi-motor transport has raised a number of methodological challenges. Two such challenges that we seek to address here are: (1) how to identify when a motor-cargo complex switches from one biophysical state to another; and (2) how to characterize and quantify the distribution of velocities exhibited by transport mediated by various motors and motor-motor combinations. A typical path of interest is displayed in Figure 2. The cargo is initially being transported by the motor and then comes to a stop. It pauses temporarily (with some continued progress) and then it begins stepping again until the end of the observed time.

    Figure 2.  Left: The tracked cargo tethered to a two-motor complex trajectory (black curve) in the (x, y) plane, where t0 denotes the starting position and tN denotes the ending position with respect to time. The purple line denotes the estimated position of the microtubule. Center: The trajectory (black curve) in the time, longitudinal distances coordinate system, where the red circles correspond to missing observations that have been estimated. Right: The increment process of the trajectory, where the red circles indicate the increments calculated from the missing observations that were estimated.

    A variety of methods to address questions like this exist in the literature. Some have been developed specifically for particle tracking data, while others (change point detection algorithms in particular) have been developed with other applications in mind. In Section 1.1 we survey some relevant methods and in this work we develop a particle-tracking specific algorithm that is informed by some of these "best practices" that have been explored in the statistics literature. One collection of work that was particularly influential for us is that generated by the KymoAnalyzer software developed by the Encalada lab [22]. This tool for quantifying behavior observed in kymograph data allows a user to partition individual paths into segments, and then estimate the velocity of each segment. The collection of observed velocities can then be binned in a histogram or displayed as a cumulative distribution function, either of which can be sufficiently detailed to capture qualitative differences in behavior observed in various experiments. The two major drawbacks to using KymoAnalyzer (or any similar tool) stem from the "use of the human hand" in the analysis. Because the segments must be marked individually, user-time is a limiting factor, preventing analysis of massive data sets. Moreover, different users can have different habits in determining when segments occur, leading to biases that are difficult to quantify.

    In the work that follows, we take a step toward developing a protocol for finding change points and reporting velocity distributions that is fully automated, once some initial settings are put in place. In Section 1.2 we summarize the method that we have implemented, which is Bayesian in the overall framework, but relies on a number of approximations to be computationally feasible. Subsequently, in Sections 3 and 4 we provide a detailed development of our approach to modeling both the data and the underlying biophysical process for the purposes of quantifying the data and simulating artificial data, respectively. We then describe our procedure for validating the statistical methods we employ. In Section 5 we report how various change point methods perform on the simulated data and then ultimately we apply our method to data sets featuring quantum dots being transported by a single kinesin-1 (kin1) motor, a single dynein-dynactin-BicD2 (DDB) motor, and a kin1-DDB pair. We are able to quantify differences in velocity distributions and run lengths and reveal a pattern of behavior that, at least from this preliminary perspective, is neither completely consistent with tug-of-war hypothesis predictions, nor with the in vivo observations of co-dependence among antogonistic motors.

    The change point detection problem is well-studied, dating back at least to the work of Page in the 1950s. After laying down a framework for detecting a change in parameter using cumulative sum schemes for general distributions [23,24], Page specifically addressed a problem much like what we study here in 1957, looking at the change in mean of independent normally distributed observations [25]. In 1964, Chernoff and Zacks [26] derived a Bayesian estimator for segment means under the assumption that there is at most one change. The problem becomes considerably more difficult when the number and location of the change points is not known. The literature on this topic is extensive, but we would like to highlight a few prominent methods that we include in our numerical analysis.

    Essentially all change point detection schemes rely on proposing a number of changes, and the locations of those changes, and then using a probability model to evaluate a likelihood score. Since a model with more change points can always be made to fit the data better, there is a risk of overfitting a trajectory. There must be some penalty in the score for how many change points are proposed. These two components are very explicit in algorithms that construct a piecewise-linear approximation of the data and compute the score based on the sum of squares of the residuals (RSS). One such algorithm, proposed by Bai and Perron [27,28], employs a dynamic programming algorithm to minimize RSS. This has been implemented in the R package strucchange, in which the function breakpoints uses the Bayesian Information Criterion (BIC) to penalize larger numbers of change points.

    Bayesian methods take a random sampling approach to finding good parameter sets. In these algorithms, a sequence of location vectors and associated means are proposed. At each step a likelihood score is evaluated and the proposed parameter set is accepted or rejected in accordance with the Metropolis-Hastings algorithm. Here the penalty is encoded in prior information about the anticipated probability that any given observation might be the location of a change point. Once a mathematical model is specified, innovations come about through proposing more efficient methods of exploring the parameter space. To our knowledge, the first Bayesian method for detecting multiple change points with unknown number was proposed by Barry and Hartigan [29]. They called their mathematical model the Product Partition Model (PPM), which encompasses all models that satisfy a product condition on partitions (the change points) and an independence condition for the observations given the partition. Barry and Hartigan construct a PPM in [30] that performs well in detecting sharp short-lived changes in the means of independently normally distributed observations. The PPM specified in [30] has been implemented as an R package bcp by Erdman et al. [31].

    Within the particle tracking community, in 2018 Yin et al. [32] proposed using a likelihood ratio test (LRT) to classify a time as a change point. To identify all the change points in a single trajectory, the authors utilized a recursive binary segmentation algorithm. Such a method is feasible for the case of a sequence of Normal random variables with change in mean parameter and common variance because the log likelihood ratio is a well studied object and estimates of the critical value exist, see [33]. We refer to this method as LRT in our analysis.

    In this work, we construct a PPM that differs from [30] in the modeling of the segment means and segment durations. Barry and Hartigan employ a hierarchical model for the segment means. Each segment mean is assumed to be drawn from a Normal distribution that has common mean, μ0, and variance σ20/|I| that is inversely proportional to the segment length, |I|. One of the objectives of this work is to infer the parameters μ0 and σ20. Such a model does well for sequences with one typical behavior and short departures from this behavior. This model structure may not be well-suited for many motor-cargo trajectories, which commonly involve a switch between two prolonged states. Moreover, the model prescribes the shape for the segment mean distribution that does not allow for important qualitative features that might exists in motor data, in particular multimodality. The method we propose takes advantage of Bayesian sampling techniques but does impose a model on the velocities. For this reason, we remove the hierarchical structure on the segment means and treat each segment velocity as a distinct parameter with a uniform prior with bounds determined by existing knowledge about motor speeds. Additionally, we modify the geometric distribution prior on the segment lengths used in [30] to guard against short unrealistic segments that commonly arise from tracker errors.

    In theory, one could implement a fully Bayesian non-parametric method to infer the change points, velocities, and velocity distribution structure all at once. However, this approach is computationally intractable for most desktop settings. In fact, despite efforts to minimize computational cost, the protocol we propose is itself computationally intensive. With these computational barriers in mind, we split our inference method into multiple stages. In the first stage, we introduce a likelihood approximation, utilizing the intuition underlying Bayes Factors, that reduces the dimension of the parameter space and allows us to estimate the most likely number of change points. Then given a number of change points, in the second stage, we use MCMC methods to sample from the posterior distribution of the position of the change points and the segment velocities. We then construct an estimated velocity distribution using samples from the stage 2 posterior distributions that are weighted by segment length. We refer to the method as Number of Change points - Conditional Inference (NC-CI).

    Kinesin-1 motors were bacterially expressed and purified by Ni column chromatography, as described previously [20]. Dynein-dynactin-BicD2 (DDB) complexes were purified by adding bacterially expressed BicD2 to bovine brain lysate and purifying DDB complexes by column chromatography, as described previously [21]. Both motors contained a C-terminal Green Fluorescent Protein (GFP). Isolated kinesin-1 and DDB were visualized by attaching streptavidin-coated Quantum dots (Qdots) to the motors through a biotinylated anti-GFP nanobody (GFP Binding Protein, or GBP). Kinesin-DDB complexes were formed by attaching single-stranded DNA-functionalized GBP to each motor and connecting them through a complementary single-stranded oligonucleotide labeled with a Qdot. All of the methods for forming complexes are described in Feng et al. [21]. The Qdot label has a diameter of approximately 0.03μm and estimated diffusivity of 10μm2/s (based on Stokes-Einstein equation). Motors were tracked by visualizing the Qdots with Total Internal Reflection Fluorescence Microscopy on a Nikon TE-2000 microscope, as previously described [21].

    Transport by Kinesin-1 has been well studied since the 1990s. Kinesin-1 has been shown to be an extremely processive motor taking approximately 100 steps, each of size 0.008μm, before detaching from the microtubule. The unloaded velocity is roughly 100 steps/s, or 0.8 μm/s [34,35,36], and the motor is able to sustain loads of roughly 6 pN [15,37,38].

    Due to this consistent processive behavior, we treated kinesin data as a control group, in which we did not expect to detect many change points. There were 34 paths within the Kinesin-1 data set, in which observations were taken at 20 frames per second (fps) (Δ=0.05s). By contrast, activated dynein in a DDB complex has been shown to display diverse motitily behavior in vitro, including processive runs, pauses, and diffusive behavior [19,21,39]. There were 42 tracked paths in this group with observations taken 22 fps (Δ=0.043s). In our analysis, we considered a subset of 41 DDB motor-cargo paths with approximately linear microtubules.

    Less is known about how a single cargo is transported by a multimotor complex comprised of antagonistic motors such as Kinesin-1 and DDB [10]. The last data set we studied consists of individual Qdots being transported by both a Kinesin-1 and a DDB motor, referred to as Kin1-DDB. These complexes have been assembled previously using DNA origami and shown to display tug-of-war behavior. There were 101 tracked particles and observations are taken 9fps (Δ0.11s). Because our technique is restricted to paths that are approximately linear, the excluded set of paths was larger for this data set than the others. We included 92 out of 101 paths. It is not immediately clear why there are more "bent" paths, but one likely explanation is that Kin1-DDB complexes may be more likely to switch between different microtubules before detaching.

    In this section, we present the mathematical model and algorithm used in this work. In Section 3.1 we present two models for the motor-cargo system, the first is a biophysical model based on Langevin dynamics, and the second is an approximation to the biophysical model that yields explicit solutions that can be used for statistical inference. In Section 3.2, we construct a Bayesian approach to estimating the number of change points that exist within a path. This is the first step in our two-step method. Then in Section 3.3, we provide the sampling algorithm to infer change point locations and segment velocities for a given number of change points, which constitutes the second step of our method. We give definitions of quantities of interest estimated from the outputs of the two proposed sampling algorithms in Section 3.4. Lastly, in Section 3.5 we describe the statistical methods used to clean and pre-process the molecular motor data described in Section 2.

    The motor-cargo process is intrinsically stochastic due to motor-stepping and the fluctuations of the associated cargo. We assume that the various motor configuration states (for example, a single kinesin motor transporting the cargo, or a single DDB motor, or a kinesin-DDB pair) can be modeled effectively through assigning a mean stepping rate for each state. To this end, suppose there are k changes in motor interactions at times τ=(τ1,,τk) with the additional notational convention that τ0=0. Within the jth time segment, τj1t<τj, the motor configuration is characterized by a distinct stepping rate, ρj, which can be either a deterministic value, or drawn from some probability distribution. We then model the motor step times within a segment as a Poisson arrival process with rate parameter ρj. Assuming that each motor step is of size δ, it follows that the position of the motor configuration center, Z(t), is a scaled Poisson process,

    (Z(t)Z(s))/δPoisson((ts)ρj)for τj1s<tτj (3.1)

    for j=1,k.

    As for the position of the cargo at time t, denoted X(t), we consider a Langevin Stochastic Differential Equation (SDE) framework in the overdamped (zero mass) limit. To be specific, the position of the cargo is governed by an Ornstein-Uhlenbeck process centered at Z(t):

    dX(t)=κγ(X(t)Z(t))dt+2DdW(t),X(0)=x0, (3.2)

    where κ is the spring constant of the tether, γ is the viscous drag coefficient of the cargo, D=kBT/γ is the diffusivity of the cargo in the absence of the tether (with kBT being Boltzmann's constant times the fluid temperature), and W(t) is a standard Brownian motion.

    In experimental settings, the position of the motor Z(t) is usually unknown. It is possible to estimate motor positions from cargo observation, but since the motor model is phenomenological to begin with, and since the quantities of interest are the various velocity states of the motor-cargo complex, we introduce a simplification that is used primarily for statistical inference purposes. The statistical model is expressed in terms of two non-physical parameters: the state velocity νj, and the state diffusivity 1/η. In this statistical model, the (observed) position of the cargo ˆX(t) with k changes in velocities is described by Brownian motion with changes in drift parameter:

    dˆX(t)=νjdt+1/ηdW(t)for τj1<tτj,ˆX(0)=x0. (3.3)

    The drift parameter νj (j{1,2,,k+1}) is the velocity of the cargo during the jth state (roughly ρjδ). The common diffusion parameter 1/η is treated as a nuisance parameter (its exact value is not of primary concern in this work) since it is an amalgamation of multiple sources of noise in the system: fluctuations in motor-stepping, fluctuations of the cargo about the motor positions, and observation error. This statistical model has the same form as a standard diffusion approximation for a Poisson process [40] (which would have the form 1/η=νj) but assuming this strict form would not take into account other contributions to the noise.

    Because single particle tracking data is collected at discrete times, we further assume that observations are made on a uniform grid of size Δ=tntn1 for n=1,N and that Δ is small enough so the jth change point can be approximated by an observation time,

    τj=tMjwhere Mj:=τj/Δ>0 (3.4)

    for all j=1,,k, with the convention that τ0=0 and τk+1=NΔ. Here Mj denotes the index of the jth change whereas τj denotes the time of the jth change. Under these assumptions for times t[tMj,tMj+1), the position of the observed cargo conditioned on the position of the jth change, ˆXMj=ˆX(ΔMj), is given by

    ˆX(t)|ˆXMj=ˆXMj+νj+1(ttMj)+1/η. (3.5)

    From Equation 3.5, the increment process of ˆX, that is {Ξn:=ˆXn+1ˆXn}Nn=1, has the following distribution

    ΞnN(νjΔ,η1Δ) if τj1<tnτj, (3.6)

    for 1jk+1, n=1,N. Denoting a realization of the increment process by ξ=(ξ1,,ξN), our statistical model assuming k change points has an approximate likelihood function

    LN,k(ξ;θk)=(η2πΔ)N/2exp(η2Δk+1j=1Mji=Mj1+1(ξiνjΔ)2) (3.7)

    where θk=(τ1,τk,ν1,νk+1,η) denotes the model parameters with k change points. We treat the vector of change point times, τk, as model parameters rather than the vector of change point indices, Mk=(M1,Mk), since the latter is computable from τk and Δ. We refer to the jth segment as the increments between Mj1+1 and Mj. In the special case when all the change points occur at observation times, Equation 3.7 is exact.

    Under this approximation our change point problem for molecular motors can be written as a sequence of independent normal random variables with changes in mean parameter. From this point forward, our inference technique assumes the statistical model with k changes,

    Mk:ξnNormal(νjΔ,η1Δ), for Mj1<nMj,j{1,k+1}, (3.8)

    where M0=0 and Mk+1=N. The unknown parameters of interest are k, τ, ν, and η.

    In this section, we provide our approach to approximating the marginal posterior probability distribution of the number of change points. We constructed a Metropolis-Hastings (MH) sampling algorithm that utilizes a switch point process to infer the number of change points, k. In Lavielle et al. [41], the switch point process was defined as an (N1)-dimensional vector of independent Bernoulli random variables, denoted r=(r1,,rN1), where rn=1 indicates a change point occurred at time observation n with some probability that is known a priori. By contrast, we take λ, the rate of changes among states, to be unknown. So, the parameters to be inferred in our model are the segment mean velocities, ν, the common precision, η, the switch point process, r|λ, and the rate of switches, λ. Assuming independence among the parameters η,ν, and r|λ, our model results in a joint posterior distribution of the form

    p(r,λ,ν,η|ξ)C=LN,Kr(ξ;r,ν,η)p(r,λ,ν,η)=LN,Kr(ξ;r,ν,η)(Kr+1j=1p(νj))p(η)p(r|λ)p(λ) (3.9)

    where C= denotes equality up to a constant, Kr is the number of of change points in switch point process r, LN,Kr(ξ;r,ν,η) is given in Equation 3.7, and p() denotes the prior distribution of unknown model parameters. Estimating all of these parameters simultaneously is computationally prohibitive. As such we have taken the approach to integrate over all possible segment velocities, weighted by their prior distribution, and then we infer the remaining parameters r, λ, and η. This marginalization of the velocities is similar to the prior-weighted averaging that takes place in computing Bayes factors for model selection [48]. Under this model reduction strategy, the target distribution becomes

    p(r,λ,η|ξ)C=(RRLN,Kr(ξ;r,ν,η)p(ν1)p(νKr+1)dν1dνKr+1)p(η)p(r|λ)p(λ). (3.10)

    We provide explicit forms of Equation 3.10 with our choices of prior distributions at the end of this subsection.

    Our target distribution includes the locations of the changepoints, the switch rate, and the common precision. This is different from existing Bayesian methods. For the bcp algorithm of Barry & Hartigan [30] and Erdman & Emerson [31], the posterior distribution is the joint distribution of the change points (partitions) and the segment means, while the switch rate, λ and the common variance (1/η) are assumed to be known. In the work by Lavielle & Lebarbier [43], two posterior distributions are considered; the switch point process and the joint distribution of the switch point process and segment means. The former is obtained by integrating out the segment means. Both posterior distributions are feasible because the switch rate and the common variance 1/η are estimated from the data and then fixed.

    The following discussion concerns both model parameters to be inferred and hyperparameters that define the associated prior distributions. To distinguish between them, the hyperparameters are underlined.

    Our prior for the switch point process is motivated by interpreting the switch point process, r(t), as a continuous-time counting process where change points occur with rate λ, r(t)Poisson(λt). However, the times between events in a Poisson process can be arbitrarily small and this is inconsistent with the idea that biophysical states must persist for some period of time to be meaningful. We therefore introduced a minimum length for allowable segment durations, which is at least d_r observations. This means that the posterior distribution of switch point processes is restricted to the set of all switch point processes that have all change points separated by at least d_r steps. We denote this set TN,d_r. We refer to d_r as the minimum segment duration and we select it according to the following reasoning. Suppose a molecular motor needs to step at least ten times in order for the segment to have significance and the motor is assumed to step on average once every 0.01 seconds. Then d_r would be chosen to be d_r=(100.01)/Δ, where Δ is the time between observations.

    Under this modeling assumption, the prior distribution on the switch point process is

    p(r;λ,d_r)={(1eλΔ)Kr(eλΔ)Arif rTN,d_r0otherwise (3.11)

    where

    Kr=N1n=1rnAr=Kr+1j=1max{0,Mj(Mj1+1)2(d_r1)},Mr={n:rn=1}{0,N}, (3.12)

    and we adopt the convention that Mr can be written as the ordered set Mr=(M0,M1,,MKr+1). Note that when d_r=1, then Ar=(N1)Kr which agrees with the prior when the switch point process is a discrete approximation of a Poisson process.

    We capture uncertainty in the switch point rate, λ, by placing a prior distribution on λ. Specifically, we assume that

    λGamma(a_λ,b_λ) (3.13)

    where a_λ and b_λ are hyper-parameters chosen to express prior belief about the rate at which change points occur. In the inference below, we set a_λ=(3/10)50 and b_λ=50, so on average three changes occur within ten seconds.

    We place an informative, compact prior on the segment speeds, νj for j=1,k+1. Specifically, we set

    νjUniform([v_max,v_max])for j=1,k+1 (3.14)

    where the hyper-parameter v_max is the maximum speed a motor is believed to be able to travel within one time unit. Below we set this hyperparameter to be v_max=2μm/s. One advantage of such a prior is that we can obtain an explicit analytical expression for the marginal posterior distribution of (r,λ,η). There is also a conjugate prior for νj, the Normal distribution, and it also yields an explicit analytical expression for p(r,λ,η|ξ). However, when we used the non-compact prior for inference on experimental data, the result was an unrealistic number of inferred change points. This result is discussed in Section 5.1.

    Because the common precision parameter η is phenomenological and there is little biophysical guidance to set a reasonable prior, we used an empirical prior. To this end, we constructed a Gamma prior from the uniformly minimum variance unbiased (UMVU) estimator of η assuming no change points,

    ηGamma(ˆηumvu0.15,0.1), (3.15)

    where

    ˆηumvu=ΔVar(ξ), (3.16)

    and the values 0.15 and 0.1 were chosen so that expected common precision was slightly larger (1.5x) than the estimated value assuming no change points.

    We provide a list of all the hyper-parameters for the sampling algorithm in Table 1.

    Table 1.  Step 1 algorithm hyperparameters. In the following inference the hyperparameters are d_r=5, u_1=0.25, u_2=0.25, α_λ=(3/10)50, β_λ=50, α_prop=2.5, β_prop=10 and max velocity, v_=2μm/s.
    Model Parameter Prior Prior Hyperparameters Proposal kernel Proposal Kernel Parameters
    νj Uniform v_: maximum allowed velocity - -
    η Gamma α_η=0.15ˆηumvu, β_η=0.1 Normal random walk var = (ˆηumvu/4)2
    r Eq. 3.11 d_r: minimum segment duration See [43] u_1, u_2
    λ Gamma α_λ, β_λ Independent Gamma a_prop,b_prop

     | Show Table
    DownLoad: CSV

    Before giving the explicit form of the posterior distribution for the given choices of priors, we introduce the following notation. The marginal likelihood of observing a vector ξ assuming uniform priors on the segment velocities is

    L(ξ;r,η)=(12v_max)Kr(2πΔη)N/2(2πη)Kr2exp(η2ΔSN(Mr))×Krj=1(1Δ(MjMj1))1/2(Φ(v_maxˉξ(Mj1,Mj](ηΔ(MjMj1))1/2)Φ(v_maxˉξ(Mj1,Mj](ηΔ(MjMj1))1/2)), (3.17)

    where Kr and Mr are defined in Equation 3.12,

    ˉξ(Mj1,Mj]=1MjMj1Mjn=Mj1+1ξn,SN(Mr)=Krj=1Mjn=Mj1+1(ξnˉξ(Mj1,Mj])2, (3.18)

    and Φ(z) is the cumulative distribution function of a standard normal random variable. The derivation of Equation 3.17 is provided in Appendix 7.1.

    When rTN,d_r, the model decisions above lead to the following posterior distribution

    p(r,λ,η|ξ)C=(RRLN,Kr(ξ;r,ν,η)p(ν1)p(νKr+1)dν1dνKr+1)p(η)p(r|λ)p(λ)C=LN,Kr(ξ;r,η)×η0.15ˆηumvu1e0.1η×(1eλΔ)Kr(eλΔ)Ar×λa_λ1eb_λλ (3.19)

    where Kr and Ar are given in Equation 3.12. When rTN,d_r then p(r,λ,η|ξ)=0.

    In our analysis, both λ and η were randomly generated from their respective prior distributions. Due to the complex parameter landscape, we took more care in initializing the switch point process. To do this, we first sampled the number of change points, k0, from a Poisson distribution with parameter a_λb_λTfinal, where Tfinal is the duration of the tracked particle. Then assuming there are k0 change points, we initialized the change points to those estimates obtained by the existing change point detection method of constructing a piecewise-linear approximation to the data and minimizing the RSS.

    For the Metropolis-Hastings (MH) sampling algorithm, we used different strategies for the different parameters. In this section we denote the proposed parameter value and the current parameter value by a superscript ‘prop’ and ‘cur’, respectively.

    For the change point rate, λ, we employed an independence sampler, i.e., a proposal function that does not depend on the current value:

    qλ(λprop|λcur)=qλ(λprop)Gamma(a_prop,b_prop), (3.20)

    where a_prop,b_prop are hyper-parameters.

    For the precision parameter, we used a Normal random walk proposal function:

    qη(ηprop|ηcur)Normal(ηcur,σ_2η), (3.21)

    where σ_2η is a tuning parameter. Note that a negative proposal will be rejected with probability one because it is outside the support of the prior on η.

    For the switch point process, we adopted the proposal function created by Green [44] and used in [41,43]. The proposal function takes its values on the set {0,1}Ndr1, where N is number of observations. At each step, one of three types of proposals might be made: (1) an independent switch point process; (2) a process generated by the creation or extinction of one of the change points; or (3) a location shift of a single change point. These proposal types occur with probability, u_1,u_2, and 1u_1u_2, respectively. Details can be found in the SI Section 1. We provide a pseudo-code for our algorithm in SI Algorithm 1.

    Our proposed MH sampler results in posterior samples of the switch point process, the switch point rate, and the common precision. We use the posterior samples to approximate the posterior distribution of the total number of change points. Suppose there are M posterior samples and let the number of change points in the mth switch point configuration be given by K(m)r=N1n=1r(m)n, then by ergodic theorem

    1MMm=11(K(m)r=k)a.sP(Kr=k|ξ) (3.22)

    as M goes to infinity. Our posterior estimate for the number of change points is taken as the number with the highest posterior probability, denoted ˆkMAP.

    In this section, we describe the MCMC sampling algorithm used to infer the change point times, the segment velocities, and the common precision when the number of change points, k, is known. Our model, Equation 3.6, naturally lends itself to the methodology proposed in [45,46] to infer θk=(τk,νk+1,η). Namely, our assumption that the change points occurs at an observation time, or can be approximated by an observation time, yields conditionally independent segments and Normal data allows us to exploit the form of the full conditional posterior distributions through a Gibbs sampler for the following target distribution:

    p(τk,νk+1,η|ξ)C=LN,k(ξ;τk,νk+1,η)p(τk,νk+1,η) (3.23)

    where p(τk,νk+1,η) is the joint prior distribution of the model parameters and C= indicates equality up to a constant. Because we assume a model with k changes, we drop the subscript k and k+1 on τk and νk+1, respectively in the following section.

    For νj for j=1,k+1, we employ the same priors as described in Section 3.2.1, Equation 3.14. We still model our prior belief in the common precision as a Gamma random variable as in Step 1, but update the hyper-parameters to reflect the information learned in Step 1. Let ˆηMAP, 1 denote the maximum a posterior (MAP) estimator of η obtained from the posterior samples of Step 1, then the prior for η is given by

    ηGamma(0.1ˆηMAP, 1,0.1). (3.24)

    These choices in hyper-parameters result in Gamma prior with large spread and mean at the MAP estimator from Step 1.

    Recall, in Step 1 we impose that each segment must have at least d_ observations, so that the change point times satisfy

    d_Δτ1<(M2+d_1)Δ(Mj1+d_)Δτj<(Mj+1+d_1)Δ(Mk1+d_)Δτk<(Nd_+1)Δ (3.25)

    where Mj=τj/Δ is the change point index. To keep consistent between steps, we use a uniform prior constrained by the inequalities given in Equation 3.25 for the change point times. By choosing d_=1, we retrieve the uniform (non-informative) prior used in [46].

    To obtain posterior samples for ν we perform component-wise sampling of νj|{ξ,η,Mj1,Mj} for j=1,k+1. For the jth segment velocity, we use a normal random walk to propose a new value of νj,

    q(νpropj|νcurj)Normal(νcurj,ϵ_2ν) (3.26)

    where ϵ_ν is a tuning parameter. In the below inference, we used the value of ϵ_ν=1/2.

    For the precision parameter, our use of a conjugate prior enables direct sampling from the conditional posterior distribution,

    η|{ξ,ν,M}Gamma(0.1ˆηMAP, 1+N2,0.1+12Δkj=1Mji=Mj1+1(ξiνjΔ)2). (3.27)

    We obtain a sample of τ by component-wise sampling, τj|{νj1,νj,η,τj1,τj+1} for j=1,k using the following proposal function for τj, q(|τcurrj),

    q(τpropj|τcurrj)Uniform(τcurrjϵ_τ,τcurrj+ϵ_τ)for j=1,k, (3.28)

    where ϵ_τ is a tuning parameter. In the inference below, we set ϵ_τ=(1+k)2NΔ, where N is the path length. We note that our proposed values of the change point indices are continuous rather than discrete as to allow for better estimation of τ.

    We assessed the convergence of this sampling algorithm to the target distribution using the potential scale reduction, ˆR, as given in [47] and the effective number of independent samples, ˆneff, as defined in [42]. One concludes that there is evidence for convergence if all inferred model parameters satisfy the two constraints

    ˆR<1.1andˆneff>5×(# chains). (3.29)

    All the above hyper-parameters and tuning parameters can be found in Table 2 and a pseudo code for the above sampling algorithm is described in Algorithm SI-2. For brevity in Algorithm SI-2, we denote the posterior shape and rate of η given in Equations 3.27 by α(ξ,ν,M), and β(ξ,ν,M), respectively.

    Table 2.  Step 2 algorithm hyperparameters. In the following inference the hyperparameters are set to d_=5 and v_max=2μm/s are considered. The parameters of the proposal distribution are set to ϵ_ν=1/2μm/s and ϵ_τ=(1+k)2NΔ, where k is the assumed number of change points and N is the path length.
    Model parameter Prior Prior Hyperparameters Proposal kernel Proposal Kernel Parameters
    νj Uniform v_: maximum allowed velocity Normal random walk ϵ_2ν
    τj Eq. 3.25 d_: minimum state duration Uniform random walk ϵ_τ
    η Gamma α_ν=0.1ˆηMAP,1, β_ν=0.1 Gibbs sampling -

     | Show Table
    DownLoad: CSV

    As defined in [10], we use the term run length to denote "distance a molecular motor moves before detaching from the filament and diffusing away." The run time is the time the molecular motor remains on the filament before detaching and diffusing away. When the cargo displays multiple physical states, we define the jth segment as the time between the τj1 and τj change point. We define a segment duration to be the time the motor stays in the jth state τjτj1, and segment speed to be the average speed of the motor during the segment.

    We used a maximum a posteriori (MAP) estimator for our point estimates of the number of change points, the change points, the segment velocities, and common precision. To estimate the number of change points, we calculated an empirical marginal distribution from posterior distribution samples as described in Equation 3.22. The posterior samples were obtained by merging four independent runs of Step 1 samplers, each for 200,000 iterations with the first half discarded as burn-in and then thinned to every 100th sample. Then the posterior point estimate for the number of change points was set to the number of change points with the highest posterior probability, denoted ˆkMAP. For the segment velocities, change points, and common precision, we set the posterior point estimates to the respective component of the MAP estimate over the full posterior distribution, Equation 3.23. Posterior quantities of the full distributions were obtained using the posterior samples of four independent runs of the Step 2 samplers, each of 40,000 iterations with the first half discarded and thinned to every 50th sample.

    To examine the heterogeneity of velocities within families of molecular motors and across families of molecular motors, we construct and compare posterior velocity distributions that account for the amount of time the motor spends in a given segment. For a given data set the velocity distribution is constructed as follows. For each tracked path, we set the duration of each segment using the posterior point estimate described in Section 3.4.1 for the change point times. Then, we draw posterior samples of the segment velocities, where the number of samples drawn equals the number of observations within each segment, i.e. if a path remains in the jth segment for ten observations, we draw ten samples from the posterior distribution of the jth velocity. Finally, to obtain the velocity distribution for the given data set, we pool together the posterior velocity samples from the previous step over all paths in the data set. Because we do not the polarity of the microtubules in practice, we take the absolute value of the velocity samples and construct speed distributions.

    In order to quantify switches among biophysical states, we defined three categories: Initial, Paused, and Reversed, then assigned each segment to one of these categories. Then, using these assignments, we estimated the 1-step transition probabilities. We defined the three biological states in terms of the estimated segment velocity and the distance traveled in the segment, referred to as segment distance. Without the inclusion of the segment distance, a segment that has a slow but processive motor-cargo complex can be misclassified as paused. Because we did not have information concerning the direction of the plus- and minus- ends of the MT, we considered the initial direction as given by sign(ˆν0), where ˆν0 is the MAP estimate of the first segment velocity that is considered processive, and the reverse direction as the opposing sign. The three state definitions are for segment j are:

    ● Initial: movement in the initial direction (sign(ˆνj)=sign(ˆν0) and |ˆνj|0.1) or (sign(ˆνj)=sign(ˆν0) and |ˆνj|<0.1 and segment distance 0.4μm),

    ● Paused: no directed movement, (|ˆνj|<0.1 and segment distance <0.4μm),

    ● Reversed: movement in the direction opposite of the initial segment, (sign(^νj)sign(^ν0) and |ˆνj|0.1) or (sign(^νj)sign(^ν0) and |ˆνj|<0.1 and segment distance 0.4μm).

    The threshold values 0.1μm/s and 0.4μm were chosen based on an informal analysis of segment durations and velocities. A scatter plot for each of the data sets is shown in Figure SI-1. We found that for the single motor data sets, there were no segments with a velocity of less than 0.1μm/s that nevertheless had an overall displacement of at least 0.4μm. However, in the two-motor data set, there were multiple such segments, and to the eye, these appeared to show consistent slow movement over a long period of time. Because the behavior is not a statistical artifact, we included this type as an "active" segment. We conducted our analysis using 0.2μm/s and 0.5μm and all reported qualitative conclusions held for that choice as well.

    Before applying the described statistical methods, we pre-processed the data as follows.

    The model we proposed for the position of the cargo in Section 3.1.2 and used in our Bayesian inference method is one dimensional, whereas the experimental cargo data is two dimensional, cartesian coordinates. Because we are interested in the position of cargo along (parallel to) the MT, we transformed the x,y-position data into longitudinal (motion parallel to the MT) position as follows. We assume the variation in the x- and y-directions are independent and identically distributed, and used orthogonal least squares regression to approximate the position of the MT. Then we consider the dynamics of the cargo in terms of its projection along the length of the MT.

    To be specific, for orthogonal regression, the sum of square residuals is minimized to obtain the coefficients for the line of best fit, ˆβ0 and ˆβ1, and the predicted x-position, ˆxn:

    ˆβ0=syysxx+(syysxx)2+4s2xy2sxyˆβ1=ˉyˆβ1ˉxˆxn=xn+ˆβ1ˆβ21+1(ynˆβ0ˆβ1xn)for n=1,N,

    where ˉx,sxx is the sample mean and sample standard deviation of x, respectively, ˉy,syy is the sample mean and sample standard deviation of y, respectively, and sxy is the sample covariance of x and y. The predicted y-positions are ˆyn=ˆβ0+ˆβ1ˆxn for n=1,N. Then, for n=1,N, we rotate the predicted positions (ˆxn,ˆyn) onto the x-axis and set the longitudinal position to the rotated x-position

    Xn=cos(β)(ˆxnˆx1)sin(β)(ˆynˆy1)

    where β is the angle between the x-axis and the vector from the origin to (ˆx2ˆx1,ˆy2ˆy1). For those tracked paths that appear to be stepping on a non-linear microtubule we remove the path from the following analysis.

    This process is depicted in the the left and center frame of Figure 2. In the left frame the observed trajectory of a cargo tethered to a Kinesin1-DDB motor complex is shown in black and the estimated position of the microtubule is shown in purple. The black curve in the center frame represents the longitudinal position of the trajectory when projected onto the microtubule.

    An assumption in our inference method that is often violated in practice is observation at uniform time increments. One such cause is that the tagged cargo becomes too dim to track for a couple of frames due to a change in the z-position. We opt to fill in the missing data points using the longitudinal position for a maximum of twenty consecutive missed points, although the maximum number of consecutive missed points for the data presented above in Section 2 was eight. If more than the maximum allowed number of consecutive points are missing, we keep the trajectory up until the period of max missed number of observations.

    In the case of r sequential missing positions we estimate the missing observations by the uniformly spaced positions between the previous and last observed position with some noise. For instance, if the longitudinal position at the (n+1)th time observation, tn+1, to the (n+r)th time observation, tn+r are missing, we estimate Xn+i for i=1,r by

    ˆXn+i=Xn+ir+1(Xn+r+1Xn)+ϵZ (3.30)

    where ZNormal(0,1) and ϵ is chosen so qualitatively the missing points looks similar to the rest of the path. In the below analysis, we set ϵ2=var(X)/10, which in some cases is an underestimation of the true variance. This can potentially lead to a bias in the over-estimation of the length of states.

    We display an example of this procedure in the center frame of Figure 2. The black line is an observed longitudinal trajectory of a tracked cargo tethered by a Kinesin1-DDB motor complex and the three red circles correspond to estimated missing observations as given by Equation 3.30. In the right frame of Figure 2, we display the increment process of the longitudinal trajectory in black and mark the increments that have been calculated with estimated missing observations with red circles.

    In this section, we provide an experimental design for validating inference methods for the number of change points in molecular motor data. This entails constructing simulated data sets of the observed cargo data in which the number of change points is known. First in Section 4.1, we describe the simulation technique for generating 1D molecular motor and cargo processes from the biophysical model, Equations 3.1 and 3.2. Section 4.2 gives a model for the experimental cargo data that arises from the data collection method. Lastly, Section 4.3 outlines and describes the construction of four simulated data sets that vary in complexity.

    We simulated motor-cargo processed from the biophysical model as follows. First, we obtained the motor paths using the Gillespie algorithm where the rate of the motor stepping changes at times τ=(τ1,τ2,). The simulation time grid for the motor positions, denoted {sn}n=1, is set to the ordered union of the uniform time mesh of size Δsim and the change times τ. Defining the time grid in this manner ensures that Z(t) only jumps at the right endpoint of {sn}n=1. After sampling the motor positions, the position of the cargo is determined for time sn using the exact conditional solution to Equation 3.2

    X(sn)|Z(sn1),X(sn1)=eκγ(snsn1)Xn1+(1eκγ(snsn1))Zn1+2Dsnsn1eκγ(snsn1)dW(u)

    where the stochastic term is a normal random variable and using Ito's Isometry

    2Dsnsn1eκγ(snsn1)dW(u)Normal(0,Dκ/γ(1e2κγ(snsn1))). (4.1)

    The output of the algorithm is the trajectory of the motor position and the cargo position with respect to the simulation time grid {sn}n=1. Pseudo code for simulating the motor-cargo trajectory is given in Algorithm SI-3.

    In Figure 3(a), (b), we display a simulated motor (black) and cargo (green) process in which the stepping rate of the motor-cargo complex changes rate twice on a short time scale and long time scale, respectively. To generate the simulation in Figure 3, the simulation time grid was taken to be every Δsim=104s, and the model parameters were set to

    Motor Parameters: ρ=(100,30,100)s1, δstep=0.008μm,Cargo Parameters: κ/γ=1000,D=0.012μm2/s. (4.2)

    Alternatively, the average speed of the motor, |ν| can be specified rather than the stepping rate, in which the stepping rate is computed as ρ=|ν|/δstep.

    Figure 3.  Simulated cargo-motor system from the biophysical model. (a): Simulated motor and cargo process governed by the biophysical model presented in Section 3.1.1, on a short time scale of 0.2 seconds. The motor and simulated cargo position are colored in black, and green, respectively. (b): The same simulated paths as in (a) but over the whole simulation time window of 10 second. The motors change stepping rate twice, marked by the dashed blue lines. (c): The trajectory of the observable cargo over the whole simulation time window. Observations are assumed to occur once per every 0.05s.

    We introduce a model for the experimental cargo data that arises from the data collection method. Position data of a tagged particle is collected at discrete time increments depend on the frame rate of the camera. Let Δ denote the time between observations, and {tn}Nn=0 be the collection of N+1 observation times. For simplicity, we assume that observations are made on a uniform time grid, tn=nΔ.

    Due to limited spatial precision, the collected time-series data captures the position of the cargo plus some observational noise. Under the assumption that the observational error is independent at each measurement and normally distributed, our model for the observed process is given by

    ˆXn=X(tn)+σϵn (4.3)

    where X(tn) is the true position of the cargo, σ is the standard deviation of the observational error, and ϵn for n=0,N are i.i.d standard normal random variables.

    The observed cargo process is obtained from the simulated cargo position, described in Section 4.1, by subsampling the simulated cargo position at times {tn}Nn=1, and adding Normal noise to each observation as given in Equation 4.3. For the simulated motor-cargo process in Figure 3(b) we depict the observed cargo process in Figure 3(c), where observations at taken uniformly every 0.05 seconds (Δ=0.05) and σ=0.003μm.

    To construct a simulated data set that emulates transport by multimotor complex Kin1-DDB, we work under the assumption that each simulated path is from the biophysical model, Equation 3.2, but the observed data is from Equation 4.3. We assume that a change in motor interaction does not result in a switch of direction, but rather the motor's average velocity changes between a slow velocity, νslow, distribution and fast velocity, νfast, distribution. Equivalently, using the relation: λ=|ν|/δstep, the motor changes between a distribution of low and high stepping rate, respectively. We further assume the number of changes follows a Poisson distribution.

    In each simulation study, the strength of the change by altering the length of each segment or the relative difference in the segment velocities. For the former, the change points are either uniformly spaced in time so all segment durations are equal, or ordered uniformly distributed random variables so each segment duration is random. As for the latter, the amount of overlap, negligible or non-negligible, between the fast and slow distribution. The resulting four scenarios are:

    Case 1: change points are uniformly spaced and the distributions for the slow and fast velocities overlap on sets of negligible probability,

    Case 2: change points are uniformly spaced and the distributions for the slow and fast velocities are overlapping,

    Case 3: change points are uniformly distributed and the distributions for the slow and fast velocities overlap on sets of negligible probability,

    Case 4: change points are uniformly distributed and the distributions for the slow and fast velocities are overlapping.

    Case 1 produces paths with the strongest change signal since each segment length is controlled and large velocity changes are probable, whereas Case 4 contains paths with the weakest signal since segments can be shorter than the frame rate and small velocity changes are more probable. Both Case 2 and 3 muddle change point detection, but ultimately the short segment durations (Case 3) makes changes more difficult to detect. We note there was no restriction that segments must be longer than the minimum segment cutoff dr that appears in our NC-CI method. In Section 5.2, we report on the performance of the four methods in two ways. First, we looked at the accuracy of all methods directly on the simulated paths regardless of segment lengths. We do not expect that any of the methods should perform well in identifying extremely short segments. So, in the next analysis, we restricted our study to paths whose segments are all "meaningfully long" in the sense that they are longer than dr observations.

    For each case, we simulated 200 paths from Equation 4.3 with experimental parameters

    Δ=0.05 s,Tfinal=10 s, and σ=0.003μm,

    and physical parameters,

    δstep=0.008μm,D=0.01μm2/s, and κ/γ=1000.

    The remaining parameters: number of change points, time of change points and segment stepping rates varied from path to path.

    For a single path, we first sampled the number of change points from Poisson(λ=3). Then for Case 1 and 2 change points were uniformly spaced, while for Case 3 and 4 the change points were set to an ordered sample from Uniform((0,Tfinal)). Lastly, we obtained the stepping rate for each segment through sampling the segment velocities and using the relation λ=ν/δstep. For each segment velocity, we alternated drawing the velocity from two folded normal distributions with Case 1 and Case 3 parameters:

    νslowfolded Normal(0.1,0.052)νfastfolded Normal(0.6,0.12) (4.4)

    and Case 2 and Case 4 parameters:

    νslowfolded Normal(0.1,0.12)νfastfolded Normal(0.6,0.22), (4.5)

    as displayed in Figure SI-3.

    We considered our proposed algorithm, NC-CI, along with the three other algorithms introduced in Section 1.1: (1) the PPM specified in [30] that has been implemented in R, referred to as BCP, (2) Bai and Perron's dynamic programming algorithm minimizing the RSS, referred to as RSS, and (3) the likelihood ratio test method proposed by [32], referred to as LRT. For all four simulation studies, the hyperparameters for each considered method were kept the same.

    For our proposed sampling scheme, the hyper-parameters were set to those given in the caption of Table 1. For each path we ran two independent Step 1 samplers with different initial values, each for 200,000 iterations, discarding the first 100,000 samples as the burn-in period, and keeping every 100th sample. The remaining posterior samples from each sampler were merged together and used for inference.

    For BCP, there are two model parameters that need to be selected, the maximum probability of change occurring at an observation, and the maximum signal to noise ratio. While the default value for both parameters is 0.2, which has been shown to work well in general settings, [30,49], we chose problem-specific values that increased the method's accuracy. For the probability of a changepoint parameter, we used the form that emerges from modeling the switches as a Poisson process. We set the assumed the rate of changes to 4/10, which yields a hyper-parameter value of 1e(4/10)Δ. As for the maximum signal to noise ratio, we set the variance of the means in the PPM model to (1/2)2, which is consistent with when the Normal priors were tried in the NC-CI. We approximated the measurement variance by the assessing the residual error when fitting a spline to the time-longitudinal data. Note the latter results in an empirical prior rather than a true Bayesian prior. Such hyper-parameters are not necessarily optimal, but express our knowledge of the system and perform better then the default value. The default priors resulted in accuracy of 21.0, 18.5, 5.5, and 3.5% in the Case 1, 2, 3, and 4 respectively. We ran the sampler for a total of 30,000, the first 10,000 as the burn-in period and the latter 20,000 as posterior samples.

    As for the RSS method, we modeled the increment process as an intercept-only linear model with multiple structural changes and set the minimum segment length to 5, which is consistent with the hyper-parameter for the NC-CI method. For the LRT method, we approximated the critical value using ˆμ, see [32,33] value for α=0.05.

    In this section, we report our findings on change point detection on both simulated and experimental molecular motor data. In Section 5.1 we present the results on change point detection when the conjugate prior is placed on segment velocities that led us to use the compact uniform priors in NC-CI. Section 5.2 compares the accuracy of four change point detection methods on the simulated case studies presented in Section 4.3. The results of our prososed NC-CI on the experimental data discussed in Section 2 is given in Section 5.3. To run our proposed method on both the simulated and real data, the High Performance Computing system at Tulane University was used.

    A natural alternative prior for each segment velocity is a Normal distribution because it is a conjugate prior for our choice in likelihood. We encode prior knowledge about motor proteins into this prior by setting the prior mean to 0 and standard deviation to 1/2 so that a motor speed less or equal 1μm/s occurs with 95% probability. When placing such unbounded prior on each of the segment velocities, and enforcing each segment had a minimum of d_r=5 observations, we found most paths had a MAP estimate for the number change points near or at the upper bound on the number of change points. For example, within the DDB data set, there was a tracked cargo path with 283 observations. Visually, it has two apparent change points, but the method with normal (and hence, unbounded) prior estimated it to have 47 change points, shown in Figure SI-4. Such an estimate corresponds to each segment either be 6 observations (0.258 seconds) or 7 observations long (0.301 seconds). By contrast, the same DDB path was estimated to have 2 change points with the compact uniform prior, Figure SI-5.

    In Figure 4, top row, for all three data sets we display the empirical distribution of the MAP estimator for the number of change points with a normal prior on the segment velocities. For comparison when a uniform prior on segment velocities is assumed see the first column in Figure 6. In the middle row of Figure 4 (left to right), we display the path length versus MAP estimate for the number of change point for each path in Kinesin-1, DDB, and Kinesin1-DDB data set, respectively. Within all data sets, a linear trend emerges between the number of change point estimate and the path length. The linear trend has a slope of roughly 6.3 for the Kin1 data set, 5.8 for the DDB data set, and 5.1 for the Kin1-DDB set, indicating each segment is closer to the minimum number of observations required by our choice in hyper-parameter, d_r=5. These trends do not appear when a compact uniform prior is placed on the segment velocities. See the bottom row of Figure 4 for number of observations versus number of change points for each three data sets.

    Figure 4.  Unbounded (Normal) prior on segment velocities result in overestimation of number of change point. Top row: For Normal prior, the frequency of the MAP estimator for the number of change points within the Kinesin1, DDB, and Kinesin1-DDB data set from left to right, respectively. Middle, bottom row: For each tracked path, the number of observations versus the estimate for the number of change points (MAP) within the Kinesin1, DDB, and Kinesin1-DDB data set (left to right) when a Normal prior (middle row) and uniform prior (bottom row) is assumed. The black dashed line denotes the maximum number of allowed change points, which is equal to the number of observations/5).

    A possible explanation is that a large likelihood value arising from overfitting the data with many change points compensates for the low prior probability the Normal prior assigns to unrealistic segment speeds. This does not occur when using the uniform prior because it has finite support. We also conducted some preliminary analysis on simulated data assuming a Normal prior with mean zero and standard deviation 1/2 that we do not report here, which also resulted significant over-estimation.

    We validated the first stage of our proposed method by comparing its accuracy with that obtained from using the three existing methods BCP, RSS, and LRT. First, we considered the accuracy of the four methods on the simulated molecular motor datasets described in Section 4.3. Then, we compared the estimates for the number of change points using the NC-CI and the BCP method on the three molecular motor data sets described in Section 2. The hyper-parameters of the alternative methods are given in Section 4.3.1.

    We estimated the accuracy of a method, with respect to the number of change points, as the fraction of paths with estimator for the number of change points equal to the true number of change points. For the Bayesian methods, NC-CI and BCP, the estimator is taken to be the MAP. Confidence intervals for the accuracy estimates are constructed assuming a normal approximation to the binomial distribution. We note that the analysis of the simulated data sets for the three existing methods can be run on a laptop, whereas the analysis by our proposed method is more computationally intensive and was run on a Tulane University's high power computing system.

    For all four simulation studies, the NC-CI method had the highest accuracy, followed by the LRT method. Within increasing complexity of the the simulation study, less can be concluded about the significance of the improved accuracy when using the NC-CI method because the confidence intervals for all methods overlap. In Table 3, we give the accuracy and the 95% confidence intervals for the point estimate for each case and all four methods. A graphical display of the accuracy of each method is shown in Figure 5. Each frame displays the distribution of the residual for the number of change points, ˆkktrue with the residual value on the x-axis, the relative frequency on the y-axis, and the color denoting the method.

    Table 3.  For each method and each case study consisting of 200 simulated paths, we give the accuracy of identifying the true number of change points along with 95% confidence interval. The accuracy fraction is estimated by the fraction of simulated paths with number of change points correctly estimated and confidence interval are obtained assuming a normal approximation to the binomial distribution.
    % ˆk=ktrue for all simulated paths
    Case 1 Case 2 Case 3 Case 4
    NC-CI 0.90, [0.85, 0.94] 0.75, [0.68, 0.81] 0.61, [0.54, 0.67] 0.57, [0.50, 0.64]
    BCP 0.80, [0.74, 0.86] 0.70, [0.63, 0.76] 0.52, [0.45, 0.59] 0.48, [0.40, 0.55]
    LRT (C0.95=ˆμ) 0.80, [0.74, 0.86] 0.70, [0.63, 0.76] 0.54, [0.46, 0.61] 0.55, [0.48, 0.62]
    RSS 0.77, [0.71, 0.83] 0.61, [0.54, 0.68] 0.53, [0.45, 0.60] 0.49, [0.42, 0.56]

     | Show Table
    DownLoad: CSV
    Figure 5.  The distribution of the residual for the number of change points for the four simulation scenarios. Within a frame, the residual value is on the x-axis, the relative frequency of the residual is on the y-axis, and the method denoted by the color. The top (bottom) row consists of the cases with uniformly spaced (distributed) change points. The left (right) column consist of the cases with fast and slow velocity distributions that overlap with negligible (significant) probability.

    As the change point signal becomes harder to detect, going from Case 1 to Case 4, the accuracy of estimating the number of change points k decreases for all methods but the LRT, which has a similar accuracy for the hardest two cases, Case 3 and 4. For all methods, the frequency of under-estimation systematically increasing as to be expected due to weaker change point signals. This can be seen qualitatively in Figure 5, where the left tail of the residual distribution of the number of change points for all methods, grows as the case number increases.

    Furthermore, for our choices in overlap between the fast and slow velocity distributions having uniformly distributed segment durations has a more negative impact on change point detection. This can be seen by comparing the residual distributions of each method across columns in Figure 5, as compared to those across the rows in Figure 5. There is a larger decrease in accuracy going from Case 1 to Case 3 as compared to Case 1 to Case 2, and going from Case 2 to Case 4 as compared to Case 3 to Case 4. Note the cases in the top row of Figure 5 have uniformly spaced change points whereas the bottom row has uniformly distributed change points, and the cases in the left column have negligibly overlapping velocity distributions, as opposed to the cases in the right column have overlapping velocity distributions.

    The success of the NC-CI method is particularly strong when we consider only those paths whose segments are biophysically meaningful, in the sense that each true segment duration lasts at least 5 observations and true sequential velocity changes are at least 0.1μm/s in magnitude, |νiνi+1|0.1 for all i=1,k. This is to be expected since, at this point, we are only considering those paths that have a segment duration greater than or equal to selected hyperparameter value for the minimum segment duration, dr=5. The increase in accuracy is larger for Case 3 and Case 4 as the former condition is violated more when the change points are uniformly distributed as opposed to uniformly spaced as in Case 1 and Case 2. Specifically, for Case 1 and Case 2 none of the paths violate the condition on the segment length, whereas 34/200 and 32/200 paths violate the condition on the segment length for Case 3 and Case 4 respectively. As for the condition on the magnitude of the velocity change, none of the paths in Case 3 violate the condition whereas 1/200, 14/200, and 19/200 paths violate the condition for Case 1, Case 2, and Case 4, respectively. The accuracy for the subset of meaningful paths are given in Table 4.

    Table 4.  The accuracy of the estimate for the number of change points for the subset of paths with each segment duration lasting at least 5 observations and had subsequent velocity changes at least ±0.1μm/s, |νiνi+1|0.1 for all i=1,k.
    % ˆk=ktrue for the subset with "meaningful" changes
    Case 1 Case 2 Case 3 Case 4
    NC-CI 0.90 0.79 0.80 0.76
    BCP 0.80 0.74 0.67 0.64
    LRT (C0.95=ˆμ) 0.80 0.73 0.68 0.72
    RSS 0.77 0.63 0.68 0.62

     | Show Table
    DownLoad: CSV

    For the Kin-1 and DDB data sets, the estimate for number of change points using the NC-CI and BCP method were in good agreement, whereas the Kin1-DDB data set had more discrepancies between the two methods. In Table 5, we report the fraction of paths whose NC-CI estimate for the number of change points was equal, less than, and greater than that when using the BCP method.

    Table 5.  The fraction of Kin1, DDB, and Kin1-DDB, paths whose NC-CI estimate number of change points, ˆkNC-CI, was equal, less than, and greater than that when using the BCP method, ˆkBCP.
    Kin1 DDB Kin1-DDB
    ˆkNC-CI=ˆkBCP 0.85 0.95 0.68
    ˆkNC-CI<ˆkBCP 0.00 0.02 0.21
    ˆkNC-CI>ˆkBCP 0.15 0.02 0.11

     | Show Table
    DownLoad: CSV

    For the data sets with a single motor, when there was a discrepancy between the two estimates, typically the NC-CI was higher than the BCP estimate. In about half the cases NC-CI detected a single change either towards the beginning or end of the path, whereas BCP detected no changes.

    We found the converse results for the multi-motor data set: when the two estimates disagreed, the BCP estimate was typically larger that the NC-CI estimate. We attribute this to the Kin1-DDB data set having more paths with large spikes in the increment process only lasting for a few observations, either the stochasticity of the motor stepping, or tracker error. The BCP marked such short lived spikes as change points whereas the NC-CI did not due to the assumption of the minimum state duration of at least five observations.

    We found that Kinesin-1 had more stereotypical velocities, whereas there was heterogeneity of velocity for the DDB and Kinesin1-DDB complexes.

    We found that 79.41% (27/34) of the tracked cargo had a MAP estimator for the number of change points as zero, top left frame of Figure 6. At first glance having 17.64% (6/34) of paths with ˆkMAP>0 seems contradictory to Kinesin-1 being known as a processive motor [37,15,38]. Through path-by-path analysis of the paths with ˆkMAP=1, we found four out of the six had a change point estimated towards the end of the trajectory in which the motor protein became less processive. A possible explanation is that the motor reached the end of the MT and remained there before detaching. Moreover for the single path with ˆkMAP=2 visually appears to transition from a processive state to a paused state and back to a processive state, resulting in two change points.

    Figure 6.  Top, middle, bottom row: Reported inference on the Kinesin-1, DDB, and Kin1-DDB data sets. Left column: The empirical distribution of the MAP estimator for the number of change points, ˆkMAP. Center: The empirical posterior density of speed of all the tracked cargo weighted by the duration of each state described in Section 3.4.2. Points on the line y=0 denote the posterior point estimate for each segment speed. Right: The MAP estimate for the jth speed versus that of the duration of the jth state, where the MAP is over the joint posterior distribution, Equation 3.23. The symbol corresponds to the number of changes points, ˆkMAP, for the given path. Note that x-axis scales vary between rows for right column.

    While there was some variation in the posterior samples of the speed over all Kinesin-1 trajectories, the samples were concentrated 0.57μm/s, top center frame of Figure 6. The MAP estimator for each speed segment is denoted by the black circle on the x-axis. There appears to be a slight a correlation between duration of each state and the speed of the cargo during the given state, top right frame in Figure 6. For states only lasting a short duration (bottom left points), there appears to be a decrease in the MAP estimator for the speed.

    Overall, we found that a majority of the paths were estimated as having zero change points, 78.05% (32/41) of the paths. This finding agrees with the experimental results presented in [39], in which trajectories of cargo transported by DDB appeared as diagonal lines. However, there was a significant percentage, 21.95%, of paths with least one change point was detected, middle row left frame of Figure 6.

    We found more variation in the posterior samples of the segment speed for those tracked cargo tethered to DDB, middle row center frame of Figure 6, as opposed to those tethered to Kinesin-1. The empirical posterior distribution appeared to be possibly be tri-modal with peaks around 0.4,0.6,0.1μm/s from largest to smallest. More data is required to make a claim the multi-modality is statistically significant. Moreover, the empirical posterior distribution was supported on speeds larger than 1μm/s, a region within the support of our uniform prior but a less biologically plausible region, and 7 of the 54 MAP estimates of the segment velocities were larger that 1μm/s. In examining these paths, the segment with velocity larger than 1μm/s in magnitude had a low number observations per state, ranging between 5 and 30, (see SI-DDB particle 1, 8, 18, 20, 24, and 39).

    The Kin1-DDB data had more challenging paths to analyze. We removed 9 out of 101 tracked particles because the trajectory was poorly captured by a single straight line. Either the microtubule was curved or bent, or the motor-cargo complex changed microtubule during the observation window, complicating tracking. There were fifteen tracked cargos with a low posterior probability of k, but after visual inspection, there were only two trajectories for which the number of change points and locations were unclear. These fifteen tracked paths are included in the below analysis. Furthermore, there were nine paths in which not all inferred parameters in Stage 2 had posterior samples satisfying the convergence criterion ({SI-{Kin1-DDB particle 8, 11, 13, 26, 36, 41, 44, 74, 77}}). After visual inspection, we opted to include them in analysis below. Four of the nine paths had a large estimate for the number of change points (at least five). Whereas the remaining five had estimated number of changes between two and four, in which it is unclear if the number of changes is mis-estimated resulting in the convergence criterion not being satisfied.

    We found 67.39% (62/92) of the paths were estimated as having at least one change, ˆkMAP>0, as depicted in the empirical distribution of ˆkMAP, bottom left frame of Figure 6. Qualitatively, the empirical posterior density of speed of the tracked Qdot-Kin1-DDB complexes weighted by the duration of each segment appeared to be tri-modal, with peaks around 0.05,0.3,0.5μm/s (largest to smallest) bottom center frame of Figure 6. A possible explanation for the largest peak being 0.05μm/s is the segments with longer duration had such a speed, bottom right frame of Figure 6. While there is a slight increase in run length and time tracked as ˆkMAP increases, there does not appear to be a correlation between run length nor time tracked and the MAP estimator for the segment speeds, bottom center and bottom right frame of Figure 6.

    Preliminary analysis qualitative showed that two motor complexes, Kinesin1-DDB complexes have longer run lengths and run time compared to the single motor complexes. Although there is an imbalance in the number of tracked particles for each type of motor, the median run length and amount of tracked time is largest for the Kin1-DDB data set. This suggests that cargo tethered to Kin1-DDB remain bound to the microtubule longer than when the cargo is only tethered to a single motor.

    Table 6.  The median run length and time tracked of the tracked cargo from the Kinesin-1, DDB, and Kin1-DDB data sets.
    Kinesin-1 DDB Kin1-DDB
    Median run length (μm) 1.163 1.26 2.45
    Median time tracked (s) 1.90 1.72 9.24

     | Show Table
    DownLoad: CSV

    For all tracked cargo, we display the run length versus time tracked on a log10 scale for each cargo in the left panel of Figure 7, where a dot corresponds to a tracked cargo and the color and shape denotes the motor protein complex. Taking the corresponding MAP estimators obtained in Step 2 of our algorithm, we found on average the Kin1-DDB tracked particles had slower velocities but longer segment durations. We display the segment speed versus segment duration on a log10 scale for each cargo in the right panel of Figure 7, where a dot corresponds to a tracked cargo and the color and shape denotes the motor protein complex.

    Figure 7.  Left: Run length versus time tracked on a log10 scale where the color and style of the point corresponds to the type of motor protein. Right: Segment speed versus segment duration on a log10 scale where the color and style of the point corresponds to the type of motor protein.

    There remain important biophysical questions concerning what motor-cargo configurations cause pauses in cargo trajectories [12] and whether reversal in direction is instantaneous or requires a paused tug-of-war intermediate state. As mentioned above, this data set is not sufficiently robust to fully answer these questions, but we do see evidence that paused states are more common when motor antagonism is present and paused states do occur between reverses in trajectories.

    In agreement with our prior expectation that Kinesin-1 is a unidirectional and steadily-processive motor, we only observed 8 transitions among the 34 tracked paths (0.24 transitions per path, 0.09 transitions per second). Of these segments, none involved a reversal of direction. Similarly, for the DDB dataset there were 15 detected transitions among the 41 tracked paths (0.42 transitions per path, 0.16 transitions per second). Only one of these led to a reversed-direction state. However, individual inspection of the associated path indicates that this reversal might be an artifact of the first segment having a small number of data points, leading to a noisy estimation. See SI-DDB particle 16 for the path details. In contrast to the single motor data, state transitions were more commonly observed in for the antagonistic motor complex, Kin1-DDB. This was, in part, due the fact that that we could observe the paths for so much longer durations. Indeed, there were 128 detected transitions for the 92 tracked paths (1.39 transitions per path), but the transition rate was only slightly higher than that of Kinesin-1 (0.10 transitions per second).

    The methods we have described do allow for some insight into the transition properties of Kin1-DDB, but there are limitations in the experimental methods that prevent a full analysis. In particular, since we do not know the polarity of the microtubules in these experiments, we cannot know which motor is driving a particular state. Nevertheless, we can make some general observations about the role of the paused state. Most importantly, we see that direct reversals of direction are rare – usually there is an intermediate paused state, as is commonly predicted by tug-of-war models [10]. Using the method described in Section 3.4.3, we estimated the 1-step transition probabilities for each motor type, visually displayed in Figure 8. The node denotes a different state and the transition probability is given along the corresponding arrow where the thickness of the arrow corresponds to the magnitude of the transition probability. When a transition is marked from Initial to Initial, this means that there was a substantial change in velocity, where "substantial" is defined in Section 3.4.3. In the single motor experiments, the most common transition for both Kinesin-1 and DDB is Initial to Initial. While this is also the case for the for the Kinesin-1/DDB complex the transition from an Initial to a Paused state was slightly more probable than the single motor complexes (estimated probability 0.49). Due to the small sample size, the only statistically significant ordering we can make is the Reversed state is the least likely transition state from Initial. We caution the reader from taking too much from some of the transition probabilities. For example, there were only two transitions out of the reversed state for Kin1-DDB, both of which were to different Reverse-direction state.

    Figure 8.  From left to right, the inferred transition diagram for the Kinesin-1, DDB, and Kin1-DDB data set. The nodes represent the three velocity states of motor-cargo complex's velocity: movement in the initial direction (Initial), approximately no movement (Paused), and movement in the opposite direction (Reversed). The weights of each edge correspond to the observed transition probabilities. The transition probabilities of a state may not add to one due to rounding.

    Identifying changes of means in time series of independent Normal random variables has a rich history. There are numerous techniques for this type on analysis, but few have been tailored for the specific application of intracellular transport by molecular motor data. The biological questions of interest here include i) what biophysical states the cargo is in, ii) how long it remains in each state, and iii) quantifying the characteristic parameters of each state. One important observation is that the distribution of velocities associated with active states are expected to be different for different molecular motor families. With this in mind, we have proposed a Bayesian change point method for detecting velocity changes in cargo trajectories that can faithfully be studied in a one-dimensional projection. We validated the method on simulated data in comparison to three other prominent techniques. Our method compared favorably in a set of simulated test cases meant to present realistic challenges presented by motor-transport. Moreover, we applied the method to three different experimental data sets: quantum dots being transported in vitro by a single Kinesin-1 motor, by a single DDB motor, and by a Kinesin-1/DDB pair. We saw substantial differences among the motor experiments that align with what might be expected in a molecular motor "folklore." That is to say, Kinesin-1 steps processively with a stereotyped behavior while DDB shows more variation in its motility. When both motors are present, the velocity is typically smaller but the cargo stays attached for longer periods of time.

    Our approach can be broken down into three basic steps. First, we identify the most likely number of switches; then conditioned on the number of switches we learn the segment velocities and change points; and lastly, we construct an estimated velocity distribution for the given set of trajectories. In each of these stages we sought to find a balance between statistical rigor and computational feasibility. For example, our approach to stage one is effective for one dimensional data but preliminary work indicates that our choice in likelihood might not be optimal for two dimensional data. Whether there exists an effective alternative to standard change point algorithms (bcp in particular) for 2D remains an open question. Moreover, it is not clear whether there is an optimal method for estimation of the velocity distribution for the purpose of characterizing motor-cargo interactions. Ultimately, the method we introduce here is sensitive enough to detect biophysically important differences in the behavior of different motor-cargo configurations.

    With regard to biological implications, we see differences among the motor types, but more data and refined experimental techniques will be necessary to draw firm conclusions. For example, in this experimental framework it is not possible to know whether a motor run ended because it detached in the middle of a microtubule or at its end. Also, when Kinesin-1 and DDB are both present, we see bi-directional movement but we do not know the microtubule polarity, so we cannot yet attribute a particular state to a given motor or pair of motors. Our methods are readily adaptable to experimental data with higher resolution, and higher information content such as microtubule polarity, as well as to more complex complexes such as numerous kinesins and dyneins working against each other.

    This work was funded in part by the National Institutes of Health (NIH) R01-GM122082. We would like to thank John Fricks, Peter Kramer, Daniel Coombs, Jay Newby, Keisha Cook, and Riley Juenemann for their contributions through fruitful conversations during the development of this work.

    The authors have no conflicts of interest to declare.

    For a fixed η and r the marginal likelihood of observing the data assuming uniform priors on the velocities is given by

    RRLN,Kr(ξ;r,ν,η)p(ν1)p(νKr+1)dν1dνKr+1=RR(η2πΔ)N2exp(η2ΔKr+1j=1Mjn=Mj1+1(ξnνjΔ)2)×12v_1{[v_,v_]}(ν1)12v_1{[v_,v_]}(νKr+1)dν1dνKr+1 (7.1)

    Observing the integrand is integrable, employing Fubini's theorem yields

    =(η2πΔ)N2(12v_)Kr+1Kr+1j=1v_v_exp(η2ΔMjn=Mj1+1(ξnΔνj)2)dνj. (7.2)

    For the jth integral, the argument of the exponential can be expressed as

    Mjn=Mj1+1(ξnΔνj)2=Mjn=Mj1+1((ξnˉξ(Mj1,Mj])+(ˉξ(Mj1,Mj]Δνj))2=Mjn=Mj1+1(ξnˉξ(Mj1,Mj])2+(MjMj1)(ˉξ(Mj1,Mj]Δνj)2. (7.3)

    Hence, jth integral can be written as

    v_v_exp(η2ΔMjn=Mj1+1(ξnΔνj)2)dνj=exp(η2ΔMjn=Mj1+1(ξnˉξ(Mj1,Mj])2)(2π[(MjMj1)Δ]η)1/2×v_v_([(MjMj1)Δ]η2π)1/2exp(12(νjˉξ(Mj1,Mj]/Δ)2((MjMj1)ηΔ)1)dνj. (7.4)

    Viewing the integrand as the density function of a normal random variable with mean ˉξ(Mj1,Mj]/Δ and variance ([(MjMj1)Δ]η)1, the j integral becomes

    exp(η2ΔMjn=Mj1+1(ξnˉξ(Mj1,Mj])2)(2π[(MjMj1)Δ]η)1/2×(Φ(v_ˉξ(Mj1,Mj](ηΔ(MjMj1))1/2)Φ(v_ˉξ(Mj1,Mj](ηΔ(MjMj1))1/2)) (7.5)

    Φ(z) is the cumulative distribution function of a standard normal random variable. Thus for all k+1 integrals we obtain Equation 3.17.



    [1] S. Klumpp, R. Lipowsky, Cooperative cargo transport by several molecular motors, Proc. Natl. Acad. Sci. U. S. A., 102 (2005), 17284–17289. doi: 10.1073/pnas.0507363102
    [2] M. J. Müller, S. Klumpp, R. Lipowsky, Bidirectional transport by molecular motors: Enhanced processivity and response to external forces, Biophys. J., 98 (2010), 2610–2618. doi: 10.1016/j.bpj.2010.02.037
    [3] A. Kunwar, A. Mogilner, Robust transport by multiple motors with nonlinear force–velocity relations and stochastic load sharing, Phys. Biol., 7 (2010), 016012. doi: 10.1088/1478-3975/7/1/016012
    [4] J. J. Klobusicky, J. Fricks, P. R. Kramer, Effective behavior of cooperative and nonidentical molecular motors, Res. Math. Sci., 7 (2020), 1–49. doi: 10.1007/s40687-019-0200-6
    [5] F. Berger, C. Keller, S. Klumpp, R. Lipowsky, Distinct transport regimes for two elastically coupled molecular motors, Phys. Rev. Lett., 108 (2012), 208101. doi: 10.1103/PhysRevLett.108.208101
    [6] S. A. McKinley, A. Athreya, J. Fricks, P. R. Kramer, Asymptotic analysis of microtubule-based transport by multiple identical molecular motors, J. Theor. Biol., 305 (2012), 54–69. doi: 10.1016/j.jtbi.2012.03.035
    [7] J. D. Smith, S. A. McKinley, Assessing the impact of electrostatic drag on processive molecular motor transport, Bull. Math. Biol, 80 (2018), 2088–2123. doi: 10.1007/s11538-018-0448-9
    [8] G. Arpağ, S. R. Norris, S. I. Mousavi, V. Soppina, K. J. Verhey, W. O. Hancock, et al., Motor dynamics underlying cargo transport by pairs of kinesin-1 and kinesin-3 motors, Biophys. J., 116 (2019), 1115–1126. doi: 10.1016/j.bpj.2019.01.036
    [9] S. E. Encalada, L. Szpankowski, C.-h. Xia, L. S. Goldstein, Stable kinesin and dynein assemblies drive the axonal transport of mammalian prion protein vesicles, Cell, 144 (2011), 551–565. doi: 10.1016/j.cell.2011.01.021
    [10] W. O. Hancock, Bidirectional cargo transport: Moving beyond tug of war, Nat. Rev. Mol. Cell Biol., 15 (2014), 615.
    [11] M. J. Müller, S. Klumpp, R. Lipowsky, Tug-of-war as a cooperative mechanism for bidirectional cargo transport by molecular motors, Proc. Natl. Acad. Sci. U. S. A., 105 (2008), 4609–4614. doi: 10.1073/pnas.0706825105
    [12] K. M. Trybus, Intracellular transport: The causes for pauses, Curr. Biol., 23 (2013), R623–R625. doi: 10.1016/j.cub.2013.06.005
    [13] A. Kunwar, S. K. Tripathy, J. Xu, M. K. Mattson, P. Anand, R. Sigua, et al., Mechanical stochastic tug-of-war models cannot explain bidirectional lipid-droplet transport, Proc. Natl. Acad. Sci. U. S. A., 108 (2011), 18960–18965. doi: 10.1073/pnas.1107841108
    [14] K. G. Ohashi, L. Han, B. Mentley, J. Wang, J. Fricks, W. O. Hancock, Load-dependent detachment kinetics plays a key role in bidirectional cargo transport by kinesin and dynein, Traffic, 20 (2019), 284–294. doi: 10.1111/tra.12639
    [15] S. M. Block, L. S. Goldstein, B. J. Schnapp, Bead movement by single kinesin molecules studied with optical tweezers, Nature, 348 (1990), 348. doi: 10.1038/348348a0
    [16] M. A. Welte, S. P. Gross, M. Postner, S. M. Block, E. F. Wieschaus, Developmental regulation of vesicle transport in drosophila embryos: forces and kinetics, Cell, 92 (1998), 547–557. doi: 10.1016/S0092-8674(00)80947-2
    [17] T. L. Fallesen, J. C. Macosko, G. Holzwarth, Force–velocity relationship for multiple kinesin motors pulling a magnetic bead, Eur. Biophys. J., 40 (2011), 1071–1079. doi: 10.1007/s00249-011-0724-1
    [18] N. D. Derr, B. S. Goodman, R. Jungmann, A. E. Leschziner, W. M. Shih, S. L. Reck-Peterson, Tug-of-war in motor protein ensembles revealed with a programmable dna origami scaffold, Science, 338 (2012), 662–665. doi: 10.1126/science.1226734
    [19] V. Belyy, M. A. Schlager, H. Foster, A. E. Reimer, A. P. Carter, A. Yildiz, The mammalian dynein–dynactin complex is a strong opponent to kinesin in a tug-of-war competition, Nat. Cell Biol., 18 (2016), 1018. doi: 10.1038/ncb3393
    [20] Q. Feng, K. J. Mickolajczyk, G.-Y. Chen, W. O. Hancock, Motor reattachment kinetics play a dominant role in multimotor-driven cargo transport, Biophys. J., 114 (2018), 400–409. doi: 10.1016/j.bpj.2017.11.016
    [21] Q. Feng, A. M. Gicking, W. O. Hancock, Dynactin p150 promotes processive motility of ddb complexes by minimizing diffusional behavior of dynein, Mol. Biol. Cell, 31 (2020), 782–792. doi: 10.1091/mbc.E19-09-0495
    [22] S. Neumann, R. Chassefeyre, G. E. Campbell, S. E. Encalada, Kymoanalyzer: A software tool for the quantitative analysis of intracellular transport in neurons, Traffic, 18 (2017), 71–88. doi: 10.1111/tra.12456
    [23] E. S. Page, Continuous inspection schemes, Biometrika, 41 (1954), 100–115. doi: 10.1093/biomet/41.1-2.100
    [24] E. S. Page, A test for a change in a parameter occurring at an unknown point, Biometrika, 42 (1955), 523–527. doi: 10.1093/biomet/42.3-4.523
    [25] E. S. Page, On problems in which a change in a parameter occurs at an unknown point, Biometrika, 44 (1957), 248–252. doi: 10.1093/biomet/44.1-2.248
    [26] H. Chernoff, S. Zacks, Estimating the current mean of a normal distribution which is subjected to changes in time, Ann. Math. Stat., 35 (1964), 999–1018. doi: 10.1214/aoms/1177700517
    [27] J. Bai, P. Perron, Estimating and testing linear models with multiple structural changes, Econometrica, 47–78.
    [28] J. Bai, P. Perron, Computation and analysis of multiple structural change models, J. Appl. Econ., 18 (2003), 1–22. doi: 10.1002/jae.659
    [29] D. Barry, J. A. Hartigan, Product partition models for change point problems, Ann. Stat., 260–279.
    [30] D. Barry, J. A. Hartigan, A bayesian analysis for change point problems, J. Am. Stat. Assoc., 88 (1993), 309–319.
    [31] C. Erdman, J. W. Emerson, bcp: An R package for performing a Bayesian analysis of change point problems, J. Stat. Softw., 23.
    [32] S. Yin, N. Song, H. Yang, Detection of velocity and diffusion coefficient change points in single-particle trajectories, Biophys. J., 115 (2018), 217–229. doi: 10.1016/j.bpj.2017.11.008
    [33] M. Csorgo, L. Horváth, Limit theorems in change-point analysis, John Wiley & Sons Chichester, 1997.
    [34] W. Hua, E. C. Young, M. L. Fleming, J. Gelles, Coupling of kinesin steps to ATP hydrolysis, Nature, 388 (1997), 390. doi: 10.1038/41118
    [35] M. J. Schnitzer, S. M. Block, Kinesin hydrolyses one ATP per 8-nm step, Nature, 388 (1997), 386. doi: 10.1038/41111
    [36] K. Visscher, M. J. Schnitzer, S. M. Block, Single kinesin molecules studied with a molecular force clamp, Nature, 400 (1999), 184. doi: 10.1038/22146
    [37] K. Svoboda, C. F. Schmidt, B. J. Schnapp, S. M. Block, Direct observation of kinesin stepping by optical trapping interferometry, Nature, 365 (1993), 721. doi: 10.1038/365721a0
    [38] J. O. L. Andreasson, Single-molecule biophysics of kinesin family motor proteins, PhD thesis, Stanford University, 2013.
    [39] R. J. McKenney, W. Huynh, M. E. Tanenbaum, G. Bhabha, R. D. Vale, Activation of cytoplasmic dynein motility by dynactin-cargo adapter complexes, Science, 345 (2014), 337–341. doi: 10.1126/science.1254198
    [40] T. G. Kurtz, Approximation of Population Processes, SIAM, 1981.
    [41] M. Lavielle, Optimal segmentation of random processes, IEEE Trans. Signal Process., 46 (1998), 1365–1373. doi: 10.1109/78.668798
    [42] R. E. Kass, B. P. Carlin, A. Gelman, R. M. Neal, Markov chain Monte Carlo in practice: A roundtable discussion, Am. Stat., 52 (1998), 93–100.
    [43] M. Lavielle, E. Lebarbier, An application of MCMC methods for the multiple change-points problem, Signal Processing, 81 (2001), 39–53. doi: 10.1016/S0165-1684(00)00189-4
    [44] P. J. Green, Reversible jump Markov chain Monte Carlo computation and Bayesian model determination, Biometrika, 82 (1995), 711–732. doi: 10.1093/biomet/82.4.711
    [45] B. P. Carlin, A. E. Gelfand, A. F. Smith, Hierarchical Bayesian analysis of changepoint problems, Appl. Stat., 389–405.
    [46] D. Stephens, Bayesian retrospective multiple-changepoint identification, Appl. Stat., 159–178.
    [47] A. Gelman, H. S. Stern, J. B. Carlin, D. B. Dunson, A. Vehtari, D. B. Rubin, Bayesian Data Analysis, Chapman and Hall/CRC, 2013.
    [48] R. E. Kass, A. E. Raftery, Bayes factors, J. Am. Stat. Assoc., 90 (1995), 773–795. doi: 10.1080/01621459.1995.10476572
    [49] Y.-C. Yao, Estimation of a noisy discrete-time step function: Bayes and empirical Bayes approaches, Ann. Stat., 1434–1447.
  • mbe-18-06-442 supplementary.zip
  • This article has been cited by:

    1. Nathan T. Rayens, Keisha J. Cook, Scott A. McKinley, Christine K. Payne, Transport of lysosomes decreases in the perinuclear region: Insights from changepoint analysis, 2022, 121, 00063495, 1205, 10.1016/j.bpj.2022.02.032
    2. Chuying Zhou, You Kure Wu, Fumiyoshi Ishidate, Takahiro K. Fujiwara, Mineko Kengaku, Nesprin-2 coordinates opposing microtubule motors during nuclear migration in neurons, 2024, 223, 0021-9525, 10.1083/jcb.202405032
  • Reader Comments
  • © 2021 the Author(s), licensee AIMS Press. This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0)
通讯作者: 陈斌, bchen63@163.com
  • 1. 

    沈阳化工大学材料科学与工程学院 沈阳 110142

  1. 本站搜索
  2. 百度学术搜索
  3. 万方数据库搜索
  4. CNKI搜索

Metrics

Article views(3555) PDF downloads(127) Cited by(2)

Article outline

/

DownLoad:  Full-Size Img  PowerPoint
Return
Return

Catalog