Review Topical Sections

Unique insight into protein-DNA interactions from single molecule atomic force microscopy

  • Protein-DNA interactions are pivotal for many essential biological processes. Atomic force microscopy (AFM) imaging of protein-DNA systems involved in DNA target site search, identification, and processing by proteins has contributed invaluable information to our understanding of the underlying mechanisms. The single molecule 3D resolution of AFM enables us to uncover stoichiometries and conformational properties of protein-DNA complexes. Its molecular resolution places AFM at the interface between the atomic resolution achievable by crystallography and the comparably poor (typically > hundred nanometers) spatial resolution of optical microscopy. Furthermore, the transient character of protein interactions with nonspecific DNA sites, for example during their target site search renders these complexes difficult to resolve by standard ensemble methods. Here, we review current applications and capabilities of as well as novel advances in AFM imaging in protein-DNA interaction studies.

    Citation: Disha Mohan Bangalore, Ingrid Tessmer. Unique insight into protein-DNA interactions from single molecule atomic force microscopy[J]. AIMS Biophysics, 2018, 5(3): 194-216. doi: 10.3934/biophy.2018.3.194

    Related Papers:

    [1] Kittur Philemon Kibiwott , Yanan Zhao , Julius Kogo, Fengli Zhang . Verifiable fully outsourced attribute-based signcryption system for IoT eHealth big data in cloud computing. Mathematical Biosciences and Engineering, 2019, 16(5): 3561-3594. doi: 10.3934/mbe.2019178
    [2] Yong Zhu, Zhipeng Jiang, Xiaohui Mo, Bo Zhang, Abdullah Al-Dhelaan, Fahad Al-Dhelaan . A study on the design methodology of TAC3 for edge computing. Mathematical Biosciences and Engineering, 2020, 17(5): 4406-4421. doi: 10.3934/mbe.2020243
    [3] Zhen Yu, Sheng-Huang Lin, Chia-Ching Cho, Changping Chen . Performance management algorithm of financial shared service center based on Internet of Things public cloud privacy protection. Mathematical Biosciences and Engineering, 2023, 20(7): 12510-12528. doi: 10.3934/mbe.2023557
    [4] Ching-Chun Chang, Chang-Tsun Li . Algebraic secret sharing using privacy homomorphisms for IoT-based healthcare systems. Mathematical Biosciences and Engineering, 2019, 16(5): 3367-3381. doi: 10.3934/mbe.2019168
    [5] Radi P. Romansky, Irina S. Noninska . Challenges of the digital age for privacy and personal data protection. Mathematical Biosciences and Engineering, 2020, 17(5): 5288-5303. doi: 10.3934/mbe.2020286
    [6] Kwabena Owusu-Agyemang, Zhen Qin, Appiah Benjamin, Hu Xiong, Zhiguang Qin . Insuring against the perils in distributed learning: privacy-preserving empirical risk minimization. Mathematical Biosciences and Engineering, 2021, 18(4): 3006-3033. doi: 10.3934/mbe.2021151
    [7] Kwabena Owusu-Agyemang, Zhen Qin, Appiah Benjamin, Hu Xiong, Zhiguang Qin . Guaranteed distributed machine learning: Privacy-preserving empirical risk minimization. Mathematical Biosciences and Engineering, 2021, 18(4): 4772-4796. doi: 10.3934/mbe.2021243
    [8] Sanket Desai, Nasser R Sabar, Rabei Alhadad, Abdun Mahmood, Naveen Chilamkurti . Mitigating consumer privacy breach in smart grid using obfuscation-based generative adversarial network. Mathematical Biosciences and Engineering, 2022, 19(4): 3350-3368. doi: 10.3934/mbe.2022155
    [9] Zhongxue Yang, Yiqin Bao, Yuan Liu, Qiang Zhao, Hao Zheng, Wenbin Xu . Lightweight blockchain fuzzy decision scheme through MQTT and Fibonacci for sustainable transport. Mathematical Biosciences and Engineering, 2022, 19(12): 11935-11956. doi: 10.3934/mbe.2022556
    [10] Kaimeng Chen, Chin-Chen Chang . High-capacity reversible data hiding in encrypted images based on two-phase histogram shifting. Mathematical Biosciences and Engineering, 2019, 16(5): 3947-3964. doi: 10.3934/mbe.2019195
  • Protein-DNA interactions are pivotal for many essential biological processes. Atomic force microscopy (AFM) imaging of protein-DNA systems involved in DNA target site search, identification, and processing by proteins has contributed invaluable information to our understanding of the underlying mechanisms. The single molecule 3D resolution of AFM enables us to uncover stoichiometries and conformational properties of protein-DNA complexes. Its molecular resolution places AFM at the interface between the atomic resolution achievable by crystallography and the comparably poor (typically > hundred nanometers) spatial resolution of optical microscopy. Furthermore, the transient character of protein interactions with nonspecific DNA sites, for example during their target site search renders these complexes difficult to resolve by standard ensemble methods. Here, we review current applications and capabilities of as well as novel advances in AFM imaging in protein-DNA interaction studies.


    In the last few decades, technology has significantly dominated our lives and is currently considered the driving force of recent improvements in the medical health care area. Wireless sensor networks (WSNs) have demonstrated considerable importance due to their usage in many different aspects of human lives, such as medical health care, surveillance, environmental monitoring, military fields, and many other useful applications [1,2,3].

    With the widespread use of smartphones, researchers have concentrated on the use of mobile technology for mobile medical health care, focusing on systems of medical data aggregation that are used to collect and send health data from patient smartphones directly to health care organizations.

    With the exponential growth of health data, the processes of aggregating and analyzing vast amounts of data require immense storage capabilities, powerful computational resources, and fast and secure means of communication. Achieving these requirements by relying only on traditional WSNs is difficult and expensive for health care organizations [4].

    Cloud-based solutions have proliferated in the medical health care field due to their extensive benefits [5]. These benefits include large-scale and on-demand storage, agility, cost-effectiveness and continuous service availability for information processing. Therefore, cloud-based solutions have considerable potential to enhance collaboration among the various participating entities in medical health care, such as patients and health care organizations.

    Despite these benefits, cloud-based solutions are associated with elevated threat levels in terms of security and privacy. These threats include identity spoofing; data tampering; information disclosure; and violations of data integrity, confidentiality, authenticity, and accountability [6,7].

    Mobile health networks (MHNWs) consist of small and inexpensive sensors that are deployed in unsupervised environments and are easily exposed to malfunctions and malicious attacks. Thus, fault tolerance is an important characteristic that must be considered when designing sensor network schemes [8].

    In data aggregation schemes, sensor failures can cause the collection and transfer of incorrect data without a guarantee of excellence. Fault tolerance is defined as "the ability of the network to sustain its functionalities properly, even in the presence of failures in some of its nodes". Fault tolerance aims to eliminate critical privacy threats and assure strong privacy protection for users who contribute their data to aggregators to ensure that the applied technology can deliver excellent service quality [9].

    Two approaches can be employed to achieve fault tolerance. The first is a reactive approach, in which a system can recover from failures when they occur [10]. A small error can be recovered by a state-of-the-art protocol despite failures [11]. Unfortunately, these protocols can tolerate only partial failures and are not efficient in terms of bandwidth and delay. The second is a proactive approach that handles failures using multiple message exchanges between the nodes and the aggregator before faults occur. This approach substantially reduces the required recovery time, as the information needed for fault recovery is available.

    Despite the fact that the state-of-the-art binary proactive protocol achieves a low delay, it suffers from communication overhead, bandwidth costs and large errors [10]. Won et al. [10] presented a novel design for a future ciphertext mechanism; this design supports differential privacy and achieves a higher bandwidth than the state-of-the-art binary proactive protocol. Chen et al. [12] presented a data aggregation scheme that preserves user privacy and guarantees data integrity by adopting a future ciphertext mechanism to provide fault tolerance capability.

    Due to the confidential nature of health information and the importance of protecting and preserving the confidentiality of data, information security systems should be designed and developed with consideration of legal, ethical and security issues. Therefore, to design a workable data aggregation scheme for medical health care, the following issues must be addressed. The first issue is how to protect and preserve the security and privacy of data and maintain data confidentiality. The second issue is how to protect a system against failures.

    Therefore, we propose a novel design for a fault-tolerant privacy-preserving data aggregation scheme. We use the cloud to aggregate, store and process data. Our contributions can be summarized as follows:

    ●  The proposed architecture achieves a fault-tolerant privacy-preserving data aggregation scheme for lightweight health data with end-to-end verification. Moreover, when some failures occur, the cloud can compute the aggregation result, and health care institutions (HCs) can verify the correctness of the aggregated result.

    ●  We modify the future ciphertext mechanism by adding a threshold for the number of faulty nodes. This modification avoids scenarios in which the cloud continues to compute meaningless aggregation when a serious abnormality occurs in the system.

    ●  For secure aggregation and identity protection, we use homomorphic encryption, as it enables aggregation functions to be performed on encrypted data. We use random noise to achieve differential privacy.

    ●  We provide a security and privacy analysis to show that our proposed scheme supports privacy preservation, fault tolerance, and data integrity verification. Additionally, we evaluate the efficiency, robustness and reliability of our scheme to confirm that it has good real-time performance and low aggregation error.

    The remainder of the paper is organized as follows: Section 2 provides a background of health data aggregation and investigates the privacy and security challenges of data aggregation. Section 3 reviews related studies. Preliminaries and the proposed scheme are presented in Sections 4 and 5, respectively. The security analysis is provided in Section 6, followed by the performance analysis in Section 7. The conclusion of the study and future research ideas are discussed in Section 8.

    WSNs are formed by hundreds of thousands of sensor nodes that are used to measure and transmit physical or environmental changes, such as temperature and pressure or motion within a monitoring environment. Each sensor node consists of a sensing unit, memory, a processing unit, a power supply and a wireless communication unit [13]. The characteristics of WSNs include limited power, mobility, ability to cope with node failures, and low cost. These features have prompted researchers to introduce a new research area in the medical health care field: MHNWs. Recently, wearable devices and smartphones have been extensively applied in offering health monitoring services based on health data gathered from users. As these health data are very sensitive, any data leakage may violate user privacy [2]. In this section, we present an overview of the different uses of cloud computing in terms of data aggregation and address the security and privacy issues associated with MHNWs.

    The usage of WSNs has rapidly increased over the past few years in various fields. A massive amount of data is being collected, transmitted and aggregated to perform processing operations. A large number of threats surround WSNs. Consequently, the design of a privacy-preserving data aggregation protocol should address these threats [13], which include the following:

    a) Privacy Preservation and Eavesdropping: Eavesdropping is a type of attack in which the intruder tries to obtain confidential data by listening to transmissions over neighboring wireless links. Therefore, privacy preservation assures data privacy that may be threatened by trusted sensor nodes and adversaries. Some aggregation functions, such as min and max, can also be used to breach data privacy. Therefore, the designed protocol must maintain data privacy while using aggregation functions [13].

    b) Data Integrity and Data Tampering: One of the most common types of attack on data privacy is data tampering, in which the attacker tries to manipulate (with an intermediate result) sensor data at the aggregator level during the data aggregation phase. This type of attack leads to an incorrect aggregation result and eventually to an incorrect decision [2,13].

    c) Efficiency: In WSNs, it is very difficult to avoid communication overhead, but it can be greatly minimized by reducing communication costs, computational costs, and memory and payload sizes. In WSNs, data aggregation must fulfill both bandwidth and energy efficiency requirements throughout network processing.

    d) Accuracy and Dynamism: In WSNs, energy constraints must be properly managed. The data generated from all sensing nodes are important. Therefore, all nodes should have sufficient power to process the collected data [3,13].

    Certain rules and regulations are defined to ensure the privacy of the data within an organization and are called the CIA model (confidentiality, integrity and availability) [5]. Nevertheless, the data managed by third-party vendors require more privacy measures than those existing in the CIA model. Abbas and Khan [7] stated that there are many threats to privacy in the cloud, such as spoofing, masquerading, tampering, replaying and denial of service. The following requirements must be fulfilled to achieve data privacy preservation:

    ●  Confidentiality: The health information of patients must be protected not only in the cloud environment but also from external anomalies and unauthorized users [7].

    ●  Integrity: The data must be protected from illegitimate actions while ensuring that the data have not been altered or tampered with by either authorized or unauthorized users [5,7].

    ●  Anonymity: Health data contain vital information, such as the patient's diseases and name, and this information must be hidden [14,15]. The patient's identity must be protected from intruders, unauthorized users and other internal or external adversaries. Anonymity can be achieved by using a technique known as pseudonymity [5,7].

    ●  Nonrepudiation: These threats are posed by a user who performs tasks and denies them later. In the medical health care area, neither a patient nor a doctor can deny modifying data [7].

    Data aggregation, as a powerful technique for MHNWs, has attracted substantial attention in both academia and industrial fields. Recently, many privacy-preserving data aggregation schemes have been presented. In [11], ACS and Castelluccia proposed a privacy-preserving data aggregation scheme that applies the differential privacy concept by adding Laplace noise to aggregated data. However, the scheme increases network bandwidth and delay. Subsequently, they extended their scheme to support partial fault tolerance.

    Lu et al. [15] introduced an efficient privacy-preserving scheme that reduces the computational overhead and delays in the network with its features, thereby providing fewer calculations, less traffic, higher accuracy and verifiable completeness. Khan et al. [16] proposed a fault-tolerant privacy-preserving data aggregation scheme in a fog-enabled Boneh-Goh-Nissam (BGN) cryptosystem used to preserve privacy. This scheme also reduces communication and computation costs. Zhang et al. [17] presented a privacy-preserving data aggregation scheme for health data monitoring in which the health data were stored and processed in the cloud and various strategies were applied based on the prioritization of the dataset. This scheme reduces the communication overhead but is not tolerant of node failures.

    Won et al. [10] introduced a novel design for future ciphertext buffering to tolerate malfunctioning smart meters and achieved both differential privacy and error optimization. Chen et al. [12] also adopted the future ciphertext buffering mechanism that was proposed in [10] and proposed an aggregation scheme that supports fault tolerance, privacy preservation and data integrity. In addition, confidentiality was guaranteed by using Diffie-Hellman cryptography, while integrity was achieved by attaching a homomorphic message authentication code (HMAC) to each message.

    Han et al. [8] addressed the fault tolerance issue within the health data monitoring framework. They proposed a cloud-based data aggregation scheme that supports additive and nonadditive data aggregation. A BGN cryptosystem was used to protect user privacy. Differential privacy was achieved by using multiple cloud servers. The scheme also guarantees data integrity. Chen et al. [18] presented another multifunctional data aggregation schema (MUDA) that takes advantage of the homomorphic property of the BGN cryptosystem and a bilinear map to provide confidentiality to user data. The MUDA was also extended to support differential privacy [19,20]. Zhu et al. [21] proposed a secure data integrity verification scheme based on a short signature algorithm. They introduced the use of cloud computing to augment computing and storage resources.

    Since health data aggregation requires very high computational capabilities, the privacy of sensitive information can be guaranteed if it is encrypted [22] by the owner. Homomorphic encryption enables the cloud to compute the result of aggregation without knowing the raw data. Another way to provide security for health data is through the use of cryptographic storage [3,8,17].

    Moreover, verification is an extremely crucial step in health data aggregation, as any tampering with the data results in an invalid aggregation, and such interference must therefore be detected and rejected. The message authentication code (MAC) is a protocol that is commonly used to detect false data and to protect and guarantee the integrity of the data [12,23,24,25]. Zhang et al. [25] and Chen et al. [12] took advantage of the homomorphic properties of the MAC to guarantee data integrity. The hash-based MAC was used by Chen et al. [12] and Zhuo et al. [26] to verify user data.

    This section reviews the relevant definitions and terminologies. These definitions are necessary to understand the remainder of this work. The basic notations and symbols are listed in Table 1.

    Table 1.  Descriptions of symbols that are frequently used.
    Variables Description
    $ {\mathbf{G}}_{\bf 1}, {\mathbf{G}}_{\bf 2}, {\mathbf{G}}_{\mathbf{T}} $ Cyclic multiplicative group
    $ {\boldsymbol{g}}_{\bf 1} $ Generator of $ { \ \mathrm{G}}_{1} $
    $ {\boldsymbol{g}}_{\bf 2} $ Generator of $ { \ \mathrm{G}}_{2} $
    $ \boldsymbol{p}, \boldsymbol{q} $ Prime numbers
    $ {\boldsymbol{Z}}_{\boldsymbol{q}}^{\mathbf{*}} $ Multiplicative group of $ { Z}_{q} $
    $ \boldsymbol{e}\left({\boldsymbol{g}}_{1}, {\boldsymbol{g}}_{2}\right) $ The generator of group $ { G}_{T} $
    $ \boldsymbol{A} $ A random mechanism
    $ \boldsymbol{ϵ} $ A parameter expressing the privacy cost
    $ \boldsymbol{S} $ Subset of $ Range\left(A\right) $
    $ \boldsymbol{R}\boldsymbol{a}\boldsymbol{n}\boldsymbol{g}\boldsymbol{e}\left(\boldsymbol{A}\right) $ A domain of the output under mechanism $ A $
    $ \mathbf{ \pmb{\mathsf{ λ}} } $ A security parameter
    $ \boldsymbol{p}\boldsymbol{k} $ Public key
    $ \boldsymbol{s}\boldsymbol{k} $ Secret key
    $ \mathbf{x} $ Message
    $ \boldsymbol{c} $ Ciphertext
    MAC Homomorphic MAC function
    $ \boldsymbol{h} $ Secure hash function

     | Show Table
    DownLoad: CSV

    A bilinear pairing $ \mathrm{e} $ is a map $ \mathrm{e}:{\mathrm{G}}_{1}\times {\mathrm{G}}_{2}\stackrel{}{\to }{\mathrm{G}}_{\mathrm{T}} $, where $ {\mathrm{G}}_{1}, {\mathrm{G}}_{2} $ and $ {\mathrm{G}}_{\mathrm{T}} $ are cyclic multiplicative groups of the same prime order $ \mathrm{q} $, $ {\mathrm{g}}_{1} $ is a generator of $ {\mathrm{G}}_{1} $, and $ {\mathrm{g}}_{2} $ is a generator of $ {\mathrm{G}}_{2} $. The pairing $ \mathrm{e} $ has the following properties:

    Bilinearity: $ \mathrm{e}\left({\mathrm{g}}_{1}^{\mathrm{a}}, {\mathrm{g}}_{2}^{\mathrm{b}}\right) = \ \mathrm{e}\left({\mathrm{g}}_{1}, {\mathrm{g}}_{2}\right)\forall {\mathrm{g}}_{1}\in {\mathrm{G}}_{1}, {\mathrm{g}}_{2}\in {\mathrm{G}}_{2} $ and $ \mathrm{a}, \ \mathrm{b}\in {\mathrm{Z}}_{\mathrm{q}}^{\mathrm{*}} $.

    Computability: $ \forall {\mathrm{g}}_{1}\in {\mathrm{G}}_{1}, {\mathrm{g}}_{2}\in {\mathrm{G}}_{2} $, $ \mathrm{e}\left({\mathrm{g}}_{1}, {\mathrm{g}}_{2}\right) $ can be computed by an efficient algorithm.

    Nondegeneracy: $ \forall {\mathrm{g}}_{1}\in {\mathrm{G}}_{1}, {\mathrm{g}}_{2}\in {\mathrm{G}}_{2} $, $ \mathrm{e}\left({\mathrm{g}}_{1}, {\mathrm{g}}_{2}\right)\ne 1. $

    Definition 1. Discrete Logarithmic Problem [27] (DLP): Assume that $ {\mathrm{G}}_{1}, {\mathrm{G}}_{2} $ are two cyclic multiplicative cyclic groups, $ {\mathrm{G}}_{1} $ is generated by $ {\mathrm{g}}_{1} $, and $ {\mathrm{G}}_{2} $ is generated by $ {\mathrm{g}}_{2} $. Suppose that $ {\mathrm{g}}_{0}, {\mathrm{g}}_{1} $ are two elements in $ {\mathrm{G}}_{1} $. It is computationally intractable to compute $ \mathrm{a} $ such that

    $ {\mathrm{g}}_{1} = {\mathrm{g}}_{0}^{\mathrm{a}} $ (1)

    Definition 2. Computational Diffie-Hellman Problem [28] (CDH): Assume that $ {\mathrm{G}}_{1}, {\mathrm{G}}_{2} $ are two cyclic multiplicative cyclic groups, $ {\mathrm{G}}_{1} $ is generated by $ {\mathrm{g}}_{1} $, and $ {\mathrm{G}}_{2} $ is generated by $ {\mathrm{g}}_{2} $. Given $ \mathrm{e}\left({\mathrm{g}}_{1}, {\mathrm{g}}_{1}^{\mathrm{a}}, {\mathrm{g}}_{1}^{\mathrm{b}}\right) $ and $ \mathrm{a}, \ \mathrm{b}\in {\mathrm{Z}}_{\mathrm{q}}^{\mathrm{*}} $, it is intractable to derive $ {\mathrm{g}}_{1}^{\mathrm{a}\mathrm{b}} $ from the given $ \mathrm{e}\left({\mathrm{g}}_{1}^{\mathrm{a}}, {\mathrm{g}}_{1}^{\mathrm{b}}\right) $ in polynomial time.

    Definition 3. Decisional Diffie-Hellman Problem [26] (DDH): Assume that $ {\mathrm{G}}_{1}, {\mathrm{G}}_{2} $ are two cyclic multiplicative cyclic groups, $ {\mathrm{G}}_{1} $ is generated by $ {\mathrm{g}}_{1} $, and $ {\mathrm{G}}_{2} $ is generated by $ {\mathrm{g}}_{2} $. Given $ \mathrm{e}\left({\mathrm{g}}_{1}, {\mathrm{g}}_{1}^{\mathrm{a}}, {\mathrm{g}}_{1}^{\mathrm{b}}, {\mathrm{g}}_{1}^{\mathrm{c}}\right) $, where a, b, c $ \in {\mathrm{Z}}_{\mathrm{q}}^{\mathrm{*}} $, a DDH determines whether $ \mathrm{c} = \mathrm{a}\mathrm{b} \ \mathrm{m}\mathrm{o}\mathrm{d} \ \mathrm{q} $ by checking as follows:

    $ \mathrm{e}\left({\mathrm{g}}_{1}^{\mathrm{a}}, {\mathrm{g}}_{1}^{\mathrm{b}}\right)\genfrac{}{}{0pt}{}{?}{ = }\mathrm{e}\left({\mathrm{g}}_{1}^{\mathrm{c}}, \ \mathrm{g}\right) $ (2)

    Definition 4. Gap Diffie-Hellman [29] (GDH) Group: A group is Gap Diffie-Hellman if the computational Diffie-Hellman problem is hard but the Decisional Diffe-Hellman problem can be solved in a cyclic multiplicative group $ {\mathrm{G}}_{1}, {\mathrm{G}}_{2} $.

    Definition 1. ($ \mathrm{ϵ}- $ Differential Privacy) [30] A randomized mechanism $ \mathrm{A} $ satisfies $ \mathrm{ϵ}- $ differential privacy if for any two datasets $ {\mathrm{D}}_{1} \ \ \mathrm{a}\mathrm{n}\mathrm{d} \ \ {\mathrm{D}}_{2} $, where $ {\mathrm{D}}_{1}\mathrm{i}\mathrm{s} $ obtained from $ {\mathrm{D}}_{2} $ by adding or removing a single element, and for all $ \mathrm{S} $ ⊆ $ \mathrm{R}\mathrm{a}\mathrm{n}\mathrm{g}\mathrm{e}\left(\mathrm{A}\right), $

    $ \mathrm{P}\mathrm{r}\left(\mathrm{A}\left({\mathrm{D}}_{1}\right)\in \mathrm{S}\right)\le {\mathrm{e}}^{\mathrm{ϵ}}.\mathrm{P}\mathrm{r}\left(\mathrm{A}\left({\mathrm{D}}_{2}\right)\in \mathrm{S}\right) $ (3)

    In the above definition, the parameter $ \mathrm{ϵ} $ represents the privacy cost, which allows us to control the desired privacy level. A smaller value of $ \mathrm{ϵ} $ denotes better privacy protection but implies that more noise is required and that the result will have lower accuracy. The most common mechanism for achieving $ \mathrm{ϵ} $ -differential privacy is to add i.i.d Laplace noise sampled from the Laplace distribution to the aggregated result.

    Definition 2. ($ 2\mathrm{ϵ}- $ Differential Privacy) [31] The noise $ \mathrm{L}\mathrm{a}\mathrm{p}\left({\rm{ \mathsf{ λ} }}\right) $ is sampled from the Laplace noise distribution with mean 0 and variance $ {2{\rm{ \mathsf{ λ} }}}^{2} $. The probability density function of the distribution is given by

    $ \mathrm{L}\mathrm{a}\mathrm{p}\left({\rm{ \mathsf{ λ} }}\right) = \frac{1}{2{\rm{ \mathsf{ λ} }}}{\mathrm{e}}^{\left|\mathrm{x}\right|/{\rm{ \mathsf{ λ} }}} $ (4)

    In our scenario, each participant should generate random noise following a Laplace distribution. The Laplace distribution is infinity divisible, where each random variable is a summation of n other random variables as follows:

    $ \mathrm{L}\mathrm{a}\mathrm{p}\left({\rm{ \mathsf{ λ} }}\right) = {\sum }_{\mathrm{i} = 1}^{\mathrm{n}}({\mathrm{G}}_{\mathrm{i}}\left(\mathrm{n}, \ {\rm{ \mathsf{ λ} }}\right)-{\mathrm{G}'}_{\mathrm{i}}(\mathrm{n}, \ {\rm{ \mathsf{ λ} }}\left)\right) $ (5)

    where $ {\mathrm{G}}_{\mathrm{i}}\left(\mathrm{n}, \ {\rm{ \mathsf{ λ} }}\right) $ and $ {\mathrm{G}}_{\mathrm{i}}'(\mathrm{n}, {\rm{ \mathsf{ λ} }}) $ are gamma-distributed random variables with a gamma density given by

    $ \mathrm{g}\left(\mathrm{x}, \ \mathrm{n}, \ {\rm{ \mathsf{ λ} }}\right) = \frac{{(1/{\rm{ \mathsf{ λ} }})}^{1/\mathrm{n}}}{\mathrm{\Gamma }(1/\mathrm{n})}{\mathrm{x}}^{1/\mathrm{n}-1}{\mathrm{e}}^{-\mathrm{x}/{\rm{ \mathsf{ λ} }}} $ (6)

    Additionally, $ \mathrm{\Gamma }(1/\mathrm{n}) $ is the gamma function evaluated at $ 1/\mathrm{n} $.

    Yet Another Somewhat Homomorphic Encryption (YASHE) is a scheme based on a modified version of n-th degree truncated polynomial ring units (NTRUs) and the multikey homomorphic encryption scheme [32]. It has become a trendy fully homomorphic encryption (FHE) scheme due to its superior performance with lightweight data compared with the performances of other homomorphic schemes [32,33].

    The security of YASHE is based on the hardness of the decisional-ring learning with errors (RLWE) problem [34]: Given sample $ {\mathrm{a}\leftarrow \mathrm{R}}_{\mathrm{q}} $, error term e $ \leftarrow $ χ, and a secret $ {\mathrm{s}\leftarrow \mathrm{R}}_{\mathrm{q}} $ where $ {\mathrm{a}\leftarrow \mathrm{R}}_{\mathrm{q}} $ is drawn uniformly at random, it is computationally hard for an adversary that does not know s and e to distinguish between the distribution of e $ (sa+e, a) $ and that of $ (a, b) $ where $ \left({\mathrm{b}\leftarrow \mathrm{R}}_{\mathrm{q}}\right) $.

    YASHE. ParamGen($ {\rm{ \mathsf{ λ} }} $): Given a set of parameters $ {\rm{ \mathsf{ λ} }} $, $ \mathrm{d} $, $ \mathrm{q} $, $ \mathrm{t} $, $ {\mathrm{x}}_{\mathrm{k}\mathrm{e}\mathrm{y}} $, $ {\mathrm{x}}_{\mathrm{e}\mathrm{r}\mathrm{r}} $ and $ \mathrm{w} $, where $ {\rm{ \mathsf{ λ} }} $ is a security parameter, $ \mathrm{d} $ is a fixed positive integer that determines $ \mathrm{R} $, and moduli $ \mathrm{q} $ and $ \mathrm{t} $ exist, with $ 1<\mathrm{t}<\mathrm{q} $, $ {\mathrm{x}}_{\mathrm{k}\mathrm{e}\mathrm{y}} $ and $ {\mathrm{x}}_{\mathrm{e}\mathrm{r}\mathrm{r}} $ are distributions on $ \mathrm{R} $, and $ \mathrm{w} $ is an integer base where $ \mathrm{w}>1 $. The algorithm generates $ (\mathrm{d} $, $ \mathrm{q} $, $ \mathrm{t} $, $ {\rm{ \mathsf{ λ} }} $, $ {\mathrm{x}}_{\mathrm{k}\mathrm{e}\mathrm{y}} $, $ {\mathrm{x}}_{\mathrm{e}\mathrm{r}\mathrm{r}} $, $ \mathrm{w} $).

    YASHE. keyGen $ (\mathrm{d} $, $ \mathrm{q} $, $ \mathrm{t} $, $ {\rm{ \mathsf{ λ} }} $, $ {\mathrm{x}}_{\mathrm{k}\mathrm{e}\mathrm{y}} $, $ {\mathrm{x}}_{\mathrm{e}\mathrm{r}\mathrm{r}} $, $ \mathrm{w} $): $ \mathrm{h} $, $ \mathrm{f}\leftarrow {\mathrm{x}}_{\mathrm{k}\mathrm{e}\mathrm{y}} $ are computed; then, $ \mathrm{f} = [\mathrm{t}{\mathrm{f}}'+1{]}_{\mathrm{q}} $ and $ \mathrm{h} = [\mathrm{t}\mathrm{g}{\mathrm{f}}'{]}_{\mathrm{q}} $ are set. $ \mathrm{e}, \ \mathrm{s}\leftarrow {\mathrm{x}}_{\mathrm{e}\mathrm{r}\mathrm{r}}^{{\mathrm{l}}_{\mathrm{w}, \ \mathrm{q}}} $ are sampled, and $ \mathrm{\gamma } = \ [{\mathrm{P}}_{\mathrm{w}, \ \mathrm{q}}\left(\mathrm{f}\right)+\mathrm{e}+\mathrm{h}. \ \mathrm{s}{]}_{\mathrm{q}}\in {\mathrm{R}}^{{\mathrm{l}}_{\mathrm{w}}} $ is computed. Then, $ (\mathrm{p}\mathrm{k} $, $ \mathrm{s}\mathrm{k} $, $ \mathrm{e}\mathrm{v}\mathrm{k} $) = $ (\mathrm{h} $, $ \mathrm{f} $, $ \mathrm{\gamma } $) is generated.

    YASHE. Encrypt ($ \mathrm{p}\mathrm{k}, \ \mathrm{x}) $ : $ \mathrm{x}\in \mathrm{R} $ is encrypted, and ciphertext c $ = {\left[∆{\left[\mathrm{x}\right]}_{\mathrm{t}}+\mathrm{e}+\mathrm{h}\mathrm{s}\right]}_{\mathrm{q}}\in \mathrm{R} $ is generated.

    YASHE. Decrypt ($ \mathrm{s}\mathrm{k}, \ \mathrm{c}) $ : A ciphertext c is decrypted by x $ = {\left[⌊\frac{\mathrm{t}}{\mathrm{q}}.[{\mathrm{f}\mathrm{c}]}_{\mathrm{q}}⌉\right]}_{\mathrm{t}}\in \mathrm{R} $.

    YASHE. Add $ ({\mathrm{c}}_{1} $, $ {\mathrm{c}}_{2} $): The ciphertext $ {\mathrm{c}}_{\mathrm{a}\mathrm{d}\mathrm{d}} $ = $ [{{\mathrm{c}}_{1}+{\mathrm{c}}_{2}]}_{\mathrm{q}} $ is output.

    One of the basic methods for ensuring data integrity and preventing tampering attacks is to use a homomorphic MAC function. The homomorphic property means that for two messages $ {\mathrm{x}}_{1} $ and $ {\mathrm{x}}_{2} $, given two homomorphic MACs (MAC ($ {\mathrm{x}}_{1} $) and MAC ($ {\mathrm{x}}_{2} $)), anyone can compute $ {{\mathrm{M}\mathrm{A}\mathrm{C} \ (\mathrm{x}}_{1}+ \ \mathrm{x}}_{2} $) without knowing $ {\mathrm{x}}_{1} $ or $ {\mathrm{x}}_{2} $. The MAC function can be constructed as follows:

    ${\mathrm{M}\mathrm{A}\mathrm{C} \ (\mathrm{x}}_{\mathrm{i}}) = {\mathrm{g}}^{\mathrm{x}}\mathrm{m}\mathrm{o}\mathrm{d}\mathrm{q} $ (7)

    where $ {\mathrm{x}}_{\mathrm{i}}<\mathrm{q} $. This MAC function satisfies the homomorphic property since it follows that

    $ \text{MAC} ( {\mathrm{x}}_{1} ) \mathrm{mo}\mathrm{d} \ \mathrm{q} + \text{MAC} ( {\mathrm{x}}_{2} ) \mathrm{mo}\mathrm{d} \ \mathrm{q} = {\mathrm{g}}^{\mathrm{x}1+\mathrm{x}2}\mathrm{m}\mathrm{o}\mathrm{d} \ \mathrm{q} = \text{MAC} ( {{\mathrm{x}}_{1}+ \ \mathrm{x}}_{2} ) \forall {{\mathrm{x}}_{1}, \ \mathrm{x}}_{2}\in {\mathrm{Z}}_{\mathrm{q}} $ (8)

    The cryptographic hash function is used to check the integrity and source of the given data. This function accepts an input of arbitrary length and maps it to a fixed length with a one-way, collision-resistant mapping. It is computationally infeasible to map two different input maps $ (\mathrm{a}, \ \mathrm{b}) $ to the same output such that $ \mathrm{h}\left(\mathrm{a}\right) = \mathrm{h}\left(\mathrm{b}\right) $, where $ \mathrm{a}\ne \mathrm{b} $. Additionally, it is impossible to infer a from $ \mathrm{h}\left(\mathrm{a}\right) $ [35].

    Data aggregation is an important tool in MHNWs, in which a vast amount of sensitive data is transmitted, processed, and analyzed. Therefore, fault tolerance and privacy have become critical issues for health data aggregation. Without appropriate privacy protection, users may not be willing to share their data. Therefore, we introduce a fault-tolerant privacy-preserving data aggregation scheme for health data.

    In our scheme, the computational overhead is reduced. Privacy is provided by the fully homomorphic YASHE in addition to embedded noise for differential privacy. Fault tolerance is achieved by applying the future message mechanism to properly sustain network operability even in the presence of failures. To enhance the efficiency of the proposed scheme, a health institution can control malfunctioning nodes. The basic notations and symbols of the scheme are listed in Table 2.

    Table 2.  Basic notations for our proposed scheme.
    Variables Description
    $ {\boldsymbol{U}}_{\boldsymbol{i}} $ Participant
    $ {\mathbf{s}\mathbf{k}}_{\mathbf{i}} $ Private key for the participant
    $ {\mathbf{p}\mathbf{k}}_{\mathbf{i}} $ Public key for the participant
    $ {\mathbf{s}\mathbf{k}}_{\mathbf{c}} $ Private key for the health care institution
    $ {\mathbf{p}\mathbf{k}}_{\mathbf{c}} $ Public key for the health care institution
    $ {\mathbf{x}}_{\mathbf{i}, \mathbf{ }\mathbf{t}} $ Health data
    $ \widehat{\mathbf{r}} $ Random noise
    $ {\widehat{\mathbf{x}}}_{\mathbf{i}, \mathbf{t}} $ Noisy data
    $ {\mathbf{c}}_{\mathbf{i}, \mathbf{t}} $ Ciphertext
    $ \boldsymbol{H} $ Secure hash function
    $ {\mathbf{ \pmb{\mathsf{ σ}} }}_{\mathbf{i}, \mathbf{t}} $ Signature generated by the secure hash function for participant $ { U}_{i} $
    $ {\widehat{\mathbf{c}}}_{\mathbf{i}, \mathbf{t}+\mathbf{B}} $ Future ciphertext for time t + B
    $ \boldsymbol{M}\boldsymbol{A}\boldsymbol{C} $ Homomorphic message authentication code
    $ \widehat{\mathbf{M}\mathbf{A}\mathbf{C}} $ Future homomorphic message authentication code
    $ \mathbf{ \pmb{\mathsf{ μ}} } $ Encrypted aggregation result
    $ \mathbf{ \pmb{\mathsf{ σ}} } $ Verification of the correctness of the obtained aggregation result

     | Show Table
    DownLoad: CSV

    Our system model consists of four main entities, as shown in Figure 1: mobile workers (MWs), the health care institution (HC), the cloud (C), and the trusted authority (TA).

    Figure 1.  System model.

    ■  Trusted Authority (TA): The primary responsibility of the TA is the initialization of the entire system, which includes registering the participants, the HCs and the cloud; generating the required public parameters; and distributing the keys.

    ■  Health Care Institution (HC): The HC is the requester that seeks aggregation statistics from patients' data. Due to limited storage and computation capabilities, the HC delegates computations to the cloud.

    ■  Cloud: The cloud server receives encrypted data from MWs and computes the desired statistical results. The cloud server encrypts the computation results and forwards them to the HC.

    ■  Participant (U): Participants refers to users or MWs who have smartphones and contribute their data to an HC. MWs are randomly chosen and encrypt and send their sensing data to the cloud.

    Figure 2 depicts the framework of our proposed scheme, which contains three main entities: the client, the cloud, and the health institution. The cloud is the most prominent of these entities in our proposed scheme and contains three main modules: the data integrity verification module, the fault tolerance module, and the data aggregation module.

    Figure 2.  Our proposed framework.

    The workflow of our framework is as follows: First, the user's encrypted data are sent with two parameters: the first is the future ciphertext, and the second is the verification code. Then, the cloud server will verify the data integrity and calculate the aggregation result. If the aggregator fails to receive the data from one or more users up to m, the aggregator will use the future ciphertext from the buffer memory to calculate the aggregation result and then send the result to the HC. Finally, the HC will decrypt the result.

    Step 1: Setup and key management

    The TA generates the necessary parameters and keys for the system, generates the bilinear parameters ($ \mathrm{q}, \ \mathrm{g}, \ \mathrm{h}, \ \mathrm{e}, {\mathrm{G}}_{1}, {\mathrm{G}}_{2}, {\mathrm{G}}_{\mathrm{T}} $) and encryption parameters for YASHE $ (\mathrm{d} $, $ \mathrm{q} $, $ \mathrm{t} $, $ {\rm{ \mathsf{ λ} }} $, $ {\mathrm{x}}_{\mathrm{k}\mathrm{e}\mathrm{y}} $, $ {\mathrm{x}}_{\mathrm{e}\mathrm{r}\mathrm{r}} $, $ \mathrm{w} $) and chooses a secure hash function $ \mathrm{H}\left(\mathrm{x}\right) $. The TA registers all mobile users, the requester and the cloud in the system by sending them a private/public key pair ($ {\mathrm{s}\mathrm{k}}_{\mathrm{c}}, {\mathrm{p}\mathrm{k}}_{\mathrm{c}}) $. The TA selects N mobile users and registers them. Each registered MW is assigned private/public key pairs ($ {\mathrm{s}\mathrm{k}}_{\mathrm{i}}, {\mathrm{p}\mathrm{k}}_{\mathrm{i}} $). Both the requester and workers are assigned encryption keys (α, β) for the homomorphic MAC.

    Step 2: Sensing and reporting

    During each time period t, each participant $ {\mathrm{U}}_{\mathrm{i}} $ reports his/her sensing data $ {\mathrm{x}}_{\mathrm{i}, \ \mathrm{t}} $ as follows. First, $ {\mathrm{U}}_{\mathrm{i}} $ computes

    $ {\widehat{\mathrm{x}}}_{\mathrm{i}, \mathrm{t}} = {\mathrm{x}}_{\mathrm{i}, \ \mathrm{t}} + {\widehat{\mathrm{r}}}_{\mathrm{i}, \ \mathrm{t}} $ (9)
    $ {\widehat{\mathrm{r}}}_{\mathrm{i}, \mathrm{t}} = {\mathrm{G}}_{\mathrm{i}, \mathrm{t}}\left(\mathrm{n}, \ {\rm{ \mathsf{ λ} }}\right)-{{G}'}_{\mathrm{i}, \mathrm{t}}\left(\mathrm{n}, \ {\rm{ \mathsf{ λ} }}\right) = {\widehat{\mathrm{G}}}_{\mathrm{i}, \mathrm{t}}\left(\mathrm{n}, \ {\rm{ \mathsf{ λ} }}\right) $ (10)

    where $ {\widehat{\mathrm{r}}}_{\mathrm{i}, \ \mathrm{t}} $ represents random noise variables with gamma densities. The sum of all random noise from all participants guarantees differential privacy due to the divisibility of the Laplace distribution, as described in Section 4.1.

    However, adding random noise $ \widehat{\mathrm{r}} $ is not adequate for ensuring the privacy of the data. As a result, the noisy data $ {\widehat{\mathrm{x}}}_{\mathrm{i}, \ \mathrm{t}} $ should be encrypted using the public key $ {\mathrm{p}\mathrm{k}}_{\mathrm{c}} $ of the requester to obtain

    $ {\mathrm{c}}_{\mathrm{i}, \mathrm{t}} = {\mathrm{E}\mathrm{n}\mathrm{c}}_{{\mathrm{p}\mathrm{k}}_{\mathrm{c}}} ( {\widehat{\mathrm{x}}}_{\mathrm{i}, \ \mathrm{t}} ) $ (11)

    Each ciphertext $ {\mathrm{c}}_{\mathrm{i}, \ \mathrm{t}} $ is signed with its corresponding signature $ {{\rm{ \mathsf{ σ} }}}_{\mathrm{i}, \ \mathrm{t}} $ (generated by the secure hash function H() using participants' private keys $ {\mathrm{s}\mathrm{k}}_{\mathrm{i}} $) to prevent tampering attacks and ensure data integrity as follows:

    $ {{\rm{ \mathsf{ σ} }}}_{\mathrm{i}, \mathrm{t}} = {\mathrm{H}\left(\mathrm{t}\right|\left|{\mathrm{c}}_{\mathrm{i}, \mathrm{t}}\right)}^{{\mathrm{s}\mathrm{k}}_{\mathrm{i}}} $ (12)

    To address fault tolerance, we use the proactive aggregation protocol based on the future ciphertext mechanism. Each participant $ {\mathrm{U}}_{\mathrm{i}} $ computes two kinds of ciphertext— $ {\mathrm{c}}_{\mathrm{i}, \ \mathrm{t}} $ for $ {\widehat{\mathrm{x}}}_{\mathrm{i}, \ \mathrm{t}} $ and a future ciphertext $ {\widehat{\mathrm{c}}}_{\mathrm{i}, \ \mathrm{t}} $ adapted from $ {\mathrm{c}}_{\mathrm{i}, \ \mathrm{t}} $ as follows:

    $ {\widehat{\mathrm{c}}}_{\mathrm{i}, \mathrm{t}+\mathrm{B}} = {\mathrm{E}\mathrm{n}\mathrm{c}}_{{\mathrm{p}\mathrm{k}}_{\mathrm{c}}} ( {\widehat{\mathrm{r}}}_{\mathrm{i}, \ \mathrm{t}+\mathrm{B}}+\mathrm{L}\mathrm{a}{\mathrm{p}}_{\mathrm{i}, \ \mathrm{t}+\mathrm{B}}\left({\rm{ \mathsf{ λ} }}\right) ) $ (13)

    We assume that the aggregator has a buffer memory ($ \mathrm{B} $) to store future ciphertexts for each node. In our design, the aggregator is the cloud, which has intensive storage. Each node $ \mathrm{i} $ sends its ciphertext $ {\mathrm{c}}_{\mathrm{i}, \ \mathrm{t}} $ at time $ \mathrm{t} $ and $ \mathrm{B} $ future ciphertexts $ {\widehat{\mathrm{c}}}_{\mathrm{i}, \ \mathrm{t}} $, $ {\widehat{\mathrm{c}}}_{\mathrm{i}, \ \mathrm{t}+1} $, $ {\widehat{\mathrm{c}}}_{\mathrm{i}, \ \mathrm{t}+2} $... $ {\widehat{\mathrm{c}}}_{\mathrm{i}, \ \mathrm{t}+\mathrm{B}-1} $, as shown in Figure 3. In the next iteration, each node sends two ciphertexts: The first ciphertext is the current ciphertext $ { \ \mathrm{c}}_{\mathrm{i}\mathrm{t}} $, and the second ciphertext is the future ciphertext $ {\widehat{\mathrm{c}}}_{\mathrm{i}, \ \mathrm{t}+\mathrm{B}} $ and the corresponding signature $ {{\rm{ \mathsf{ σ} }}}_{\mathrm{i}, \ \mathrm{t}} $. The purpose of a future ciphertext is to replace a given ciphertext if the cloud is unable to receive ciphertexts from the corresponding participant node. For increased efficiency, the HC controls the number of malfunctioning nodes using the parameter factor $ \mathrm{M} $.

    Figure 3.  Future ciphertext mechanism.

    Step 3: Verifying the correctness of the health data aggregation

    To ensure end-to-end verification, we use the HMAC function MAC. Each participant $ {\mathrm{U}}_{\mathrm{i}} $ signs the reported data with the corresponding homomorphic MAC value $ MAC $ ($ {\widehat{x}}_{i, t} $) and calculates the homomorphic MAC value for the future ciphertext $ \widehat{MAC} $ ($ {\widehat{r}}_{i, t+B} $). Participant $ {\mathrm{U}}_{\mathrm{i}} $ sends $ {c}_{i, t} $, $ {\widehat{c}}_{i, t+B } $, $ {\sigma }_{i, t} $, $ MAC $ ($ {\widehat{x}}_{i, t} $) and $ \widehat{MAC} $ ($ {\widehat{r}}_{i, t+B} $) to the cloud.

    Step 4: Data aggregation and verification

    After receiving all reports, the cloud verifies whether the received reports were obtained from the chosen participants for each ciphertext $ {c}_{i, t} $ using participants' public keys $ {pk}_{i} $ by checking

    $ e\left({\sigma }_{i, t}, {g}_{2}\right) = e\left(H\right(t|\left|{c}_{i, t}\right), {pk}_{i} ) $ (14)

    If the above equation is valid, then data integrity is guaranteed, and the cloud proceeds to compute the aggregation result $ \mu $. If not, a breach has occurred.

    $ \mu = \sum \limits_{i = 1, i \notin {U}_{F}}^{n}{c}_{i, t} $ (15)

    However, this equation does not consider fault tolerance. If some reports were not received by the cloud, the cloud cannot verify the received reports or obtain the aggregation results. For a more efficient and reliable schema, we modify the future ciphertext mechanism to enable users to set a preference configuration parameter M and resist the failure of a maximum of M participants out of N total participants.

    $ \mu = \sum _{i = 1, i \notin {U}_{F}}^{n}{c}_{i, t}+ \sum _{j = 1, j ϵ {U}_{F}}^{n}{\widehat{c}}_{j, t} $ (16)

    If the cloud does not receive the ciphertext $ \mathrm{c} $ from between one and M nodes, where HC can specify M, the cloud uses the future ciphertext $ {\widehat{\mathrm{c}}}_{ \ } $, which corresponds to the malfunctioning node from the buffer memory. If the number of malfunctioning nodes exceeds M, then the system is reinitialized to choose new medical health care nodes.

    To verify the correctness of the aggregation result, the cloud computes the corresponding homomorphic message authentication code MAC as follows:

    $ \sigma = \sum _{i = 1}^{n}MAC \left({\widehat{x}}_{i, t}\right) $ (17)

    If the number of participants who fail to send their data is less than M, the cloud verifies the correctness of the aggregation result as follows:

    $ \sigma = \sum _{i = 1, i \notin {U}_{F}}^{n}MAC \left({\widehat{x}}_{i, t}\right) + \sum _{j = 1, j ϵ {U}_{F}}^{n}\widehat{MAC} \left({\widehat{r}}_{j, t}\right) $ (18)

    The cloud forwards the results and the corresponding homomorphic MAC values $\{ {\rm{ \mathsf{ μ} }}, \ {\rm{ \mathsf{ σ} }}\} $ to the requester (the HC).

    Step 5: Decryption and verification of the results

    When HC receives $\{ {\rm{ \mathsf{ μ} }}, \ {\rm{ \mathsf{ σ} }}\} $ from the cloud, it derives the aggregation result $ \sum _{\mathrm{i} = 1}^{\mathrm{n}}{\mathrm{c}}_{\mathrm{i}, \ \mathrm{t}} $ by decrypting $ {\rm{ \mathsf{ μ} }} $ as follows:

    $ \sum _{i = 1}^{n}{{Dec}_{{sk}_{c}}(c}_{i, t}) = {Dec}_{{sk}_{c}}\left(\mu \right) $ (19)

    The HC verifies the correctness of the aggregation result obtained using the homomorphic MAC algorithm by checking

    $ MAC\left({Dec}_{{sk}_{c}}\left(\mu \right)\right)\genfrac{}{}{0pt}{}{?}{ = }\prod\limits _{i = 1}^{n}MAC\left({\widehat{x}}_{i, t}\right)+\prod\limits _{j = 1}^{n}\widehat{MAC}\left({\widehat{r}}_{j, t}\right) $ (20)

    If the verification fails, the HC rejects the results. Otherwise, the HC accepts the results.

    This section analyzes the security and privacy requirements satisfied by our proposed scheme. Moreover, we demonstrate how our proposed scheme resists different types of adversary models.

    In our scheme, health data $ {\mathrm{x}}_{\mathrm{i}} $ are encrypted using YASHE, which is indistinguishable under the chosen ciphertext attack (IND-CPA) and secure under the decisional-RLWE assumption [34]. It is impossible for any time-bounded adversary to decrypt the ciphertext and obtain the health data without the knowledge of the private key, which is known only by the HC.

    ●  Resilience against external attacks:

    Proof: The external adversary cannot eavesdrop on the ciphertext $ {c}_{i, \ t} $ and extract $ {x}_{i, \ t} $ successfully since he/she has no knowledge of $ t, q, f $ or $ e, h $. Such knowledge is impossible because f is held securely by participant $ {U}_{i} $ and $ e, h $ is privately held by the HC.

    During each time period t, the cloud can perform one of the above two types of queries. Both queries provide $ 2ϵ $ -differential privacy, where $ {\rm{ \mathsf{ λ} }} $ = $ GS/ϵ $ and $ GS $ is the global sensitivity of the aggregation result. Although the cloud uses the current and future ciphertexts to infer the sensing data $ {x}_{i, t}, $, i.e., $ {c}_{i, \ t} $ - $ {\widehat{c}}_{i, \ t} $ = $ {x}_{i, t}-La{p}_{i, t}\left({\rm{ \mathsf{ λ} }}\right), $ it also provides $ ϵ $ -differential privacy for the data $ {x}_{i, \ t} $ [12], as the Laplace distribution has a symmetric shape around its mean of zero. Therefore, during each time period, from the participants' perspective, our scheme provides $ 2ϵ $ -differential privacy based on its parallel composition and sequential composition properties. Furthermore, our scheme provides protection against human factor-aware differential aggregation (HAD) [36]. This type of attack aims to break individual privacy. Suppose there are three MWs $ {\mathrm{M}\mathrm{W}}_{1} $, $ { \ \mathrm{M}\mathrm{W}}_{2} $ and $ { \ \mathrm{M}\mathrm{W}}_{3} $, and the sensing data $ {x}_{1}, {x}_{2} $ of $ {\mathrm{M}\mathrm{W}}_{1} $, $ {\mathrm{M}\mathrm{W}}_{2} $, respectively, are stable at time slots $ {t}_{1} $ and $ {t}_{2} $. $ {\mathrm{M}\mathrm{W}}_{3} $ does not report any data at time slot $ {t}_{2} $. From Eqs (5), (9), (10) and (13), the aggregated results for $ {t}_{1} $ and $ {t}_{2} $ are $ {M}_{1} $ = $ \sum _{i = 1}^{3}{x}_{i, 1}+La{p}_{1}\left({\rm{ \mathsf{ λ} }}\right) $ and $ {M}_{2} $ = $ \sum _{i = 1}^{2}{x}_{i, 2}+La{p}_{2}\left({\rm{ \mathsf{ λ} }}\right)+La{p}_{\mathrm{3, 2}}\left({\rm{ \mathsf{ λ} }}\right) $, respectively. It is infeasible for the adversary to derive the sensing data $ {x}_{3} $ of $ {\mathrm{M}\mathrm{W}}_{3} $ at time slot $ {t}_{1} $ by comparing the aggregated result of $ {t}_{1} $ and $ {t}_{2} $ since $ {{M}_{1}-M}_{2} = {x}_{3, 1}-La{p}_{3, 1}\left({\rm{ \mathsf{ λ} }}\right) $.

    In our scheme, the cloud can easily detect if a report has been modified or interrupted by any adversary. Each report will be signed by a secure hash function at each time $ t $.

    ●  Resilience against modification attacks:

    Proof: Assume that the adversary modifies $ {c}_{i, t} $ and $ {\sigma }_{i, t} $ into $ {c}_{i, t}' $ and $ {\sigma }_{i, t}' $, respectively. The modified message passes the verification step if and only if $ {\sigma }_{i, t}' $ is guessed correctly. However, GDH group theory posits that it is infeasible for the adversary to determine $ {\sigma }_{i, t}' $ from $ \mathrm{e}\left({\sigma }_{i, t}', {g}_{2}\right) = e\left(H\left(t|\left|{c}_{i, t}'\right)\right), {pk}_{i}\right) $ since $ {G}_{1} $ is a GDH group. Additionally, for the given $ {\sigma }_{i, t}' $, it is impossible to extract $ {c}_{i, t}' $ from $ e\left({\sigma }_{i, t}', {g}_{2}\right) = e\left(H\right(t|\left|{c}_{i, t}'\right), {pk}_{i} $) due to the features of the secure hash function and GDH group.

    Therefore, when the adversary tries to transmit a modified message $ {c}_{i, t}' $ to the cloud, the modification can be detected by the cloud. As a result, our proposed scheme is resilient against modification attacks.

    ●  Resilience against impersonation attacks:

    Proof: To impersonate $ {\mathrm{U}}_{1} $, the adversary must know the private key $ {\mathrm{s}\mathrm{k}}_{\mathrm{i}} $. Using the public key $ {\mathrm{p}\mathrm{k}}_{\mathrm{i}} $ and the signature $ {{\rm{ \mathsf{ σ} }}}_{\mathrm{i}, \ \mathrm{t}} $ = $ {\mathrm{H} \ \left(\mathrm{t}\right|\left|{\mathrm{c}}_{\mathrm{i}, \ \mathrm{t}}\right)}^{{\mathrm{s}\mathrm{k}}_{\mathrm{i}}} $, it is intractable to find $ {\mathrm{s}\mathrm{k}}_{\mathrm{i}} $ in polynomial time due to the discrete logarithmic assumption in $ {\mathrm{G}}_{1} $.

    ●  Resilience against reply attacks:

    Proof: The adversary launches a reply attack by sending ciphertext $ {\mathrm{c}}_{\mathrm{i} \ } $ with the signature $ {{\rm{ \mathsf{ σ} }}}_{\mathrm{i}, 1} $ at time $ {\mathrm{t}}_{2} $, which has been used at time $ {\mathrm{t}}_{1} $, where ($ {\mathrm{t}}_{1}<{\mathrm{t}}_{2} $). This can be detected by the cloud since $ \mathrm{e}\left({{\rm{ \mathsf{ σ} }}}_{\mathrm{i}, 1}, {\mathrm{g}}_{2}\right) = \mathrm{e} \ \left(\mathrm{H} \ \right({\mathrm{t}}_{1}|\left|{\mathrm{c}}_{\mathrm{i}, 1}\right), {\mathrm{p}\mathrm{k}}_{\mathrm{i}} $).

    To achieve robustness and node failure resistance in our scheme, we utilize a future ciphertext mechanism that requires low memory expenses. In the case of node failure, the cloud can still compute the aggregation and allows the HC to verify the correctness of aggregation. This in turn guarantees fault tolerance and robustness.

    We use HMAC to ensure the correctness of the obtained aggregation result. First, during each time period t, the cloud computes the summation of the HMACs' $ {\rm{ \mathsf{ σ} }} $ for all received data and forwards the sum to the HC with the aggregation result $ {\rm{ \mathsf{ μ} }} $. Then, the HC computes the HMAC for the aggregation result $ {\rm{ \mathsf{ μ} }} $ and checks whether the equation below holds:

    $ MAC\left({Dec}_{{sk}_{c}}\left(\mu \right)\right)\genfrac{}{}{0pt}{}{?}{ = }\sigma $

    Therefore, if the adversary tampers with the aggregation result, this tampering can be detected by the HC. Moreover, Table 3 demonstrates a comparison between the security features of our proposed scheme and those of other works [10,12,26].

    Table 3.  Comparison of the security features of the proposed approach and related works.
    Features Our scheme Won et al. [10] Zhuo et al. [26] Chen et al. [12]
    PPR Yes Yes Yes Yes
    REX Yes No Yes No
    DPR Yes Yes No Yes
    RIM Yes No Yes -
    RMO Yes No Yes -
    RRE Yes No - -
    ROU Yes Yes No Yes
    Correctness of Verification Yes No Yes Yes
    Computation Delegation Yes No Yes No
    PPR: Privacy preservation. RMO: Resilience against modification attacks.
    REX: Resilience against external attacks. RRE: Resilience against reply attacks.
    DPR: Differential privacy. ROU: Robustness.
    RIM: Resilience against impersonation attacks.

     | Show Table
    DownLoad: CSV

    Our proposed scheme is implemented based on the homomorphic scheme developed by Lepoint and Naehrig [32] using the Fast Library for Number Theory (FLINT) arithmetic library and the GNU Multiple Precision (GMP) math library. Our simulation experiments and benchmark tests are executed on a laptop with an Intel core i5 processor, 6 GB of RAM and the Windows 7 (64-bit) operating system. We also implement the scheme of Won et al. [10] for comparison. The performance results are stated in terms of milliseconds.

    Additionally, we consider that the health data are manipulated by the patient's mobile phone (MW). Encryption is performed by the MW before sending the data to the cloud, and decryption is performed by the HC after sending the data to the cloud. The cloud receives the encrypted data, computes the summation of these encrypted data, and forwards the encrypted results to the HC. The size of the encrypted dataset is relatively small, as our scheme focuses on lightweight health data. Our simulation dataset is randomly generated from 35 to 42 human body temperature readings.

    First, we compare the key generation costs of an MW in our scheme with those in the scheme of Won et al. [10] by changing the security bit to examine the key generation costs at different security levels. Table 4 shows the parameter sets used in our benchmarks. We choose these parameters based on [37]. The comparison is shown in Figures 4 and 5 for the real-time and CPU time flags yielded by benchmark testing. For this comparison, we calculate the cost based on a group of 100 MWs. The graphs plotted in Figures 4 and 5 indicate that the required time for key generation in our scheme is lower than that in the scheme of Won et al. [10] in terms of both real time and CPU time. Note that the real time required for key generation in the scheme of Won et al. [10] is three times higher than that required for key generation in our scheme. Thus, our scheme is four times faster than that of Won et al. [10] based on CPU time flags. The key generation cost is critical to the MW, as a lower cost for key generation leads to a longer battery life.

    Table 4.  Key generation cost for different security levels.
    Security Bit
    Our Scheme Scheme of Won et al. [10]
    n q CPU Time (ms) Real Time (ms) CPU Time (ms) Real Time (ms)
    128 2048 54 0.115 0.346 0.960 1.917
    192 4096 75 0.151 0.395 0.985 1.982
    256 8192 118 0.185 0.469 0.995 2.078

     | Show Table
    DownLoad: CSV
    Figure 4.  Comparison of key generation costs in terms of real time.
    Figure 5.  Comparison of key generation costs in terms of CPU time.

    We also simulate the costs of encryption incurred by an MW when each group of our scheme has 100 participants (U) and compare the calculated costs with those of the scheme of Won et al. [10] at different security levels by changing the security bit. The simulation results are shown in Figures 6 and 7 for the real-time and CPU time flags yielded by benchmark testing. As shown in Figures 6 and 7, the encryption time of our scheme is superior to that of the scheme developed by Won et al. [10]. In terms of both real time and CPU time, the encryption cost of our scheme is six times lower than that of the scheme of Won et al. [10]. The low efficiency of the Won et al. [10] scheme is attributed to its encryption mechanism, where each participant in each time period t must communicate with all partners from the same group to exchange the secret keys $ {\boldsymbol{s}\boldsymbol{k}}_{\boldsymbol{i}, \ \boldsymbol{t}} $ to be used as the encryption key. To reduce the encryption cost in the scheme of Won et al. [10], we need to reduce the number of participants (U) in each group. This reduction would cause a decrease in the privacy level of the data and result in a reduced security level. The opportunity for the adversary to attack and disclose the data would then increase.

    Figure 6.  Comparison of encryption costs in real time for different security levels.
    Figure 7.  Comparison of encryption costs in CPU time for different security levels.

    To make our scheme more practical, we utilize the future ciphertext mechanism proposed by Won et al. [10] to guarantee fault tolerance at the expense of two main requirements. If failures occur, the cloud can still calculate the aggregation result and the corresponding data integrity verification value. To evaluate our fault tolerance protocol, we measure the closeness between the actual summation of the sequence of data and the noisy sum calculated using the root mean square error (RMSE). Figure 8 shows the simulation result of our proposed scheme, where p is the probability of failure for MWs. The error in our scheme is significantly lower than that in the scheme developed by Won et al. [10].

    Figure 8.  Comparison of encryption costs in real time yielded by changing the number of participants.

    We propose a fault-tolerant privacy-preserving cloud-based data aggregation scheme for lightweight health data. Our proposed scheme takes advantage of the numerous capabilities of the cloud by enabling an HC to delegate data aggregation tasks to the cloud. In our proposed scheme, we implement YASHE to protect the patient's identity and privacy, which enables the cloud to calculate the aggregation result with encrypted data. For differential privacy, we distribute noise among the MWs. Although our scheme enables the HC to verify the correctness of the aggregation result, our fault tolerance scheme is proactive and based on a future ciphertext mechanism. For increased efficiency, we enable the HC to control the number of acceptable malfunctioning nodes.

    Compared with the aggregation process in the scheme of Won et al. [10], that in our scheme has a lower aggregation error and is not affected by the number of malfunctioning nodes. In addition, the performance evaluation shows that the computational overhead is significantly reduced. Unlike the encryption time in the scheme of Won et al. [10], that in our scheme is not affected by the number of participants utilized. The simulation results demonstrate the efficiency and feasibility of our scheme. In future work, we will improve our scheme to support multifunctional health data aggregation. Additionally, we will apply batch verification instead of individually verifying the reported data, which will improve the performance of the scheme.

    This research project was supported by a grant from the Research Center of the Female Scientific and Medical Colleges, Deanship of Scientific Research, King Saud University.

    All authors declare no conflicts of interest in this paper.

    [1] Binnig G, Quate CF, Gerber C (1986) Atomic Force Microscope. Phys Rev Lett 56: 930–933. doi: 10.1103/PhysRevLett.56.930
    [2] Tessmer I, Kaur P, Lin JG, et al. (2013) Investigating bioconjugation by atomic force microscopy. J Nanobiotechnol 11: 1–17. doi: 10.1186/1477-3155-11-1
    [3] Werten PJL, Remigy HW, de Groot BL, et al. (2002) Progress in the analysis of membrane protein structure and function. FEBS Lett 529: 65–72. doi: 10.1016/S0014-5793(02)03290-8
    [4] Muller DJ, Sapra KT, Scheuring S, et al. (2006) Single-molecule studies of membrane proteins. Curr Opin Struct Biol 16: 489–495. doi: 10.1016/j.sbi.2006.06.001
    [5] Gorle S, Pan YG, Sun ZQ, et al. (2017) Computational Model and Dynamics of Monomeric Full-Length APOBEC3G. ACS Cent Sci 3: 1180–1188. doi: 10.1021/acscentsci.7b00346
    [6] Sander B, Tria G, Shkumatov AV, et al. (2013) Structural characterization of gephyrin by AFM and SAXS reveals a mixture of compact and extended states. Acta Crystallogr 69: 2050–2060.
    [7] Ishino S, Yamagami T, Kitamura M, et al. (2014) Multiple interactions of the intrinsically disordered region between the helicase and nuclease domains of the archaeal Hef protein. J Biol Chem 289: 21627–21639. doi: 10.1074/jbc.M114.554998
    [8] Shinozaki Y, Sumitomo K, Tsuda M, et al. (2009) Direct Observation of ATP-Induced Conformational Changes in Single P2X(4) Receptors. PLos Biol 7: e1000103. doi: 10.1371/journal.pbio.1000103
    [9] Lemaire PA, Tessmer I, Craig R, et al. (2006) Unactivated PKR exists in an open conformation capable of binding nucleotides. Biochemistry 45: 9074–9084. doi: 10.1021/bi060567d
    [10] Kinoshita E, van Rossum-Fikkert S, Sanchez H, et al. (2015) Human RAD50 makes a functional DNA-binding complex. Biochimie 113: 47–53. doi: 10.1016/j.biochi.2015.03.017
    [11] Bonazza K, Rottensteiner H, Seyfried BK, et al. (2014) Visualization of a protein-protein interaction at a single-molecule level by atomic force microscopy. Anal Bioanal Chem 406: 1411–1421. doi: 10.1007/s00216-013-7563-0
    [12] Shlyakhtenko LS, Gall AA, Filonov A, et al. (2003) Silatrane-based surface chemistry for immobilization of DNA, protein-DNA complexes and other biological materials. Ultramicroscopy 97: 279–287. doi: 10.1016/S0304-3991(03)00053-6
    [13] Scheuring S, Muller DJ, Ringler P, et al. (1999) Imaging streptavidin 2D crystals on biotinylated lipid monolayers at high resolution with the atomic force microscope. J Microsc 193: 28–35. doi: 10.1046/j.1365-2818.1999.00434.x
    [14] Whited AM, Park PS (2014) Atomic force microscopy: A multifaceted tool to study membrane proteins and their interactions with ligands. Biochim Biophys Acta 1838: 56–68. doi: 10.1016/j.bbamem.2013.04.011
    [15] Dubrovin EV, Schachtele M, Klinov DV, et al. (2017) Time-Lapse Single-Biomolecule Atomic Force Microscopy Investigation on Modified Graphite in Solution. Langmuir 33: 10027–10034. doi: 10.1021/acs.langmuir.7b02220
    [16] Oliveira Brett AM, Chiorcea Paquim AM (2005) DNA imaged on a HOPG electrode surface by AFM with controlled potential. Bioelectrochemistry 66: 117–124. doi: 10.1016/j.bioelechem.2004.05.009
    [17] Garcia R, Perez R (2002) Dynamic atomic force microscopy methods. Surf Sci Rep 47: 197–301. doi: 10.1016/S0167-5729(02)00077-8
    [18] Dufrene YF, Ando T, Garcia R, et al. (2017) Imaging modes of atomic force microscopy for application in molecular and cell biology. Nat Nanotechnol 12: 295–307. doi: 10.1038/nnano.2017.45
    [19] Hansma HG, Sinsheimer RL, Groppe J, et al. (1993) Recent Advances in Atomic-Force Microscopy of DNA. Scanning 15: 296–299. doi: 10.1002/sca.4950150509
    [20] Balamurugan S, Obubuafo A, Soper SA, et al. (2008) Surface immobilization methods for aptamer diagnostic applications. Anal Bioanal Chem 390: 1009–1021. doi: 10.1007/s00216-007-1587-2
    [21] Knecht S, Ricklin D, Eberle AN, et al. (2009) Oligohis-tags: Mechanisms of binding to Ni2+-NTA surfaces. J Mol Recognit 22: 270–279. doi: 10.1002/jmr.941
    [22] Ritzefeld M, Walhorn V, Anselmetti D, et al. (2013) Analysis of DNA interactions using single-molecule force spectroscopy. Amino Acids 44: 1457–1475. doi: 10.1007/s00726-013-1474-4
    [23] Rief M, Clausen-Schaumann H, Gaub HE (1999) Sequence-dependent mechanics of single DNA molecules. Nat Struct Biol 6: 346–349. doi: 10.1038/7582
    [24] Carrion-Vazquez M, Oberhauser AF, Fowler SB, et al. (1999) Mechanical and chemical unfolding of a single protein: A comparison. Proc Natl Acad Sci U. S. A 96: 3694–3699. doi: 10.1073/pnas.96.7.3694
    [25] Woodside MT, Block SM (2014) Reconstructing Folding Energy Landscapes by Single-Molecule Force Spectroscopy. Annu Rev Biophys 43: 19–39. doi: 10.1146/annurev-biophys-051013-022754
    [26] Hughes ML, Dougan L (2016) The physics of pulling polyproteins: A review of single molecule force spectroscopy using the AFM to study protein unfolding. Rep Prog Phys Phys Soc 79: 076601. doi: 10.1088/0034-4885/79/7/076601
    [27] Fisher TE, Marszalek PE, Fernandez JM (2000) Stretching single molecules into novel conformations using the atomic force microscope. Nat Struct Biol 7: 719–724. doi: 10.1038/78936
    [28] Beckwitt EC, Kong M, Van Houten B (2018) Studying protein-DNA interactions using atomic force microscopy. Semin Cell Dev Biol 73: 220–230. doi: 10.1016/j.semcdb.2017.06.028
    [29] Kasas S, Dietler G (2018) DNA-protein interactions explored by atomic force microscopy. Semin Cell Dev Biol 73: 231–239. doi: 10.1016/j.semcdb.2017.07.015
    [30] Lyubchenko YL, Shlyakhtenko LS (2016) Imaging of DNA and Protein-DNA Complexes with Atomic Force Microscopy. Crit Rev Eukaryot Gene Expr 26: 63–96. doi: 10.1615/CritRevEukaryotGeneExpr.v26.i1.70
    [31] Halford SE, Marko JF (2004) How do site-specific DNA-binding proteins find their targets? Nucleic Acids Res 32: 3040–3052. doi: 10.1093/nar/gkh624
    [32] Schneider R, Grosschedl R (2007) Dynamics and interplay of nuclear architecture, genome organization, and gene expression. Genes Dev 21: 3027–3043. doi: 10.1101/gad.1604607
    [33] Lambert SA, Jolma A, Campitelli LF, et al. (2018) The Human Transcription Factors. Cell 172: 650–665. doi: 10.1016/j.cell.2018.01.029
    [34] Sancar A, Lindsey-Boltz LA, Unsal-Kacmaz K, et al. (2004) Molecular mechanisms of mammalian DNA repair and the DNA damage checkpoints. Annu Rev Biochem 73: 39–85. doi: 10.1146/annurev.biochem.73.011303.073723
    [35] Kapanidis AN, Strick T (2009) Biology, one molecule at a time. Trends Biochem Sci 34: 234–243. doi: 10.1016/j.tibs.2009.01.008
    [36] Larson MH, Landick R, Block SM (2011) Single-Molecule Studies of RNA Polymerase: One Singular Sensation, Every Little Step It Takes. Mol Cell 41: 249–262. doi: 10.1016/j.molcel.2011.01.008
    [37] Ghodke H, Wang H, Hsieh CL, et al. (2014) Single-molecule analysis reveals human UV-damaged DNA-binding protein (UV-DDB) dimerizes on DNA via multiple kinetic intermediates. P Natl Acad Sci USA 111: E1862–E1871. doi: 10.1073/pnas.1323856111
    [38] Buechner CN, Maiti A, Drohat AC, et al. (2015) Lesion search and recognition by thymine DNA glycosylase revealed by single molecule imaging. Nucleic Acids Res 43: 2716–2729. doi: 10.1093/nar/gkv139
    [39] Bewley CA, Gronenborn AM, Clore GM (1998) Minor groove-binding architectural proteins: Structure, function, and DNA recognition. Annu Rev Biophys Biomol Struct 27: 105–131. doi: 10.1146/annurev.biophys.27.1.105
    [40] Perez-Rueda E, Hernandez-Guerrero R, Martinez-Nunez MA, et al. (2018) Abundance, diversity and domain architecture variability in prokaryotic DNA-binding transcription factors. PLos One 13: e0195332. doi: 10.1371/journal.pone.0195332
    [41] Richards FM, Kundrot CE (1988) Identification of Structural Motifs from Protein Coordinate Data-Secondary Structure and 1st-Level Supersecondary Structure. Proteins 3: 71–84. doi: 10.1002/prot.340030202
    [42] Shanahan HP, Garcia MA, Jones S, et al. (2004) Identifying DNA-binding proteins using structural motifs and the electrostatic potential. Nucleic Acids Res 32: 4732–4741. doi: 10.1093/nar/gkh803
    [43] Bustin M, Reeves R (1996) High-mobility-group chromosomal proteins: Architectural components that facilitate chromatin function. Prog Nucleic Acid Res Mol Biol 54: 35–100. doi: 10.1016/S0079-6603(08)60360-8
    [44] Smith NC, Matthews JM (2016) Mechanisms of DNA-binding specificity and functional gene regulation by transcription factors. Curr Opin Struct Biol 38: 68–74. doi: 10.1016/j.sbi.2016.05.006
    [45] Saravanan M, Vasu K, Nagaraja V (2008) Evolution of sequence specificity in a restriction endonuclease by a point mutation. Proc Natl Acad Sci U. S. A 105: 10344–10347. doi: 10.1073/pnas.0804974105
    [46] Ferredamare AR, Prendergast GC, Ziff EB, et al. (1993) Recognition by Max of Its Cognate DNA through a Dimeric B/Hlh/Z Domain. Nature 363: 38–45. doi: 10.1038/363038a0
    [47] Morgunova E, Yin Y, Jolma A, et al. (2015) Structural insights into the DNA-binding specificity of E2F family transcription factors. Nat Commun 6: 10050. doi: 10.1038/ncomms10050
    [48] Li J, Rodriguez JP, Niu F, et al. (2016) Structural basis for DNA recognition by STAT6. P Natl Acad Sci USA 113: 13015–13020. doi: 10.1073/pnas.1611228113
    [49] Rudolph MJ, Gergen JP (2001) DNA-binding by Ig-fold proteins. Nat Struct Biol 8: 384–386. doi: 10.1038/87531
    [50] Doherty AJ, Serpell LC, Ponting CP (1996) The helix-hairpin-helix DNA-binding motif: A structural basis for non-sequence-specific recognition of DNA. Nucleic Acids Res 24: 2488–2497. doi: 10.1093/nar/24.13.2488
    [51] Burgers PMJ, Kunkel TA (2017) Eukaryotic DNA Replication Fork. Annu Rev Biochem 86: 417–438. doi: 10.1146/annurev-biochem-061516-044709
    [52] Raghunathan S, Kozlov AG, Lohman TM, et al. (2000) Structure of the DNA binding domain of E-coli SSB bound to ssDNA. Nat Struct Biol 7: 648–652. doi: 10.1038/77943
    [53] Theis K, Chen PJ, Skorvaga M, et al. (1999) Crystal structure of UvrB, a DNA helicase adapted for nucleotide excision repair. EMBO J 18: 6899–6907. doi: 10.1093/emboj/18.24.6899
    [54] Waters TR, Eryilmaz J, Geddes S, et al. (2006) Damage detection by the UvrABC pathway: Crystal structure of UvrB bound to fluorescein-adducted DNA. FEBS Lett 580: 6423–6427. doi: 10.1016/j.febslet.2006.10.051
    [55] Chai N, Li WX, Wang J, et al. (2015) Structural basis for the Smad5 MH1 domain to recognize different DNA sequences. Nucleic Acids Res 43: 9051–9064. doi: 10.1093/nar/gkv848
    [56] Maiti A, Morgan MT, Pozharski E, et al. (2008) Crystal structure of human thymine DNA glycosylase bound to DNA elucidates sequence-specific mismatch recognition. P Natl Acad Sci USA 105: 8890–8895. doi: 10.1073/pnas.0711061105
    [57] Bruner SD, Norman DPG, Verdine GL (2000) Structural basis for recognition and repair of the endogenous mutagen 8-oxoguanine in DNA. Nature 403: 859–866. doi: 10.1038/35002510
    [58] Ando T, Kodera N, Takai E, et al. (2001) A high-speed atomic force microscope for studying biological macromolecules. P Natl Acad Sci USA 98: 12468–12472. doi: 10.1073/pnas.211400898
    [59] Ando T (2018) High-speed atomic force microscopy and its future prospects. Biophys Rev 10: 285–292. doi: 10.1007/s12551-017-0356-5
    [60] Sawicka M, Aramayo R, Ayala R, et al. (2017) Single-Particle Electron Microscopy Analysis of DNA Repair Complexes. Methods Enzymol 592: 159–186. doi: 10.1016/bs.mie.2017.03.010
    [61] Sun JC, Yuan ZN, Bai L, et al. (2017) Cryo-EM of dynamic protein complexes in eukaryotic DNA replication. Protein Sci 26: 40–51. doi: 10.1002/pro.3033
    [62] Tessmer I, Moore T, Lloyd RG, et al. (2005) AFM studies on the role of the protein RdgC in bacterial DNA recombination. J Mol Biol 350: 254–262. doi: 10.1016/j.jmb.2005.04.043
    [63] Lohr D, Bash R, Wang H, et al. (2007) Using atomic force microscopy to study chromatin structure and nucleosome remodeling. Methods 41: 333–341. doi: 10.1016/j.ymeth.2006.08.016
    [64] Hizume K, Kominam H, Kobayash K, et al. (2017) Flexible DNA Path in the MCM Double Hexamer Loaded on DNA. Biochemistry 56: 2435–2445. doi: 10.1021/acs.biochem.6b00922
    [65] Noort JV, Heijden TVD, Dutta CF, et al. (2004) Initiation of translocation by Type I restriction-modification enzymes is associated with a short DNA extrusion. Nucleic Acids Res 32: 6540–6547. doi: 10.1093/nar/gkh999
    [66] Maurer S, Fritz J, Muskhelishvili G, et al. (2006) RNA polymerase and an activator form discrete subcomplexes in a transcription initiation complex. EMBO J 25: 3784–3790. doi: 10.1038/sj.emboj.7601261
    [67] Verhoeven EEA, Wyman C, Moolenaar GF, et al. (2002) The presence of two UvrB subunits in the UvrAB complex ensures damage detection in both DNA strands. EMBO J 21: 4196–4205. doi: 10.1093/emboj/cdf396
    [68] Cellai S, Mangiarotti L, Vannini N, et al. (2007) Upstream promoter sequences and alpha CTD mediate stable DNA wrapping within the RNA polymerase-promoter open complex. EMBO Rep 8: 271–278. doi: 10.1038/sj.embor.7400888
    [69] Umemura K, Okada T, Kuroda R (2005) Cooperativity and intermediate structures of single-stranded DNA binding-assisted RecA-single-stranded DNA complex formation studied by atomic force microscopy. Scanning 27: 35–43.
    [70] Hamon L, Pastre D, Dupaigne P, et al. (2007) High-resolution AFM imaging of single-stranded DNA-binding (SSB) protein-DNA complexes. Nucleic Acids Res 35: e58. doi: 10.1093/nar/gkm147
    [71] Li BS, Goh MC (2010) Direct visualization of the formation and structure of RecA/dsDNA complexes. Micron 41: 227–231. doi: 10.1016/j.micron.2009.10.011
    [72] Li BS, Wei B, Goh MC (2012) Direct visualization of the formation of RecA/dsDNA complexes at the single-molecule level. Micron 43: 1073–1075. doi: 10.1016/j.micron.2012.04.016
    [73] Tessmer I, Melikishvili M, Fried MG (2012) Cooperative cluster formation, DNA bending and base-flipping by O6-alkylguanine-DNA alkyltransferase. Nucleic Acids Res 40: 8296–8308. doi: 10.1093/nar/gks574
    [74] Miyagi A, Ando T, Lyubchenko YL (2011) Dynamics of Nucleosomes Assessed with Time-Lapse High-Speed Atomic Force Microscopy. Biochemistry 50: 7901–7908. doi: 10.1021/bi200946z
    [75] Sanchez H, Suzuki Y, Yokokawa M, et al. (2011) Protein-DNA interactions in high speed AFM: Single molecule diffusion analysis of human RAD54. Integr Biol 3: 1127–1134. doi: 10.1039/c1ib00039j
    [76] Endo M, Sugiyama H (2014) Single-Molecule Imaging of Dynamic Motions of Biomolecules in DNA Origami Nanostructures Using High-Speed Atomic Force Microscopy. Acc Chem Res 47: 1645–1653. doi: 10.1021/ar400299m
    [77] Lee AJ, Endo M, Hobbs JK, et al. (2018) Direct Single-Molecule Observation of Mode and Geometry of RecA-Mediated Homology Search. Acs Nano 12: 272–278. doi: 10.1021/acsnano.7b06208
    [78] Suzuki Y, Shin M, Yoshida A, et al. (2012) Fast microscopical dissection of action scenes played by Escherichia coli RNA polymerase. FEBS Lett 586: 3187–3192. doi: 10.1016/j.febslet.2012.06.033
    [79] Buechner CN, Tessmer I (2013) DNA substrate preparation for atomic force microscopy studies of protein-DNA interactions. J Mol Recognit 26: 605–617. doi: 10.1002/jmr.2311
    [80] Yang Y, Sass LE, Du C, et al. (2005) Determination of protein-DNA binding constants and specificities from statistical analyses of single molecules: MutS-DNA interactions. Nucleic Acids Res 33: 4322–4334. doi: 10.1093/nar/gki708
    [81] Sukhanova MV, Abrakhi S, Joshi V, et al. (2016) Single molecule detection of PARP1 and PARP2 interaction with DNA strand breaks and their poly(ADP-ribosyl)ation using high-resolution AFM imaging. Nucleic Acids Res 44: e60. doi: 10.1093/nar/gkv1476
    [82] Buechner CN, Heil K, Michels G, et al. (2014) Strand-specific Recognition of DNA Damages by XPD Provides Insights into Nucleotide Excision Repair Substrate Versatility. J Biol Chem 289: 3613–3624. doi: 10.1074/jbc.M113.523001
    [83] Wirth N, Gross J, Roth HM, et al. (2016) Conservation and Divergence in Nucleotide Excision Repair Lesion Recognition. J Biol Chem 291: 18932–18946. doi: 10.1074/jbc.M116.739425
    [84] Doniselli N, Rodriguez-Aliaga P, Amidani D, et al. (2015) New insights into the regulatory mechanisms of ppGpp and DksA on Escherichia coli RNA polymerase-promoter complex. Nucleic Acids Res 43: 5249–5262. doi: 10.1093/nar/gkv391
    [85] Nettikadan S, Tokumasu F, Takeyasu K (1996) Quantitative analysis of the transcription factor AP2 binding to DNA by atomic force microscopy. Biochem Biophys Res Commun 226: 645–649. doi: 10.1006/bbrc.1996.1409
    [86] Timofeeva OA, Chasovskikh S, Lonskaya I, et al. (2012) Mechanisms of Unphosphorylated STAT3 Transcription Factor Binding to DNA. J Biol Chem 287: 14192–14200. doi: 10.1074/jbc.M111.323899
    [87] Zhang P, Xia JH, Zhu J, et al. (2018) High-throughput screening of prostate cancer risk loci by single nucleotide polymorphisms sequencing. Nat Commun 9: 2022. doi: 10.1038/s41467-018-04451-x
    [88] Huang Q, Whitington T, Gao P, et al. (2014) A prostate cancer susceptibility allele at 6q22 increases RFX6 expression by modulating HOXB13 chromatin binding. Nat Genet 46: 126–135. doi: 10.1038/ng.2862
    [89] Gao P, Xia JH, Sipeky C, et al. (2018) Biology and Clinical Implications of the 19q13 Aggressive Prostate Cancer Susceptibility Locus. Cell 174: 576–589. doi: 10.1016/j.cell.2018.06.003
    [90] Crampton N, Bonass WA, Kirkham J, et al. (2006) Collision events between RNA polymerases in convergent transcription studied by atomic force microscopy. Nucleic Acids Res 34: 5416–5425. doi: 10.1093/nar/gkl668
    [91] Countryman P, Fan Y, Gorthi A, et al. (2018) Cohesin SA2 is a sequence-independent DNA-binding protein that recognizes DNA replication and repair intermediates. J Biol Chem 293: 1054–1069. doi: 10.1074/jbc.M117.806406
    [92] Schneider SW, Larmer J, Henderson RM, et al. (1998) Molecular weights of individual proteins correlate with molecular volumes measured by atomic force microscopy. Pflug Arch Eur J Phy 435: 362–367. doi: 10.1007/s004240050524
    [93] Ratcliff GC, Erie DA (2001) A novel single-molecule study to determine protein-protein association constants. J Am Chem Soc 123: 5632–5635. doi: 10.1021/ja005750n
    [94] Wang H, Dellavecchia MJ, Skorvaga M, et al. (2006) UvrB domain 4, an autoinhibitory gate for regulation of DNA binding and ATPase activity. J Biol Chem 281: 15227–15237. doi: 10.1074/jbc.M601476200
    [95] Roth HM, Romer J, Grundler V, et al. (2012) XPB helicase regulates DNA incision by the Thermoplasma acidophilum endonuclease Bax1. DNA Repair 11: 286–293. doi: 10.1016/j.dnarep.2011.12.002
    [96] Fuentes-Perez ME, Dillingham MS, Moreno-Herrero F (2013) AFM volumetric methods for the characterization of proteins and nucleic acids. Methods 60: 113–121. doi: 10.1016/j.ymeth.2013.02.005
    [97] Amidani D, Tramonti A, Canosa AV, et al. (2016) Study of DNA binding and bending by Bacillus subtilis GabR, a PLP-dependent transcription factor. Biochim Biophys Acta Gen Subj 1861: 3474–3489.
    [98] Rivetti C, Guthold M, Bustamante C (1996) Scanning force microscopy of DNA deposited onto mica: Equilibration versus kinetic trapping studied by statistical polymer chain analysis. J Mol Biol 264: 919–932. doi: 10.1006/jmbi.1996.0687
    [99] Cassina V, Manghi M, Salerno D, et al. (2016) Effects of cytosine methylation on DNA morphology: An atomic force microscopy study. Biochim Biophys Acta 1860: 1–7. doi: 10.1016/j.bbagen.2015.10.006
    [100] Scipioni A, Anselmi C, Zuccheri G, et al. (2002) Sequence-dependent DNA curvature and flexibility from scanning force microscopy images. Biophys J 83: 2408–2418. doi: 10.1016/S0006-3495(02)75254-5
    [101] Moukhtar J, Faivre-Moskalenko C, Milani P, et al. (2010) Effect of genomic long-range correlations on DNA persistence length: from theory to single molecule experiments. J Phys Chem B 114: 5125–5143. doi: 10.1021/jp911031y
    [102] Jager MD, Noort JV, Gent DCV, et al. (2001) Human Rad50/Mre11 is a flexible complex that can tether DNA ends. Mol Cell 8: 1129–1135. doi: 10.1016/S1097-2765(01)00381-1
    [103] Tessmer I, Yang Y, Zhai J, et al. (2008) Mechanism of MutS searching for DNA mismatches and signaling repair. J Biol Chem 283: 36646–36654. doi: 10.1074/jbc.M805712200
    [104] Bosch D, Campillo M, Pardo L (2003) Binding of proteins to the minor groove of DNA: What are the structural and energetic determinants for kinking a basepair step? J Comput Chem 24: 682–691. doi: 10.1002/jcc.10200
    [105] Kong MW, Liu LL, Chen XJ, et al. (2016) Single-Molecule Imaging Reveals that Rad4 Employs a Dynamic DNA Damage Recognition Process. Mol Cell 64: 376–387. doi: 10.1016/j.molcel.2016.09.005
    [106] Wang H, Yang Y, Schofield MJ, et al. (2003) DNA bending and unbending by MutS govern mismatch recognition and specificity. Proc Natl Acad Sci U. S. A 100: 14822–14827. doi: 10.1073/pnas.2433654100
    [107] Chen L, Haushalter KA, Lieber CM, et al. (2002) Direct visualization of a DNA glycosylase searching for damage. Chem Biol 9: 345–350. doi: 10.1016/S1074-5521(02)00120-5
    [108] Lamers MH, Perrakis A, Enzlin JH, et al. (2000) The crystal structure of DNA mismatch repair protein MutS binding to a G center dot T mismatch. Nature 407: 711–717. doi: 10.1038/35037523
    [109] Koroleva ON, Dubrovin EV, Yaminsky IV, et al. (2016) Effect of DNA bending on transcriptional interference in the systems of closely spaced convergent promoters. Biochim Biophys Acta 1860: 2086–2096. doi: 10.1016/j.bbagen.2016.06.026
    [110] Fronczek DN, Quammen C, Wang H, et al. (2011) High accuracy FIONA-AFM hybrid imaging. Ultramicroscopy 111: 350–355. doi: 10.1016/j.ultramic.2011.01.020
    [111] Sanchez H, Kertokalio A, van Rossum-Fikkert S, et al. (2013) Combined optical and topographic imaging reveals different arrangements of human RAD54 with presynaptic and postsynaptic RAD51-DNA filaments. P Natl Acad Sci USA 110: 11385–11390. doi: 10.1073/pnas.1306467110
    [112] Frederickx W, Rocha S, Fujita Y, et al. (2018) Orthogonal Probing of Single-Molecule Heterogeneity by Correlative Fluorescence and Force Microscopy. Acs Nano 12: 168–177. doi: 10.1021/acsnano.7b05405
    [113] Schmucker SW, Kumar N, Abelson JR, et al. (2012) Field-directed sputter sharpening for tailored probe materials and atomic-scale lithography. Nat Commun 3: 935. doi: 10.1038/ncomms1907
    [114] Pfreundschuh M, Alsteens D, Hilbert M, et al. (2014) Localizing Chemical Groups while Imaging Single Native Proteins by High-Resolution Atomic Force Microscopy. Nano Lett 14: 2957–2964. doi: 10.1021/nl5012905
    [115] Monig H, Hermoso DR, Arado OD, et al. (2016) Submolecular Imaging by Noncontact Atomic Force Microscopy with an Oxygen Atom Rigidly Connected to a Metallic Probe. Acs Nano 10: 1201–1209. doi: 10.1021/acsnano.5b06513
    [116] Senapati S, Lindsay S (2016) Recent Progress in Molecular Recognition Imaging Using Atomic Force Microscopy. Acc Chem Res 49: 503–510. doi: 10.1021/acs.accounts.5b00533
    [117] Khan Z, Leung C, Tahir BA, et al. (2010) Digitally tunable, wide-band amplitude, phase, and frequency detection for atomic-resolution scanning force microscopy. Rev Sci Instrum 81: 197.
    [118] Calo A, Eleta-Lopez A, Stoliar P, et al. (2016) Multifrequency Force Microscopy of Helical Protein Assembly on a Virus. Sci Rep 6: 21899. doi: 10.1038/srep21899
    [119] Wu D, Kaur P, Li ZM, et al. (2016) Visualizing the Path of DNA through Proteins Using DREEM Imaging. Mol Cell 61: 315–323. doi: 10.1016/j.molcel.2015.12.012
  • This article has been cited by:

    1. Soufiene Ben Othman, Faris A. Almalki, Chinmay Chakraborty, Hedi Sakli, Privacy-preserving aware data aggregation for IoT-based healthcare with green computing technologies, 2022, 101, 00457906, 108025, 10.1016/j.compeleceng.2022.108025
    2. Chinmay Chakraborty, Soufiene Ben Othman, Faris A. Almalki, Hedi Sakli, FC-SEEDA: fog computing-based secure and energy efficient data aggregation scheme for Internet of healthcare Things, 2023, 0941-0643, 10.1007/s00521-023-08270-0
    3. S. P. Rathinaeswari, V. Santhi, An Intelligent Lightweight Signing Signature and Secured Jellyfish Data Aggregation (LS3JDA) Based Privacy Preserving Model in Cloud, 2024, 42, 0288-3635, 911, 10.1007/s00354-024-00263-4
  • Reader Comments
  • © 2018 the Author(s), licensee AIMS Press. This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0)
通讯作者: 陈斌, bchen63@163.com
  • 1. 

    沈阳化工大学材料科学与工程学院 沈阳 110142

  1. 本站搜索
  2. 百度学术搜索
  3. 万方数据库搜索
  4. CNKI搜索

Metrics

Article views(6766) PDF downloads(2190) Cited by(7)

Figures and Tables

Figures(8)

Other Articles By Authors

/

DownLoad:  Full-Size Img  PowerPoint
Return
Return

Catalog