Loading [MathJax]/jax/output/SVG/jax.js
Review Special Issues

Nanoparticle-based delivery platforms for mRNA vaccine development

  • Conventional vaccines have saved millions of lives, and new vaccines have also been developed; however, an urgent need for an efficient vaccine against SARS-CoV-2 showed us that vaccine development technologies should be improved more to obtain prophylactic agents rapidly during pandemic diseases. One of the next-generation vaccine technologies is utilization of mRNA molecules encoding antigens. The mRNA vaccines offer many advantages compared to conventional and other subunit vaccines. For instance, mRNA vaccines are relatively safe since they do not cause disease and mRNA does not integrate into the genome. mRNA vaccines also provide diverse types of immune responses resulting in the activation of CD4+ and CD8+ T cells. However, utilization of mRNA molecules also has some drawbacks such as degradation by ubiquitous nucleases in vivo. Nanoparticles (NPs) are delivery platforms that carry the desired molecule, a drug or a vaccine agent, to the target cell such as antigen presenting cells in the case of vaccine development. NP platforms also protect mRNA molecules from the degradation by nucleases. Therefore, efficient mRNA vaccines can be obtained via utilization of NPs in the formulation. Although lipid-based NPs are widely preferred in vaccine development due to the nature of cell membrane, there are various types of other NPs used in vaccine formulations, such as virus-like particles (VLPs), polymers, polypeptides, dendrimers or gold NPs. Improvements in the NP delivery technologies will contribute to the development of mRNA vaccines with higher efficiency.

    Citation: Sezer Okay, Öznur Özge Özcan, Mesut Karahan. Nanoparticle-based delivery platforms for mRNA vaccine development[J]. AIMS Biophysics, 2020, 7(4): 323-338. doi: 10.3934/biophy.2020023

    Related Papers:

    [1] Lili Jiang, Sirong Chen, Yuanhui Wu, Da Zhou, Lihua Duan . Prediction of coronary heart disease in gout patients using machine learning models. Mathematical Biosciences and Engineering, 2023, 20(3): 4574-4591. doi: 10.3934/mbe.2023212
    [2] Natalya Shakhovska, Vitaliy Yakovyna, Valentyna Chopyak . A new hybrid ensemble machine-learning model for severity risk assessment and post-COVID prediction system. Mathematical Biosciences and Engineering, 2022, 19(6): 6102-6123. doi: 10.3934/mbe.2022285
    [3] Hangle Hu, Chunlei Cheng, Qing Ye, Lin Peng, Youzhi Shen . Enhancing traditional Chinese medicine diagnostics: Integrating ontological knowledge for multi-label symptom entity classification. Mathematical Biosciences and Engineering, 2024, 21(1): 369-391. doi: 10.3934/mbe.2024017
    [4] Faisal Mehmood Butt, Lal Hussain, Anzar Mahmood, Kashif Javed Lone . Artificial Intelligence based accurately load forecasting system to forecast short and medium-term load demands. Mathematical Biosciences and Engineering, 2021, 18(1): 400-425. doi: 10.3934/mbe.2021022
    [5] Xin Jing, Jungang Luo, Shangyao Zhang, Na Wei . Runoff forecasting model based on variational mode decomposition and artificial neural networks. Mathematical Biosciences and Engineering, 2022, 19(2): 1633-1648. doi: 10.3934/mbe.2022076
    [6] Pinpin Qin, Xing Li, Shenglin Bin, Fumao Wu, Yanzhi Pang . Research on transformer and long short-term memory neural network car-following model considering data loss. Mathematical Biosciences and Engineering, 2023, 20(11): 19617-19635. doi: 10.3934/mbe.2023869
    [7] Wajid Aziz, Lal Hussain, Ishtiaq Rasool Khan, Jalal S. Alowibdi, Monagi H. Alkinani . Machine learning based classification of normal, slow and fast walking by extracting multimodal features from stride interval time series. Mathematical Biosciences and Engineering, 2021, 18(1): 495-517. doi: 10.3934/mbe.2021027
    [8] Peng Lu, Ao Sun, Mingyu Xu, Zhenhua Wang, Zongsheng Zheng, Yating Xie, Wenjuan Wang . A time series image prediction method combining a CNN and LSTM and its application in typhoon track prediction. Mathematical Biosciences and Engineering, 2022, 19(12): 12260-12278. doi: 10.3934/mbe.2022571
    [9] Feng Wang, Shan Chang, Dashun Wei . Prediction of conotoxin type based on long short-term memory network. Mathematical Biosciences and Engineering, 2021, 18(5): 6700-6708. doi: 10.3934/mbe.2021332
    [10] Yu-Mei Han, Hui Yang, Qin-Lai Huang, Zi-Jie Sun, Ming-Liang Li, Jing-Bo Zhang, Ke-Jun Deng, Shuo Chen, Hao Lin . Risk prediction of diabetes and pre-diabetes based on physical examination data. Mathematical Biosciences and Engineering, 2022, 19(4): 3597-3608. doi: 10.3934/mbe.2022166
  • Conventional vaccines have saved millions of lives, and new vaccines have also been developed; however, an urgent need for an efficient vaccine against SARS-CoV-2 showed us that vaccine development technologies should be improved more to obtain prophylactic agents rapidly during pandemic diseases. One of the next-generation vaccine technologies is utilization of mRNA molecules encoding antigens. The mRNA vaccines offer many advantages compared to conventional and other subunit vaccines. For instance, mRNA vaccines are relatively safe since they do not cause disease and mRNA does not integrate into the genome. mRNA vaccines also provide diverse types of immune responses resulting in the activation of CD4+ and CD8+ T cells. However, utilization of mRNA molecules also has some drawbacks such as degradation by ubiquitous nucleases in vivo. Nanoparticles (NPs) are delivery platforms that carry the desired molecule, a drug or a vaccine agent, to the target cell such as antigen presenting cells in the case of vaccine development. NP platforms also protect mRNA molecules from the degradation by nucleases. Therefore, efficient mRNA vaccines can be obtained via utilization of NPs in the formulation. Although lipid-based NPs are widely preferred in vaccine development due to the nature of cell membrane, there are various types of other NPs used in vaccine formulations, such as virus-like particles (VLPs), polymers, polypeptides, dendrimers or gold NPs. Improvements in the NP delivery technologies will contribute to the development of mRNA vaccines with higher efficiency.


    Rainfall has played an important role in the development and maintenance of human civilizations. Rain is one of the most important sources of pure water on which humans depend for life. Rain replenishes groundwater, which is the main source of drinking water. Since more than 50% of Australia's land mass is used for agriculture, an accurate rainfall forecasting system can help farmers plan cropping operations, i.e., when to sow seeds, apply fertilizers, and harvest crops. Rainfall prediction [1] can also help farmers decide which crops to plant for maximum harvests and profits. In addition, precipitation plays an important role in the planning and maintenance of water reservoirs, such as dams that generate electricity from hydropower. About half of the renewable energy generated by more than 120 hydropower plants in Australia comes from precipitation. With accurate rainfall forecasts, operators are well informed about when to store water and when to release it to avoid flooding or drought conditions in places with low rainfall. Precipitation forecasts also play a critical role in the aviation industry, from the moment an aircraft starts its engine. An accurate precipitation forecast helps plan flight routes and suggests the right time to take off and land a flight to ensure physical and economic safety. After all, aircraft operations can be seriously affected by lightning, icing, turbulence, thunderstorm activity and more. According to [2], climate is a major factor in aviation accidents, accounting for 23% of accidents worldwide.

    Numerous studies have shown that the duration and intensity of rainfall can cause major weather-related disasters such as floods and droughts. AON's annual weather report shows that seasonal flooding in China from June to September 2020 resulted in an estimated economic loss of 35 billion and a large number of deaths [3]. In addition, rainfall also has a negative impact on the mining industry, as heavy and unpredictable rainfall can affect mining activities. For example, the Bowen Basin in Queensland hosts some of Australia's largest coal reserves. The summer rains of 2010–2011 severely impacted mining operations. An estimated 85% of coal mines in Queensland had their operations disrupted as a result (Queensland Flood Commission, 2012) [4,5]. As of May 2011, the Queensland coal mining sector had recovered only 75% of its pre-flood production and lost 5.7 billion. As a result, rainfall forecasts are becoming increasingly important in developing preventive measures to minimize the impact of such disasters.

    Predicting rainfall is challenging because it involves the study of various natural phenomena such as temperature, humidity, wind speed, wind direction, cloud cover, sunlight, and more. Therefore, accurate rainfall forecasts are critical in areas such as energy and agriculture. A report produced by Australia's National Climate Change Adaptation Research Facility examined the impacts of extreme weather events. It states that currently available weather forecasts for industry are inadequate. It lacks location information and other details that enable risk management and targeted planning. Traditional weather forecasts use various hardware parameters to predict parameters and use mathematical calculations to predict heavy rainfall, which are sometimes inaccurate and therefore cannot work effectively. The Australian Bureau of Meteorology currently uses the Australian Predictive Ocean Atmosphere Model (POAMA) forecasts to predict rainfall patterns [6]. POAMA is a standard distribution model used for many weeks to specific seasons to look at weather throughout the year. It uses surveys of ocean, atmospheric, ice, and Earth data to develop ideas for up to nine months. In this work, we use machine learning and deep learning methods [7,8] based on the analysis of complex patterns based on historical data to effectively and accurately predict the occurrence of rainfall. The application of this method requires accurate historical data, the presence of patterns that can be detected, and their continuation into the future where predictions are sought.

    Several divisive algorithms such as Random Forest [9], Naive Bayes [10], Logistic Regression [11], Decision Tree [12], XGBoost [13], and others have been studied for rainfall prediction. However, the effectiveness of these algorithms varies depending on a combination of preprocessing and data cleaning techniques, feature scaling, data normalization, training parameters, and segmentation testing, leaving room for improvement. The goal of this paper is to provide a customized set of these techniques to train machine learning [14] and deep learning [15] models that provide the most accurate results for rainfall prediction. The models are trained and tested on the Australian rainfall database using the proposed approach. The database contains records from 49 metropolitan areas over a 10-year period starting December 1, 2008. The research contributions of the proposed work are as follows:

    1) To remove outliers using Inter Quartile Range (IQR).

    2) To balance the data using Synthetic Minority Oversampling Technique (SMOTE) technique.

    3) To apply both classification and regression models which at first predict whether it rain or not and if there is rain then find the amount of rain.

    4) To apply XGBoost, Random Forest, Kernel SVM, and Long-Short Term Memory (LSTM) for the classification task.

    5) To apply Multiple Linear Regressors, XGBoost, Polynomial Regressor, Random Forest Regressor, and LSTM for the regression task.

    Luk et al. [16] addressed watershed management and flood control. The goal was to accurately predict the temporal and local distribution of rainfall and the amount of water and quality management. The dataset used to train the ML models was collected from the Upper Parramatta River Catchment Trust (UPRCT), Sydney. Three methods were used for modeling features related to rainfall prediction, namely MultiLayer FeedForward Network (MLFN), Partial Recurrent Neural Network (PRNN) and Time Delay Neural Network (TDNN). The main parameters for the above methods were lag, window size and number of hidden nodes.

    Abhishek et al. [17] worked on developing ANN -based effective and nonlinear models for accurate prediction of maximum temperature 365 days a year. The data used were from the Toronto Lester B. Pearson Int'l A station, Ontario, Canada from 1999–2009. They proposed two models and trained them using the Levenberg-Marquardt algorithm with 5 hidden layers and a model with 10 hidden layers. Factors that affected the results were the number of neurons, sampling, hidden layers, transfer function, and overfitting.

    Abhishek et al. [18] performed a regression task to predict average rainfall using a feed forward network trained with the back propagation algorithm, the layer recurrent network, and the feed forward network trained with the cascaded back propagation algorithm for a large number of neurons. The data were collected from www.Indiastat.com and the IMD website. The dataset contains records for the months of April to November from 1960 to 2010 in Udupi district of Karnataka.

    Saba et al. [19] worked on accurate weather predictions using a hybrid neural network model combining MultiLayer Perceptron (MLP) and Radial Basis Function (RBF). The dataset used was from the weather station in Saudi Arabia. They proposed an extended hybrid neural network approach and compared the results of individual neural networks with those of hybrid neural networks. The results showed that hybrid neural network models have greater learning ability and better generalization ability for certain sets of inputs and nodes.

    Biswas et al. [20] focused on the prediction of weather conditions (good or bad) using the classification method. Naive Bayes and chi-square algorithm were used for classification. The main objective was to show that data mining approaches are sufficient for weather prediction. Data was obtained in real time from users and stored in a database. The decision tree generated from the training features is used for classification.

    Basha et al. [21] introduced a Machine and Deep Learning-based rain prediction model. This model utilizes the Kaggle dataset to train various models, including the Support Vector Regressor, Autoregressive Integrated Moving Average (ARIMA), and Neural Network. The authors claim that the performance of the model, as measured by the Root Mean Squared Error (RSME), is 72%.

    Doroshenko et al. [22] worked on refining numerical weather forecasts using a neural network by error to increase the accuracy of the additional 2m weather forecasts of the regional model COSMO. The dataset was obtained from the Kiev weather station in Ukraine. The authors chose the gated recurrent unit (GRU) approach because the error in the weather forecast is a time series and it also has fewer parameters than LSTM. When a lower error history is chosen, better fitting and refinement of the model is possible.

    Appiah-Badu et al. [23] conducted a study to predict the occurrence of rainfall through classification. They employed several classification algorithms, including Decision Tree (DT), Random Forest (RF), Multilayer Perceptron (MLP), Extreme Gradient Boosting (XGB), and K-Nearest Neighbor (KNN). The data for the study was collected from the Ghana Meteorological Agency from 1980 to 2019 and was divided into four ecological zones: Coastal, Forest, Transitional, and Savannah.

    Raval et al. [24] worked on a classification task to predict tomorrow's rain using logistic regression, LDA, KNN, and many other models and compared their metrics. They used a dataset containing daily 10-year weather forecasts from most Australian weather stations. It was found that deep learning models produced the best results.

    Ridwan et al. [25] proposed a rainfall prediction model for Malaysia. The model was trained using a dataset of ten stations and employs both a Neural Network Regressor and Decision Forest Regression (DFR). The authors claim that the R2 score ranges from 0.5 to 0.9. This approach only predicts rainfall and does not perform any classification tasks.

    Adaryani et al. [26] conducted an analysis of short-term rainfall forecasting for applications in hydrologic modeling and flood warning. They compared the performance of PSO Support Vector Regression (PSO-SVR), Long Short-Term Memory (LSTM), and Convolutional Neural Network (CNN). The study considered 5-minute and 15-minute ahead forecast models of rainfall depth based on data from the Niavaran station in Tehran, Iran.

    Fahad et al. [27] conducted a study on forecasting rainfall through the use of a deep forecasting model based on Gated Recurrent Unit (GRU) Neural Network. The study analyzed 30 years of climatic data (1991–2020) in Pakistan, considering both positive and negative impacts of temperature and gas emissions on rainfall. The findings of the study have potential implications for disaster management institutions.

    Tables 1 and 2 compare various state-of-the-art approaches for classification as well as regression tasks respectively.

    Table 1.  Literature review for classification.
    Authors Dataset Approach used Best performance
    Raval et al. (2021) [24] Daily weather observations from several Australian weather stations for 10 years Logistic regression, Linear discriminant analysis, Quadratic discriminant analysis, K-Nearest neighbor, Decision tree, Gradient boosting, Random forest, Bernoulli Naïve Bayes, Deep learning model Precision = 98.26 F1-Score = 88.61
    Appiah-Badu et al. (2021) [23] Data from the 22 synoptic stations across the four ecological zones of Ghana from 1980 – 2019 Decision tree, Multilayer perceptron, Random forest, Extreme gradient boosting, K-Nearest neighbor Precision = 100 Recall = 96.03 F1-Score = 97.98

     | Show Table
    DownLoad: CSV
    Table 2.  Literature review for regression.
    Authors Dataset Approach used Limitations Best performance
    Luk et al. (2001) [16] Dataset is collected from the Upper Parramatta River Catchment Trust (UPRCT), Sydney Multi-Layer Feedforward Network (MLFN), Partial Recurrent Neural Network (PRNN), Time Delay Neural Network Only used the regression model to predict the amount of rainfall NMSE = 0.63
    Abhishek et al. (2012) [17] Data available for the station Toronto Lester B. Pearson Int'l A, Ontario, Canada, 1999-2009 Single layer model, 5 hidden layer model, 10 hidden layer model Not used any sequential model to capture the time series nature of data. MSE = 2.75
    Saba et al. (2017) [19] Data used from Saudia Arabian Weather Forecasting Station Hybrid Model (MLP+RBF) Not used any time series model, only regression model to predict the amount of rainfall Correlation coefficient = 0.95, RMSE = 146, Scatter Index = 0.61
    Basha et al. (2020) [21] Dataset chosen is the Kaggle dataset for the rainfall prediction Support vector regressor, AutoRegressive Integrated Moving Average (ARIMA), and Neural Network Trained on a small dataset, no oversampling techniques are used to increase the size of dataset RSME = 0.72

     | Show Table
    DownLoad: CSV

    Here, NMSE stands for Normalized Mean Square Error, which allows us to compare the error across sets with different value ranges. Using simple Mean Squared Error (MSE) can result in higher variance for sets with larger values, even if the variance for sets with smaller values is actually greater. For example, if set 1 contains elements with values ranging from 1–100 and set 2 contains elements with values ranging from 1000–10,000, the variance for set 2 will be higher if MSE is used, even if the variance for set 1 is actually greater. NMSE is used to compare the error across sets by dividing the entire set by the maximum value in the range, resulting in a conversion of both sets' ranges to 0-ik -1 for better comparison.

    The Dataset we used for our study contains data from daily observations over a tenure of 10 years starting from 1/12/2008 up till 26/04/2017 from 49 different locations over Australia [28]. The dataset contains 23 features which are Date, Location, MinTemp, MaxTemp, Rainfall, Evaporation, Sunshine, WindGustDir, WindGustSpeed, WindDir9am, WindDir3pm, WindSpeed9am, WindSpeed3pm, Humidity9am, Humidity3pm, Pressure9am, Pressure3pm, Cloud9am, Cloud3pm, Temp9am, Temp3pm, RainToday and RainTomorrow. The dataset contains around 145 thousand entries.

    For the classification task, RainTomorrow is the target variable that predicts the occurrence of rainfall on the next day. Here, 0 indicates no rain, and 1 indicates chance of rainfall. For the regression task, Rainfall is the target variable that forecasts the amount of precipitation in millimeters. We performed exploratory data analysis on the dataset, which is the key to machine learning problems in order to gain maximum confidence in the validity of future results. This analysis helps us to look for anomalies in the data, figure out correlations between features and check for missing values to enhance the outcomes of the machine learning models. Table 3 presents the analysis of null values in the raw dataset. Here, it is visible that most of the attributes contain null values which need to be addressed carefully before passing the data to train the model, otherwise the model will not give accurate predictions.

    Table 3.  Null values of attributes in the dataset.
    Attribute Null Value Attribute Null value Attribute Null value
    Date 0.0% missing values WindGustSpeed 7.06% missing values Pressure3pm 10.33% missing values
    Location 0.0% missing values WindDir9am 7.26% missing values Cloud9am 38.42% missing values
    MinTemp 1.02% missing values WindDir3pm 2.91% missing values Cloud3pm 40.81% missing values
    MaxTemp 0.87% missing values WindSpeed9am 1.21% missing values Temp9am 1.21% missing values
    Rainfall 2.24% missing values WindSpeed3pm 2.11% missing values Temp3pm 2.48% missing values
    Evaporation 43.17% missing values Humidity9am 1.82% missing values RainToday 2.24% missing values
    Sunshine 48.01% missing values Humidity3pm 3.1% missing values RainTomorrow 2.25% missing values
    WindGustDir 7.1% missing values Pressure9am 10.36% missing values

     | Show Table
    DownLoad: CSV

    Figure 1 presents a correlation matrix that states the correlation coefficient between two features, i.e., how much the features are correlated to each other. The scale of the correlation matrix is from –1 to 1, where 1 represents the perfect positive relationship between the two factors and –1 represents the total negative relationship between the two factors. A correlation coefficient of 0 represents an absence of a relation between the two variables.

    Figure 1.  Correlation matrix for various features in the dataset.

    From the analysis of the null values, it appears that the attributes Evaporation, Sunshine, Cloud9am, and Cloud3pm contain almost 50% NaN values. Therefore, we have discarded these 4 columns and did not use them for training our model. This is because even if we use one of the available techniques to populate the data, it might differ from the performance of the model. The actual weather data may not match the padded data, which could affect the learning process of the model. Figure 1 presents a correlation matrix that shows the correlation coefficient between two features. As we removed four features from our dataset, they will not be included in our correlation matrix calculation. The Date feature is categorized into its respective months, and the month feature is taken into consideration for better season-wise categorization of data and the study of correlation. The reason for ignoring the Location feature is that the dataset is already classified into various locations. Therefore, finding a correlation between Location and Date is not beneficial in the data analysis. Additionally, the feature RainToday is ignored because the Rainfall feature, which attributes RainToday, is already included in the study of correlation.

    Figure 2 displays the distribution of numerical data based on the 0th percentile, 25th percentile (1st quartile), 50th percentile (2nd quartile), 75th percentile (3rd quartile), and 100th percentile. This distribution provides insight into the presence of outliers, which must be eliminated prior to training our predictive model to achieve accurate results. In order to analyze the distribution of data with regards to various quartiles, the data must be continuous. Features such as Location, RainTomorrow, WindDir3pm, WindDir9am, WindGustDir, etc. are categorical features, therefore using a boxplot to remove outliers is only feasible with the use of continuous features for data analysis. As a result, we have used only 10 features that are continuous and have the potential for having outliers.

    Figure 2.  Distribution of data points w.r.t. quartile.

    Due to various factors such as global warming, deforestation, etc., affecting seasonal variables during the year, uncertainty in rainfall has become one of the most discussed topics among researchers. Therefore, the main objective of this work is to apply different techniques of data preprocessing, machine learning and deep learning models: 1) Data pre-processing to remove uncertainties and anomalies in the provided dataset. 2) Forecasting the occurrence of rainfall. 3) Projecting the amount of rainfall in millimeters. 4) Comparing the results of various models used for classification and regression purposes.

    The comparison of various algorithms for the same task gives us more insights into the problem statement and helps us make decisions regarding the best model to be used for rainfall forecasting. Figure 3 illustrates the flow diagram of the proposed methodology.

    Figure 3.  Flow diagram of the proposed approach.

    Data processing is a method of data mining that refers to the cleaning and modification of raw data collected from various sources that are suitable for the work and provide more favorable results.

    As it is evident from Figure 4 that our data contains several outliers. Thus, we employed the Inter Quartile Range (IQR) approach to remove the outliers. IQR is basically the range between the 1st and the 3rd quartile, i.e., the 25th and 75th percentile. In this approach, the data point which falls below (Q1 – 1.5 * IQR) and above (Q1 + 1.5 * IQR) are considered outliers. After removing all the outliers approximately 30 thousand rows were removed. Figure 4 represents the IQR approach employed for removing the outliers [29].

    Figure 4.  IQR approach for data cleaning.

    Figures 5 and 6 show the distribution of normalized values and IQR range plot after removing the outliers from the dataset.

    Figure 5.  Distribution of normalized cleaned data points w.r.t. quartiles.
    Figure 6.  Distribution of cleaned data points w.r.t. quartiles.

    From the analysis of the null values, it appears that the attributes evaporation, sunshine, cloud9am and cloud3pm contain almost 50% NaN values. Therefore, we discarded these columns and did not use them for training our model. This is because even if we use one of the available techniques to populate the data, it might differ from the performance of the model. The actual weather data may not match that of the padded data, which could affect the learning process of the model.

    For the remaining attributes, we filled the numeric features with the mean value of the attribute and the categorical values with the mode of each feature. However, because location and seasons also play an important role in measuring the attributes, we divided the data set into 4 seasons, namely summer (January to March), fall (March to June), winter (June to September), and spring (September to December). We then averaged the values using the month and location attributes to group the data by season and location and populated the NaN values using a similar pairwise approach. Similarly, for categorical data, the maximum occurrence of the location-season pair value is used to populate the NaN values.

    For the Rainfall, RainToday, and RainTomorrow attributes, we took a different approach to handling NaN values. For the Rainfall attribute, we replaced the NaN values with 0. If NaN values were filled with mean values, the model could not generalize better. For the RainToday and RainTomorrow features, we omitted the rows with NaN values because if we fill them with the most common class values, it could affect classification precision.

    To train the model containing categorical features, it needs to be converted into a numerical format. For features RainToday and RainTomorrow, we used LabelEncoder that replaced the values Yes and No with 1 and 0 respectively. We could use LabelEncoder to convert directional features into numerical format but for better generalization, we replaced direction with their respective degree value.

    'N': 0, 'NNE': 22.5, 'NE': 45.0, 'ENE': 67.5, 'E': 90.0, 'ESE': 112.5, 'SE': 135.0, 'SSE': 157.5, 'S': 180.0, 'SSW': 202.5, 'SW': 225.0, 'WSW': 247.5, 'W': 270.0, 'WNW': 292.5, 'NW': 315.0, 'NNW': 337.5

    An important factor affecting the performance of the model is the imbalance of the output classes. If the ratio of the values of the two classes is not close to 1, the model will be biased in favor of the class whose values matter more than those of the others. One of the simplest and most effective solutions is to oversample for class imbalance using SMOTE (Synthetic Minority Oversampling Technique) [30]. Originally, the ratio of class 0 to class 1 frequencies was about 5:1. Because of this, the performance of the model was not better with unseen data. Figure 7(a), (b) show the bar graph of the number of values of both classes before and after balancing the classes. Class equalization is performed only for training classification models.

    Figure 7.  Target variable distribution (a) Before class balancing (b) After class balancing.

    Scaling the data is very important in performing a regression task. By scaling our variables, we can compare different variables with the same calibration. We used a standard scaler to normalize our characteristics. The standard scaler converts the feature values to a range of –3 to 3. Equation (1) represents the equation that the standard scaler uses to scale the values.

    z=(xiμ)σ (1)

    After conducting exploratory data analysis and data cleaning, the dataset comprises 110 k rows and 19 features. The approach is divided into two parts: 1) forecasting the occurrence of rainfall and 2) estimating the amount of rainfall. For forecasting the occurrence of rainfall, which is a classification task, the training data consists of approximately 147 k rows and the testing data consists of 36 k rows. The number of rows is higher than the actual dataset because the SMOTE technique was applied to oversample the data and balance both classes for classification tasks. For predicting the amount of rainfall, which is a regression task, the training data consists of 87 k rows and the testing data consists of 22 k rows. In this approach, we did not use any oversampling technique for regression.

    The task here is to predict the occurrence of precipitation in two classes, i.e., whether it will rain tomorrow or not. The above task is to implement a classification approach using various features and their corresponding target values from a given dataset. The classification approach is divided into three parts as: 1) Preparing data for classification. 2) Fitting the training data to train a classification model, and 3) Evaluating the model performance. Figures 8 and 9 show flow of the overall implementation of the classification approach. The flow of the overall implementation of the classification approach.

    Figure 8.  Preparing data for classification.
    Figure 9.  Fitting data for classification.

    For forecasting the occurrence of rainfall, we have implemented four classification models as follows:

    XGBoost Classifier: XGBoost [31] stands for eXtreme Gradient Boosting which is a fast and effective boosting algorithm based on gradient boosted decision tree algorithm. XGBoost uses a finer regularization technique, Shrinkage, and Subsampling column to prevent over-submerging, and this is one of the differences in gradient development. The XGBoost classifier takes a 2-dimensional array of training features and target variables as input for training the classification model.

    Random Forest Classifier: Random forest or random decision forest [32] is an integrated learning method for classification, regression, and other activities that work by building a pile of decision trees during training. It basically creates a set of decision trees from a randomly selected small set of the training set and then collects votes from different decision trees to determine the final forecast. For classification tasks, random forest clearing is a class selected by many trees. Random Forest classifier also takes a 2-dimensional array of training features and target variables as input for training the classification model.

    Kernel SVM Classifier: Support Vector Machines [33] are supervised learning models with associated learning algorithms that analyze data for classification and regression analysis. The main motivation of SVM is to create the best decision-making limit that can divide two or more classes so that we can accurately place data points in the correct class for which various kernels are used. We have chosen the Gaussian Radial Basis Function (RBF) as the kernel for training our SVM model as rainfall data is non-linear data. Equation (2) represents the gaussian radial basis function used by the support vector machine.

    F(x,xj)=exp(y||xxj||2 (2)

    LSTM Classifier: An LSTM or Long-Short-Term-Memory classifier [34] is an artificial recurrent neural network that has both feedforward communication and feedback, and is often used to classify and make predictions over time-series data. For training the LSTM classifiers a different data format needs to be supplied in which the data must first be converted to a 3-D array according to the provided format: (number of samples, time steps, number of features). To prevent overfitting and visualize the training progress, callbacks are passed as parameters while training the prediction model. In our approach we have used two callbacks that are:

    ReduceLRonPlateau: It reduces the learning rate by 'factor' times passed as an argument if the metric has stopped improving for the 'patience' number of epochs. Reducing the learning rate often benefits the model.

    EarlyStopping: This will stop training if the monitored metric has stopped improving for the 'patience' number of epochs.

    Figure 10(a), (b) show the layout of the LSTM neural network trained and unrolled RNN for a timestamp of 15 days respectively.

    Figure 10.  The layout of the LSTM neural network trained and unrolled RNN for a timestamp. (a) Sequential LSTM model, and (b) Unrolled RNN for the timestamp of 15 days.

    To predict the occurrence of rainfall tomorrow, the designed algorithm is then shown in Algorithm 1.

     

    Algorithm 1: Algorithm for Classification
    Input: Rainfall Forecasting Dataset
    I = ['MinTemp', 'MaxTemp', 'Rainfall', 'WindGustDir', 'WindGustSpeed', 'WindDir9am', 'WindDir3pm', 'WindSpeed9am', 'WindSpeed3pm', 'Humidity9am, Humidity3pm', 'Pressure9am', 'Pressure3pm', 'Temp9am', 'Temp3pm', 'RainToday', 'RainTomorrow']
    Output: Yes, No
    1 Preprocess the input data and divide it into features and targets.
    2 Balance output classes using SMOTE.
    3 Scale the data using Standard Scaler.
    4 Define classification model.
    5 Train the classifier according to the defined parameters.

    Classification is one of the steps in precipitation forecasting. It tells you whether it will be a sunny day or whether you will need an umbrella throughout the day, since the forecast only says that it will rain at any time. However, another important aspect is that predicting the amount of rainfall based on features of the flow is a good prediction because it helps us make decisions such as whether to leave the station or stay home because it will rain heavily, whether to hold back the water stored in the reservoir or release some water from the reservoir because heavy rain is predicted in the watershed, and more. Therefore, this part deals with different regression techniques that are used to forecast the amount of rainfall in millimeters.

    The regression task is also divided into three parts:

    1) Preparing data for the regression model.

    2) Training a regression model.

    3) Evaluating the regression model.

    For projecting the amount of rainfall, we have implemented various regression algorithms including:

    ● Multiple Linear Regression

    ● XGBoostRegressor

    ● Polynomial Regression

    ● Random Forest Regression

    ● LSTM-based Deep Learning Model

    Multiple Linear Regression: Linear Regression [35] is a supervised learning to perform a regression task. This regression method detects the linear relationship between the input features and the target variable.

    y^=b0+b1x1+b2x2+...+bkxk (3)

    Equation (3) is the statistical equation used for prediction by the Multiple Linear regression algorithm. In order to achieve the best fit line the model aims to predict ŷ so that the difference in error between the predicted value and the real value is as small as shown in Eq (4).

    mininmize1ni=1n(prediyi)2 (4)

    XGBoostRegressor: XGBoost [36] stands for eXtreme Gradient Boosting which is a fast and effective boosting algorithm based on gradient boosted decision tree algorithm. XGBoost's objective function consists of a loss function and a regularization term. It tells us about the difference between real values and predicted values, i.e, how far the model results are from the real values. We used reg: linear as the XGBoost loss functions to perform the regression task.

    Polynomial Regression: Polynomial regression [37] is a special case of linear regression in which the correlation between the independent variable x and the target variable y is modeled as an nth polynomial degree of x. This regression technique is used to identify a curvilinear relationship between independent and dependent variables.

    y^=b0+b1x1+b2x12+b3x13+...+bnx1n (5)

    Equation (5) represents the statistical equation used for prediction by the Polynomial regression algorithm. In order to achieve the best fit line, the model aims to predict ŷ so that the difference in error between the predicted value and the real value is as small as shown in Eq (4).

    Random Forest Regression: Random Forest or random decision forest [38] is an integrated learning method for classification, regression and other activities that work by building a pile of decision trees during training. Random Forest has many decision trees as base learning models. The end results of the random forest is the mean of all the results of decision trees.

    LSTM based Deep Learning Model: An LSTM or Long-Short-Term-Memory classifier [39] is an artificial recurrent neural network that has both feedforward communication and feedback. LSTM for regression is typically a time series problem. For training the LSTM regression model data must first be converted to 3-D array according to the provided format: (number of samples, time steps, number of features). To prevent overfitting and visualize [40] the training progress, callbacks are passed as parameters while training the prediction model. In our approach we have used two callback that are:

    ReduceLRonPlateau: It reduces learning rate by "factor" times passed as argument if metric has stopped improving for "patience" number of epochs. Reducing learning rate oftens benefits the model.

    EarlyStopping: This will stop training if the monitored metric has stopped improving for "patience" number of epochs.

    Algorithm 2 is used for forecasting the amount of precipitation, as is given here.

     

    Algorithm 2: Algorithm for Regression
    Input: Rainfall Amount Prediction Dataset
    I = ['MinTemp', 'MaxTemp', 'Rainfall', 'WindGustDir', 'WindGustSpeed', 'WindDir9am', 'WindDir3pm', 'WindSpeed9am', 'WindSpeed3pm', 'Humidity9am, Humidity3pm', 'Pressure9am', 'Pressure3pm', 'Temp9am', 'Temp3pm', 'RainToday']
    Output: Amount of precipitation in millimeters
    1 Preprocess the input data and divide into features and target.
    2 Scale the data using Standard Scaler.
    3 Scale the data using Standard Scaler.
    4 Define regression model.
    5 Train regressor according to defined parameters.

    There are numerous evaluation metrics that can be used for measuring model performance. In our paper we have evaluated our machine learning and deep learning models on confusion matrix, accuracy, precision, recall and F1-score for classification models and mean absolute error, mean squared error and r2 score for regression models.

    For Classification

    Confusion Matrix: Confusion matrix yields the output of a classification model in a matrix format. The matrix is defined as shown in Table 4.

    Table 4.  Confusion matrix.
    Predicted Values
    Positive Negative
    Actual Values Positive TP FN
    Negative FP TN

     | Show Table
    DownLoad: CSV

    A list of evaluation metrics used for evaluating trained classifiers is stated in Eqs (6)–(9).

    Accuracy=No.ofcorrectpredictionsTotalno.ofpredictions (6)
    Precision=TruePositiveTruePositive+FalsePositive (7)
    Recall=TruePositiveTruePositive+FalseNegative (8)
    F1=2(PrecisionRecall)(Precision+Recall) (9)

    XGBoost Classifier: Figure 11 and Table 5 represent the confusion matrix and the classification report for the XGBoost classifier. The classification report shows that the precision, recall, and f1-score for both the classes, i.e., rain and no rain are 96, 89, 92%, and 89, 96, 93% respectively.

    Figure 11.  Confusion matrix for XGBoost Classifier.
    Table 5.  Classification report for XGBoost Classifier.
    Precision Recall F1-Score Support
    0 89% 96% 93% 18238
    1 96% 89% 92% 18714
    Accuracy 92% 36952
    Macro Avg 93% 92% 92% 36952
    Weighted Avg 93% 92% 92% 36952

     | Show Table
    DownLoad: CSV

    Kernel SVM: Figure 12 and Table 6 represent the confusion matrix and the classification report for the Support Vector Machine Classifier with a radial basis kernel function. The classification report shows that the precision, recall, and f1-score for both the classes, i.e., rain and no rain are 81, 82, 92%, and 82, 81, 81% respectively.

    Figure 12.  Confusion matrix for SVM Classifier.
    Table 6.  Classification report for SVM Classifier.
    Precision Recall F1-Score Support
    0 82% 81% 81% 18467
    1 81% 82% 82% 18485
    Accuracy 81% 36952
    Macro Avg 81% 81% 81% 36952
    Weighted Avg 81% 81% 81% 36952

     | Show Table
    DownLoad: CSV

    LSTM Classifier: Figure 13 and Table 7 represent the confusion matrix and the classification report for the LSTM classifier. The classification report shows that the precision, recall, and f1-score for both the classes, i.e., rain and no rain are 86, 87, 86%, and 87, 85, 86% respectively.

    Figure 13.  Confusion matrix for LSTM Classifier.
    Table 7.  Classification report for LSTM Classifier.
    Precision Recall F1-Score Support
    0 87% 85% 86% 18531
    1 86% 87% 86% 18531
    Accuracy 86% 37062
    Macro Avg 86% 86% 86% 37062
    Weighted Avg 86% 86% 86% 37062

     | Show Table
    DownLoad: CSV

    Random Forest Classifier: Figure 14 and Table 8 represent the confusion matrix and the classification report for the Random Forest classifier. The classification report shows that the precision, recall, and f1-score for both the classes, i.e., rain and no rain are 92, 91, 91%, and 91, 92, 91% respectively.

    Figure 14.  Confusion matrix for Random Forest Classifier.
    Table 8.  Classification report for Random Forest Classifier.
    Precision Recall F1-Score Support
    0 91% 92% 91% 18467
    1 92% 91% 91% 18485
    Accuracy 91% 36952
    Macro Avg 91% 91% 91% 36952
    Weighted Avg 91% 91% 91% 36952

     | Show Table
    DownLoad: CSV

    Table 9 and Figure 15 represent the comparison of the evaluation results of the employed classification models. It is visible that the XGBoost classifier surpasses all the other classifiers with an accuracy (92.2%), precision (95.6%), and F1-Score (91.9%). However, Random Forest Classifier provided the best recall (91.2%) over the other classifiers. On the other hand, Kernel SVM with Radial Basis Function performed the worst among the four classifiers with an accuracy (81.4%), precision (80.9%), recall (82.1%), and F1-Score (81.5%), respectively.

    Table 9.  Evaluation results for Classification.
    Approach Accuracy Precision Recall F1-Score
    XGBoost Classifier 92.2% 95.61% 88.4% 91.87%
    Random Forest Classifier 91.3% 91.42% 91.16% 91.29%
    LSTM Classifier 85.84% 85.81% 85.88% 85.85%
    Kernel SVM Classifier (rbf kernel) 81.44% 80.86% 82.06% 81.45%

     | Show Table
    DownLoad: CSV
    Figure 15.  Comparing Evaluation Results for Classification.

    From Table 10, we can confirm that our classification method is significantly superior to the various modern methods that use the Australian Kaggle rain dataset in terms of accuracy.

    Table 10.  State-of-the-art Results.
    State-of-the-Art Approach Best Accuracy
    Oswal (2019) [41] 84%
    He (2021) [42] 82%

     | Show Table
    DownLoad: CSV

    After undergoing several data processing techniques, a cleaned dataset with approximately 110 thousand rows is utilized for the classification task. Upon analyzing the class imbalance in the "RainTomorrow" feature, it was found to be highly skewed towards the "No" rainfall class, with a 9:1 ratio of "No" to "Yes" rainfall values. Models trained on this highly skewed data produced precision and accuracy values between 0.80 to 0.85. To address this imbalance, we used the SMOTE (Synthetic Minority Oversampling Technique) to balance both classes. This increased the data from 110 k rows to 183 k rows. The data was then divided into training and testing sets accordingly. The precision was improved to 95% and accuracy increased to 92% after using a balanced data set. This improvement in accuracy is attributed to the use of optimized data preprocessing and cleaning techniques, feature scaling techniques, data normalization techniques, training parameters, and train-test split ratios.

    For Regression

    The evaluation metrics employed for evaluating the trained regression models are Mean Absolute Error (MAE), Mean Squared Error (MSE), and R2 Score (R2) as stated in Eqs (10)–(12).

    MAE=(1n)i=1n|yiy^i| (10)
    MSE=(1n)i=1n(yiy^i)2 (11)
    R2=1SSResSSTOT (12)

    Table 11 and Figure 16(a), (b) present the evaluation results of Multiple Linear regression, XGBoostRegressor, Polynomial regression, Random Forest regression, and LSTM-based deep learning model.

    Table 11.  Evaluation Results for Regression.
    Approach MAE MSE R2 Score
    Multiple Linear Regression 0.124 0.039 0.754
    XGBoost Regression 0.121 0.04 0.737
    Polynomial Regression 0.120 0.039 0.746
    Random Forest Regression 0.117 0.036 0.760
    LSTM based deep learning model 0.104 19.83 0.70

     | Show Table
    DownLoad: CSV
    Figure 16.  Comparing (a) MAE, R2 Score (b) MSE for Regression.

    Here, it is observed that the Random Forest regressor outperformed all the other regression models with a mean absolute error of 0.117, mean squared error of 0.036, and R2 score of 0.76. On the other hand, the LSTM based deep learning model performed the worst among the five regression models with a mean absolute error of 0.104, mean squared error of 19.83, and R2 score of 0.70.

    Novelty and discussions: Data processing is a critical aspect of building a machine learning or deep learning model. Instead of filling missing values with the mean or mode of the entire dataset, the proposed solution uses seasonal and location-based data filling to fill numeric values with the mean and categorical values with the mode. LSTM-based models are often considered to be the best for modeling relationships in time series data. However, in the proposed method, the ensemble learning-based random forest model outperforms the LSTM model in both the classification and regression tasks. Random forest leverages class votes from each decision tree it grows, making it less susceptible to the impact of an inconsistent dataset and less prone to overfitting. In contrast, neural network models require more consistent data to make accurate predictions and may not perform well with inconsistent datasets.

    Limitations: Data collection is a major obstacle to accurate rainfall forecasting. Real-time weather data is hard to obtain and must be gathered from multiple meteorological stations, resulting in inconsistent and abnormal data due to incompatible data types and measurements. Therefore, we dropped the "Evaporation", "Sunshine", "Cloud9am", and "Cloud3pm" columns while handling NaN values as they each had 50% NaN values. Although these features can have a strong correlation with the rainfall value.

    In this work, we implemented different machine learning and deep learning models for predicting the occurrence of rainfall the next day and for predicting the amount of rainfall in millimeters. We used the Australian rainfall dataset in this work. The dataset contains weather data from 49 locations on the Australian continent. In this work, we managed to obtain more accurate results than various state-of-the-art approaches. We achieved an accuracy of 92.2%, precision of 95.6%, F1 score of 91.9%, and recall of 91.1% for next day rainfall prediction. For the prediction of the amount of precipitation in millimeters, we obtained a mean absolute error of 0.117, a mean square error of 0.036, and an R2 value of 0.76. To obtain the above results, we applied several data preprocessing techniques, such as. analyzing null values, populating null values with seasonal and location-specific values, removing outliers using the interquartile range approach, selecting features by analyzing the correlation matrix, converting categorical values to numerical values to use the data for training the predictive model, balancing the class of target variables for the classification task, and normalizing the data using standard scalars for the regression task. We also compared different statistical machine learning and deep learning models for both the classification and regression tasks. This work uses publicly available datasets for training classification and regression models. Satellite and radar data can be used for training models and predicting rainfall in real time.

    In the future, further robustness can be achieved with the use of more recent and accurate data collected from meteorological departments. Incorporating additional features, such as "time of rainfall", "time of strongest wind gusts", "relative humidity at two points of time in a day" and "atmospheric pressure at sea level", could greatly enhance the model. These additional features are highly correlated with the "RainTomorrow" and "Rainfall" features. If the proposed model is trained with more features, it could lead to an increase in model performance. We would also like to work on transfer learning models to get better results.

    This work is partially supported by Western Norway University of Applied Sciences, Bergen, Norway.

    The authors declare there is no conflict of interest.



    Conflict of interest



    The authors declare no conflict of interest.

    [1] Sharp PA (2009) The centrality of RNA. Cell 136: 577-580.
    [2] Cooper TA, Wan L, Dreyfuss G (2009) RNA and disease. Cell 136: 777-793.
    [3] Fry LE, Patrício MI, Williams J, et al. (2019) Association of messenger RNA level with phenotype in patients with choroideremia: potential implications for gene therapy dose. JAMA Ophthalmol 138: 128-135.
    [4] Li B, Zhang X, Dong Y (2019) Nanoscale platforms for messenger RNA delivery. Wires Nanomed Nanobi 11: e1530.
    [5] Midoux P, Pichon C (2015) Lipid-based mRNA vaccine delivery systems. Expert Rev Vaccines 14: 221-234.
    [6] Dannull J, Haley NR, Archer G, et al. (2013) Melanoma immunotherapy using mature DCs expressing the constitutive proteasome. J Clin Invest 123: 3135-3145.
    [7] Van Lint S, Heirman C, Thielemans K, et al. (2013) mRNA: From a chemical blueprint for protein production to an off-the-shelf therapeutic. Hum Vacc Immunother 9: 265-274.
    [8] Yang J, Arya S, Lung P, et al. (2019) Hybrid nanovaccine for the co-delivery of the mRNA antigen and adjuvant. Nanoscale 11: 21782-21789.
    [9] Le TT, Andreadakis Z, Kumar A, et al. (2020) The COVID-19 vaccine development landscape. Nat Rev Drug Discov 19: 305-306.
    [10] Geall AJ, Mandl CW, Ulmer JB (2013) RNA: the new revolution in nucleic acid vaccines. Semin Immunol 25: 152-159.
    [11] Pardi N, Hogan MJ, Porter FW, et al. (2018) mRNA vaccines—a new era in vaccinology. Nat Rev Drug Discov 17: 261-279.
    [12] Pascolo S (2015) The messenger's great message for vaccination. Expert Rev Vaccines 14: 153-156.
    [13] Deering RP, Kommareddy S, Ulmer JB, et al. (2014) Nucleic acid vaccines: prospects for non-viral delivery of mRNA vaccines. Expert Opin Drug Deliv 11: 885-899.
    [14] Liu MA (2010) Immunologic basis of vaccine vectors. Immunity 33: 504-515.
    [15] Jäschke A, Helm M (2003) RNA sex. Chem Biol 10: 1148-1150.
    [16] Pollard C, Rejman J, De Haes W, et al. (2013) Type I IFN counteracts the induction of antigen-specific immune responses by lipid-based delivery of mRNA vaccines. Mol Ther 21: 251-259.
    [17] Vallazza B, Petri S, Poleganov MA, et al. (2015) Recombinant messenger RNA technology and its application in cancer immunotherapy, transcript replacement therapies, pluripotent stem cell induction, and beyond. Wiley Interdiscip Rev RNA 6: 471-499.
    [18] Gómez-Aguado I, Rodríguez-Castejón J, Vicente-Pascual M, et al. (2020) Nanomedicines to deliver mRNA: State of the art and future perspectives. Nanomaterials 10: 364.
    [19] Versteeg L, Almutairi MM, Hotez PJ, et al. (2019) Enlisting the mRNA vaccine platform to combat parasitic infections. Vaccines 7: 122.
    [20] Hekele A, Bertholet S, Archer J, et al. (2013) Rapidly produced SAM® vaccine against H7N9 influenza is immunogenic in mice. Emerg Microbes Infect 2: e52.
    [21] Lindgren G, Ols S, Liang F, et al. (2017) Induction of robust B cell responses after influenza mRNA vaccination is accompanied by circulating hemagglutinin-specific ICOS+ PD-1+ CXCR3+ T follicular helper cells. Front Immunol 8: 1539.
    [22] Luo F, Zheng L, Hu Y, et al. (2017) Induction of protective immunity against Toxoplasma gondii in mice by nucleoside triphosphate hydrolase-II (NTPase-II) self-amplifying RNA vaccine encapsulated in lipid nanoparticle (LNP). Front Microbiol 8: 605.
    [23] Michel T, Golombek S, Steinle H, et al. (2019) Efficient reduction of synthetic mRNA induced immune activation by simultaneous delivery of B18R encoding mRNA. J Biol Eng 13: 40.
    [24] Appay V, Douek DC, Price DA (2008) CD8+ T cell efficacy in vaccination and disease. Nat Med 14: 623-628.
    [25] Pardi N, Hogan MJ, Naradikian MS, et al. (2018) Nucleoside-modified mRNA vaccines induce potent T follicular helper and germinal center B cell responses. J Exp Med 215: 1571-1588.
    [26] Zarghampoor F, Azarpira N, Khatami SR, et al. (2019) Improved translation efficiency of therapeutic mRNA. Gene 707: 231-238.
    [27] Kowalski PS, Rudra A, Miao L, et al. (2019) Delivering the messenger: Advances in technologies for therapeutic mRNA delivery. Mol Ther 27: 710-728.
    [28] Reichmuth AM, Oberli MA, Jaklenec A, et al. (2016) mRNA vaccine delivery using lipid nanoparticles. Ther Deliv 7: 319-334.
    [29] Lundstrom K (2009) Alphaviruses in gene therapy. Viruses 1: 13-25.
    [30] Chira S, Jackson CS, Oprea I, et al. (2015) Progresses towards safe and efficient gene therapy vectors. Oncotarget 6: 30675-30703.
    [31] Ku SH, Jo SD, Lee YK, et al. (2016) Chemical and structural modifications of RNAi therapeutics. Adv Drug Deliv Rev 104: 16-28.
    [32] Presnyak V, Alhusaini N, Chen YH, et al. (2015) Codon optimality is a major determinant of mRNA stability. Cell 160: 1111-1124.
    [33] Thess A, Grund S, Mui BL, et al. (2015) Sequence-engineered mRNA without chemical nucleoside modifications enables an effective protein therapy in large animals. Mol Ther 23: 1456-1464.
    [34] Wojtczak BA, Sikorski PJ, Fac-Dabrowska K, et al. (2018) 5′-phosphorothiolate dinucleotide cap analogues: Reagents for messenger RNA modification and potent small-molecular inhibitors of decapping enzymes. J Am Chem Soc 140: 5987-5999.
    [35] Li B, Luo X, Dong Y (2016) Effects of chemically modified messenger RNA on protein expression. Bioconjug Chem 27: 849-853.
    [36] Li M, Zhao M, Fu Y, et al. (2016) Enhanced intranasal delivery of mRNA vaccine by overcoming the nasal epithelial barrier via intra- and paracellular pathways. J Control Release 228: 9-19.
    [37] Svitkin YV, Cheng YM, Chakraborty T, et al. (2017) N1-methyl-pseudouridine in mRNA enhances translation through eIF2a-dependent and independent mechanisms by increasing ribosome density. Nucleic Acids Res 45: 6023-6036.
    [38] Oberg AL, Kennedy RB, Li P, et al. (2011) Systems biology approaches to new vaccine development. Curr Opin Immunol 23: 436-443.
    [39] Auffan M, Rose J, Bottero JY, et al. (2009) Towards a definition of inorganic nanoparticles from an environmental, health and safety perspective. Nat Nanotechnol 4: 634-641.
    [40] Treuel L, Jiang X, Nienhaus GU (2013) New views on cellular uptake and trafficking of manufactured nanoparticles. J R Soc Interface 10: 20120939.
    [41] Ulkoski D, Bak A, Wilson JT, et al. (2019) Recent advances in polymeric materials for the delivery of RNA therapeutics. Expert Opin Drug Deliv 16: 1149-1167.
    [42] Pérez-Ortín JE, Alepuz P, Chávez S, et al. (2013) Eukaryotic mRNA decay: Methodologies, pathways, and links to other stages of gene expression. J Mol Biol 425: 3750-3775.
    [43] Pati R, Shevtsov M, Sonawane A (2018) Nanoparticle vaccines against infectious diseases. Front Immunol 9: 2224.
    [44] Means TK, Hayashi F, Smith KD, et al. (2003) The Toll-like receptor 5 stimulus bacterial flagellin induces maturation and chemokine production in human dendritic cells. J Immunol 170: 5165-5175.
    [45] Boraschi D, Italiani P, Palomba R, et al. (2017) Nanoparticles and innate immunity: new perspectives on host defence. Semin Immunol 34: 33-51.
    [46] Chen YS, Hung YC, Lin WH, et al. (2010) Assessment of gold nanoparticles as a size-dependent vaccine carrier for enhancing the antibody response against synthetic foot-and-mouth disease virus peptide. Nanotechnology 21: 195101.
    [47] Wang T, Zou M, Jiang H, et al. (2011) Synthesis of a novel kind of carbon nanoparticle with large mesopores and macropores and its application as an oral vaccine adjuvant. Eur J Pharm Sci 44: 653-659.
    [48] Xu L, Liu Y, Chen Z, et al. (2012) Surface-engineered gold nanorods: promising DNA vaccine adjuvant for HIV-1 treatment. Nano Lett 12: 2003-2012.
    [49] Tao W, Gill HS (2015) M2e-immobilized gold nanoparticles as influenza A vaccine: role of soluble M2e and longevity of protection. Vaccine 33: 2307-2315.
    [50] Li X, Deng X, Huang Z (2001) In vitro protein release and degradation of poly-d-L-lactide-poly(ethylene glycol) microspheres with entrapped human serum albumin: quantitative evaluation of the factors involved in protein release phases. Pharm Res 18: 117-124.
    [51] Chahal JS, Fang T, Woodham AW, et al. (2017) An RNA nanoparticle vaccine against Zika virus elicits antibody and CD8+ T cell responses in a mouse model. Sci Rep 7: 252.
    [52] Chahal JS, Khan OF, Cooper CL, et al. (2016) Dendrimer-RNA nanoparticles generate protective immunity against lethal Ebola, H1N1 influenza, and toxoplasma gondii challenges with a single dose. Proc Natl Acad Sci U S A 113: E4133-E4142.
    [53] Sharifnia Z, Bandehpour M, Hamishehkar H, et al. (2019) In-vitro transcribed mRNA delivery using PLGA/PEI nanoparticles into human monocyte-derived dendritic cells. Iran J Pharm Res 18: 1659-1675.
    [54] Uchida S, Kinoh H, Ishii T, et al. (2016) Systemic delivery of messenger RNA for the treatment of pancreatic cancer using polyplex nanomicelles with a cholesterol moiety. Biomaterials 82: 221-228.
    [55] Kaczmarek JC, Patel AK, Kauffman KJ, et al. (2016) Polymer-lipid nanoparticles for systemic delivery of mRNA to the lungs. Angew Chem Int Ed Engl 55: 13808-13812.
    [56] Patel AK, Kaczmarek JC, Bose S, et al. (2019) Inhaled nanoformulated mRNA polyplexes for protein production in lung epithelium. Adv Mater 31: e1805116.
    [57] Liu Y, Li Y, Keskin D, et al. (2019) Poly(β-amino esters): Synthesis, formulations, and their biomedical applications. Adv Healthc Mater 8: e1801359.
    [58] Capasso Palmiero U, Kaczmarek JC, Fenton OS, et al. (2018) Poly(β-amino ester)-co-poly(caprolactone) terpolymers as nonviral vectors for mRNA delivery in vitro and in vivo. Adv Healthc Mater 7: e1800249.
    [59] Palamà IE, Cortese B, D'Amone S, et al. (2015) mRNA delivery using non-viral PCL nanoparticles. Biomater Sci 3: 144-151.
    [60] Lacroix C, Humanes A, Coiffier C, et al. (2020) Polylactide-based reactive micelles as a robust platform for mRNA delivery. Pharm Res 37: 30.
    [61] Dong Y, Dorkin JR, Wang W, et al. (2016) Poly(glycoamidoamine) brushes formulated nanomaterials for systemic siRNA and mRNA delivery in vivo. Nano Lett 16: 842-848.
    [62] Palmerston Mendes L, Pan J, Torchilin VP (2017) Dendrimers as nanocarriers for nucleic acid and drug delivery in cancer therapy. Molecules 22: 1401.
    [63] Franiak-Pietryga I, Ziemba B, Messmer B, et al. (2018) Dendrimers as drug nanocarriers: the future of gene therapy and targeted therapies in cancer. Dendrimers: Fundamentals and Applications IntechOpen, 7.
    [64] Islam MA, Xu Y, Tao W, et al. (2018) Restoration of tumour-growth suppression in vivo via systemic nanoparticle-mediated delivery of PTEN mRNA. Nat Biomed Eng 2: 850-864.
    [65] Hajam IA, Senevirathne A, Hewawaduge C, et al. (2020) Intranasally administered protein coated chitosan nanoparticles encapsulating influenza H9N2 HA2 and M2e mRNA molecules elicit protective immunity against avian influenza viruses in chickens. Vet Res 51: 37.
    [66] McCullough KC, Bassi I, Milona P, et al. (2014) Self-replicating replicon-RNA delivery to dendritic cells by chitosan-nanoparticles for translation in vitro and in vivo. Mol Ther Nucleic Acids 3: e173.
    [67] Maiyo F, Singh M (2019) Folate-targeted mRNA delivery using chitosan-functionalized selenium nanoparticles: potential in cancer immunotherapy. Pharmaceuticals (Basel) 12: 164.
    [68] Son S, Nam J, Zenkov I, et al. (2020) Sugar-nanocapsules imprinted with microbial molecular patterns for mRNA vaccination. Nano Lett 20: 1499-1509.
    [69] Siewert C, Haas H, Nawroth T, et al. (2019) Investigation of charge ratio variation in mRNA - DEAE-dextran polyplex delivery systems. Biomaterials 192: 612-620.
    [70] Zeng C, Zhang C, Walker PG, et al. (2020) Formulation and delivery technologies for mRNA vaccines. Current Topics in Microbiology and Immunology Berlin: Springer.
    [71] Scheel B, Teufel R, Probst J, et al. (2005) Toll-like receptor-dependent activation of several human blood cell types by protamine-condensed mRNA. Eur J Immunol 35: 1557-1566.
    [72] Schlake T, Thess A, Fotin-Mleczek M, et al. (2012) Developing mRNA-vaccine technologies. RNA Biol 9: 1319-1330.
    [73] Fotin-Mleczek M, Duchardt KM, Lorenz C, et al. (2011) Messenger RNA-based vaccines with dual activity induce balanced TLR-7 dependent adaptive immune responses and provide antitumor activity. J Immunother 34: 1-15.
    [74] Schnee M, Vogel AB, Voss D, et al. (2016) An mRNA vaccine encoding rabies virus glycoprotein induces protection against lethal infection in mice and correlates of protection in adult and newborn pigs. PLoS Negl Trop Dis 10: e0004746.
    [75] Udhayakumar VK, De Beuckelaer A, McCaffrey J, et al. (2017) Arginine-rich peptide-based mRNA nanocomplexes efficiently instigate cytotoxic T cell immunity dependent on the amphipathic organization of the peptide. Adv Healthc Mater 6: 1601412.
    [76] Coolen AL, Lacroix C, Mercier-Gouy P, et al. (2019) Poly(lactic acid) nanoparticles and cell-penetrating peptide potentiate mRNA-based vaccine expression in dendritic cells triggering their activation. Biomaterials 195: 23-37.
    [77] Jekhmane S, De Haas R, Paulino da Silva Filho O, et al. (2017) Virus-like particles of mRNA with artificial minimal coat proteins: particle formation, stability, and transfection efficiency. Nucleic Acid Ther 27: 159-167.
    [78] Li J, Sun Y, Jia T, et al. (2014) Messenger RNA vaccine based on recombinant MS2 virus-like particles against prostate cancer. Int J Cancer 134: 1683-1694.
    [79] Sun S, Li W, Sun Y, et al. (2011) A new RNA vaccine platform based on MS2 virus-like particles produced in saccharomyces cerevisiaeBiochem Biophys Res Commun 407: 124-128.
    [80] Zhitnyuk Y, Gee P, Lung MSY, et al. (2018) Efficient mRNA delivery system utilizing chimeric VSVG-L7Ae virus-like particles. Biochem Biophys Res Commun 505: 1097-1102.
    [81] Kauffman KJ, Webber MJ, Anderson DG (2016) Materials for non-viral intracellular delivery of messenger RNA therapeutics. J Control Release 240: 227-234.
    [82] Kulkarni JA, Cullis PR, Van Der Meel R (2018) Lipid nanoparticles enabling gene therapies: From concepts to clinical utility. Nucleic Acid Ther 28: 146-157.
    [83] Dimitriadis GJ (1978) Translation of rabbit globin mRNA introduced by liposomes into mouse lymphocytes. Nature 274: 923-924.
    [84] Moon JJ, Suh H, Bershteyn A, et al. (2011) Interbilayer-crosslinked multilamellar vesicles as synthetic vaccines for potent humoral and cellular immune responses. Nat Mater 10: 243-251.
    [85] Tyagi RK, Garg NK, Sahu T (2012) Vaccination strategies against malaria: novel carrier(s) more than a tour de force. J Control Release 162: 242-254.
    [86] Adler-Moore J, Munoz M, Kim H, et al. (2011) Characterization of the murine Th2 response to immunization with liposomal M2e influenza vaccine. Vaccine 29: 4460-4468.
    [87] Monslow MA, Elbashir S, Sullivan NL, et al. (2020) Immunogenicity generated by mRNA vaccine encoding VZV gE antigen is comparable to adjuvanted subunit vaccine and better than live attenuated vaccine in nonhuman primates. Vaccine 38: 5793-5802.
    [88] Erasmus JH, Archer J, Fuerte-Stone J, et al. (2020) Intramuscular delivery of replicon RNA encoding ZIKV-117 human monoclonal antibody protects against Zika virus infection. Mol Ther Methods Clin Dev 18: 402-414.
    [89] Freyn AW, da Silva JR, Rosado VC, et al. (2020) A multi-targeting, nucleoside-modified mRNA influenza virus vaccine provides broad protection in mice. Mol Ther 28: 1569-1584.
    [90] Lo MK, Spengler JR, Welch SR, et al. (2020) Evaluation of a single-dose nucleoside-modified messenger RNA vaccine encoding Hendra virus-soluble glycoprotein against lethal Nipah virus challenge in Syrian hamsters. J Infect Dis 221: S493-S498.
    [91] Yang T, Li C, Wang X, et al. (2020) Efficient hepatic delivery and protein expression enabled by optimized mRNA and ionizable lipid nanoparticle. Bioact Mater 5: 1053-1061.
    [92] Moyo N, Wee EG, Korber B, et al. (2020) Tetravalent immunogen assembled from conserved regions of HIV-1 and delivered as mRNA demonstrates potent preclinical T-cell immunogenicity and breadth. Vaccines (Basel) 8: 360.
    [93] Lou G, Anderluzzi G, Schmidt ST, et al. (2020) Delivery of self-amplifying mRNA vaccines by cationic lipid nanoparticles: The impact of cationic lipid selection. J Control Release 325: 370-379.
    [94] Mai Y, Guo J, Zhao Y, et al. (2020) Intranasal delivery of cationic liposome-protamine complex mRNA vaccine elicits effective anti-tumor immunity. Cell Immunol 354: 104143.
    [95] Eygeris Y, Patel S, Jozic A, et al. (2020) Deconvoluting lipid nanoparticle structure for messenger RNA delivery. Nano Lett 20: 4543-4549.
    [96] Van Hoecke L, Verbeke R, De Vlieger D, et al. (2020) mRNA encoding a bispecific single domain antibody construct protects against influenza A virus infection in mice. Mol Ther Nucleic Acids 20: 777-787.
    [97] Zhong Z, Mc Cafferty S, Combes F, et al. (2018) mRNA therapeutics deliver a hopeful message. Nano Today 23: 16-39.
    [98] Bogers WM, Oostermeijer H, Mooij P, et al. (2015) Potent immune responses in rhesus macaques induced by nonviral delivery of a self-amplifying RNA vaccine expressing HIV type 1 envelope with a cationic nanoemulsion. J Infect Dis 211: 947-955.
    [99] Jackson LA, Anderson EJ, Rouphael NG, et al. (2020) An mRNA vaccine against SARS-CoV-2—preliminary report. N Engl J Med . doi: 10.1056/NEJMoa2022483
    [100] Alberer M, Gnad-Vogt U, Hong HS, et al. (2017) Safety and immunogenicity of a mRNA rabies vaccine in healthy adults: an open-label, non-randomised, prospective, first-in-human phase 1 clinical trial. Lancet 390: 1511-1520.
    [101] Bahl K, Senn JJ, Yuzhakov O, et al. (2017) Preclinical and clinical demonstration of immunogenicity by mRNA vaccines against H10N8 and H7N9 influenza viruses. Mol Ther 25: 1316-1327.
    [102] Feldman RA, Fuhr R, Smolenov I, et al. (2019) mRNA vaccines against H10N8 and H7N9 influenza viruses of pandemic potential are immunogenic and well tolerated in healthy adults in phase 1 randomized clinical trials. Vaccine 37: 3326-3334.
    [103] Ding Y, Jiang Z, Saha K, et al. (2014) Gold nanoparticles for nucleic acid delivery. Mol Ther 22: 1075-1083.
    [104] Liu J, Miao L, Sui J, et al. (2020) Nanoparticle cancer vaccines: Design considerations and recent advances. Asian J Pharm Sci . doi: 10.1016/j.ajps.2019.10.006
    [105] Yeom JH, Ryou SM, Won M, et al. (2013) Inhibition of xenograft tumor growth by gold nanoparticle-DNA oligonucleotide conjugates-assisted delivery of BAX mRNA. PLoS One 8: e75369.
    [106] Azmi F, Ahmad Fuaad AAH, Skwarczynski M, et al. (2014) Recent progress in adjuvant discovery for peptide-based subunit vaccines. Hum Vaccin Immunother 10: 778-796.
    [107] Coffman RL, Sher A, Seder RA (2010) Vaccine adjuvants: Putting innate immunity to work. Immunity 33: 492-503.
    [108] Reed SG, Orr MT, Fox CB (2013) Key roles of adjuvants in modern vaccines. Nat Med 19: 1597-1608.
    [109] Oleszycka E, Lavelle EC (2014) Immunomodulatory properties of the vaccine adjuvant alum. Curr Opin Immunol 28: 1-5.
    [110] Alving CR (2009) Vaccine adjuvants. Vaccines for Biodefense and Emerging and Neglected Diseases London: Elsevier, 115-129.
    [111] Hussein WM, Liu TY, Skwarczynski M, et al. (2014) Toll-like receptor agonists: a patent review (2011–2013). Expert Opin Ther Pat 24: 453-470.
    [112] Montomoli E, Piccirella S, Khadang B, et al. (2011) Current adjuvants and new perspectives in vaccine formulation. Expert Rev Vaccines 10: 1053-1061.
    [113] Mamo T, Poland GA (2012) Nanovaccinology: The next generation of vaccines meets 21st century materials science and engineering. Vaccine 30: 6609-6611.
    [114] Banchereau J, Briere F, Caux C, et al. (2000) Immunobiology of dendritic cells. Annu Rev Immunol 18: 767-811.
    [115] Oyewumi MO, Kumar A, Cui ZR (2010) Nano-microparticles as immune adjuvants: correlating particle sizes and the resultant immune responses. Expert Rev Vaccines 9: 1095-1107.
    [116] Marasini N, Skwarczynski M, Toth I (2014) Oral delivery of nanoparticle-based vaccines. Expert Rev Vaccines 13: 1361-1376.
    [117] Kawai T, Akira S (2011) Toll-like receptors and their crosstalk with other innate receptors in infection and immunity. Immunity 34: 637-650.
    [118] Vasilichin VA, Tsymbal SA, Fakhardo AF, et al. (2020) Effects of metal oxide nanoparticles on Toll-like receptor mRNAs in human monocytes. Nanomaterials (Basel) 10: 127.
    [119] Roy R, Kumar D, Sharma A, et al. (2014) ZnO nanoparticles induced adjuvant effect via toll-like receptors and Src signaling in Balb/c mice. Toxicol Lett 230: 421-433.
    [120] De Temmerman M-L, Rejman J, Demeester J, et al. (2011) Particulate vaccines: on the quest for optimal delivery and immune response. Drug Discov Today 16: 569-582.
    [121] Hafner AM, Corthésy B, Merkle HP (2013) Particulate formulations for the delivery of poly(I: C) as vaccine adjuvant. Adv Drug Deliv Rev 65: 1386-1399.
  • This article has been cited by:

    1. Tumusiime Andrew Gahwera, Odongo Steven Eyobu, Mugume Isaac, Analysis of Machine Learning Algorithms for Prediction of Short-Term Rainfall Amounts Using Uganda’s Lake Victoria Basin Weather Dataset, 2024, 12, 2169-3536, 63361, 10.1109/ACCESS.2024.3396695
    2. Arti Jain, Rajeev Kumar Gupta, Mohit Kumar, 2024, chapter 16, 9798369323519, 324, 10.4018/979-8-3693-2351-9.ch016
    3. Babita Pathik, Rajeev Kumar Gupta, Nikhlesh Pathik, 2024, Chapter 4, 978-3-031-62216-8, 45, 10.1007/978-3-031-62217-5_4
    4. Nan Yao, Jinyin Ye, Shuai Wang, Shuai Yang, Yang Lu, Hongliang Zhang, Xiaoying Yang, Bias correction of the hourly satellite precipitation product using machine learning methods enhanced with high-resolution WRF meteorological simulations, 2024, 310, 01698095, 107637, 10.1016/j.atmosres.2024.107637
    5. Dhvanil Bhagat, Shrey Shah, Rajeev Kumar Gupta, 2024, Chapter 6, 978-3-031-62216-8, 63, 10.1007/978-3-031-62217-5_6
    6. Tumusiime Andrew Gahwera, Odongo Steven Eyobu, Mugume Isaac, 2024, Transfer Learning Approach for Rainfall Class Amount Prediction Using Uganda's Lake Victoria Basin Weather Dataset, 979-8-3503-9174-9, 101, 10.1109/IBDAP62940.2024.10689700
    7. Abhishek Thoke, Sakshi Rai, 2024, Exploring Faster R-CNN Algorithms for Object Detection, 979-8-3503-5293-1, 1, 10.1109/SSITCON62437.2024.10796389
    8. S. P. Siddique Ibrahim, N.Venkata Harika, N.Mohitha Reddy, K. Srija, P.Shiva Sahithi, A. Kohima, 2024, Application of Machine Learning and Data Mining Techniques for Accurate Cloud Burst Prediction, 979-8-3315-1809-7, 1117, 10.1109/ICICNIS64247.2024.10823388
    9. Hiresh Singh Sengar, Sakshi Rai, 2024, A Comparative Analysis of Different Machine Learning Approaches for Crop Yield Prediction, 979-8-3315-0496-0, 1, 10.1109/ICIICS63763.2024.10859455
    10. C. Vijayalakshmi, M. Pushpa, Optimizing non-linear autoregressive networks with Bird Sea Lion algorithms for effective rainfall forecasting, 2025, 18, 1865-0473, 10.1007/s12145-025-01768-2
    11. Tumusiime Andrew Gahwera, Odongo Steven Eyobu, Mugume Isaac, Samuel Kakuba, Dong Seog Han, Transfer Learning-Based Ensemble Approach for Rainfall Class Amount Prediction, 2025, 13, 2169-3536, 48318, 10.1109/ACCESS.2025.3551737
  • Reader Comments
  • © 2020 the Author(s), licensee AIMS Press. This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0)
通讯作者: 陈斌, bchen63@163.com
  • 1. 

    沈阳化工大学材料科学与工程学院 沈阳 110142

  1. 本站搜索
  2. 百度学术搜索
  3. 万方数据库搜索
  4. CNKI搜索

Metrics

Article views(13842) PDF downloads(1006) Cited by(19)

Other Articles By Authors

/

DownLoad:  Full-Size Img  PowerPoint
Return
Return

Catalog