Review

Survey on the application of deep learning in algorithmic trading

  • Algorithmic trading is one of the most concerned directions in financial applications. Compared with traditional trading strategies, algorithmic trading applications perform forecasting and arbitrage with higher efficiency and more stable performance. Numerous studies on algorithmic trading models using deep learning have been conducted to perform trading forecasting and analysis. In this article, we firstly summarize several deep learning methods that have shown good performance in algorithmic trading applications, and briefly introduce some applications of deep learning in algorithmic trading. We then try to provide the latest snapshot application for algorithmic trading based on deep learning technology, and show the different implementations of the developed algorithmic trading model. Finally, some possible research issues are suggested in the future. The prime objectives of this paper are to provide a comprehensive research progress of deep learning applications in algorithmic trading, and benefit for subsequent research of computer program trading systems.

    Citation: Yongfeng Wang, Guofeng Yan. Survey on the application of deep learning in algorithmic trading[J]. Data Science in Finance and Economics, 2021, 1(4): 345-361. doi: 10.3934/DSFE.2021019

    Related Papers:

    [1] Youtian Hao, Guohua Yan, Renjun Ma, M. Tariqul Hasan . Linking dynamic patterns of COVID-19 spreads in Italy with regional characteristics: a two level longitudinal modelling approach. Mathematical Biosciences and Engineering, 2021, 18(3): 2579-2598. doi: 10.3934/mbe.2021131
    [2] Weike Zhou, Aili Wang, Fan Xia, Yanni Xiao, Sanyi Tang . Effects of media reporting on mitigating spread of COVID-19 in the early phase of the outbreak. Mathematical Biosciences and Engineering, 2020, 17(3): 2693-2707. doi: 10.3934/mbe.2020147
    [3] Marco Roccetti . Excess mortality and COVID-19 deaths in Italy: A peak comparison study. Mathematical Biosciences and Engineering, 2023, 20(4): 7042-7055. doi: 10.3934/mbe.2023304
    [4] Marco Roccetti . Drawing a parallel between the trend of confirmed COVID-19 deaths in the winters of 2022/2023 and 2023/2024 in Italy, with a prediction. Mathematical Biosciences and Engineering, 2024, 21(3): 3742-3754. doi: 10.3934/mbe.2024165
    [5] Tianfang Hou, Guijie Lan, Sanling Yuan, Tonghua Zhang . Threshold dynamics of a stochastic SIHR epidemic model of COVID-19 with general population-size dependent contact rate. Mathematical Biosciences and Engineering, 2022, 19(4): 4217-4236. doi: 10.3934/mbe.2022195
    [6] S. H. Sathish Indika, Norou Diawara, Hueiwang Anna Jeng, Bridget D. Giles, Dilini S. K. Gamage . Modeling the spread of COVID-19 in spatio-temporal context. Mathematical Biosciences and Engineering, 2023, 20(6): 10552-10569. doi: 10.3934/mbe.2023466
    [7] Aili Wang, Xueying Zhang, Rong Yan, Duo Bai, Jingmin He . Evaluating the impact of multiple factors on the control of COVID-19 epidemic: A modelling analysis using India as a case study. Mathematical Biosciences and Engineering, 2023, 20(4): 6237-6272. doi: 10.3934/mbe.2023269
    [8] Sarah R. Al-Dawsari, Khalaf S. Sultan . Modeling of daily confirmed Saudi COVID-19 cases using inverted exponential regression. Mathematical Biosciences and Engineering, 2021, 18(3): 2303-2330. doi: 10.3934/mbe.2021117
    [9] Peng An, Xiumei Li, Ping Qin, YingJian Ye, Junyan Zhang, Hongyan Guo, Peng Duan, Zhibing He, Ping Song, Mingqun Li, Jinsong Wang, Yan Hu, Guoyan Feng, Yong Lin . Predicting model of mild and severe types of COVID-19 patients using Thymus CT radiomics model: A preliminary study. Mathematical Biosciences and Engineering, 2023, 20(4): 6612-6629. doi: 10.3934/mbe.2023284
    [10] Yuto Omae, Yohei Kakimoto, Makoto Sasaki, Jun Toyotani, Kazuyuki Hara, Yasuhiro Gon, Hirotaka Takahashi . SIRVVD model-based verification of the effect of first and second doses of COVID-19/SARS-CoV-2 vaccination in Japan. Mathematical Biosciences and Engineering, 2022, 19(1): 1026-1040. doi: 10.3934/mbe.2022047
  • Algorithmic trading is one of the most concerned directions in financial applications. Compared with traditional trading strategies, algorithmic trading applications perform forecasting and arbitrage with higher efficiency and more stable performance. Numerous studies on algorithmic trading models using deep learning have been conducted to perform trading forecasting and analysis. In this article, we firstly summarize several deep learning methods that have shown good performance in algorithmic trading applications, and briefly introduce some applications of deep learning in algorithmic trading. We then try to provide the latest snapshot application for algorithmic trading based on deep learning technology, and show the different implementations of the developed algorithmic trading model. Finally, some possible research issues are suggested in the future. The prime objectives of this paper are to provide a comprehensive research progress of deep learning applications in algorithmic trading, and benefit for subsequent research of computer program trading systems.



    Since the emergence of the COVID-19 pandemic around December 2019, the outbreak has snowballed globally [1,2], and there is no clear sign that the new confirmed cases and deaths are coming to an end. Though vaccines are rolling out to deter the spread of this pandemic, the mutations of the viruses are already under way [3,4,5,6]. Despite the fact and research that the origin of the pandemic is still in debate [7], many researchers are conducting their study from different aspects and perspectives. They could be categorised mainly into three levels: SARS-CoV-2 genetic level [8], COVID-19 individual country level [9,10,11] and continental levels [12,13]. In this study, we focus on the latter two levels. Regarding these two levels, there are many methods and techniques on these issues. For example, linear and non-linear growth models, together with 2-week-kernel-window regression, are exploited in modelling the exponential growth rate of COVID-19 confirmed cases [14] - which are also generalised to non-linear modelling of COVID-19 pandemic [15,16]. Some research works focus on the prediction of COVID-19 spread by estimating the lead-lag effects between different countries via time warping technique [17], while some utilise clustering analyses to group countries via epidemiological data of active cases, active cases per population, etc.[18]. In addition, there are other researches focusing on tackling the relationship between economic variables and COVID-19 related variables [19,20] - though both the results show there are no relation between economic freedom and COVID-19 deaths and no relation between the performance of equality markets and the COVID-19 cases and deaths.

    In this study, we aim to extract the features of daily biweekly growth rates of cases and deaths on national and continental levels. We devise the orthonormal bases based on Fourier analysis [21,22], in particular Fourier coefficients for the potential features. For the national levels, we import the global time series data and sample 117 countries for 109 days [23,24]. Then we calculate the Euclidean distance matrices for the inner products between countries and between days. Based on the distance matrices, we then calculate their variabilities to delve into the distribution of the data. For the continental level, we also import the biweekly changes of cases and deaths for 5 continents as well as the world data with time series data for 447 days. Then we calculate their inner products with respect to the temporal frequencies and find the similarities of extracted features between continents.

    For the national levels, the biweekly data bear higher temporal features than spatial features, i..e., as time goes by, the pandemic evolves more in the time dimension than the space (or country-wise) dimension. Moreover, there exists a strong concurrency between features for biweekly changes of cases and deaths, though there is no clear or stable trend for the extracted features. However, in the continental level, one observes that there is a stable trend of features regarding biweekly change. In addition, the extracted features between continents are similar to one another, except Asia whose features bear no clear similarities with other continents.

    Our approach is based on orthonormal bases, which serve as the potential features for the biweekly change of cases and deaths. This method is straightforward and easy to comprehend. The limitations of this approach are the extracted features are based on the hidden frequencies of the dynamical structure, which is hard to assign a interpretable meaning for the frequencies, and the data fetched are not complete, due to the missing data in the database. However the results provided in this study could help one map out the evolutionary features of COVID-19.

    Let δ:N{0,1} be a function such that δ(n)=0 (or δn=0), if n2N and δ(n)=1, if n2N+1. Given a set of point data D={v}RN, we would like to decompose each v into some frequency-based vectors by Fourier analysis. The features of COVID-19 case and death growth rates are specified by the orthogonal frequency vectors BN={fij:1jN}Ni=1, which is based on Fourier analysis, in particular Fourier series [22], where

    f1j=1N for all 1jN;

    ● For any 2iN1+δN,

    fij=2Ncos[π2δi(iδi)πNj; (2.1)

    ● If N2N, then fNj=1Ncos(jπ) for all 1jN.

    Now we have constructed an orthonormal basis FN={f1,f2,,fN} as features for RN. Now each v=Ni=1<v,fi>fi, where <,> is the inner product. The basis BN could also be represented by a matrix

    FN=[f1f2fN]=[f11f12f1Nf21f22f2NfN1fN2fNN]. (2.2)

    where each fij is defined in Eq 2.1.

    Example 1. If N is 5, then the matrix representation of the orthonormal basis B5 is

    F5=[f1f2f3f4f5]=[0.4470.4470.4470.4470.4470.1950.5120.5120.1950.6320.6020.3720.3720.60200.5120.1950.1950.5120.6320.3720.6020.6020.3720].

    and the representation of a data column vector v={(-3,14,5,8,-12)} with respect to B5 is calculated by F5v=[<v,fi>]5i=1 or a column vector or 5-by-1 matrix (5.367,-16.334,-3.271,-6.434,-9.503).

    There are two main parts of data collection and handling - one for individual countries (or national level) and the other for individual continents (or continental level). In both levels, we fetch the daily biweekly growth rates of confirmed COVID-19 cases and deaths from Our World in Data [23,24]. Then we use R programming 4.1.0 to handle the data and implement the procedures.

    Sampled targets: national. After filtering out non-essential data and missing data, the effective sampled data are 117 countries with effective sampled 109 days as shown in Results. The days range from December 2020 to June 2021. Though the sampled days are not subsequent ones (due to the missing data), the biweekly information could still cover such loss. In the latter temporal and spatial analyses, we will conduct our study based on these data.

    Sampled targets: continental. As for the continental data, we collect data regarding the world, Africa, Asia, Europe, North and South America. The sampled days range from March 22nd, 2020 to June 11th, 2021. In total, there are 449 days (this is different from the national level). In the latter temporal analysis (there is no spatial analysis in the continental level, due to the limited sampling size), we will conduct our study based on these data.

    Notations: national. For further processing, let us utilise some notations to facilitate the introduction. Let the sampled countries be indexed by i=1,...,117. Let the sampled days be indexed by t=1,...,109. Days range from December 3rd 2020 to May 31st 2021. Let ci(t) and di(t) be the daily biweekly growth rates of confirmed cases and deaths in country i on day t, respectively, i.e.,

    ci(t):=casei,t+13casei,tcasei,t; (2.3)
    di(t):=deathi,t+13deathi,tdeathi,t, (2.4)

    where casei,t and deathi,t denote the total confirmed cases and deaths for country i at day t, respectively. We form temporal and spatial vectors by

    ci=(ci(1),...,ci(109));di=(di(1),...,di(109))
    v(t)=(c1(t),c2(t)...,c117(t));w(t)=(d1(t),d2(t)...,d117(t))

    the vector ci and di give every count in time for a given country, and the vector v(t) and w(t) give every countries' count for a given time.

    Notations: continental. For further processing, let us utilise some notations to facilitate the introduction. Let the sampled continents be indexed by j=1,...,6. Let the 447 sampled days range from March 22nd 2020 to June 11th 2021. We form temporal vectors for confirmed cases and deaths by

    xj=(cj(1),cj(2),...,cj(447));yi=(dj(1),dj(2),...,dj(447)).

    For any m-by-n matrix A, we use min(A) to denote the value min{aij:1im;1jn}. Similarly, we define max(A) by the same manner. If v is a vector, we define min(v) and max(v) in the same manner. The implementation goes as follows:

    (1) Extract and trim and source data.

    Extraction: national. Extract the daily biweekly growth rates of COVID-19 cases and deaths from the database and trim the data. The trimmed data consist of 109 time series data for 117 countries as shown in Table 1, which consists of two 117-by-109 matrices:

    Biweekly_cases=[ci(t)]t=1:109i=1:117;Biweekly_deaths=[di(t)]t=1:109i=1:117.
    Table 1.  Time series data of 109 daily biweekly growth rates for 117 countries for confirmed cases(upper block) and deaths (lower block).
    Date 2020/12/3 2020/12/4 2020/12/5 2021/5/29 2021/5/30 2021/5/31
    label 1 2 3 107 108 109
    1 18.64 11.25 42.3 28.2 0.72 58.99
    2 3.6 8.73 40.35 28.25 4.58 73.84
    3 4.01 4.44 37.79 27.62 3.88 94.77
    115 12.55 46.42 31.97 7.82 21.65 28.91
    116 12.04 45.49 27.99 5.36 41.76 37.63
    117 11.42 43.95 26.28 3.47 51.09 39.43
    Date 2020/12/3 2020/12/4 2020/12/5 2021/5/29 2021/5/30 2021/5/31
    label 1 2 3 107 108 109
    1 101.04 24.1 30.54 6.41 9.93 55.56
    2 65.14 1.01 27.75 6.79 12.12 44.44
    3 56.3 9.31 29.03 7.74 14.54 40
    115 28.85 35.65 23.24 9.8 9.09 14.29
    116 35.95 36.89 23.75 9.06 16.67 0
    117 36.77 33.49 24.35 8.65 0 33.33

     | Show Table
    DownLoad: CSV

    Row i in the matrices are regarded as temporal vectors ci and di respectively, and Column t in the matrices are regarded as spatial vectors v(t) and w(t) respectively.

    Extraction: continental. As for the continental data, they are collected by two 6-by-447 matrices:

    Biweekly_cont_cases=[xj(τ)]τ=1:447j=1:6;
    Biweekly_cont_deaths=[yj(t)]τ=1:447j=1:6.

    (2) Specify the frequencies (features) for the imported data.

    Basis: national. In order to decompose ci and di into some fundamental features, we specify F109 as the corresponding features, whereas to decompose v(t) and w(t), we specify F117 as the corresponding features. The results are presented in Table 2.

    Table 2.  Orthonormal temporal frequencies for 109 days (upper block or F109) and orthonormal spatial frequencies for 117 countries (lower block or F117).
    temp. freq. ele. 1 ele. 2 ele. 3 ele. 107 ele. 108 ele. 109
    f1 0.1 0.1 0.1 0.1 0.1 0.1
    f2 0.14 0.13 0.13 0.13 0.14 0.14
    f3 0.01 0.02 0.02 0.02 0.01 0
    f107 0.01 0.02 0.03 0.02 0.01 0
    f108 0.14 0.14 0.13 0.14 0.14 0.14
    f109 0 0.01 0.01 0.01 0 0
    spatial. freq. ele. 1 ele. 2 ele. 3 ele. 115 ele. 116 ele. 117
    f1 0.09 0.09 0.09 0.09 0.09 0.09
    f2 0.13 0.13 0.13 0.13 0.13 0.13
    f3 0.01 0.01 0.02 0.01 0.01 0
    f115 0.01 -0.02 0.03 0.02 0.01 0
    f116 0.13 0.13 0.13 0.13 0.13 0.13
    f117 0 0.01 0.01 0.01 0 0

     | Show Table
    DownLoad: CSV

    Basis: continental. In order to decompose xj and yj into some fundamental features, we specify F447 as the corresponding features.

    (3) Compute the sets of the representations with respect to various bases.

    Representation: national. The temporal representations of of ci and di with respect to F109 are calculated by

    IP_cases_time={F109ci}117i=1,
    IP_death_time={F109di}117i=1;

    and the spatial representations of v(t) and w(t) with respect to F117 are calculated by

    IP_cases_space={F117v(t)}109t=1,
    IP_death_space={F117w(t)}109t=1.

    The results are presented Results.

    Representation: continental. The temporal representations of of xj and yj with respect to F447 are calculated by

    IP_cont_cases_time={F447xj}447j=1,
    IP_cont_death_time={F447yj}447j=1.

    (4) Compute the Euclidean distance matrices for the representations.

    Euclidean: national. The distances between temporal representations with respect to cases and deaths by calculated by

    dismat_case_time=[dE(F109ci,F109cj)]117i,j=1
    dismat_death_time=[dE(F109di,F109dj)]117i,j=1;

    The distances between spatial representations with respect to cases and deaths by calculated by

    dismat_case_space[dE(F117v(t),F117v(τ))]109t,τ=1
    dismat_death_space=[dE(F117w(t),F117w(τ))]109t,τ=1,

    where dE is the usual Euclidean distance function. The results are presented in Results

    Euclidean: continental. The distances between temporal representations with respect to cases and deaths by calculated by

    dismat_cont_case_time=[dE(F447xj,F447xk)]447j,k=1
    dismat_cont_death_time=[dE(F447yj,F447yk)]447j,k=1.

    (5) Compute the average variability based on the above distance matrices.

    Average variability: national. For each country i, the temporal variabilities for confirmed cases and deaths are computed by

    var_case_time[i]=117j=1dE(F109ci,F109cj)109;
    var_death_time[i]=117j=1dE(F109ci,F109cj)109;

    and for each day t, the spatial variabilities for confirmed cases and deaths are computed by

    var_case_space[t]=109τ=1dE(F117v(t),F117v(τ))117;
    var_death_space[t]=109τ=1dE(F117w(t),F117w(τ))117.

    The results are presented in Results.

    Average variability: continental. For each continent j, the temporal variabilities for confirmed cases and deaths are computed by

    var_cont_case_time[j]=6k=1dE(F447xj,F447xk)447;
    var_cont_death_time[j]=6k=1dE(F447yj,F447yk)447.

    (6) Unify the national temporal and spatial variabilities of cases and deaths. For each country i, the unified temporal and spatial variabilities for cases and deaths are defined by

    bvar_case_time[i]=var_case_time[i]mn1mx1mn1;

    bvar_death_time[i]=var_death_time[i]mn2mx2mn2;

    bvar_case_space[t]=var_case_space[t]mn3mx3mn3;

    bvar_death_space[t]=var_death_space[t]mn4mx4mn4,

    where

    mn1=min(var_case_time);mx1=max(var_case_time);

    mn2=min(var_death_time);mx2=max(var_death_time);

    mn3=min(var_case_space);mx3=max(var_case_space);

    mn4=min(var_death_space);mx4=max(var_death_space). The results are shown in Results.

    (7) Unified temporal representations with respect to continental confirmed cases and deaths by matrices whose (i,j) cell are defined by

    σijmin(IP_cont_cases_time)max(IP_cont_cases_time)min(IP_cont_cases_time);
    βijmin(IP_cont_death_time)max(IP_cont_death_time)min(IP_cont_death_time);

    where σij and βij denotes the value in the (i,j) cells of IP_cont_cases_time and IP_cont_deaths_time, respectively. The results are visualised by figures in Results.

    There are two main parts of results shown in this section: national results and continental results.

    National results. Based on the method mentioned in section 2, we identify the temporal orthonormal frequencies and spatial ones as shown in Table 2.

    The computed inner products at country levels, served as the values for extracted features, for daily biweekly growth rates of cases and deaths with respect to temporal frequencies are shown in Figure 1. Similarly, the computed inner products at a country level for daily biweekly growth rates of cases and deaths with respect to spatial frequencies are shown in Figure 2. Meanwhile, their scaled variabilities are plotted in Figure 3.

    Figure 1.  Inner products between growth rates of cases (in solid line) over 109 temporal frequencies; and inner products between growth rates of deaths (in dotted line) over 109 temporal frequencies for some demonstrative countries: Afghanistan, Albania, Algeria, Uruguay, Zambia, and Zimbabwe.
    Figure 2.  Inner products between growth rates of cases (in solid line) over 117 spatial frequencies; and inner products between growth rates of deaths (in dotted line) over 117 spatial frequencies for some demonstrative dates: 2020/12/3, 2020/12/4, 2020/12/5, 2021/5/29, 2021/5/30, and 2021/5/31.
    Figure 3.  Unified temporal and spatial variabilities of daily biweekly growth rates of cases and deaths.

    Continental results. According to the obtained data, we study and compare continental features of daily biweekly growth rates of confirmed cases and deaths of Africa, Asia, Europe, North America, South America and World. Unlike the missing data in analysing individual countries, the continental data are complete. We take the samples from March 22nd, 2020 to June 11th, 2021. In total, there are 447 days for the analysis. The cosine values which compute the similarities between representations for continents are shown in Table 3. The results of the unified inner products with respect to confirmed cases and deaths are plotted in Figures 4 and 5, respectively.

    Table 3.  Cosine values (similarities) between World, Africa, Asia, Europe, North America (No. Am.), and South America (So. Am.).
    World Africa Asia Europe No. Am. So. Am.
    World 1.000 0.963 0.638 0.923 0.938 0.890
    Africa 0.963 1.000 0.484 0.926 0.965 0.941
    Asia 0.638 0.484 1.000 0.391 0.399 0.356
    Europe 0.923 0.926 0.391 1.000 0.968 0.956
    No. Am. 0.938 0.965 0.399 0.968 1.000 0.983
    So. Am. 0.890 0.941 0.356 0.956 0.983 1.000
    World Africa Asia Europe No. Am. So. Am.
    World 1.000 0.895 0.647 0.936 0.978 0.972
    Africa 0.895 1.000 0.553 0.966 0.843 0.891
    Asia 0.647 0.553 1.000 0.547 0.596 0.615
    Europe 0.936 0.966 0.547 1.000 0.893 0.917
    No. Am. 0.978 0.843 0.596 0.893 1.000 0.967
    So. Am. 0.972 0.891 0.615 0.917 0.967 1.000

     | Show Table
    DownLoad: CSV
    Figure 4.  Unified inner product, or UIP, for World, Africa, Asia, Europe, North and South America with respect to biweekly growth rates of cases.
    Figure 5.  Unified inner product, or UIP, for world, Africa, Asia, Europe, North and South America with respect to daily biweekly growth rates of deaths.

    Other auxiliary results that support the plotting of the graphs are also appended in Appendix. The names of the sampled 117 countries are provided in Tables A1 and A2. The dates of the sampled days are provided in Figure A1. The tabulated results for inner product of temporal and spatial frequencies on a national level are provided in Table A3. The tabulated results for inner product of temporal frequencies on a continental level are provided in Table A4. The Euclidean distance matrices for temporal and spatial representations with respect to confirmed cases and deaths are tabulated in Table A5 and their average variabilities are tabulated in Table A6.

    Summaries of results. Based on the previous tables and figures, we have the following results.

    (1) From Figures 1 and 2, one observes that the temporal features are much more distinct that the spatial features, i.e., if one fixes one day and extracts the features from the spatial frequencies, he obtains less distinct features when comparing with fixing one country and extracting the features from the temporal frequencies. This indicates that SARS-CoV-2 evolves and mutates mainly according to time than space.

    (2) For individual countries, the features for the biweekly changes of cases are almost concurrent with those of deaths. This indicates biweekly changes of cases and deaths share the similar features. In some sense, the change of deaths is still in tune with the change of confirmed cases, i.e., there is no substantial change between their relationship.

    (3) For individual countries, the extracted features go up and down intermittently and there is no obvious trend. This indicates the virus is still very versatile and hard to capture its fixed features in a country-level.

    (4) From Figure 3, one observes that there is a clear similarities, in terms of variabilities, for both daily biweekly growth rates of cases and deaths under temporal frequencies. Moreover, the distribution of overall data is not condensed, where middle, labelled countries are scattering around the whole data. This indicates the diversity of daily biweekly growth rates of cases and deaths across countries is still very high.

    (5) From Figure 3, the daily biweekly growth rates of deaths with respect to the spatial frequencies are fairly concentrated. This indicates the extracted features regarding deaths are stable, i.e., there are clearer and stabler spatial features for daily biweekly growth rates of deaths.

    (6) Comparing the individual graphs in Figures 4 and 5, they bear pretty much the same shape, but in different scale - with death being higher feature oriented (this is also witnessed in a country-level as claimed in the first result above). This indicates there is a very clear trend of features regarding daily biweekly growth rates in a continental level (this is a stark contrast to the third claimed result above).

    (7) From Figures 4 and 5, the higher values of inner products lie in both endpoints for biweekly change of cases and deaths, i.e., low temporal frequencies and high temporal frequencies for all the continents, except the biweekly change of deaths in Asia. This indicates the evolutionary patterns in Asia are very distinct from other continents.

    (8) From Table 3, the extracted features are all very similar to each continents, except Asia. This echoes the above result.

    In this study, we identify the features of daily biweekly growth rates of COVID-19 confirmed cases and deaths via orthonormal bases (features) which derive from Fourier analysis. Then we analyse the inner products which represent the levels of chosen features. The variabilities for each country show the levels of deaths under spatial frequencies are much more concentrated than others. The generated results are summarised in Results 3. There are some limitations in this study and future improvements to be done:

    ● The associated meanings of the orthonormal features from Fourier analysis are not yet fully explored;

    ● We use the Euclidean metric to measure the distances between features, which is then used to calculate the variabilities. Indeed Euclidean metric is noted for its geographical properties, but may not be the most suitable in the context of frequencies. One could further introduce other metrics and apply machine learning techniques to find out the optimal ones.

    ● In this study, we choose the daily biweekly growth rates of confirmed cases and deaths as our research sources. This is a one-sided story. To obtain a fuller picture of the dynamical features, one could add other variables for comparison.

    This work is supported by the Humanities and Social Science Research Planning Fund Project under the Ministry of Education of China (No. 20XJAGAT001).

    No potential conflict of interest was reported by the authors.

    Figure A1.  Sampled dates.
    Table A1.  Country codes: part one.
    1 2 3 4
    Afghanistan Albania Algeria Argentina
    5 6 7 8
    Armenia Austria Azerbaijan Bahrain
    9 10 11 12
    Bangladesh Belarus Belgium Bolivia
    13 14 15 16
    Bosnia and Herzegovina Brazil Bulgaria Cameroon
    17 18 19 20
    Canada Cape Verde Chile Colombia
    21 22 23 24
    Congo Costa Rica Cote d'Ivoire Croatia
    25 26 27 28
    Cuba Cyprus Czechia Congo
    29 30 31 32
    Denmark Dominican Republic Ecuador Egypt
    33 34 35 36
    El Salvador Estonia Ethiopia Finland
    37 38 39 40
    France Gabon Georgia Germany
    41 42 43 44
    Ghana Greece Guatemala Guinea
    45 46 47 48
    Guyana Honduras Hungary India
    49 50 51 52
    Indonesia Iran Iraq Ireland
    53 54 55 56
    Israel Italy Jamaica Japan
    57 58 59 60
    Jordan Kazakhstan Kenya Kosovo

     | Show Table
    DownLoad: CSV
    Table A2.  Country codes: part two.
    61 62 63 64
    Kuwait Kyrgyzstan Latvia Lebanon
    65 66 67 68
    Lithuania Luxembourg Madagascar Malaysia
    69 70 71 72
    Maldives Malta Mauritania Mexico
    73 74 75 76
    Moldova Morocco Mozambique Nepal
    77 78 79 80
    Netherlands Nicaragua Niger Nigeria
    81 82 83 84
    North Macedonia Norway Oman Pakistan
    85 86 87 88
    Palestine Panama Paraguay Peru
    89 90 91 92
    Philippines Poland Portugal Qatar
    93 94 95 96
    Romania Russia Rwanda Saudi Arabia
    97 98 99 100
    Senegal Serbia Somalia South Africa
    101 102 103 104
    South Korea Spain Sri Lanka Sudan
    105 106 107 108
    Sweden Switzerland Syria Togo
    109 110 111 112
    Turkey Uganda Ukraine United Arab Emirates
    113 114 115 116
    United Kingdom United States Uruguay Zambia
    117
    Zimbabwe

     | Show Table
    DownLoad: CSV
    Table A3.  Inner products w.r.t. temporal (case: upper top and death: upper bottom blocks) and spatial (case: lower top and death: lower bottom blocks) frequencies at a national level.
    1 2 3 107 108 109
    1 110.53 62.66 93.32 77.81 39.04 18.77
    2 112.67 46.17 92.52 87.26 47.41 12.92
    3 12.98 67.61 27.31 51.41 43.11 5.06
    115 17.59 58.27 134.11 114.95 23.12 152.09
    116 19.65 8.52 79.69 35.33 36.47 65.35
    117 2.64 22.75 108.62 92.29 32.71 94.36
    1 366.35 1.46 210.92 116.03 131.95 126.11
    2 326.53 17.45 208.92 133.41 233.80 208.15
    3 0.74 24.33 18.58 8.17 78.58 108.78
    115 25.84 105.22 122.77 199.31 12.66 82.39
    116 32.41 62.45 92.73 137.71 11.78 56.31
    117 39.29 124.70 296.81 359.07 65.38 202.47
    1 2 3 115 116 117
    1 4.15 291.26 152.02 189.07 356.05 209.52
    2 92.29 115.53 61.74 38.31 228.48 215.91
    3 31.84 114.23 93.34 46.84 65.16 26.38
    107 208.69 103.92 32.68 169.51 233.38 219.35
    108 217.87 251.94 125.90 238.90 130.16 52.03
    109 170.73 107.86 416.69 196.56 40.43 163.64
    1 20.39 145.38 106.48 174.21 227.71 182.89
    2 4.29 118.47 61.71 78.04 149.29 94.80
    3 56.22 97.81 90.43 132.01 38.92 2.25
    107 285.28 85.01 1.29 99.33 401.99 407.84
    108 88.85 167.76 108.55 175.93 188.25 86.94
    109 91.87 139.50 262.08 80.87 111.20 10.58

     | Show Table
    DownLoad: CSV
    Table A4.  Temporal inner product for continents (World, Africa, Asia, Europe, North and South America) w.r.t. daily biweekly growth rates of cases (upper block) and deaths (lower block) from March 22nd, 2020 to June 11th, 2021 (447 days).
    1 2 3 445 446 447
    World 653.77 686.84 700.11 27.98 27.77 26.93
    Africa 1551.22 1818.89 2003.92 38.31 43.08 44.42
    Asia 51.47 59.75 68.83 40.74 41.06 41.41
    Europe 1234.23 1118.72 1016.35 29.31 28.57 28.50
    North.America 6319.48 7234.23 6924.07 31.72 33.09 31.97
    South.America 5602.78 6116.46 5568.32 1.87 1.74 4.91
    World 731.76 842.64 854.91 14.38 12.32 11.88
    Africa 4600.00 5500.00 3000.00 5.26 4.38 5.22
    Asia 113.03 145.76 157.64 24.24 19.13 18.54
    Europe 1980.05 1806.36 1531.68 30.64 29.34 28.41
    North.America 2823.81 3439.13 3551.72 21.91 6.88 4.41
    South.America 2533.33 3200.00 4200.00 2.92 1.13 0.67

     | Show Table
    DownLoad: CSV
    Table A5.  Distance matrices for daily biweekly growth rates of cases (uppermost block) and deaths (2nd block) w.r.t. temporal frequencies and the ones of cases (3rd block) and deaths (bottommost block) w.r.t. spatial frequencies.
    1 2 3 115 116 117
    1 0 106.7561 662.1251 1228.0185 873.9556 1040.8007
    2 106.7561 0 670.5732 1240.0159 892.0054 1054.962
    3 662.1251 670.5732 0 1040.3623 585.0852 813.6218
    115 1228.0185 1240.0159 1040.3623 0 607.239 340.4118
    116 873.9556 892.0054 585.0852 607.239 0 386.7496
    117 1040.8007 1054.962 813.6218 340.4118 386.7496 0
    1 0 701.2076 1977.59 2120.7301 2032.9264 2635.872
    2 701.2076 0 2043.929 2342.249 2269.9992 2802.319
    3 1977.5901 2043.9292 0 1178.0557 1054.5304 1942.125
    115 2120.7301 2342.249 1178.056 0 328.4664 1126.271
    116 2032.9264 2269.9992 1054.53 328.4664 0 1355.298
    117 2635.8716 2802.3189 1942.125 1126.2709 1355.2977 0
    1 2 3 107 108 109
    1 0 1001.7219 655.3515 898.4473 1291.1967 1139.945
    2 1001.7219 0 524.0962 903.7544 848.0297 1166.972
    3 655.3515 524.0962 0 717.0677 908.4058 1151.471
    107 898.4473 903.7544 717.0677 0 778.5017 1172.124
    108 1291.1967 848.0297 908.4058 778.5017 0 1297.228
    109 1139.9453 1166.9716 1151.4709 1172.1237 1297.2278 0
    1 0 1013.4563 770.5213 1382.64 1138.3808 1099.3026
    2 1013.4563 0 519.0499 1168.676 418.6043 776.2315
    3 770.5213 519.0499 0 1206.915 629.7284 842.2901
    107 1382.6399 1168.6759 1206.915 0 1158.8305 1369.2354
    108 1138.3808 418.6043 629.7284 1158.83 0 808.8633
    109 1099.3026 776.2315 842.2901 1369.235 808.8633 0

     | Show Table
    DownLoad: CSV
    Table A6.  Variability for 117 countries with respect to daily biweekly growth rates of cases and death under temporal frequencies and variability for 109 days with respect to daily biweekly growth rates of cases and death under spatial frequencies.
    1 2 3 115 116 117
    var_case_time 1025.64 1037.49 779.02 1145.40 806.55 965.68
    var_death_time 2406.49 2610.72 1546.89 1582.22 1450.46 2249.04
    1 2 3 107 108 109
    var_case_country 923.10 763.91 657.83 893.60 1029.93 1204.01
    var_death_country 1277.72 1019.70 998.01 1477.71 1110.46 1216.24

     | Show Table
    DownLoad: CSV


    [1] Baek Y, Kim H (2018) ModAugNet: A new forecasting framework for stock market index value with an overfitting prevention LSTM module and a prediction LSTM module. Expert Syst Appl 113: 457–480. doi: 10.1016/j.eswa.2018.07.019
    [2] Boehmer E, Fong K, Wu J (2012) International evidence on algorithmic trading. In AFA 2013 San Diego Meetings Paper.
    [3] Chen C, Zhang P, Liu Y, et al. (2020) Financial quantitative investment using convolutional neural network and deep learning technology. Neurocomputing 390: 384–390. doi: 10.1016/j.neucom.2019.09.092
    [4] Chen J, Chen W, Huang C, et al. (2016) Financial Time-Series Data Analysis Using Deep Convolutional Neural Networks. In 2016 7th International Conference on Cloud Computing and Big Data (CCBD), 87–92.
    [5] Chen K, Zhou Y, Dai F (2015) A LSTM-based method for stock returns prediction: A case study of China stock market. In 2015 IEEE International Conference on Big Data (Big Data), 2823–2824.
    [6] Chen S, He H (2018) Stock Prediction Using Convolutional Neural Network. In IOP Conference Series: Materials Science and Engineering, 435: 012026.
    [7] Chen Y, Chen W, Huang S (2018) Developing Arbitrage Strategy in High-frequency Pairs Trading with Filterbank CNN Algorithm. In 2018 IEEE International Conference on Agents (ICA), 113–116.
    [8] Cybenko G (1989) Approximation by superpositions of a sigmoidal function. Math Control Signals Syst 2: 303–314. doi: 10.1007/BF02551274
    [9] Day M, Lee C (2016) Deep learning for financial sentiment analysis on finance news providers. In 2016 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM), 1127–1134.
    [10] Deng Y, Bao F, Kong Y, et al. (2017) Deep Direct Reinforcement Learning for Financial Signal Representation and Trading. IEEE Trans Neural Networks Learn Syst 28: 653–664. doi: 10.1109/TNNLS.2016.2522401
    [11] Dixon M, Klabjan D, Bang J (2017) Classification-based financial markets prediction using deep neural networks. Algorithmic Financ 6: 67–77. doi: 10.3233/AF-170176
    [12] Doering J, Fairbank M, Markose S (2017) Convolutional neural networks applied to high-frequency market microstructure forecasting. In 2017 9th Computer Science and Electronic Engineering (CEEC), 31–36.
    [13] Fang Y, Chen J, Xue Z (2019) Research on quantitative investment strategies based on deep learning. Algorithms 12: 35. doi: 10.3390/a12020035
    [14] Gudelek M, Boluk S, Ozbayoglu A (2017) A deep learning based stock trading model with 2-D CNN trend detection. In 2017 IEEE Symposium Series on Computational Intelligence (SSCI), 1–8.
    [15] Gunduz H, Yaslan Y, Cataltepe Z (2017) Intraday prediction of borsa Istanbul using convolutional neural networks and feature correlations. Knowl Based Syst 137: 138–148. doi: 10.1016/j.knosys.2017.09.023
    [16] Hendershott T, Jones C, Menkveld A (2011) Does algorithmic trading improve liquidity? J Financ 66: 1–33. doi: 10.1111/j.1540-6261.2010.01624.x
    [17] Hendershott T, Riordan R (2009) Algorithmic trading and information. University of California, Berkeley.
    [18] Hinton G, Salakhutdinov R (2006) Reducing the Dimensionality of Data with Neural Networks. Science 313: 504–507. doi: 10.1126/science.1127647
    [19] Hochreiter S, Schmidhuber J (1997) Long Short-Term Memory. Neural Comput 9: 1735–1780. doi: 10.1162/neco.1997.9.8.1735
    [20] Lee H, Grosse R, Ranganath R, et al. (2009) Convolutional deep belief networks for scalable unsupervised learning of hierarchical representations. In Proceedings of the 26th annual international conference on machine learning, 609–616.
    [21] Hoseinzade E, Haratizadeh S (2019) CNNpred: CNN-based stock market prediction using a diverse set of variables. Expert Syst Appl 129: 273–285. doi: 10.1016/j.eswa.2019.03.029
    [22] Hossain M, Karim R, Thulasiram R, et al. (2018) Hybrid Deep Learning Model for Stock Price Prediction. In 2018 IEEE Symposium Series on Computational Intelligence (SSCI), 1837–1844.
    [23] Jeong G, Kim H (2019) Improving financial trading decisions using deep Q-learning: Predicting the number of shares, action strategies, and transfer learning. Expert Syst Appl 117: 125–138. doi: 10.1016/j.eswa.2018.09.036
    [24] Ji S, Kim J, Im H (2019) A Comparative Study of Bitcoin Price Prediction Using Deep Learning. Mathematics 7: 898. doi: 10.3390/math7100898
    [25] Kalman B, Kwasny S (1992) Why tanh: choosing a sigmoidal function. In [Proceedings 1992] IJCNN International Joint Conference on Neural Networks, 4: 578–581.
    [26] Kim S, Kang M (2019) Financial series prediction using Attention LSTM. arXiv preprint arXiv: 1902.10877.
    [27] Krauss C, Do X, Huck N (2017) Deep neural networks, gradient-boosted trees, random forests: Statistical arbitrage on the S & P 500. Eur J Oper Res 259: 689–702. doi: 10.1016/j.ejor.2016.10.031
    [28] LeCun Y, Bengio Y, Hinton G (2015) Deep learning. Nature 521: 436-44. doi: 10.1038/nature14539
    [29] Li Y, Zheng W, Zheng Z (2019) Deep Robust Reinforcement Learning for Practical Algorithmic Trading. IEEE Access 7: 108014–108022. doi: 10.1109/ACCESS.2019.2932789
    [30] Lin B, Chu W, Wang C (2018) Application of Stock Analysis Using Deep Learning. In 2018 7th International Congress on Advanced Applied Informatics (ⅡAI-AAI), 612–617.
    [31] Liu S, Zhang C, Ma J (2017) CNN-LSTM neural network model for quantitative strategy analysis in stock markets. In international conference on neural information processing, Springer, 198–206.
    [32] Lu W, Li J, Li Y, et al. (2020) A CNN-LSTM-Based Model to Forecast Stock Prices. Complexity 2020: 6622927.
    [33] Luo S, Lin X, Zheng Z (2019) A novel CNN-DDPG based AI-trader: Performance and roles in business operations. Transp Res Part E Logist Transp Rev 131: 68–79. doi: 10.1016/j.tre.2019.09.013
    [34] Lv D, Yuan S, Li M, et al. (2019) An Empirical Study of Machine Learning Algorithms for Stock Daily Trading Strategy. Math Probl Eng 2019: 7816154.
    [35] Mikolov T, Karafiát M, Burget L, et al. (2010) Recurrent neural network based language model. In Interspeech, Makuhari, 1045–1048.
    [36] Mudassir M, Bennbaia S, Unal D, et al. (2020) Time-series forecasting of Bitcoin prices using high-dimensional features: a machine learning approach. Neural Comput Appl 2020: 1–15.
    [37] Nair V, Hinton G (2010) Rectified linear units improve restricted boltzmann machines. In Icml.
    [38] Nelson D, Pereira A, de Oliveira R (2017) Stock market's price movement prediction with LSTM neural networks. In 2017 International Joint Conference on Neural Networks (IJCNN), 1419–1426.
    [39] Nuti G, Mirghaemi M, Treleaven P, et al. (2011) Algorithmic Trading. Computer 44: 61–69. doi: 10.1109/MC.2011.31
    [40] Ozbayoglu A, Gudelek M, Sezer O (2020) Deep learning for financial applications: A survey. Appl Soft Comput 93: 106384. doi: 10.1016/j.asoc.2020.106384
    [41] Selvin S, Vinayakumar R, Gopalakrishnan E, et al. (2017) Stock price prediction using LSTM, RNN and CNN-sliding window model. In 2017 International Conference on Advances in Computing, Communications and Informatics (ICACCI), 1643–1647.
    [42] Serrano W (2018) Fintech Model: The Random Neural Network with Genetic Algorithm. Proced Comput Sci 126: 537–546. doi: 10.1016/j.procs.2018.07.288
    [43] Sezer O, Ozbayoglu M, Dogdu E (2017) A Deep Neural-Network Based Stock Trading System Based on Evolutionary Optimized Technical Analysis Parameters. Proced Comput Sci 114: 473–480. doi: 10.1016/j.procs.2017.09.031
    [44] Sezer O, Ozbayoglu A (2018) Algorithmic financial trading with deep convolutional neural networks: Time series to image conversion approach. Appl Soft Comput 70: 525–538. doi: 10.1016/j.asoc.2018.04.024
    [45] Sezer O, Ozbayoglu A (2019) Financial trading model with stock bar chart image time series with deep convolutional neural networks. arXiv preprint arXiv: 1903.04610.
    [46] Shah D, Campbell W, Zulkernine F (2018) A Comparative Study of LSTM and DNN for Stock Market Forecasting. In 2018 IEEE International Conference on Big Data (Big Data), 4148–4155.
    [47] Singh R, Srivastava S (2017) Stock prediction using deep learning. Multimed Tools Appl 76: 18569–18584. doi: 10.1007/s11042-016-4159-7
    [48] Sirignano J, Cont R (2019) Universal features of price formation in financial markets: perspectives from deep learning. Quant Financ 19: 1449–1459. doi: 10.1080/14697688.2019.1622295
    [49] Sohangir S, Wang D, Pomeranets A, et al. (2018) Big Data: Deep Learning for financial sentiment analysis. J Big Data 5: 1–25. doi: 10.1186/s40537-017-0111-6
    [50] Sutskever I, Hinton G, Taylor G (2009) The recurrent temporal restricted boltzmann machine. In Advances in neural information processing systems, 1601–1608.
    [51] Théate T, Ernst D (2021) An application of deep reinforcement learning to algorithmic trading. Expert Syst Appl 173: 114632. doi: 10.1016/j.eswa.2021.114632
    [52] Treleaven P, Galas M, Lalchand V (2013) Algorithmic trading review. Commun ACM 56: 76–85. doi: 10.1145/2500117
    [53] Troiano L, Villa E, Loia V (2018) Replicating a Trading Strategy by Means of LSTM for Financial Industry Applications. IEEE Trans Ind Inf 14: 3226–3234. doi: 10.1109/TII.2018.2811377
    [54] Wang Z, Lu W, Zhang K, et al. (2021) MCTG: Multi-frequency continuous-share trading algorithm with GARCH based on deep reinforcement learning. arXiv preprint arXiv: 2105.03625.
    [55] Xie M, Li H, Zhao Y (2020) Blockchain financial investment based on deep learning network algorithm. J Comput Appl Math 372: 112723. doi: 10.1016/j.cam.2020.112723
    [56] Zhao Z, Rao R, Tu S, et al. (2017) Time-Weighted LSTM Model with Redefined Labeling for Stock Trend Prediction. In 2017 IEEE 29th International Conference on Tools with Artificial Intelligence (ICTAI), 1210–1217.
    [57] Zou Z, Qu Z (2020) Using LSTM in Stock prediction and Quantitative Trading. CS230: Deep Learning, Winter.
  • Reader Comments
  • © 2021 the Author(s), licensee AIMS Press. This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0)
通讯作者: 陈斌, bchen63@163.com
  • 1. 

    沈阳化工大学材料科学与工程学院 沈阳 110142

  1. 本站搜索
  2. 百度学术搜索
  3. 万方数据库搜索
  4. CNKI搜索

Metrics

Article views(9362) PDF downloads(844) Cited by(19)

Figures and Tables

Figures(3)  /  Tables(4)

Other Articles By Authors

/

DownLoad:  Full-Size Img  PowerPoint
Return
Return

Catalog