This study evaluates the performance of Exchange-Traded Funds (ETFs) by using various tracking error calculation approaches. The aim of the paper is, on the one hand, an evaluation of the performance of ETFs relative to their benchmarking indexes and, on the other, an endeavour to specify any relationship between this performance and both geographical location and the degree of market development. The research was conducted on the basis of 18 different ETFs issued by iShares, six for each of three regions: both Americas, Asia and Europe. The sole criterion for ETF's selection was the benchmark. All data were collected with daily frequency. They range from January 2013 to December 2019. The results indicate that ETFs do not mimic their corresponding indexes well. Calculated tracking errors do not equal zero and are often significantly negative. Furthermore, the value of tracking errors depends on the region and the degree of market development.
1.
Introduction
The dynamic development of ETFs in recent years confirms the position of researchers who consider this type of fund the largest and most successful financial innovation in the field of investment (Deville, 2008; Antoniewicz & Heinrichs, 2014; Amenc et al., 2017). Although ETFs are regarded as relatively young financial products, elaborations concerning their functioning appear more and more in the world-wide literature. Previous studies refer to one of three thematic categories (Charupat & Miu, 2013). In the first, the performance of ETFs is analyzed, which is understood as the degree of achievement of the investment objective by the fund. In the case of ETFs which are passive funds, this consists of the most accurate mimic of the return rate of the index on which the fund operates. The second group of studies refers to the effectiveness of ETF fund valuations, which consists of determining the differences between the market valuation of ETF shares and their Net Asset Value (NAV). The consequence of this approach is a determination of the factors affecting those differences and the rate at which arbitrage disappears between the various levels of the ETF share price and the fund value of the assets (Bas & Sarioglu, 2015). Finally, in the third group, the subject of interest of researchers remains the relationship between ETF trading on related financial instruments (shares, futures contracts, etc.) that are included in the index, being the benchmark of a given fund. For the effects of ETF trading on constituent stocks for example, these studies examine if there is any change in their trading features (trading volume, spreads, etc.) after the ETF's introduction (Charupat & Miu, 2013). Available studies focus in particular on attempts to determine the impact of ETFs on the trading volume and the exchange rate margin of related financial instruments (Quadan & Yagil, 2012).
This article is devoted to the first of the areas listed above. The subject of the research focuses on the calculation of tracking errors for 18 ETFs operating on the basis of global stock exchange indexes, half of which appear in developed markets and the other half in emerging ones. This allows the degree of implementation of an investment objective to be assessed not only in terms of geographical differentiation (18 different national stock indexes listed in 17 countries in Asia-Pacific, Europe and both Americas), but also with regard to the division within emerging and developed markets. The study sample is derived from one leading ETF issuer, namely iShares. By retaining ETFs from only one fund provider, the sample limits the variability of fund performance due to diverse management styles. In addition, fund valuations are presented in one currency unit—the US dollar. This avoids the exchange differences that may influence the obtained results.
The motivation of this study is thus twofold: firstly, an analysis will be carried out of the mimic of the rate of return obtained by various global indexes based on which selected ETFs operate. In this regard, an attempt will be made to answer the research question of whether the degree of implementation of the investment objective by the fund depends on the geographical location of the market (for example is it different for Asian and European markets?) and if so, indicate the potential reason for it. Secondly, although the global success of ETFs has sparked the interest of researchers, the number of studies in emerging markets focused on ETFs is very limited. This study, dealing with ETFs operating on the basis of the stock exchange indexes of countries which include emerging markets, contributes to filling this gap. Accordingly, relying on the developed-emerging markets' dichotomy an effort has been made to shed more light on the ETFs' mispricing. Furthermore, the evidence presented here is of pivotal interest to each country ETFs' investors, since it allows them insight into the trading dynamics associated with these ETF's.
The structure of the paper is as follows: section two describes the current state of knowledge on Exchange-Traded Funds as a form of passive funds and the impact of geographical location on the level of ETFs' mispricing. The third section presents the tools for the measurement of ETF performance, namely tracking errors. The fourth section deals with the results of the empirical analysis. In the last section, the main conclusions are discussed and suggestions for future research are made.
2.
Materials and method
2.1. The theoretical basics
Although, considering all types of investment funds, the greatest importance should be attributed to collective investment funds, commonly known as mutual funds, it is noteworthy that the increase in popularity of ETFs has been especially visible in recent years. While in 2009, the NAV of ETFs accounted for about 4% of the NAV of collective investment funds, in 2018 it was already over 11% (Investment Company Institute, 2019). This is the consequence of an increase in interest in passive forms of investment, which include the majority of ETFs. As indicated in Kallinterakis et al. (2020), ETFs possess a series of high quality properties, including transparency, dividend-treatment, risk management and tax-efficiency.
The development of passive management of investment portfolios is deeply rooted in efficient market hypothesis (EMH), which assumes the consideration of all available information at a specified moment in the valuation of financial instruments (Fama, 1970).1 Translating EMH into the market of investment funds should be understood in such a way that based on all available market information, it is not possible to achieve higher rates of return in the case of investments made through actively managed funds, compared to financial instruments which reflect the stock index (Dębski, 2010; Chlebisz, 2018). Such instruments include passively managed ETFs, for which an investment portfolio modelling strategy is employed on a selected index (Nawrot, 2007).The presence of deviations of ETFs' prices from their benchmark has been widely confirmed in research (Deville, 2008; Kallinterakis et al., 2020). These premiums and discounts may have a considerable size and the frequent rationale behind them has been a geographical location. In view of country ETFs' misvaluation it might be expected that investors will endeavour to exploit this pricing inefficiency by utilising investment strategies based on these funds historical deviations patterns. Indeed there are a number of research that confirm the profitability of such strategies (Jares & Lavin, 2004; Ackert & Tian, 2008).
1 Despite numerous studies, both foreign (Basu, 1977; Malkiel, 2003; Sewell, 2012; Konak & Seker, 2014) and Polish (Czekaj et al., 2001; Szyszka, 2003; Witkowska & Żebrowska-Suchodolska, 2008, Goczek & Kania-Morales, 2015), in which authors assess based on EMH the effectiveness of financial markets, there are voices questioning the validity of the hypothesis of effective markets, taking into account in particular the changes taking place in the modern world of finance (Straffin, 2001; Evans & Honkapohja, 2005; Ambroziak, 2014; Zawadzki, 2018).
Among the studies regarding the scope of implementation of an investment objective by a fund, understood as the degree of mimicking the rate of return obtained by the index based on which the ETF operates, there are those which refer in particular to the US market. They draw attention to the lower rates of return generated by the possession of ETFs compared to the benchmark. The reasons indicated for such underperformance include both transaction costs related to purchasing/selling ETFs (Kotsovetsky, 2003; Bernstein, 2004; Agapova, 2011) and the adoption of passive management strategies by fund managers while attempting to reduce tracking errors (Gastineau, 2004).
Beyond the US market, reference should be made to ETFs operating in Europe (Marszk & Lechman, 2019). Also in this case, the results of passive investment products are underestimated compared to the benchmark. As the main reasons for the undervaluation of ETFs, reference is made to management costs and the fiscal aspect regarding the differentiated methods of income tax settlement by European investors (Blitz et al., 2012).
Apart from above mentioned, some studies concern ETFs listed in the US and targeting an overseas markets (Engle & Sarkar, 2006; Kallinterakis et al., 2020). In this case, one reason for mispricing is non-synchronicity in trading between ETFs traded in the United States and theirs benchmarks in Europe and Asia. Due to the fact that US market and other global (European, Asian) markets are not opened simultaneously for trading, the potential deviations between US ETFs and their underlying benchmarks can not be real-time arbitraged. The further away a market lies geographically from the United States, the lower the overlap between their trading times is.
The other rationale for the ETFs' deviations from their benchmarks has been considered liquidity. There are studies that confirm a considerable relationship between premiums/discounts and the difference in liquidity between the fund share and the underlying assets (Chan et al., 2008; Fletcher, 2013). Once the markets targeted by ETFs are less liquid than their home market, which is most often the US market, it may lead their prices to be more volatile compared to their target markets.
In addition to these, a rare practice, which nevertheless occurs, is to undertake research on the implementation of an investment objective in emerging countries that are characterized by high dynamics of economic growth. This case applies both to ETFs that are introduced to trading on the stock exchanges of individual countries, as well as funds operating on the basis of the stock indexes of these countries, but listed on the markets of the United States or Western Europe. In studies regarding this area, attention is drawn to the occurrence of higher levels of tracking errors in emerging countries compared to developed ones. The source of this state of affairs indicates, among other things, foreign exchange risk, or generally less liquidity for emerging markets (Shin & Soydemir, 2010; Blitz & Huij, 2012).
So far, however, no research has been done to compare the tracking errors of developed and emerging markets of ETFs due to geographical diversity. Hence the constituted cornerstone of this paper is to respond the question, whether it is developed or emerging markets that prompt ETF mispricing?
2.2. Measures of the effectiveness of the degree of implementation of an investment objective
The basic tools for measuring the effectiveness of the degree of implementation of an investment objective include those related to the estimation of the tracking difference and tracking error. Although the assumption is that ETFs should accurately mimic the changes in market prices, in practice, the rates of return on investment in ETFs differ from the rates of return on the replicated index (benchmark). The difference between the investment results achieved by an ETF fund and at the same time the results of the replicated index is referred to as the tracking difference. For example, if the return rate of a fund's investment is calculated at ten per cent per annum, whereas the return rate of the benchmark equals eleven per cent, it means that the tracking difference was minus 1 per cent. The formula for determining the tracking difference (TD) at time t is as follows (Madhavan, 2016):
where:
pt: ln NAV values of the ETF fund at the end of period t,
pt-1: ln NAV values of the ETF fund at the end of period t−1,
It: ln value of the income index (adjusted for dividend payment) at the end of the period t,
It-1: ln value of the income index (adjusted for dividend payment) at the end of the period t−1.
Even if the return rate on the index deviates from the return rate generated by the NAV of the ETF, this should not be a significant difference. The tracking difference is used to identify potential revenues and costs that determine the occurrence of deviations from the index value.
The tracking difference is frequently confused with the term tracking error (TE). In reality, however, these terms are not the same, as the tracking error allows the determination of the volatility of differences in the return rates generated by the ETF compared to the index on which the fund operates. It is therefore more a qualitative measure. In addition, the tracking error may be subject to ex post and ex ante measurements. The tracking difference applies only to ex-post evaluations. In the analysis of historical data, the tracking error is calculated as the standard deviation of differences in the rates of return achieved by the ETF and a given benchmark, or as the variation of the tracking difference. Usually, calculations are made based on the formula above using daily rates of return (Madhavan, 2016). For tracking error forecasts, the covariance matrix of a particular risk model is used. This is defined as the volatility or standard deviation of the ex ante risk of the difference between the ETF and the benchmark.
It follows from the above that the assessment of the tracking error is a bit more complicated. There is no single, universal method of measuring effectiveness in this area. In practice, several different measures are used (Roll, 1992; Pope & Yadav, 1994; Cresson, Cudd & Lipscomb, 2002). In terms of the tracking error, these include measures described by the following three formulas:
1. The difference in return rates between the ETF and the benchmark:
where:
ei: i-th ETF tracking error,
n: number of observations,
ei = NRi, t – ERi, t,
where:
NRi, t: ln of return rates of i-th ETF at time t,
ERi, t: ln of return rates of the benchmark (index), on the basis of which i-th ETF at time t operates.
2. The arithmetic average of the absolute values of the daily tracking error levels:
3. The standard deviation of the differences between the rates of return of the i-th ETF and the rates of return of the benchmark:
From the investor's point of view, the values characterizing both the tracking difference and the tracking error should be as small as possible. The lower the value of the tracking error, the better the projecting of benchmark results, which means that the risk is lower. In turn, the higher the tracking error value, the worse the ETF fund mimics the results achieved by the benchmark, so the risk is higher.
In this study, the tracking error is calculated taking into account each of the above three approaches. Table 1 reports the profiles—including: name of the fund, ticker, benchmark (stock market index), inception year, total net assets, gross expense ratio and market type—of the 18 iShares Country Funds, six for each region (Asia-Pacific, Europe and the Americas) regarding the division into developed and emerging markets. This means that for each market type there are 9 ETFs, 3 for each region. The selection criteria was the size of net assets for each location and market development. Only the markets with the largest net assets were selected. It should be mentioned that the expense ratios of both US ETFs are the lowest considering the entire sample. This seems to be reasonable, since they target their own market, hence they are not the subject to the risks of cross border trading.
All data were collected with daily frequency using logarithmic returns of the ETFs in the case of funds, and logarithmic returns of the index value. They range from January 2013 to December 2019. This is due to the inception day of the iShares MSCI India ETF in 2012. Extending the research period prior to 2012 would result in a differentiation in the number of observations, which was avoided in this study. If the ETF fund replicates the benchmark (index) well, then the average tracking error is expected to be close to zero. In order to test the relationship between the performance of ETFs and their benchmarks, the t-test was employed. Because the samples are dependent, a paired comparisons test was appropriate. The t-test is based on data arranged in paired observations, and the test itself is sometimes called a paired comparisons test, with formula:
with n-1 degrees of freedom, where n is the number of paired observations, −d is the sample mean difference, μd0 is hypothesized value for the population mean difference (most commonly used value is 0), s−d is the standard error of −d.
3.
Results and discussion
Tracking errors were estimated using the three different methods as presented in section 2. Table 2 reports the tracking errors for 18 ETFs and categorizes these funds depending on the region and the development level of the national economy. Generally, it can be seen that ETFs in the US exhibit the lowest level of tracking errors, since they target their own market, thus not being subject neither to the different trading times nor to the liquidity issues At the same time, the tracking errors for emerging markets are higher compared to developed markets for each of the three regions. In general, the largest problems with index mimicking occur on European markets. The reason behind the mispricing between developed and emerging markets has been both: higher foreign exchange risk and generally less liquidity for emerging markets.
In addition, in Table 2 the sum of daily tracking errors was computed for each fund, including the positive and negative errors as presented in Harper et al. (2006), and the t-statistic was utilized to perform their statistical significance. The average TE1 is negative, suggesting that these funds trade at a discount versus their benchmark index. TE2 and TE3 are positive by construction. All ETFs have a negative sum of TE1 daily tracking errors. This means that more importance should be given to negative tracking errors in comparison to positive ones, irrespective of either the level of market development or the region. Negative tracking errors lead investors to expect a negative risk premium in testing the performance of ETFs. Fourteen out of eighteen ETFs have a statistically significant negative sum of daily errors. The highest negative values of the sum of daily errors appear in the case of European markets, whereas the lowest concern the developed American markets. This confirms the earlier findings on the basis of tracking errors that ETFs underperform their benchmarks for each market, although this differs according to both the level of market development and the location.
The sources of deviations may vary, and finding them would require the researcher to use a regression analysis taking into account the determinants affecting the size of tracking errors depending on developed and emerging markets and geographical location. The use of such a solution was beyond the scope of this study. According to the available literature, it may be assumed that one of the possible reasons is the high price volatility observed in emerging markets (Quadan & Yagil, 2012). Other underlying reasons are identified to be differences between the trading hours of stock exchanges, exchange rate fluctuations and different transaction costs (Shin & Soydemir, 2010; Baş & Sarıoğlu, 2015).
4.
Conclusion
In this study, the performance of 18 exchange-traded funds relative to their benchmark indexes was estimated. Tracking errors were found to be statistically significant and negative. The findings prove that investing in ETFs does not provide a considerable benefit compared to their benchmark returns, irrespective of the level of market development and the location.
The findings indicate that a larger divergence between the market prices and the NAVs of ETFs appears in the case of emerging markets compared to developed ones. At the same time, in assessing the degree of performance of an intended investment objective which is an accurate replication of the index, it can be stated that the geographical criterion determines the level of the tracking error. American markets classified as developed markets (USA and Canada) were characterized by the lowest values among those analyzed. The largest values, reaching 0.3%, appeared in emerging European markets.
The results show that emerging markets prompt ETF mispricing. There are several posibble reasons of that course of action. First of all this is emerging market liquidity, which is lower than the liquidity of developed markets. Secondly, it is different trading times between the US markets and other, especially non-American markets. For Asian markets there is practically no overlap which suggests that an ETF listed in the US and targeting an overseas market is likely to track a benchmark whose prices emanate from one day back and are, thus irrelevant to the fundamnetals of the day when the ETF itself is trading. Last but not least higher exchange rate risk should be considered in terms of emerging markets in comparision with developed ones.
This study is the first to assess the effectiveness of ETF investment objectives in developed and emerging markets, taking into account geographical diversity while attempting to determine the statistical significance of the obtained results. Despite its signs of originality, the work is not free from shortcomings. In this regard, first of all, it should be indicated that the study examines a relatively small number of ETFs, issued by one global institution—iShares. On the one hand, such a decision was related to the author's attempt to limit the various management styles implemented by different ETF issuers. On the other hand, the results obtained give only partial knowledge about the total population of passive funds. Secondly, the paper points out the discrepancies between the ETFs and the value of stock indexes, but it does not allow the reasons for these deviations to be determined. To achieve this aim, this study should be expanded by testing an econometric model in which the impact on the tracking error of adopted dependent variables, such as the exchange rate or the liquidity of a given market, would be analyzed.
Acknowledgments
This research did not receive any specific grant from funding agencies in the public, commercial, or not-for-profit sectors.
Conflict of interest
The author declares no conflicts of interest in this paper.