1.
Introduction
An income distribution shows how the national income of a country is shared among its people. It provides an insight into the degree of inequality in the incomes of individuals within a country. Income can take various forms, such as wages, salaries, and capital gains. When income is distributed more equally among individuals, they can benefit from social welfare programs equally [1].
The unequal distribution of income and wealth is a significant challenge faced by most developing countries, particularly those that export petroleum like Iran. Unequal income distribution creates political, moral, and social challenges within human societies. The lower classes, who bear the primary burden of production, can become dissatisfied and angry due to inequality in income distribution. This unrest can ultimately lead to civil war, social instability, and the destruction of the economic, social, and political foundations of a country [2].
Determining the social position and welfare of individuals requires an understanding of both the income distribution status in society and the positions of individuals within different income groups. Several probability models have been proposed to assess the living standards of entire populations in different countries and to compare the living levels of various social classes or regions. To develop a probability model, it is necessary to establish a theoretical distribution function.
The widespread inequality in income distribution leads to pervasive poverty. High levels of inequality can create further gaps between social classes, spreading poverty as the deprived and low-income classes obtain only a small share of available resources. In practice, theorists do not regard the huge income gap between the lower and upper classes in developing countries as an economic advantage. Instead, they view it as an obstacle to economic growth and development. The majority of scholars also emphasize the positive role of a broad middle class in society. They believe that social class structures resemble a pointed pyramid with a broad base in most third-world countries. In other words, a small percentage of the population is very wealthy, while a large percentage is very poor, and the percentage of the middle class is insignificant [3].
Econophysics suggests applying theories and methods of statistical mechanics to problems in economics. Physical economists have extensively analyzed inequality in the distribution of income and wealth in the capitalistic system [4]. Analysis of income distribution inequality through a physical approach goes back to Wilfred Pareto. In his book [5], Pareto analyzed wealth distribution in society. He concluded that wealth distribution among the rich would follow the power law [6]. In recent decades, experimental studies have reported that only the distribution sequence follows the Pareto distribution. In other words, the income distribution of the rich minority group in society (accounting for 3% of the total population) obeys the Pareto distribution of the power law. However, the income distribution of 97% of the low-income population in society follows an Exponential or Lognormal distribution [7, 8, 9]. There is a great deal of disagreement on what the most appropriate distribution can show the income distribution of the whole society. The question has been discussed by economists and statistical physicists for years.
Studying the income distribution in Brazil, Moura and Ribeiro [8] developed the Gompertz-Pareto distribution function, which can explain the simultaneous income distribution of society's lower and upper classes. They reported that approximately 1% of Brazil's population was explained by the Pareto distribution function, whereas the Gompertz distribution function explained the remaining 99% of the population. Conducting a study in the US, Yakovenko and Rosser [9] concluded that the general form of distribution in different years had the same quality characteristics and that nearly 3% of the population was within the range of the Pareto distribution function. They also stated that the remaining 97% of the population was within the range of the Gibbs-Boltzmann function. Eventually, they concluded that the income distribution of the lower class was stable, whereas that of the upper class was exponential and unstable. Moreover, Jagielski and Kutner [10] analyzed three low-, middle- and high-income social classes to study the income distribution of EU members. The Faker-Planck equation was utilized in this study to describe the income levels of different social classes through the statistical econophysics approach, in which the Gibbs-Boltzmann distribution, Pareto distribution, and Zipf distribution yielded the best outputs for low-income, middle-income, high-income social classes, respectively. Also, Wada and Scarfone [11] found that the relations are the basis and necessary conditions of physical behavior and showed some uses of the Kaniadakis distribution to achieve results aligned with the takeaway of our use of the distribution.
The income index and sustainability are interconnected in multiple ways. Income is a crucial factor that influences the environmental impact of economic activities and determines the ability of individuals, households, and businesses to adopt sustainable practices [12, 13]. On the one hand, higher income levels can lead to increased consumption and waste generation, leading to a negative impact on the environment [14, 15]. However, higher income levels also provide individuals and businesses with the resources needed to invest in sustainable technologies and practices, thereby reducing their environmental footprint [16, 17]. Moreover, income inequality can directly affect sustainability efforts by creating social tensions, hindering access to education and healthcare, and limiting opportunities for economic growth and development, which in turn can impede the adoption of sustainable practices. Therefore, ensuring equitable access to income and promoting sustainable practices can work together towards achieving long-term environmental, social, and economic sustainability [18, 19].
This study aims to develop a distribution function using the econophysics approach, which can show the income distribution of different classes of society.
2.
Methods
If Y∼Gamma(α,1) the probability density function (pdf) of the gamma distribution is [20]:
The normalization constant for the new distribution function (Y=X−μT) is:
Therefore, the pdf of the general Gibbs-Boltzmann distribution is defined as below:
where μ and T respectively are the lowest household income and average income, analogous to temperature in the Boltzmann-Gibbs distribution [7, 10, 21].
where α is the shape parameter, and this density function includes low, medium- and high-income social classes.
The Gibbs-Boltzmann distribution is the most widely used function in statistical mechanics and physics. In recent years, it has been used in revenue distribution. The Gibbs-Boltzmann law with exponential distribution is defined for the medium- and high-income social classes in [10, 22, 23].
We use two statistical distributions in this paper. They are also utilized in revenue distribution to compare distribution functions [21]. The pdf of the Lognormal distribution function is defined below with parameters µ and σ:
Having heavy sequences, the Pareto distribution is a probability distribution that describes many physical, economic, and social phenomena. The Pareto distribution of this law is valid only for the high-income social class [1]. This distribution is defined as below:
Here k, the scale parameter is positive, and α (Pareto index of inequality), the shape parameter, is also positive.
In statistical applications, the maximum likelihood method (MLE) is a powerful technique used to estimate the parameters of a specific probability distribution function based on observed data. Its main objective is to find the values of the parameters that maximize the likelihood function, which represents the probability of observing the given data for different values of the parameters. MLE is widely used in various fields, such as economics, biology, and engineering, to name a few.
The strength of MLE lies in its ability to produce reliable estimates of the parameters of a model, even when the sample size is relatively small. It is also useful for comparing different models and selecting the one that best describes the data. However, it assumes that the data are independent and identically distributed, which may not always be true in practice. Overall, MLE is a versatile and widely used technique for analyzing the statistical behavior of a sequence or dataset.
In parametric distribution functions, the observed data are supposed to be generated by a distribution function depending on a few unknown parameters [24, 25, 26]. The MLE is the method of estimating the parameters of a distribution function.
For the Gibbs-Boltzmann distribution, we have
and
Therefore, the following equations are employed to obtain α, T, and μ:
and
3.
Results
A widespread problem facing many societies is the unequal distribution of income and wealth, which is commonly referred to as the class gap. Inequality and the class gap have significant impacts on all aspects of individual and social life [27]. As discussed earlier, income distribution is a critical topic in economics, and economists have long debated which distribution function provides the best approximation for the experimental distribution of income in a country. This section will analyze various distribution functions in economics, starting with the Pareto and Lognormal distributions. We will then adopt the econophysics approach to study the Gibbs-Boltzmann and generalized Gibbs-Boltzmann distributions. To evaluate the goodness of fit of these distributions, we will conduct a chi-square goodness of fit test, which is briefly explained below.
Chi-square goodness of fit test: The chi-square test is a goodness test of the fit for a set of statistical data to determine a specific probability distribution showing how well that statistical model fits a set of observations [28, 29].
To determine the goodness of fit of statistical data for a probability distribution, the observed frequency of each group or class is compared with the expected theoretical frequency obtained from the probability distribution. The chi-square test statistics are written as follows to determine whether statistical data have a specific probability distribution.
where k is the number of classes or groups, p is the estimated number of parameters, f and ft respectively the observed and expected frequency. The null hypothesis of the chi-square test is defined as follows:
H0: The statistical data follow the specified probability distribution.
If the calculated test statistic exceeds the critical value obtained from the chi-square table, then H0 is rejected.
According to the chi-square test results, H0 was rejected in the Pareto distribution for all the study years. In other words, the income distribution of the whole country did not follow the Pareto distribution in any of the study years. As the results of the Lognormal distribution show, H0 is rejected in all years except for 2007 and 2011. Hence, the income distribution throughout Iran does not obey the Lognormal distribution in most study years. It can then be concluded that Pareto and Lognormal distributions are good enough to show the income distribution of the whole country. Therefore, further research must be conducted to find a suitable distribution function to express the income distribution of the whole country. For this purpose, the generalized Gibbs-Boltzmann distribution will be analyzed in the next step.
According to Table 2, H0 was not rejected in any of the study years. Therefore, it is concluded that income distribution across the country follows the generalized Gibbs-Boltzmann distribution. In other words, the generalized Gibbs-Boltzmann distribution can adequately explain the income distribution across the country.
Figures 1 and 2 demonstrate the actual distributions as histograms. The Pareto distribution, the Lognormal distribution, and the Gibbs-Boltzmann distribution are highlighted in red, yellow, and green, respectively.
Accordingly, the generalized Gibbs-Boltzmann distribution is a very good fit to the actual data distribution. In fact, it is able to properly explain income distribution across the country. Moreover, the generalized Gibbs-Boltzmann distribution better fits the actual income data than the Pareto and the Lognormal distributions. Therefore, the results also confirm the results expressed in this section.
Iran's income distribution was analyzed in this section. The chi-square goodness of fit test was employed to examine the goodness of fit of the studied distributions. The results show that the income distribution in Iran did not follow the Pareto and Lognormal distributions in most of the study years but followed the Gibbs-Boltzmann distribution in all years. According to the results, the Gibbs-Boltzmann distribution also has a very good fit to the actual distribution of data and is able to properly explain the distribution of income in Iran. Furthermore, the Gibbs-Boltzmann distribution has a better fit to the actual revenue data than both Pareto and Lognormal distributions.
4.
Conclusions
This study analyzed the Pareto and Lognormal distributions, which are among the most well-known income distribution functions [30, 31, 32]. The distribution parameters are estimated through the MLE. The chi-square test was also employed to evaluate the goodness of fit.
The research results indicate that none of the income distributions followed the Pareto and Lognormal distributions across Iran in most of the study years (2006–2018 period). Therefore, it can be concluded that the Pareto and the Lognormal distributions are not good enough to explain the income distribution in Iran. In such circumstances, this study explored the known distributions of econophysics.
The Gibbs-Boltzmann distribution function has been widely used in statistical mechanics and physics and has recently been applied to analyze income distribution. However, it had never been used before in Iran to model income distribution. Therefore, this study utilized the generalized Gibbs-Boltzmann distribution (2) to analyze income distribution in Iran. Based on the estimation of the generalized Gibbs-Boltzmann distribution parameters and the chi-square test, the income distribution in Iran was found to follow the generalized Gibbs-Boltzmann distribution. In other words, the generalized Gibbs-Boltzmann distribution provides a better fit for the actual income data in Iran than the Pareto and Lognormal distributions. From a practical perspective, understanding the position of income distribution in society and having accurate information about the positions of individuals in different income groups can help governments and policymakers take necessary actions to reduce social class gaps through new policies and mechanisms. It is worth mentioning that using the Gini and Herfindahl-Hirschman index can be an interesting research area for future research.