Citation: Chiara D’Alpaos, Michele Moretto. Do smart grid innovations affect real estate market values?[J]. AIMS Energy, 2019, 7(2): 141-150. doi: 10.3934/energy.2019.2.141
[1] | Wenxue Huang, Yuanyi Pan . On Balancing between Optimal and Proportional categorical predictions. Big Data and Information Analytics, 2016, 1(1): 129-137. doi: 10.3934/bdia.2016.1.129 |
[2] | Dongyang Yang, Wei Xu . Statistical modeling on human microbiome sequencing data. Big Data and Information Analytics, 2019, 4(1): 1-12. doi: 10.3934/bdia.2019001 |
[3] | Wenxue Huang, Xiaofeng Li, Yuanyi Pan . Increase statistical reliability without losing predictive power by merging classes and adding variables. Big Data and Information Analytics, 2016, 1(4): 341-348. doi: 10.3934/bdia.2016014 |
[4] | Jianguo Dai, Wenxue Huang, Yuanyi Pan . A category-based probabilistic approach to feature selection. Big Data and Information Analytics, 2018, 3(1): 14-21. doi: 10.3934/bdia.2017020 |
[5] | Amanda Working, Mohammed Alqawba, Norou Diawara, Ling Li . TIME DEPENDENT ATTRIBUTE-LEVEL BEST WORST DISCRETE CHOICE MODELLING. Big Data and Information Analytics, 2018, 3(1): 55-72. doi: 10.3934/bdia.2018010 |
[6] | Xiaoxiao Yuan, Jing Liu, Xingxing Hao . A moving block sequence-based evolutionary algorithm for resource investment project scheduling problems. Big Data and Information Analytics, 2017, 2(1): 39-58. doi: 10.3934/bdia.2017007 |
[7] | Yaguang Huangfu, Guanqing Liang, Jiannong Cao . MatrixMap: Programming abstraction and implementation of matrix computation for big data analytics. Big Data and Information Analytics, 2016, 1(4): 349-376. doi: 10.3934/bdia.2016015 |
[8] | Tao Wu, Yu Lei, Jiao Shi, Maoguo Gong . An evolutionary multiobjective method for low-rank and sparse matrix decomposition. Big Data and Information Analytics, 2017, 2(1): 23-37. doi: 10.3934/bdia.2017006 |
[9] | Wenxue Huang, Qitian Qiu . Forward Supervised Discretization for Multivariate with Categorical Responses. Big Data and Information Analytics, 2016, 1(2): 217-225. doi: 10.3934/bdia.2016005 |
[10] | Yiwen Tao, Zhenqiang Zhang, Bengbeng Wang, Jingli Ren . Motality prediction of ICU rheumatic heart disease with imbalanced data based on machine learning. Big Data and Information Analytics, 2024, 8(0): 43-64. doi: 10.3934/bdia.2024003 |
Multi-nominal data are common in scientific and engineering research such as biomedical research, customer behavior analysis, network analysis, search engine marketing optimization, web mining etc. When the response variable has more than two levels, the principle of mode-based or distribution-based proportional prediction can be used to construct nonparametric nominal association measure. For example, Goodman and Kruskal [3,4] and others proposed some local-to-global association measures towards optimal predictions. Both Monte Carlo and discrete Markov chain methods are conceptually based on the proportional associations. The association matrix, association vector and association measure were proposed by the thought of proportional associations in [9]. If there is no ordering to the response variable's categories, or the ordering is not of interest, they will be regarded as nominal in the proportional prediction model and the other association statistics.
But in reality, different categories in the same response variable often are of different values, sometimes much different. When selecting a model or selecting explanatory variables, we want to choose the ones that can enhance the total revenue, not just the accuracy rate. Similarly, when the explanatory variables with cost weight vector, they should be considered in the model too. The association measure in [9],
To implement the previous adjustments, we need the following assumptions:
It needs to be addressed that the second assumption is probably not always the case. The law of large number suggests that the larger the sample size is, the closer the expected value of a distribution is to the real value. The study of this subject has been conducted for hundreds of years including how large the sample size is enough to simulate the real distribution. Yet it is not the major subject of this article. The purpose of this assumption is nothing but a simplification to a more complicated discussion.
The article is organized as follows. Section 2 discusses the adjustment to the association measure when the response variable has a revenue weight; section 3 considers the case where both the explanatory and the response variable have weights; how the adjusted measure changes the existing feature selection framework is presented in section 4. Conclusion and future works will be briefly discussed in the last section.
Let's first recall the association matrix
γs,t(Y|X)=E(p(Y=s|X)p(Y=t|X))p(Y=s)=α∑i=1p(X=i|Y=s)p(Y=t|X=i);s,t=1,2,..,βτY|X=ωY|X−Ep(Y)1−Ep(Y)ωY|X=EX(EY(p(Y|X)))=β∑s=1α∑i=1p(Y=s|X=i)2p(X=i)=β∑s=1γssp(Y=s) | (1) |
Our discussion begins with only one response variable with revenue weight and one explanatory variable without cost weight. Let
Definition 2.1.
ˆωY|X=β∑s=1α∑i=1p(Y=s|X=i)2rsp(X=i)=β∑s=1γssp(Y=s)rsrs>0,s=1,2,3...,β | (2) |
Please note that
It is easy to see that
Example.Consider a simulated data motivated by a real situation. Suppose that variable
1000 | 100 | 500 | 400 | 500 | 300 | 200 | 1500 | |||
200 | 1500 | 500 | 300 | 500 | 400 | 400 | 50 | |||
400 | 50 | 500 | 500 | 500 | 500 | 300 | 700 | |||
300 | 700 | 500 | 400 | 500 | 400 | 1000 | 100 | |||
200 | 500 | 400 | 200 | 200 | 400 | 500 | 200 |
Let us first consider the association matrix
0.34 | 0.18 | 0.27 | 0.22 | 0.26 | 0.22 | 0.27 | 0.25 | |||
0.13 | 0.48 | 0.24 | 0.15 | 0.25 | 0.24 | 0.29 | 0.23 | |||
0.24 | 0.28 | 0.27 | 0.21 | 0.25 | 0.24 | 0.36 | 0.15 | |||
0.25 | 0.25 | 0.28 | 0.22 | 0.22 | 0.18 | 0.14 | 0.46 |
Please note that
The correct prediction contingency tables of
471 | 6 | 121 | 83 | 98 | 34 | 19 | 926 | |||
101 | 746 | 159 | 107 | 177 | 114 | 113 | 1 | |||
130 | 1 | 167 | 157 | 114 | 124 | 42 | 256 | |||
44 | 243 | 145 | 85 | 109 | 81 | 489 | 6 | |||
21 | 210 | 114 | 32 | 36 | 119 | 206 | 28 |
The total number of the correct predictions by
total revenue | average revenue | |||
0.3406 | 0.456 | 4313 | 0.4714 | |
0.3391 | 0.564 | 5178 | 0.5659 |
Given that
In summary, it is possible for an explanatory variable
Let us further discuss the case with cost weight vector in predictors in addition to the revenue weight vector in the dependent variable. The goal is to find a predictor with bigger profit in total. We hence define the new association measure as in 3.
Definition 3.1.
ˉωY|X=α∑i=1β∑s=1p(Y=s|X=i)2rscip(X=i) | (3) |
Example. We first continue the example in the previous section with new cost weight vectors for
total profit | average profit | ||||
0.3406 | 0.3406 | 1.3057 | 12016.17 | 1.3132 | |
0.3391 | 0.3391 | 1.8546 | 17072.17 | 1.8658 |
By
We then investigate how the change of cost weight affect the result. Suppose the new weight vectors are:
total profit | average profit | ||||
0.3406 | 0.3406 | 1.7420 | 15938.17 | 1.7419 | |
0.3391 | 0.3391 | 1.3424 | 12268.17 | 1.3408 |
Hence
By the updated association defined in the previous section, we present the feature selection result in this section to a given data set
At first, consider a synthetic data set simulating the contribution factors to the sales of certain commodity. In general, lots of factors could contribute differently to the commodity sales: age, career, time, income, personal preference, credit, etc. Each factor could have different cost vectors, each class in a variable could have different cost as well. For example, collecting income information might be more difficult than to know the customer's career; determining a dinner waitress' purchase preference is easier than that of a high income lawyer. Therefore we just assume that there are four potential predictors,
total profit | average profit | ||||
7 | 0.3906 | 3.5381 | 35390 | 3.5390 | |
4 | 0.3882 | 3.8433 | 38771 | 3.8771 | |
4 | 0.3250 | 4.8986 | 48678 | 4.8678 | |
8 | 0.3274 | 3.7050 | 36889 | 3.6889 |
The first variable to be selected is
total profit | average profit | ||||
28 | 0.4367 | 1.8682 | 18971 | 1.8971 | |
28 | 0.4025 | 2.1106 | 20746 | 2.0746 | |
56 | 0.4055 | 1.8055 | 17915 | 1.7915 | |
16 | 0.4055 | 2.3585 | 24404 | 2.4404 | |
32 | 0.3385 | 2.0145 | 19903 | 1.9903 |
As we can see, all
In summary, the updated association with cost and revenue vector not only changes the feature selection result by different profit expectations, it also reflects a practical reality that collecting information for more variables costs more thus reduces the overall profit, meaning more variables is not necessarily better on a Return-Over-Invest basis.
We propose a new metrics,
The presented framework can also be applied to high dimensional cases as in national survey, misclassification costs, association matrix and association vector [9]. It should be more helpful to identify the predictors' quality with various response variables.
Given the distinct character of this new statistics, we believe it brings us more opportunities to further studies of finding the better decision for categorical data. We are currently investigating the asymptotic properties of the proposed measures and it also can be extended to symmetrical situation. Of course, the synthetical nature of the experiments in this article brings also the question of how it affects a real data set/application. It is also arguable that the improvements introduced by the new measures probably come from the randomness. Thus we can use
[1] | D'Alpaos C, Bragolusi P (2018) Buildings energy retrofit valuation approaches: State of the art and future perspectives. Valori e Valutazioni 20: 79–94. |
[2] | D'Alpaos C, Bragolusi P (2018) Multicriteria prioritization of policy instruments in buildings energy retrofit. Valori e Valutazioni 21: 15–25. |
[3] | Barbose G, Darghouth N, Weaver S, et al. (2015) Tracking the Sun VII: An historical summary of the installed price of photovoltaics in the United States from 1998 to 2014. Lawrence Berkeley National Laboratory, Berkeley, CA. Available from: https://emp.lbl.gov/publications/tracking-sun-vii-historical-summary. |
[4] | Hoen B, Wiser R, Adomatis S, et al. (2015) Selling into the sun: Price premium analysis of a Multi-State Dataset of solar homes. Lawrence Berkeley National Laboratory. Available from https://emp.lbl.gov/sites/all/.../lbnl-6942e-fullreport-factsheet.pdf |
[5] |
Dasrupt SR, Zivin JG, Costa DL, et al. (2012) Understanding the Solar Home price premium: Electricity generation and 'Green' social status'. Eur Econ Rev 56: 961–973. doi: 10.1016/j.euroecorev.2012.02.006
![]() |
[6] | Farhar B, Coburn T (2008) A new market paradigm for zero-energy homes: A comparative case study. Environ: Sci Policy Sustainable Dev 50: 18–32. |
[7] |
Hoen B, Cappers P, Wiser R, et al. (2013) Residential photovoltaic energy systems in California: The effect on home sales prices. Contemp Econ Policy 31: 708–718. doi: 10.1111/j.1465-7287.2012.00340.x
![]() |
[8] | Desmarais L (2013) The impact of photovoltaic systems on market value and marketability: A case study of 30 single‐Family homes in the north and northwest Denver metro area. Available from: https://www.colorado.gov/pacific/energyoffice/atom/14956. |
[9] | Bertolini M, D'Alpaos C, Moretto M (2018) Do Smart Grids boost investments in domestic PV plants? Evidence from the Italian electricity market. Energy 49: 890–902. |
[10] |
Biondi T, Moretto M (2015) Solar Grid Parity dynamics in Italy: A real option approach. Energy 80: 293–302. doi: 10.1016/j.energy.2014.11.072
![]() |
[11] |
Bertolini M, D'Alpaos C, Moretto M (2018) Electricity prices in Italy: Data registered during photovoltaic activity interval. Data Brief 19: 1428–1431. doi: 10.1016/j.dib.2018.06.018
![]() |
[12] | Canesi R, D'Alpaos C, Marella G (2016) Foreclosed homes market in Italy: Bases of value. Int J Hous Sci Its Appl 40: 201–209. |
[13] | Canesi R, D'Alpaos C, Marella G (2016) Forced sale values vs. Market values in Italy. J R Estate Lit 24: 377–401. |
[14] | Antoniucci V, D'Alpaos C, Marella G (2015) Energy saving in tall buildings: From urban planning regulation to smart grid building solutions. Int J Hous Sci Its Appl 39: 101–110. |
[15] | Eurostat (2015). Data available from: https://ec.europa.eu/eurostat/web/energy/data/database. |
[16] | Gianfreda A, Grossi L (2010), Forecasting Italian electricity zonal prices with exogenous variables. Energy Econ 34: 2228–2239. |
[17] | Gestore Mercati Energetici (GME) (2018) Available from: http://www.mercatoelettrico.org/it/Statistiche/ME/DatiSintesi.aspx. |
[18] | Fernandez P, Aguirreamalloa J, Corres L (2011) Market risk premium used in 56 countries in 2011: a survey with 6,014 answers. IESE Working Paper n. 920. |
[19] | Fernandez P, Aguirreamalloa J, L. Corres L (2013) Market Risk Premiun used in 82 countries in 2012: a survey with 7,192 answers. IESE Working Paper n. 1059-E. |
[20] | Dipartimento del Tesoro (2015). Available from: http://www.dt.tesoro.it/export/sites/sitodt/modules/documenti_it/debito_pubblico/dati_statistici/Principali_tassi_di_interesse_2015.pdf. |
[21] |
Ciabattoni L, Grisostomi M, Ippoliti G, et al. (2014) Fuzzy logic home energy consumption modeling for residential photovoltaic plant sizing in the new Italian scenario. Energy 74: 359–367. doi: 10.1016/j.energy.2014.06.100
![]() |
[22] | Kost C, Mayer JN, Thomsen J, et al. (2013) Levilized cost of electricity renewable energy technologies. Fraunhofer ISE. |
[23] |
Kastel P, Gilroy-Scott B (2015) Economics of pooling small local electricity prosumers-LCOE & self-consumption. Renewable Sustainable Energy Rev 51: 718–729. doi: 10.1016/j.rser.2015.06.057
![]() |
[24] |
Reichelstein S, Sahoo A (2015) Time of day pricing and the levelized cost of intermittent power generation. Energy Econ 48: 97–108. doi: 10.1016/j.eneco.2014.12.005
![]() |
[25] |
Huld T, Müller R, Gambardella A (2012) A new solar radiation database for estimating PV performance in Europe and Africa. Sol Energy 86: 1803–1815. doi: 10.1016/j.solener.2012.03.006
![]() |
[26] |
Bignucolo F, Coppo M, Crugnola G, et al. (2017) Application of a simplified thermal-electric model of a sodium-nickel chloride battery energy storage system to a real case residential prosumer. Energies 10: 1497. doi: 10.3390/en10101497
![]() |
1000 | 100 | 500 | 400 | 500 | 300 | 200 | 1500 | |||
200 | 1500 | 500 | 300 | 500 | 400 | 400 | 50 | |||
400 | 50 | 500 | 500 | 500 | 500 | 300 | 700 | |||
300 | 700 | 500 | 400 | 500 | 400 | 1000 | 100 | |||
200 | 500 | 400 | 200 | 200 | 400 | 500 | 200 |
0.34 | 0.18 | 0.27 | 0.22 | 0.26 | 0.22 | 0.27 | 0.25 | |||
0.13 | 0.48 | 0.24 | 0.15 | 0.25 | 0.24 | 0.29 | 0.23 | |||
0.24 | 0.28 | 0.27 | 0.21 | 0.25 | 0.24 | 0.36 | 0.15 | |||
0.25 | 0.25 | 0.28 | 0.22 | 0.22 | 0.18 | 0.14 | 0.46 |
471 | 6 | 121 | 83 | 98 | 34 | 19 | 926 | |||
101 | 746 | 159 | 107 | 177 | 114 | 113 | 1 | |||
130 | 1 | 167 | 157 | 114 | 124 | 42 | 256 | |||
44 | 243 | 145 | 85 | 109 | 81 | 489 | 6 | |||
21 | 210 | 114 | 32 | 36 | 119 | 206 | 28 |
total revenue | average revenue | |||
0.3406 | 0.456 | 4313 | 0.4714 | |
0.3391 | 0.564 | 5178 | 0.5659 |
total profit | average profit | ||||
0.3406 | 0.3406 | 1.3057 | 12016.17 | 1.3132 | |
0.3391 | 0.3391 | 1.8546 | 17072.17 | 1.8658 |
total profit | average profit | ||||
0.3406 | 0.3406 | 1.7420 | 15938.17 | 1.7419 | |
0.3391 | 0.3391 | 1.3424 | 12268.17 | 1.3408 |
total profit | average profit | ||||
7 | 0.3906 | 3.5381 | 35390 | 3.5390 | |
4 | 0.3882 | 3.8433 | 38771 | 3.8771 | |
4 | 0.3250 | 4.8986 | 48678 | 4.8678 | |
8 | 0.3274 | 3.7050 | 36889 | 3.6889 |
total profit | average profit | ||||
28 | 0.4367 | 1.8682 | 18971 | 1.8971 | |
28 | 0.4025 | 2.1106 | 20746 | 2.0746 | |
56 | 0.4055 | 1.8055 | 17915 | 1.7915 | |
16 | 0.4055 | 2.3585 | 24404 | 2.4404 | |
32 | 0.3385 | 2.0145 | 19903 | 1.9903 |
1000 | 100 | 500 | 400 | 500 | 300 | 200 | 1500 | |||
200 | 1500 | 500 | 300 | 500 | 400 | 400 | 50 | |||
400 | 50 | 500 | 500 | 500 | 500 | 300 | 700 | |||
300 | 700 | 500 | 400 | 500 | 400 | 1000 | 100 | |||
200 | 500 | 400 | 200 | 200 | 400 | 500 | 200 |
0.34 | 0.18 | 0.27 | 0.22 | 0.26 | 0.22 | 0.27 | 0.25 | |||
0.13 | 0.48 | 0.24 | 0.15 | 0.25 | 0.24 | 0.29 | 0.23 | |||
0.24 | 0.28 | 0.27 | 0.21 | 0.25 | 0.24 | 0.36 | 0.15 | |||
0.25 | 0.25 | 0.28 | 0.22 | 0.22 | 0.18 | 0.14 | 0.46 |
471 | 6 | 121 | 83 | 98 | 34 | 19 | 926 | |||
101 | 746 | 159 | 107 | 177 | 114 | 113 | 1 | |||
130 | 1 | 167 | 157 | 114 | 124 | 42 | 256 | |||
44 | 243 | 145 | 85 | 109 | 81 | 489 | 6 | |||
21 | 210 | 114 | 32 | 36 | 119 | 206 | 28 |
total revenue | average revenue | |||
0.3406 | 0.456 | 4313 | 0.4714 | |
0.3391 | 0.564 | 5178 | 0.5659 |
total profit | average profit | ||||
0.3406 | 0.3406 | 1.3057 | 12016.17 | 1.3132 | |
0.3391 | 0.3391 | 1.8546 | 17072.17 | 1.8658 |
total profit | average profit | ||||
0.3406 | 0.3406 | 1.7420 | 15938.17 | 1.7419 | |
0.3391 | 0.3391 | 1.3424 | 12268.17 | 1.3408 |
total profit | average profit | ||||
7 | 0.3906 | 3.5381 | 35390 | 3.5390 | |
4 | 0.3882 | 3.8433 | 38771 | 3.8771 | |
4 | 0.3250 | 4.8986 | 48678 | 4.8678 | |
8 | 0.3274 | 3.7050 | 36889 | 3.6889 |
total profit | average profit | ||||
28 | 0.4367 | 1.8682 | 18971 | 1.8971 | |
28 | 0.4025 | 2.1106 | 20746 | 2.0746 | |
56 | 0.4055 | 1.8055 | 17915 | 1.7915 | |
16 | 0.4055 | 2.3585 | 24404 | 2.4404 | |
32 | 0.3385 | 2.0145 | 19903 | 1.9903 |