Research article
Special Issues
Using huge amounts of road sensor data for official statistics
-
1.
Center for Big Data Statistics, Statistics Netherlands, Heerlen, The Netherlands
-
2.
Department of Tra c and Transport Statistics, Statistics Netherlands, The Netherlands
-
3.
Institute for Computing and Information Sciences, Radboud University Nijmegen, Nijmegen, The Netherlands
-
Received:
24 October 2018
Accepted:
11 December 2018
Published:
19 December 2018
-
-
MSC :
62P99, 62M05
-
-
On the Dutch road network, about 60,000 road sensors are located of which 20,000 sensors are on the Dutch highways. Both vehicle counts and average speed are collected each minute and stored in the National Traffic Daffic statistics several methodological challenges needed to be solved. The first was developing a method to check and improve the data quality as quite some sensors lacked data for many minutes during the day. A cleaning and estimation step was implemented that enabled a precise and accurate estimate of the number of vehicles actually passing the sensors for each minute. The second challenge was monitoring the stream of incoming and outgoing data and controlling this fully automatic statistical process. This required defining quality indicators on the raw and processed sensor data. The fourth challenge was determining calibration weights based on the geographic locations of the road sensors on the roads. This was needed because road sensors are not uniformly distributed over the road network. As the number of active sensors fluctuates over time, the weights need to be determined periodically. As a result of these steps accurate numbers could be produced on the traffic intensity during various periods on regions in the Netherlands.
Citation: Marco J. H. Puts, Piet J. H. Daas, Martijn Tennekes, Chris de Blois. Using huge amounts of road sensor data for official statistics[J]. AIMS Mathematics, 2019, 4(1): 12-25. doi: 10.3934/Math.2019.1.12
-
Abstract
On the Dutch road network, about 60,000 road sensors are located of which 20,000 sensors are on the Dutch highways. Both vehicle counts and average speed are collected each minute and stored in the National Traffic Daffic statistics several methodological challenges needed to be solved. The first was developing a method to check and improve the data quality as quite some sensors lacked data for many minutes during the day. A cleaning and estimation step was implemented that enabled a precise and accurate estimate of the number of vehicles actually passing the sensors for each minute. The second challenge was monitoring the stream of incoming and outgoing data and controlling this fully automatic statistical process. This required defining quality indicators on the raw and processed sensor data. The fourth challenge was determining calibration weights based on the geographic locations of the road sensors on the roads. This was needed because road sensors are not uniformly distributed over the road network. As the number of active sensors fluctuates over time, the weights need to be determined periodically. As a result of these steps accurate numbers could be produced on the traffic intensity during various periods on regions in the Netherlands.
References
[1]
|
P. J. H. Daas, M. J. H. Puts, B. Buelens, et al. Big Data as a Source of Official Statistics, J. Off. Stat., 31 (2015), 249-262. doi: 10.1515/jos-2015-0016
|
[2]
|
A. P. Plageras, K. E. Psannis, C. Stergiou, et al. Efficient IoT-based sensor BIG Data collectionCprocessing and analysis in smart buildings, Future Gener. Comp. Sy., 82 (2018), 349-357. doi: 10.1016/j.future.2017.09.082
|
[3]
|
M. J. H. Puts, P. J. H. Daas and T. de Waal, Finding Errors in Big Data, Significance, 12 (2015), 26-29.
|
[4]
|
NDW: a nationwide portal for traffic information, 2016. Available from: https://bit.ly/2AxHJDV.
|
[5]
|
A. B. Waghmare, I. Lee, O. Sokolsky, Real-Time Traffic Congestion Prediction, NSF-NCO/NITRD National Workshop on High Confidence Transportation Cyber-Physical Systems, 2008.
|
[6]
|
A. B. Waghmare, D. D. Gatade, Algorithms and Techniques on Travel-Time Prediction Systems, International conference on Emanations in Modern Technology and Engineering (ECEMTE-217), 5 (2017), 105-109.
|
[7]
|
C. Rudin, D. Dunson, R. Irizarry, et al. Discovery with Data: Leveraging Statistics with Computer Science to Transform Science and Society, White Paper, American Statistical Association, 2014.
|
[8]
|
J. M. F. Moura, What Is Signal Processing? Presidents Message, IEEE SignalProcessing Magazine, 26 (2009), 6.
|
[9]
|
D. J. Buckeley, A Semi-Poisson Model of Traffic Flow, Transport. Sci., 2 (1968), 107-133. doi: 10.1287/trsc.2.2.107
|
[10]
|
J. Diard, P. Bessière, E. Mazer, A survey of probabilistic models, using the Bayesian Programming methodology as a unifying framework,Conference on Computational Intelligence, Robotics and Autonomous Systems, CIRAS, 2003.
|
[11]
|
J. Durbin and S. J. Koopman, Time Series Analysis by State Space Methods, Revised Second Edition, Oxford University Press, UK, 2012.
|
[12]
|
A. C. Kokaram, S. J. Godsill, MCMC for Joint Noise Reduction and Missing Data Treatment in Degraded Video,IEEE Transaction on Signal Processing, 50 (2002), 189-205. doi: 10.1109/78.978375
|
[13]
|
D. Fox, J. Hightower, L. Liao, et al. Bayesian filtering for location estimation, IEEE Pervasive Computing, 2 (2003), 24-33.
|
[14]
|
A13 busiest National Motorway in the Netherlands. Available from: http://bit.ly/1TzPef8.
|
[15]
|
Verkeersintensiteiten op rijkswegen (statline table). Available from: http://bit.ly/1SOyMHI.
|
-
-
-
-