Spatio-temporal Keywords Queries in HBase
-
Received:
01 May 2015
Revised:
01 August 2015
Published:
01 January 2016
-
-
-
-
With the amount of data accumulated to tens of billions of scale, HBase, a distributed key-value database, plays a significant role in providing effective and high-throughput data service and management. However, for the applications involving spatio-temporal data, there is no good solution, due to inefficient query processing in HBase. In this paper, we propose spatiotemporal keyword searching problem for HBase, which is a meaningful issue in real life and a new challenge in this platform. To solve this problem, a novel access model for HBase is designed, containing row keys for indexing spatiotemporal dimensions and Bloom filters for fast detecting the existence of query keywords. And then, two algorithms for spatio-temporal keyword queries are developed, one is suitable for the queries with ordinary selectivity, the other is a parallel algorithm based on MapReduce aiming for the large range queries. We evaluate our algorithms on a real dataset, and the empirical results show that they are capable to handle spatio-temporal keyword queries efficiently.
Citation: Xiaoying Chen, Chong Zhang, Zonglin Shi, Weidong Xiao. Spatio-temporal Keywords Queries in HBase[J]. Big Data and Information Analytics, 2016, 1(1): 81-91. doi: 10.3934/bdia.2016.1.81
-
Abstract
With the amount of data accumulated to tens of billions of scale, HBase, a distributed key-value database, plays a significant role in providing effective and high-throughput data service and management. However, for the applications involving spatio-temporal data, there is no good solution, due to inefficient query processing in HBase. In this paper, we propose spatiotemporal keyword searching problem for HBase, which is a meaningful issue in real life and a new challenge in this platform. To solve this problem, a novel access model for HBase is designed, containing row keys for indexing spatiotemporal dimensions and Bloom filters for fast detecting the existence of query keywords. And then, two algorithms for spatio-temporal keyword queries are developed, one is suitable for the queries with ordinary selectivity, the other is a parallel algorithm based on MapReduce aiming for the large range queries. We evaluate our algorithms on a real dataset, and the empirical results show that they are capable to handle spatio-temporal keyword queries efficiently.
References
[1]
|
[ HBase, 2015. Available from:http://hbase.apache.org.
|
[2]
|
[ Hadoop, 2015. Available from:http://hadoop.apache.org.
|
[3]
|
[ J. Blustein and A. El-Maazawi, Bloom filters. a tutorial, analysis, and survey, Halifax, NS:Dalhousie University, (2002), 1-31.
|
[4]
|
[ C. Cheng, C. Sun, X. Xu and D. Zhang, A multi-dimensional index structure based on improved VA-file and CAN in the cloud, International Journal of Automation and Computing, 11(2014), 109-117.
|
[5]
|
[ G. Cong, C. S. Jensen and D. Wu, Efficient retrieval of the top k most relevant spatial web objects, VLDB Endowment, 2(2009), 337-348.
|
[6]
|
[ I. D. Felipe, V. Hristidis and N. Rishe, Keyword search on spatial databases, In ICDE, (2008), 656-665.
|
[7]
|
[ C. S. Jensen, D. Lin and B. C. Ooi, Query and update efficient B+-tree based indexing of moving objects, VLDB Endowment, 30(2004), 768-779.
|
[8]
|
[ B. Moon, H. V. Jagadish, C. Faloutsos and J. H. Saltz, Analysis of the clustering properties of the Hilbert space-filling curve, IEEE Transactions on Knowledge and Data Engineering, 13(2001), 124-141.
|
[9]
|
[ S. Nishimura, S. Das, D. Agrawal and A. E. Abbadi, MD-HBase:A Scalable Multidimensional Data Infrastructure for Location Aware Services, In MDM, 1(2011), 7-16.
|
[10]
|
[ W. Zhou, J. Lu, Z. Luan, S. Wang, G. Xue and S. Yao, SNB-index:A SkipNet and B+ tree based auxiliary Cloud index, Cluster Computing, 17(2014), 453-462.
|
-
-
-
-