Spatial co-location pattern mining discovers the subsets of spatial features frequently observed together in nearby geographic space. To reduce time and space consumption in checking the clique relationship of row instances of the traditional co-location pattern mining methods, the existing work adopted density peak clustering to materialize the neighbor relationship between instances instead of judging the neighbor relationship by a specific distance threshold. This approach had two drawbacks: first, there was no consideration in the fuzziness of the distance between the center and other instances when calculating the local density; second, forcing an instance to be divided into each cluster resulted in a lack of accuracy in fuzzy participation index calculations. To solve the above problems, three improvement strategies are proposed for the density peak clustering in the co-location pattern mining in this paper. Then a new prevalence measurement of co-location pattern is put forward. Next, we design the spatial co-location pattern mining algorithm based on the improved density peak clustering and the fuzzy neighbor relationship. Many experiments are executed on the synthetic and real datasets. The experimental results show that, compared to the existing method, the proposed algorithm is more effective, and can significantly save the time and space complexity in the phase of generating prevalent co-location patterns.
Citation: Meijiao Wang, Yu chen, Yunyun Wu, Libo He. Spatial co-location pattern mining based on the improved density peak clustering and the fuzzy neighbor relationship[J]. Mathematical Biosciences and Engineering, 2021, 18(6): 8223-8244. doi: 10.3934/mbe.2021408
Spatial co-location pattern mining discovers the subsets of spatial features frequently observed together in nearby geographic space. To reduce time and space consumption in checking the clique relationship of row instances of the traditional co-location pattern mining methods, the existing work adopted density peak clustering to materialize the neighbor relationship between instances instead of judging the neighbor relationship by a specific distance threshold. This approach had two drawbacks: first, there was no consideration in the fuzziness of the distance between the center and other instances when calculating the local density; second, forcing an instance to be divided into each cluster resulted in a lack of accuracy in fuzzy participation index calculations. To solve the above problems, three improvement strategies are proposed for the density peak clustering in the co-location pattern mining in this paper. Then a new prevalence measurement of co-location pattern is put forward. Next, we design the spatial co-location pattern mining algorithm based on the improved density peak clustering and the fuzzy neighbor relationship. Many experiments are executed on the synthetic and real datasets. The experimental results show that, compared to the existing method, the proposed algorithm is more effective, and can significantly save the time and space complexity in the phase of generating prevalent co-location patterns.
[1] | Y. Huang, S. Shekhar, H. Xiong, Discovering colocation patterns from spatial data sets: a general approach, IEEE Educ. Act. Dep., 16 (2004), 1472-1485. |
[2] | Y. Fang, L. Wang, T. Hu, Spatial co-location pattern mining based on density peaks clustering and fuzzy theory, in Asia-Pacific Web (APWeb) and Web-Age Information Management (WAIM) Joint International Conference on Web and Big Data, Springer, Cham, (2018), 298-305. |
[3] | M. Du, S. Ding, X. Yu, A robust density peaks clustering algorithm using fuzzy neighborhood, Int. J. Mach. Learn. Cybern., 12 (2017), 1-10. |
[4] | M. Wang, L. Wang, L. Zhao, Spatial co-location pattern mining based on fuzzy neighbor relationship, J. Inf. Sci. Eng., 35 (2019). |
[5] | S. Shekhar, Y. Huang, Discovering spatial co-location patterns: A summary of results, in International symposium on spatial and temporal databases, Heidelberg, Springer, (2001), 236-256. |
[6] | J. S. Yoo, S. Shekhar, M. Celik, A join-less approach for colocation pattern mining: a summary of results, in Fifth IEEE International Conference on Data Mining (ICDM'05), IEEE, (2005), 813-816. |
[7] | J. S. Yoo, S. Shekhar, J. Smith, J. P. Kumquat, A partial join approach for mining co-location patterns, in Proceedings of the 12th annual ACM international workshop on Geographic information systems, ACM Press, (2004), 241-249. |
[8] | L. Wang, Y. Bao, J. Lu, J. Yip, A new join-less approach for co-location pattern mining, in 2008 8th IEEE International Conference on Computer and Information Technology, IEEE, (2008), 197-202. |
[9] | L. Wang, Y. Bao, Z. Lu, Efficient discovery of spatial co-location patterns using the iCPI-tree, Open Inf. Syst. J., 3 (2009), 69-80. |
[10] | L. Wang, P. Wu, H. Chen, Finding probabilistic prevalent co-locations in spatially uncertain data sets, IEEE Trans. Knowl. Data Eng., 25 (2013), 790-804. doi: 10.1109/TKDE.2011.256 |
[11] | L. Wang, P. Guan, H. Chen, L. Zhao, Mining co-locations from spatially uncertain data with probability intervals, in International Conference on Web-Age Information Management, Springer, (2013), 301-314. |
[12] | L. Wang, H. Chen, L Zhao, L. Zhou, Efficiently mining co-location rules on interval data, in International Conference on Advanced Data Mining and Applications, Springer, Berlin, (2010), 477-488. |
[13] | L. Wang, X. Bao, L. Zhou, Redundancy reduction for prevalent co-location patterns, IEEE Trans. Knowl. Data Eng., 30 (2008), 142-155. |
[14] | L. Wang, X. Bao, H. Chen, Effective lossless condensed representation and discovery of spatial co-location patterns, Inf. Sci., 436 (2018), 197-213. |
[15] | L. Wang, L. Zhou, J. Lu, J. Yip, An order-clique-based approach for mining maximal co-locations, Inf. Sci., 179 (2009), 3370-3382. doi: 10.1016/j.ins.2009.05.023 |
[16] | X. Yao, L. Peng, L Yang, T. Chi, A fast space-saving algorithm for maximal co-location pattern mining, Expert Syst. Appl., 63 (2016), 310-323. |
[17] | J. S. Yoo, M. Bow, Mining maximal co-located event sets, in Pacific-Asia conference on knowledge discovery and data mining, (2011), 351-362. |
[18] | J. S. Yoo, D. Boulware, D. Kimmey, A parallel spatial co-location mining algorithm based on MapReduce, in 2014 IEEE International Congress on Big Data, (2014), 25-31. |
[19] | P. Yang, L. Wang, X. Wang, A parallel spatial co-location pattern mining approach based on ordered clique growth, in International Conference on Database Systems for Advanced Applications, (2018), 734-742 |
[20] | Z. Ouyang, L. Wang, P. Wu, Spatial co-location pattern discovery from fuzzy objects, Int. J. Artif. Intell. Tools, 26 (2017), 1750003. doi: 10.1142/S0218213017500038 |
[21] | P. Wu, L. Wang, Y. Zhou, Discovering co-location from spatial data sets with fuzzy attributes, J. Front. Comput. Technol., 7 (2013), 348-358. |
[22] | A. Rodriguez, A. Laio, Clustering by fast search and find of density peaks, Science, 344 (2014), 1492-1496. doi: 10.1126/science.1242072 |
[23] | J. Xie, H. Gao, W. Xie, X. Liu, P. W. Grant, Robust clustering by detecting density peaks and assigning points based on fuzzy weighted K-nearest neighbors, Inf. Sci. Int. J., 350 (2016), 19-40. |
[24] | X. Xu, S. Ding, Z. Shi, An improved density peaks clustering algorithm with fast finding cluster centers, Knowl.-Based Syst., 158 (2018), 65-74. doi: 10.1016/j.knosys.2018.05.034 |
[25] | R. Liu, H. Wang, X. Yu, Shared-nearest-neighbor-based clustering by fast search and find of densitypeaks, Inf. Sci., 450 (2018), 200-226. doi: 10.1016/j.ins.2018.03.031 |
[26] | R. Liu, W. Huang, Z. Fei, K. Wang, J. Liang, Constraint-based clustering by fast search and find of density peaks, Neurocomputing, 330 (2019), 223-237. doi: 10.1016/j.neucom.2018.06.058 |
[27] | S. Yan, H. Wang, T. Li, J. Chu, J. Guo, Semi-supervised density peaks clustering based on constraint projection, Int. J. Comput. Intell. Syst., 14 (2020), 140-147. doi: 10.2991/ijcis.d.201102.002 |
[28] | R. Mehmood, R. Bie, H. Dawood, H. Ahmad, Fuzzy clustering by fast search and find of density peaks, in 2015 International Conference on Identification, Information, and Knowledge in the Internet of Things (ⅡKI), (2015), 258-261. |