Among all usual formats of representing 3D objects, including depth image, mesh and volumetric grid, point cloud is the most commonly used and preferred format, because it preserves the original geometric information in 3D space without any discretization and can provide a comprehensive understanding of the target objects. However, due to their unordered and unstructured nature, conventional deep learning methods such as convolutional neural networks cannot be directly applied to point clouds, which poses a challenge for extracting semantic features from them. This paper proposes a feature fusion algorithm based on attention graph convolution and error feedback, which considers global features, local features and the problem of the features loss during the learning process. Comparison experiments are conducted on the ModelNet40 and ShapeNet datasets to verify the performance of the proposed algorithm, and experimental results show that the proposed method achieves a classification accuracy of 93.1% and a part segmentation mIoU (mean Intersection over Union) of 85.4%. Our algorithm outperforms state-of-the-art algorithms, and effectively improves the accuracy of point cloud classification and segmentation with faster convergence speed.
Citation: Chengyong Yang, Jie Wang, Shiwei Wei, Xiukang Yu. A feature fusion-based attention graph convolutional network for 3D classification and segmentation[J]. Electronic Research Archive, 2023, 31(12): 7365-7384. doi: 10.3934/era.2023373
Among all usual formats of representing 3D objects, including depth image, mesh and volumetric grid, point cloud is the most commonly used and preferred format, because it preserves the original geometric information in 3D space without any discretization and can provide a comprehensive understanding of the target objects. However, due to their unordered and unstructured nature, conventional deep learning methods such as convolutional neural networks cannot be directly applied to point clouds, which poses a challenge for extracting semantic features from them. This paper proposes a feature fusion algorithm based on attention graph convolution and error feedback, which considers global features, local features and the problem of the features loss during the learning process. Comparison experiments are conducted on the ModelNet40 and ShapeNet datasets to verify the performance of the proposed algorithm, and experimental results show that the proposed method achieves a classification accuracy of 93.1% and a part segmentation mIoU (mean Intersection over Union) of 85.4%. Our algorithm outperforms state-of-the-art algorithms, and effectively improves the accuracy of point cloud classification and segmentation with faster convergence speed.
[1] | Y. Guo, H. Wang, Q. Hu, H. Liu, L. Liu, M. Bennamoun, Deep learning for 3D point clouds: A survey, IEEE Trans. Pattern Anal. Mach. Intell., 42 (2021), 4338–4364. https://doi.org/10.1109/TPAMI.2020.3005434 doi: 10.1109/TPAMI.2020.3005434 |
[2] | X. Yuan, J. Shi, L. Gu, A review of deep learning methods for semantic segmentation of remote sensing imagery, Expert Syst. Appl., 169 (2021), 114417. https://doi.org/10.1016/j.eswa.2020.114417 doi: 10.1016/j.eswa.2020.114417 |
[3] | H. Aasen, E. Honkavaara, A. Lucieer, P. J. Zarco-Tejada, Quantitative remote sensing at ultra-high resolution with UAV spectroscopy: A review of sensor technology, measurement procedures, and data correction workflows, Remote Sens., 10 (2018), 1091. https://doi.org/10.3390/rs10071091 doi: 10.3390/rs10071091 |
[4] | J. Balado, J. Martínez-Sánchez, P. Arias, A. Novo, Road environment semantic segmentation with deep learning from MLS point cloud data, Sensors, 19 (2019), 3466. https://doi.org/10.3390/s19163466 doi: 10.3390/s19163466 |
[5] | R. Meleppat, K. E. Ronning, S. J. Karlen, M. E. Burns, E. N. Pugh Jr, R. J. Zawadzki, In vivo multimodal retinal imaging of disease-related pigmentary changes in retinal pigment epithelium, Sci. Rep., 11 (2015), 16252. https://doi.org/10.1038/s41598-021-95320-z doi: 10.1038/s41598-021-95320-z |
[6] | R. K. Meleppat, M. V. Matham, L. K. Seah, C. Shearwood, Quantification of biofilm thickness using a swept source based optical coherence tomography system, in International Conference on Optical and Photonic Engineering, 9524 (2015), 683–688. https://doi.org/10.1117/12.2190106 |
[7] | R. K. Meleppat, E. B. Miller, S. K. Manna, P. Zhang, E. N. Pugh Jr, R. J. Zawadzki, Multiscale Hessian filtering for enhancement of OCT angiography images, in Ophthalmic Technologies XXIX, 10858 (2019), 64–70. https://doi.org/10.1117/12.2511044 |
[8] | K. M. Ratheesh, L. K. Seah, V. M. Murukeshan, Spectral phase-based automatic calibration scheme for swept source-based optical coherence tomography systems, Phys. Med. Biol., 61 (2016), 7652. https://doi.org/10.1088/0031-9155/61/21/7652 doi: 10.1088/0031-9155/61/21/7652 |
[9] | H. Su, S. Maji, E. Kalogerakis, E. Learned-Miller, Multi-view convolutional neural networks for 3D shape recognition, in Proceedings of the IEEE International Conference on Computer Vision, (2015), 95242L. https://doi.org/10.1109/ICCV.2015.114 |
[10] | M. Huang, P. Wei, X. Liu, An efficient encoding voxel-based segmentation (EVBS) algorithm based on fast adjacent voxel search for point cloud plane segmentation, Remote Sens., 11 (2019), 2727. https://doi.org/10.3390/rs11232727 doi: 10.3390/rs11232727 |
[11] | B. Xiong, W. Jiang, D. Li, M. Qi, Voxel grid-based fast registration of terrestrial point cloud, Remote Sens., 13 (2021), 1905. https://doi.org/10.3390/rs13101905 doi: 10.3390/rs13101905 |
[12] | C. Wen, X. Li, X. Yao, L. Peng, T. Chi, Airborne LiDAR point cloud classification with global-local graph attention convolution neural network, ISPRS J. Photogramm. Remote Sens., 173 (2021), 181–194. https://doi.org/10.1016/j.isprsjprs.2021.01.007 doi: 10.1016/j.isprsjprs.2021.01.007 |
[13] | S. A. Bello, S. Yu, C. Wang, J. M. Adam, J. Li, Review: Deep learning on 3D point clouds, Remote Sens., 12 (2020), 1721. https://doi.org/10.3390/rs12111729 doi: 10.3390/rs12111729 |
[14] | Z. Zhang, L. Zhang, X. Tong, B. Guo, L. Zhang, X. Xing, Discriminative-Dictionary-Learning-Based multilevel point-cluster features for ALS point-cloud classification, IEEE Trans. Geosci. Remote Sens., 54 (2016), 7309–7322. https://doi.org/10.1109/TGRS.2016.2599163 doi: 10.1109/TGRS.2016.2599163 |
[15] | K. He, X. Zhang, S. Ren, J. Sun, Deep residual learning for image recognition, in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, (2016), 770–778. https://doi.org/10.1109/CVPR.2016.90 |
[16] | D. Maturana, S. Scherer, VoxNet: A 3D convolutional neural network for real-time object recognition, in IEEE/RSJ International Conference on Intelligent Robots and Systems, (2015), 922–928. https://doi.org/10.1109/IROS.2015.7353481 |
[17] | C. R. Qi, H. Su, M. Niebner, A. Dai, M. Yan, L. J. Guibas, Volumetric and multi-view CNNs for object classification on 3D data, in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, (2016), 5648–5656. https://doi.org/10.1109/CVPR.2016.609 |
[18] | N. Qin, X. Hu, P. Wang, J. Shan, Y. Li, Semantic labeling of ALS point cloud via learning voxel and pixel representations, IEEE Geosci. Remote Sens. Lett., 17 (2020), 859–863. https://doi.org/10.1109/LGRS.2019.2931119 doi: 10.1109/LGRS.2019.2931119 |
[19] | C. R. Qi, H. Su, K. Mo, L. J. Guibas, Pointnet: Deep learning on point sets for 3D classification and segmentation, in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, (2017), 77–85. https://doi.org/10.1109/CVPR.2017.16 |
[20] | B. S. Hua, M. K. Tran, S. K. Yeung, Pointwise convolutional neural networks, in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, (2018), 984–993. https://doi.org/10.48550/10.1109/CVPR.2018.00109 |
[21] | C. R. Qi, L. Yi, H. Su, L. J. Guibas, Pointnet++: Deep hierarchical feature learning on point sets in a metric space, in Advances in Neural Information Processing Systems, 30 (2017), 5099–5108. |
[22] | Y. Li, R. Bu, M. Sun, W. Wu, X. Di, B. Chen, PointCNN: Convolution on X-transformed points, in Advances in Neural Information Processing Systems, 31 (2018), 828–838. |
[23] | Y. Wang, Y. Sun, Z. Liu, S. E. Sarma, M. M. Bronstein, J. M. Solomon, Dynamic graph CNN for learning on point clouds, ACM Trans. Graphics, 38 (2019), 1–12. https://doi.org/10.1145/3326362 doi: 10.1145/3326362 |
[24] | Y. Li, Q. Lin, Z. Zhang, L. Zhang, D. Chen, F. Shuang, MFNet: Multi-level feature extraction and fusion network for large-scale point cloud classification, Remote Sens., 14 (2022), 5707. https://doi.org/10.3390/rs14225707 doi: 10.3390/rs14225707 |
[25] | G. Wang, Q. Zhai, H. Liu, Cross self-attention network for 3D point cloud, Knowledge-Based Syst., 2022 (2022), 247. https://doi.org/10.1016/j.knosys.2022.108769 doi: 10.1016/j.knosys.2022.108769 |
[26] | X. Li, L. Wang, J. Lu, Multiscale receptive fields graph attention network for point cloud classification, Complexity, 2021 (2021), 1–9. https://doi.org/10.1155/2021/8832081 doi: 10.1155/2021/8832081 |
[27] | R. Klokov, V. Lempitsky, Escape from cells: Deep Kd-networks for the recognition of 3D point cloud models, in Proceedings of the IEEE International Conference on Computer Vision, (2017), 863–872. https://doi.org/10.1109/ICCV.2017.99 |
[28] | H. Deng, T. Birdal, S. Ilic, PPFNet: Global context aware local features for robust 3D point matching, in IEEE/CVF Conference on Computer Vision and Pattern Recognition, (2018), 195–205. https://doi.org/10.1109/CVPR.2018.00028 |
[29] | J. Li, B. M. Chen, G. H. Lee, SO-Net: Self-organizing network for point cloud analysis, in IEEE/CVF Conference on Computer Vision and Pattern Recognition, (2018), 9397–9406. https://doi.org/10.1109/CVPR.2018.00979 |
[30] | C. Q. Huang, F. Jiang, Q. H. Huang, X. Z. Wang, Z. M. Han, W. Y. Huang, Dual-graph attention convolution network for 3-D point cloud classification, IEEE Trans. Neural Networks Learn. Syst., 2022 (2022), 1–13. https://doi.org/10.1109/TNNLS.2022.3162301 doi: 10.1109/TNNLS.2022.3162301 |
[31] | Y. Shen, C. Feng, Y. Yang, D. Tian, Mining point cloud local structures by kernel correlation and graph pooling, in IEEE/CVF Conference on Computer Vision and Pattern Recognition, (2018), 4548–4557. https://doi.org/10.1109/CVPR.2018.00478 |
[32] | K. Zhang, M. Hao, J. Wang, C. W. de Silva, C. Fu, Linked dynamic graph CNN: Learning on point cloud via linking hierarchical features, in 2021 27th International Conference on Mechatronics and Machine Vision in Practice (M2VIP), (2021), 7–12. https://doi.org/10.1109/M2VIP49856.2021.9665104 |
[33] | W. Wang, T. Wang, Y. Cai, Multi-view attention-convolution pooling network for 3D point cloud classification, Appl. Intell., 52 (2022), 14787–14798. https://doi.org/10.1007/s10489-021-02840-2 doi: 10.1007/s10489-021-02840-2 |
[34] | C. Chen, L. Z. Fragonara, A. Tsourdos, GAPointNet: Graph attention based point neural network for exploiting local feature of point cloud, Neurocomputing, 438 (2022), 122–132. https://doi.org/10.1016/j.neucom.2021.01.095 doi: 10.1016/j.neucom.2021.01.095 |
[35] | H. Wu, S. Chen, G. Chen, W. Wang, B. Lei, Z. Wen, FAT-Net: Feature adaptive transformers for automated skin lesion segmentation, Med. Image Anal., 76 (2022), 102327. https://doi.org/10.1016/j.media.2021.102327 doi: 10.1016/j.media.2021.102327 |
[36] | Z. Xie, J. Chen, B. Peng, Point clouds learning with attention-based graph convolution networks, Neurocomputing, 402 (2020), 245–255. https://doi.org/10.1016/j.neucom.2020.03.086 doi: 10.1016/j.neucom.2020.03.086 |
[37] | S. Qiu, S. Anwar, N. Barnes, Geometric back-projection network for point cloud classification, IEEE Trans. Multimedia, 24 (2022), 1943–1955. https://doi.org/10.1109/TMM.2021.3074240 doi: 10.1109/TMM.2021.3074240 |
[38] | Z. Wu, S. Song, A. Khosla, F. Yu, L. Zhang, X. Tang, et al., 3D shapeNets: A deep representation for volumetric shapes, in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, (2015), 1912–1920. https://doi.org/10.1109/CVPR.2015.7298801 |
[39] | W. Wu, Z. Qi, F. Li, PointConv: Deep convolutional networks on 3D point clouds, in IEEE/CVF Conference on Computer Vision and Pattern Recognition, (2019), 9621–9630. https://doi.org/10.1109/CVPR.2019.00985 |
[40] | Y. Xu, T. Fan, M. Xu, L. Zeng, Y. Qiao, SpiderCNN: Deep learning on point sets with parameterized convolutional filters, in Proceedings of the European Conference on Computer Vision (ECCV), (2018), 87–102. |
[41] | Z. H. Lin, S. Y. Huang, Y. C. F. Wang, Convolution in the cloud: Learning deformable Kernels in 3D graph convolution networks for point cloud analysis, in IEEE/CVF Conference on Computer Vision and Pattern Recognition, (2020), 1800–1809. https://doi.org/10.1109/CVPR42600.2020.00187 |
[42] | G. Te, W. Hu, A. Zheng, Z. Guo, RGCNN: Regularized graph CNN for point cloud segmentation, in Proceedings of the 26th ACM International Conference on Multimedia, (2018), 746–754. https://doi.org/10.1145/3240508.3240621 |