Research article Special Issues

The 3D-aware image synthesis of prohibited items in the X-ray security inspection by stylized generative radiance fields

  • Received: 11 December 2023 Revised: 24 January 2024 Accepted: 30 January 2024 Published: 29 February 2024
  • The merging of neural radiance fields with generative adversarial networks (GANs) can synthesize novel views of objects from latent code (noise). However, the challenge for generative neural radiance fields (NERFs) is that a single multiple layer perceptron (MLP) network represents a scene or object, and the shape and appearance of the generated object are unpredictable, owing to the randomness of latent code. In this paper, we propose a stylized generative radiance field (SGRF) to produce 3D-aware images with explicit control. To achieve this goal, we manipulated the input and output of the MLP in the model to entangle and disentangle label codes into/from the latent code, and incorporated an extra discriminator to differentiate between the class and color mode of the generated object. Based on the labels provided, the model could generate images of prohibited items varying in class, pose, scale, and color mode, thereby significantly increasing the quantity and diversity of images in the dataset. Through a systematic analysis of the results, the method was demonstrated to be effective in improving the detection performance of deep learning algorithms during security screening.

    Citation: Jian Liu, Zhen Yu, Wenyu Guo. The 3D-aware image synthesis of prohibited items in the X-ray security inspection by stylized generative radiance fields[J]. Electronic Research Archive, 2024, 32(3): 1801-1821. doi: 10.3934/era.2024082

    Related Papers:

  • The merging of neural radiance fields with generative adversarial networks (GANs) can synthesize novel views of objects from latent code (noise). However, the challenge for generative neural radiance fields (NERFs) is that a single multiple layer perceptron (MLP) network represents a scene or object, and the shape and appearance of the generated object are unpredictable, owing to the randomness of latent code. In this paper, we propose a stylized generative radiance field (SGRF) to produce 3D-aware images with explicit control. To achieve this goal, we manipulated the input and output of the MLP in the model to entangle and disentangle label codes into/from the latent code, and incorporated an extra discriminator to differentiate between the class and color mode of the generated object. Based on the labels provided, the model could generate images of prohibited items varying in class, pose, scale, and color mode, thereby significantly increasing the quantity and diversity of images in the dataset. Through a systematic analysis of the results, the method was demonstrated to be effective in improving the detection performance of deep learning algorithms during security screening.



    加载中


    [1] A. Chavaillaz, A. Schwaninger, S. Michel, J. Sauer, Expertise, automation and trust in X-ray screening of cabin baggage, Front. Psychol., 10 (2019), 256. https://doi.org/10.3389/fpsyg.2019.00256 doi: 10.3389/fpsyg.2019.00256
    [2] D. Turcsany, A. Mouton, T. P. Breckon, Improving feature-based object recognition for X-ray baggage security screening using primed visual words, in 2013 IEEE International Conference on Industrial Technology (ICIT), IEEE, (2013), 1140–1145. https://doi.org/10.1109/ICIT.2013.6505833
    [3] Z. Chen, Y. Zheng, B. R. Abidi, D. L. Page, M. A. Abidi, A combinational approach to the fusion, de-noising and enhancement of dual-energy X-ray luggage images, in 2005 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, IEEE, (2005), 2. https://doi.org/10.1109/CVPR.2005.386
    [4] B. R. Abidi, Y. Zheng, A. V. Gribok, M. A. Abidi, Improving weapon detection in single energy X-ray images through pseudo coloring, IEEE Trans. Syst. Man Cybern. Part C Appl. Rev., 36 (2006), 784–796. https://doi.org/10.1109/TSMCC.2005.855523 doi: 10.1109/TSMCC.2005.855523
    [5] Q. Lu, R. W. Conners, Using image processing methods to improve the explosive detection accuracy, IEEE Trans. Syst. Man Cybern. Part C Appl. Rev., 36 (2006), 750–760. https://doi.org/10.1109/TSMCC.2005.855532 doi: 10.1109/TSMCC.2005.855532
    [6] T. W. Rogers, N. Jaccard, E. J. Morton, L. D. Griffin, Detection of cargo container loads from X-ray images, in 2nd IET International Conference on Intelligent Signal Processing 2015 (ISP), (2015), 1–6. https://doi.org/10.1049/cp.2015.1762
    [7] M. Kundegorski, S. Akçay, M. Devereux, A. Mouton, T. Breckon, On using feature descriptors as visual words for object detection within X-ray baggage security screening, in 7th International Conference on Imaging for Crime Detection and Prevention (ICDP), (2016), 1–6. https://doi.org/10.1049/ic.2016.0080
    [8] D. Mery, E. Svec, M. Arias, Object recognition in baggage inspection using adaptive sparse representations of X-ray images, in Image and Video Technology, Springer, 9431 (2016), 709–720. https://doi.org/10.1007/978-3-319-29451-3_56
    [9] T. Franzel, U. Schmidt, S. Roth, Object detection in multi-view X-ray images, in Pattern Recognition, Springer, 7476 (2012), 144–154. https://doi.org/10.1007/978-3-642-32717-9_15
    [10] M. Bastan, Multi-view object detection in dual-energy X-ray images, Mach. Vision Appl., 26 (2015), 1045–1060. https://doi.org/10.1007/s00138-015-0706-x doi: 10.1007/s00138-015-0706-x
    [11] G. Heitz, G. Chechik, Object separation in X-ray image sets, in 2010 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), IEEE, (2010), 2093–2100. https://doi.org/10.1109/CVPR.2010.5539887
    [12] O. K. Stamatis, N. Aouf, D. Nam, C. Belloni, Automatic X-ray image segmentation and clustering for threat detection, in Proceedings Volume 10432, Target and Background Signatures III, (2017), 104320O. https://doi.org/10.1117/12.2277190
    [13] D. Mery, Computer Vision Technology for X-ray Testing, Springer, 2015. https://doi.org/10.1007/978-3-319-20747-6
    [14] D. Mery, V. Riffo, U. Zscherpel, G. Mondragón, I. Lillo, I. Zuccar, et al., GDXray: The database of X-ray images for nondestructive testing, J. Nondestr. Eval., 34 (2015), 42. https://doi.org/10.1007/s10921-015-0315-7 doi: 10.1007/s10921-015-0315-7
    [15] T. W. Rogers, N. Jaccard, E. D. Protonotarios, J. Ollier, E. J. Morton, L. D. Griffin, Threat image projection (TIP) into X-ray images of cargo containers for training humans and machines, in 2016 IEEE International Carnahan Conference on Security Technology (ICCST), IEEE, (2016), 1–7. https://doi.org/10.1109/CCST.2016.7815717
    [16] C. Miao, L. Xie, F. Wan, C. Su, H. Liu, J. Jiao, et al., Sixray : A large-scale security inspection x-ray benchmark for prohibited item discovery in overlapping images, in 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), IEEE, (2019), 2114–2123. https://doi.org/10.1109/CVPR.2019.00222
    [17] Y. Wei, R. Tao, Z. Wu, Y. Ma, L. Zhang, X. Liu, Occluded prohibited items detection: an X-ray security inspection benchmark and de-occlusion attention module, in Proceedings of the 28th ACM International Conference on Multimedia, ACM, (2020), 138–146. https://doi.org/10.1145/3394171.3413828
    [18] J. Yang, Z. Zhao, H. Zhang, Y. Shi, Data augmentation for X-ray prohibited item images using generative adversarial networks, IEEE Access, 7 (2019), 28894–28902. https://doi.org/10.1109/ACCESS.2019.2902121 doi: 10.1109/ACCESS.2019.2902121
    [19] Y. Zhu, Y. Zhang, H. Zhang, J. Yang, Z. Zhao, Data augmentation of X-ray images in baggage inspection based on generative adversarial networks, IEEE Access, 8 (2020), 86536–86544. https://doi.org/10.1109/ACCESS.2020.2992861 doi: 10.1109/ACCESS.2020.2992861
    [20] J. Liu, T. H. Lin, A framework for the synthesis of X-ray security inspection images based on generative adversarial networks, IEEE Access, 11 (2023), 63751–63760. https://doi.org/10.1109/ACCESS.2023.3288087 doi: 10.1109/ACCESS.2023.3288087
    [21] I. Goodfellow, NIPS 2016 Tutorial: Generative adversarial networks, preprint, arXiv: 1701.00160.
    [22] X. Wu, K. Xu, P. Hall, A survey of image synthesis and editing with generative adversarial networks, Tsinghua Sci. Technol., 22 (2017), 660–674. https://doi.org/10.23919/TST.2017.8195348 doi: 10.23919/TST.2017.8195348
    [23] Z. Pan, W. Yu, X. Yi, A. Khan, F. Yuan, Y. Zheng, Recent progress on generative adversarial networks (GANs): A survey, IEEE Access, 7 (2019), 36322–36333. https://doi.org/10.1109/ACCESS.2019.2905015 doi: 10.1109/ACCESS.2019.2905015
    [24] A. Radford, L. Metz, S. Chintala, Unsupervised representation learning with deep convolutional generative adversarial networks, preprint, arXiv: 1511.06434.
    [25] M. Mirza, S. Osindero, Conditional generative adversarial nets, preprint, arXiv: 1411.1784.
    [26] A. Odena, C. Olah, J. Shlens, Conditional image synthesis with auxiliary classifier GANs, preprint, arXiv: 1610.09585.
    [27] X. Chen, Y. Duan, R. Houthooft, J. Schulman, I. Sutskever, P. Abbeel, InfoGAN: Interpretable representation learning by information maximizing generative adversarial nets, preprint, arXiv: 1606.03657.
    [28] M. Arjovsky, S. Chintala, L. Bottou, Wasserstein generative adversarial networks, in International Conference on Machine Learning, PMLR, (2017), 214–223.
    [29] I. Gulrajani, F. Ahmed, M. Arjovsky, V. Dumoulin, A. C. Courville, Improved training of wasserstein GANs, in Proceedings of the 31st International Conference on Neural Information Processing Systems, Curran Associates, Inc., (2017), 5769–5779.
    [30] H. Petzka, A. Fischer, D. Lukovnicov, On the regularization of wasserstein GANs, preprint, arXiv: 1709.08894.
    [31] P. Isola, J. Y. Zhu, T. Zhou, A. A. Efros, Image-to-image translation with conditional adversarial networks, in 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), IEEE, (2017), 5967–5976. https://doi.org/10.1109/CVPR.2017.632
    [32] T. C. Wang, M. Y. Liu, J. Y. Zhu, A. Tao, J. Kautz, B. Catanzaro, High-resolution image synthesis and semantic manipulation with conditional GANs, in 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, (2018), 8798–8807. https://doi.org/10.1109/CVPR.2018.00917
    [33] J. Y. Zhu, T. Park, P. Isola, A. A. Efros, Unpaired image-to-image translation using cycle-consistent adversarial networks, in 2017 IEEE International Conference on Computer Vision (ICCV), IEEE, (2018), 2242–2251. https://doi.org/10.1109/ICCV.2017.244
    [34] T. Kim, M. Cha, H. Kim, J. K. Lee, J. Kim, Learning to discover cross-domain relations with generative adversarial networks, in Proceedings 34th International Conference Machine Learning, PMLR, (2017), 1857–1865.
    [35] Z. Yi, H. Zhang, P. Tan, M. Gong, DualGAN: Unsupervised dual learning for image-to-image translation, in Proceedings of the IEEE International Conference on Computer Vision (ICCV), IEEE, (2017), 2849–2857.
    [36] Y. Choi, M. Choi, M. Kim, J. W. Ha, S. Kim, J. Choo, StarGAN: Unified generative adversarial networks for multi-domain image-to-image translation, in 2018 Conference on Computer Vision and Pattern Recognition (CVPR), IEEE, (2018), 8789–8797. https://doi.org/10.1109/CVPR.2018.00916
    [37] B. Mildenhall, P. Srinivasan, M. Tancik, J. T. Barron, R. Ramamoorthi, R. Ng, NeRF: Representing scenes as neural radiance fields for view Synthesis, preprint, arXiv: 2003.08934.
    [38] S. J. Garbin, M. Kowalski, M. Johnson, J. Shotton, J. Valentin, Fastnerf: High-fidelity neural rendering at 200fps, in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), IEEE, (2021), 14346–14355.
    [39] Z. Li, S. Niklaus, N. Snavely, N. Snavely, O. Wang, Neural scene flow fields for space-time view synthesis of dynamic scenes, in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), IEEE, (2021), 6498–6508.
    [40] A. Yu, V. Ye, M. Tancik, M. Tancik, A. Kanazawa, Pixelnerf: Neural radiance fields from one or few images, in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), IEEE, (2021), 4578–4587.
    [41] Q. Wang, Z. Wang, K. Genova, P. P. Srinivasan, H. Zhou, J. T. Barron, et al., IBRnet: Learning multi-view image-based rendering, in 2021 Conference on Computer Vision and Pattern Recognition (CVPR), IEEE, (2021), 4688–4697. https://doi.org/10.1109/CVPR46437.2021.00466
    [42] K. Schwarz, Y. Liao, M. Niemeyer, A. Geiger, GRAF: Generative radiance fields for 3D-aware image synthesis, preprint, arXiv: 2007.02442.
    [43] M. Niemeyer, A. Geiger, GIRAFFE: Representing scenes as compositional generative neural feature fields, in 2021 Conference on Computer Vision and Pattern Recognition (CVPR), IEEE, (2021), 11448–11459. https://doi.org/10.1109/CVPR46437.2021.01129
    [44] K. Simonyan, A. Zisserman, Very deep convolutional networks for large-scale image recognition, preprint, arXiv: 1409.1556.
    [45] Q. Xu, G. Huang, Y. Yuan, C. Guo, Y. Sun, F. Wu, et al., An empirical study on evaluation metrics of generative adversarial net, preprint, arXiv: 1806.07755.
    [46] M. Bińkowski, D. J. Sutherland, M. Arbel, A. Gretton, Demystifying MMD GANs, preprint, arXiv: 1801.01401.
    [47] Ultralytics, YOLOv8 Project, GitHub. Available from: https://github.com/ultralytics/ultralytics.
  • Reader Comments
  • © 2024 the Author(s), licensee AIMS Press. This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0)
通讯作者: 陈斌, bchen63@163.com
  • 1. 

    沈阳化工大学材料科学与工程学院 沈阳 110142

  1. 本站搜索
  2. 百度学术搜索
  3. 万方数据库搜索
  4. CNKI搜索

Metrics

Article views(831) PDF downloads(43) Cited by(0)

Article outline

Figures and Tables

Figures(14)  /  Tables(2)

Other Articles By Authors

/

DownLoad:  Full-Size Img  PowerPoint
Return
Return

Catalog