Research article

A transformer-driven framework for multi-label behavioral health classification in police narratives


  • Received: 14 November 2024 Revised: 09 December 2024 Accepted: 12 December 2024 Published: 18 December 2024
  • Transformer-based models have shown to be highly effective for dealing with complex tasks in a wide range of areas due to their robust and flexible architecture. However, their generic nature frequently limits their effectiveness for domain-specific tasks unless significantly fine-tuned. We understand that behavioral health plays a vital role in individual well-being and community safety, as it influences interpersonal interactions and can significantly impact public safety. As a result, identifying and classifying these cases demands the use of an effective tool, such as a framework, that has been fine-tuned to context-specific behavioral health issues. In this work, we demonstrated a trainable lightweight approach for addressing behavioral health analysis utilizing feature embeddings generated from transformer-based models. To facilitate in domain adaptation, we created instruction sets based on annotations by subject matter experts, enabling for targeted fine-tuning of the large language model (LLM) for behavioral health applications. Our experiments demonstrated that parameter-frozen transformer-based models can capture high-quality feature representations that allowed for the integration of a lightweight framework, making them especially useful in resource-constrained settings.

    Citation: Francis Nweke, Abm Adnan Azmee, Md Abdullah Al Hafiz Khan, Yong Pei, Dominic Thomas, Monica Nandan. A transformer-driven framework for multi-label behavioral health classification in police narratives[J]. Applied Computing and Intelligence, 2024, 4(2): 234-252. doi: 10.3934/aci.2024014

    Related Papers:

  • Transformer-based models have shown to be highly effective for dealing with complex tasks in a wide range of areas due to their robust and flexible architecture. However, their generic nature frequently limits their effectiveness for domain-specific tasks unless significantly fine-tuned. We understand that behavioral health plays a vital role in individual well-being and community safety, as it influences interpersonal interactions and can significantly impact public safety. As a result, identifying and classifying these cases demands the use of an effective tool, such as a framework, that has been fine-tuned to context-specific behavioral health issues. In this work, we demonstrated a trainable lightweight approach for addressing behavioral health analysis utilizing feature embeddings generated from transformer-based models. To facilitate in domain adaptation, we created instruction sets based on annotations by subject matter experts, enabling for targeted fine-tuning of the large language model (LLM) for behavioral health applications. Our experiments demonstrated that parameter-frozen transformer-based models can capture high-quality feature representations that allowed for the integration of a lightweight framework, making them especially useful in resource-constrained settings.



    加载中


    [1] Open AI, J. Achiam, S. Adler, S. Agarwal, L. Ahmad, I. Akkaya, et al., GPT-4 technical report, arXiv: 2303.08774. https://doi.org/10.48550/arXiv.2303.08774
    [2] A. Azmee, M. Brown, M. Khan, D. Thomas, Y. Pei, M. Nandan, Domain-enhanced attention enabled deep network for behavioral health identification from 911 narratives, Proceedings of IEEE International Conference on Big Data, 2023, 5723–5732. https://doi.org/10.1109/BigData59044.2023.10386126 doi: 10.1109/BigData59044.2023.10386126
    [3] A. Azmee, M. Murikipudi, M. Al Hafiz Khan, Yong Pei, Sentence level analysis for detecting mental health causes using social media posts, Proceedings of IEEE 47th Annual Computers, Software, and Applications Conference (COMPSAC), 2023, 1388–1393. https://doi.org/10.1109/COMPSAC57700.2023.00211 doi: 10.1109/COMPSAC57700.2023.00211
    [4] P. Bojanowski, E. Grave, A. Joulin, T. Mikolov, Enriching word vectors with subword information, Transactions of the Association for Computational Linguistics, 5 (2017), 135–146. https://doi.org/10.1162/tacl_a_00051 doi: 10.1162/tacl_a_00051
    [5] M. Brown, M. Al Hafiz Khan, D. Thomas, Y. Pei, M. Nandan, Detection of behavioral health cases from sensitive police officer narratives, Proceedings of IEEE 47th Annual Computers, Software, and Applications Conference (COMPSAC), 2023, 1398–1403. https://doi.org/10.1109/COMPSAC57700.2023.00213 doi: 10.1109/COMPSAC57700.2023.00213
    [6] K. Cho, B. van Merrienboer, C. Gulcehre, D. Bahdanau, F. Bougares, H. Schwenk, et al., Learning phrase representations using RNN encoder-decoder for statistical machine translation, Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), 2014, 1724–1734. https://doi.org/10.3115/v1/D14-1179 doi: 10.3115/v1/D14-1179
    [7] J. Chung, J. Teo, Mental health prediction using machine learning: taxonomy, applications, and challenges, Appl. Comput. Intell. S., 2022 (2022), 970363. https://doi.org/10.1155/2022/9970363
    [8] J. Devlin, M. Chang, K. Lee, K. Toutanova, Bert: pre-training of deep bidirectional transformers for language understanding, Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2019, 4171–4186. https://doi.org/10.18653/v1/N19-1423 doi: 10.18653/v1/N19-1423
    [9] B. Figg, Substance abuse and mental health services administration < www.samhsa.gov > , J. Cons. Health Internet, 22 (2018), 253–262. https://doi.org/10.1080/15398285.2018.1513760 doi: 10.1080/15398285.2018.1513760
    [10] Froedtert and the Medical College of Wisconsin, Get the facts about behavioral health, Froedtert & the Medical College of Wisconsin health network, 2024. Available from: https://www.froedtert.com/behavioral-health/understanding.
    [11] GAO, Behavioral health: available workforce information and federal actions to help recruit and retain providers, US Government Accountability Office, 2022. Available from: https://www.gao.gov/products/gao-23-105250.
    [12] T. Hashmi, D. Thomas, M. Nandan, First responders, mental health, dispatch coding, COVID-19: crisis within a crisis, Journal of Emergency Management, 21 (2023), 233–240. https://doi.org/10.5055/jem.0664 doi: 10.5055/jem.0664
    [13] P. He, X. Liu, J. Gao, W. Chen, Deberta: decoding-enhanced bert with disentangled attention, arXiv: 2006.03654. https://doi.org/10.48550/arXiv.2006.03654
    [14] S. Hochreiter, J. Schmidhuber, Long short-term memory, Neural Comput., 9 (1997), 1735–1780. https://doi.org/10.1162/neco.1997.9.8.1735 doi: 10.1162/neco.1997.9.8.1735
    [15] E. Hu, Y. Shen, P. Wallis, Z. Allen-Zhu, Y. Li, S. Wang, et al., Lora: low-rank adaptation of large language models, arXiv: 2106.09685. https://doi.org/10.48550/arXiv.2106.09685
    [16] P. Jain, K. Srinivas, A. Vichare, Depression and suicide analysis using machine learning and NLP, J. Phys.: Conf. Ser., 2161 (2022), 012034. https://doi.org/10.1088/1742-6596/2161/1/012034 doi: 10.1088/1742-6596/2161/1/012034
    [17] A. Jiang, A. Sablayrolles, A. Mensch, C. Bamford, D. Chaplot, D. de las Casas, et al., Mistral 7B, arXiv: 2310.06825. https://doi.org/10.48550/arXiv.2310.06825
    [18] G. Karystianis, R. Cabral, S. Han, J. Poon, T. Butler, Utilizing text mining, data linkage and deep learning in police and health records to predict future offenses in family and domestic violence, Front. Digit. Health, 3 (2021), 602683. https://doi.org/10.3389/fdgth.2021.602683 doi: 10.3389/fdgth.2021.602683
    [19] G. Karystianis, A. Adily, P. Schofield, H. Wand, W. Lukmanjaya, I. Buchan, et al., Surveillance of domestic violence using text mining outputs from Australian police records, Front. Psychiatry, 12 (2022), 787792. https://doi.org/10.3389/fpsyt.2021.787792 doi: 10.3389/fpsyt.2021.787792
    [20] J. Kim, J. Lee, E. Park, J. Han, A deep learning model for detecting mental illness from user content on social media, Sci. Rep., 10 (2020), 11846. https://doi.org/10.1038/s41598-020-68764-y doi: 10.1038/s41598-020-68764-y
    [21] Z. Lan, M. Chen, S. Goodman, K. Gimpel, P. Sharma, R. Soricut, ALBERT: a lite BERT for self-supervised learning of language representations, arXiv: 1909.11942. https://doi.org/10.48550/arXiv.1909.11942
    [22] Y. Liu, M. Ott, N. Goyal, J. Du, M. Joshi, D. Chen, et al., RoBERTa: a robustly optimized BERT pretraining approach, arXiv: 1907.11692. https://doi.org/10.48550/arXiv.1907.11692
    [23] H. Lu, L. Ehwerhemuepha, C. Rakovski, A comparative study on deep learning models for text classification of unstructured medical notes with various levels of class imbalance, BMC Med. Res. Methodol., 22 (2022), 181. https://doi.org/10.1186/s12874-022-01665-y doi: 10.1186/s12874-022-01665-y
    [24] T. Mikolov, K. Chen, G. Corrado, J. Dean, Efficient estimation of word representations in vector space, arXiv: 1301.3781. https://doi.org/10.48550/arXiv.1301.3781
    [25] T. Munkhdalai, H. Yu, Neural semantic encoders, Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics, 2017,397–407.
    [26] R. Neusteter, M. O'Toole, M. Khogali, A. Rad, F. Wunschel, S. Scaffidi, et al., Understanding police enforcement, Vera Institute of Justice, 2020. Available from: https://www.vera.org/publications/understanding-police-enforcement-911-analysis.
    [27] K. O'Shea, R. Nash, An introduction to convolutional neural networks, arXiv: 1511.08458. https://doi.org/10.48550/arXiv.1511.08458
    [28] Z. Pang, Z. Xie, Y. Man, Y. Wang, Frozen transformers in language models are effective visual encoder layers, arXiv: 2310.12973. https://doi.org/10.48550/arXiv.2310.12973
    [29] J. Pennington, R. Socher, C. Manning, GloVe: global vectors for word representation, Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), 2014, 1532–1543. https://doi.org/10.3115/v1/D14-1162 doi: 10.3115/v1/D14-1162
    [30] M. Schuster, K. Paliwal, Bidirectional recurrent neural networks, IEEE T. Signal Proces., 45 (1997), 2673–2681. https://doi.org/10.1109/78.650093 doi: 10.1109/78.650093
    [31] A. Shestov, R. Levichev, R. Mussabayev, E. Maslov, A. Cheshkov, P. Zadorozhny, Finetuning large language models for vulnerability detection, arXiv: 2401.17010. https://doi.org/10.48550/arXiv.2401.17010
    [32] K. Singhal, S. Azizi, T. Tu, S. Sara Mahdavi, J. Wei, H. Chung, et al., Large language models encode clinical knowledge, Nature, 620 (2023), 172–180. https://doi.org/10.1038/s41586-023-06291-2 doi: 10.1038/s41586-023-06291-2
    [33] K. Singhal, T. Tu, J. Gottweis, R. Sayres, E. Wulczyn, L. Hou, et al., Towards expert-level medical question answering with large language models, arXiv: 2305.09617. https://doi.org/10.48550/arXiv.2305.09617
    [34] H. Touvron, T. Lavril, G. Izacard, X. Martinet, M. Lachaux, T. Lacroix, et al., Llama: open and efficient foundation language models, arXiv: 2302.13971. https://doi.org/10.48550/arXiv.2302.13971
    [35] A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. Gomez, et al., Attention is all you need, arXiv: 1706.03762. https://doi.org/10.48550/arXiv.1706.03762
    [36] B. Victor, B. Perron, R. Sokol, L. Fedina, J. Ryan, Automated identification of domestic violence in written child welfare records: leveraging text mining and machine learning to enhance social work research and evaluation, J. Soc. Soc. Work Res., 12 (2021), 631–655. https://doi.org/10.1086/712734 doi: 10.1086/712734
    [37] X. Xu, B. Yao, Y. Dong, S. Gabriel, H. Yu, J. Hendler, et al., Mental-llm: leveraging large language models for mental health prediction via online text data, Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies, 8 (2024), 31. https://doi.org/10.1145/3643540 doi: 10.1145/3643540
    [38] K. Yang, T. Zhang, Z. Kuang, Q. Xie, J. Huang, S. Ananiadou, MentaLLaMA: interpretable mental health analysis on social media with large language models, Proceedings of the ACM Web Conference, 2024, 4489–4500. https://doi.org/10.1145/3589334.3648137 doi: 10.1145/3589334.3648137
    [39] Z. Yang, Z. Dai, Y. Yang, J. Carbonell, R. Salakhutdinov, Q. Le, XLNet: generalized autoregressive pretraining for language understanding, arXiv: 1906.08237. https://doi.org/10.48550/arXiv.1906.08237
    [40] W. Zhao, K. Zhou, J. Li, T. Tang, X. Wang, Y. Hou, et al., A survey of large language models, arXiv: 2303.18223. https://doi.org/10.48550/arXiv.2303.18223
    [41] J. Zheng, H. Hong, X. Wang, J. Su, Y. Liang, S. Wu, Fine-tuning large language models for domain-specific machine translation, arXiv: 2402.15061. https://doi.org/10.48550/arXiv.2402.15061
  • Reader Comments
  • © 2024 the Author(s), licensee AIMS Press. This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0)
通讯作者: 陈斌, bchen63@163.com
  • 1. 

    沈阳化工大学材料科学与工程学院 沈阳 110142

  1. 本站搜索
  2. 百度学术搜索
  3. 万方数据库搜索
  4. CNKI搜索

Metrics

Article views(352) PDF downloads(12) Cited by(0)

Article outline

Figures and Tables

Figures(10)  /  Tables(4)

/

DownLoad:  Full-Size Img  PowerPoint
Return
Return

Catalog