Skip to main content
Top
Published in: BMC Medical Informatics and Decision Making 1/2023

Open Access 01-12-2023 | Research

Health insurance fraud detection by using an attributed heterogeneous information network with a hierarchical attention mechanism

Authors: Jiangtao Lu, Kaibiao Lin, Ruicong Chen, Min Lin, Xin Chen, Ping Lu

Published in: BMC Medical Informatics and Decision Making | Issue 1/2023

Login to get access

Abstract

Background

With the rapid growth of healthcare services, health insurance fraud detection has become an important measure to ensure efficient use of public funds. Traditional fraud detection methods have tended to focus on the attributes of a single visit and have ignored the behavioural relationships of multiple visits by patients.

Methods

We propose a health insurance fraud detection model based on a multilevel attention mechanism that we call MHAMFD. Specifically, we use an attributed heterogeneous information network (AHIN) to model different types of objects and their rich attributes and interactions in a healthcare scenario. MHAMFD selects appropriate neighbour nodes based on the behavioural relationships at different levels of a patient’s visit. We also designed a hierarchical attention mechanism to aggregate complex semantic information from the interweaving of different levels of behavioural relationships of patients. This increases the feature representation of objects and makes the model interpretable by identifying the main factors of fraud.

Results

Experimental results using real datasets showed that MHAMFD detected health insurance fraud with better accuracy than existing methods.

Conclusions

Experiment suggests that the behavioral relationships between patients’ multiple visits can also be of great help to detect health care fraud. Subsequent research fraud detection methods can also take into account the different behavioral relationships between patients.
Literature
1.
go back to reference Sisko AM, Keehan SP, Poisal JA, Cuckler GA, Smith SD, Madison AJ, et al. National health expenditure projections, 2018–27: economic and demographic trends drive spending and enrollment growth. Health Aff. 2019;38(3):491–501.CrossRef Sisko AM, Keehan SP, Poisal JA, Cuckler GA, Smith SD, Madison AJ, et al. National health expenditure projections, 2018–27: economic and demographic trends drive spending and enrollment growth. Health Aff. 2019;38(3):491–501.CrossRef
2.
go back to reference Cubanski J, Neuman T, Freed M. The facts on Medicare spending and financing. Washington, DC: Kaiser Family Foundation; 2018. Cubanski J, Neuman T, Freed M. The facts on Medicare spending and financing. Washington, DC: Kaiser Family Foundation; 2018.
3.
go back to reference Morris L. Combating fraud in health care: an essential component of any cost containment strategy. Health Aff. 2009;28(5):1351–56.CrossRef Morris L. Combating fraud in health care: an essential component of any cost containment strategy. Health Aff. 2009;28(5):1351–56.CrossRef
6.
go back to reference Xu D, Ruan C, Korpeoglu E, Kumar S, Achan K. Inductive representation learning on temporal graphs. 2020. arXiv preprint arXiv:2002.07962. Xu D, Ruan C, Korpeoglu E, Kumar S, Achan K. Inductive representation learning on temporal graphs. 2020. arXiv preprint arXiv:​2002.​07962.
7.
go back to reference Felzenszwalb PF, Huttenlocher DP. Efficient graph-based image segmentation. Int J Comput Vis. 2004;59(2):167–81.CrossRef Felzenszwalb PF, Huttenlocher DP. Efficient graph-based image segmentation. Int J Comput Vis. 2004;59(2):167–81.CrossRef
8.
go back to reference Akoglu L, Tong H, Koutra D. Graph based anomaly detection and description: a survey. Data Min Knowl Disc. 2015;29(3):626–88.CrossRef Akoglu L, Tong H, Koutra D. Graph based anomaly detection and description: a survey. Data Min Knowl Disc. 2015;29(3):626–88.CrossRef
9.
go back to reference Chang J, Gao C, Zheng Y, Hui Y, Niu Y, Song Y, et al. Sequential recommendation with graph neural networks. In: Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval. New York: Association for Computing Machinery; 2021. p. 378–87. Chang J, Gao C, Zheng Y, Hui Y, Niu Y, Song Y, et al. Sequential recommendation with graph neural networks. In: Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval. New York: Association for Computing Machinery; 2021. p. 378–87.
10.
go back to reference Sun Y, Han J. Mining heterogeneous information networks: principles and methodologies. Synth Lect Data Min Knowl Discov. 2012;3(2):1–159.CrossRef Sun Y, Han J. Mining heterogeneous information networks: principles and methodologies. Synth Lect Data Min Knowl Discov. 2012;3(2):1–159.CrossRef
11.
go back to reference Cheng Z, Ding Y, He X, Zhu L, Song X, Kankanhalli MS. A^ 3NCF: An Adaptive Aspect Attention Model for Rating Prediction. In: Twenty-Seventh International Joint Conference on Artificial Intelligence (IJCAI). Stockholm: AAAI Press; 2018. p. 3748–54. Cheng Z, Ding Y, He X, Zhu L, Song X, Kankanhalli MS. A^ 3NCF: An Adaptive Aspect Attention Model for Rating Prediction. In: Twenty-Seventh International Joint Conference on Artificial Intelligence (IJCAI). Stockholm: AAAI Press; 2018. p. 3748–54.
12.
go back to reference You Q, Jin H, Wang Z, Fang C, Luo J. Image captioning with semantic attention. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Las Vegas: IEEE Computer Society; 2016. p. 4651–59. You Q, Jin H, Wang Z, Fang C, Luo J. Image captioning with semantic attention. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Las Vegas: IEEE Computer Society; 2016. p. 4651–59.
13.
go back to reference Bauder RA, Khoshgoftaar TM. The effects of varying class distribution on learner behavior for medicare fraud detection with imbalanced big data. Health Inf Sci Syst. 2018;6(1):1–14.CrossRef Bauder RA, Khoshgoftaar TM. The effects of varying class distribution on learner behavior for medicare fraud detection with imbalanced big data. Health Inf Sci Syst. 2018;6(1):1–14.CrossRef
14.
go back to reference Sadiq S, Tao Y, Yan Y, Shyu ML. Mining anomalies in medicare big data using patient rule induction method. In: 2017 IEEE third international conference on multimedia Big Data (BigMM). Laguna Hills: IEEE; 2017. p. 185–92. Sadiq S, Tao Y, Yan Y, Shyu ML. Mining anomalies in medicare big data using patient rule induction method. In: 2017 IEEE third international conference on multimedia Big Data (BigMM). Laguna Hills: IEEE; 2017. p. 185–92.
15.
go back to reference Viegas JL, Cepeda NM, Vieira SM. Electricity fraud detection using committee semi-supervised learning. In: 2018 International Joint Conference on Neural Networks (IJCNN). Rio de Janeiro: IEEE; 2018. p. 1–6. Viegas JL, Cepeda NM, Vieira SM. Electricity fraud detection using committee semi-supervised learning. In: 2018 International Joint Conference on Neural Networks (IJCNN). Rio de Janeiro: IEEE; 2018. p. 1–6.
16.
go back to reference Joudaki H, Rashidian A, Minaei-Bidgoli B, Mahmoodi M, Geraili B, Nasiri M, et al. Using data mining to detect health care fraud and abuse: a review of literature. Glob J Health Sci. 2015;7(1):194. Joudaki H, Rashidian A, Minaei-Bidgoli B, Mahmoodi M, Geraili B, Nasiri M, et al. Using data mining to detect health care fraud and abuse: a review of literature. Glob J Health Sci. 2015;7(1):194.
17.
go back to reference Zhang W, He X. An anomaly detection method for medicare fraud detection. In: 2017 IEEE International Conference on Big Knowledge (ICBK). Hefei: IEEE; 2017. p. 309–14. Zhang W, He X. An anomaly detection method for medicare fraud detection. In: 2017 IEEE International Conference on Big Knowledge (ICBK). Hefei: IEEE; 2017. p. 309–14.
18.
go back to reference Seo J, Mendelevitch O. Identifying frauds and anomalies in medicare-B dataset. In: 2017 39th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC). Jeju: IEEE; 2017. p. 3664–67. Seo J, Mendelevitch O. Identifying frauds and anomalies in medicare-B dataset. In: 2017 39th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC). Jeju: IEEE; 2017. p. 3664–67.
19.
go back to reference Bauder R, Khoshgoftaar T. Medicare fraud detection using random forest with class imbalanced big data. In: 2018 IEEE international conference on information reuse and integration (IRI). Salt Lake City: IEEE; 2018. p. 80–7. Bauder R, Khoshgoftaar T. Medicare fraud detection using random forest with class imbalanced big data. In: 2018 IEEE international conference on information reuse and integration (IRI).  Salt Lake City: IEEE; 2018. p. 80–7.
20.
go back to reference Pandey P, Saroliya A, Kumar R. Analyses and detection of health insurance fraud using data mining and predictive modeling techniques. In: Soft Computing: Theories and Applications. Singapore: Springer; 2018. p. 41–9. Pandey P, Saroliya A, Kumar R. Analyses and detection of health insurance fraud using data mining and predictive modeling techniques. In: Soft Computing: Theories and Applications. Singapore: Springer; 2018. p. 41–9.
21.
go back to reference Perozzi B, Al-Rfou R, Skiena S. Deepwalk: Online learning of social representations. In: Proceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining. New York: Association for Computing Machinery; 2014. p. 701–10. Perozzi B, Al-Rfou R, Skiena S. Deepwalk: Online learning of social representations. In: Proceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining. New York: Association for Computing Machinery;  2014. p. 701–10.
22.
go back to reference Tang J, Qu M, Wang M, Zhang M, Yan J, Mei Q. Line: Large-scale information network embedding. In: Proceedings of the 24th international conference on world wide web. Republic and Canton of Geneva, CHE: International World Wide Web Conferences Steering Committee; 2015. p. 1067–77. Tang J, Qu M, Wang M, Zhang M, Yan J, Mei Q. Line: Large-scale information network embedding. In: Proceedings of the 24th international conference on world wide web. Republic and Canton of Geneva, CHE: International World Wide Web Conferences Steering Committee; 2015. p. 1067–77.
23.
go back to reference Wang D, Cui P, Zhu W. Structural deep network embedding. In: Proceedings of the 22nd ACM SIGKDD international conference on Knowledge discovery and data mining. New York: Association for Computing Machinery; 2016. p. 1225–34. Wang D, Cui P, Zhu W. Structural deep network embedding. In: Proceedings of the 22nd ACM SIGKDD international conference on Knowledge discovery and data mining.  New York: Association for Computing Machinery; 2016. p. 1225–34.
24.
go back to reference Lichtenwalter RN, Lussier JT, Chawla NV. New perspectives and methods in link prediction. In: Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining. New York: Association for Computing Machinery; 2010. p. 243–52. Lichtenwalter RN, Lussier JT, Chawla NV. New perspectives and methods in link prediction. In: Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining. New York: Association for Computing Machinery; 2010. p. 243–52.
25.
go back to reference Leroy V, Cambazoglu BB, Bonchi F. Cold start link prediction. In: Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining. New York: Association for Computing Machinery; 2010. p. 393–402. Leroy V, Cambazoglu BB, Bonchi F. Cold start link prediction. In: Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining. New York: Association for Computing Machinery; 2010. p. 393–402.
26.
go back to reference Shi C, Li Y, Zhang J, Sun Y, Philip SY. A survey of heterogeneous information network analysis. IEEE Trans Knowl Data Eng. 2016;29(1):17–37.CrossRef Shi C, Li Y, Zhang J, Sun Y, Philip SY. A survey of heterogeneous information network analysis. IEEE Trans Knowl Data Eng. 2016;29(1):17–37.CrossRef
27.
go back to reference Sun Y, Han J, Yan X, Yu PS, Wu T. Pathsim: Meta path-based top-k similarity search in heterogeneous information networks. Proc VLDB Endowment. 2011;4(11):992–1003.CrossRef Sun Y, Han J, Yan X, Yu PS, Wu T. Pathsim: Meta path-based top-k similarity search in heterogeneous information networks. Proc VLDB Endowment. 2011;4(11):992–1003.CrossRef
28.
go back to reference Dong Y, Chawla NV, Swami A. metapath2vec: Scalable representation learning for heterogeneous networks. In: Proceedings of the 23rd ACM SIGKDD international conference on knowledge discovery and data mining. New York: Association for Computing Machinery; 2017. p. 135–44. Dong Y, Chawla NV, Swami A. metapath2vec: Scalable representation learning for heterogeneous networks. In: Proceedings of the 23rd ACM SIGKDD international conference on knowledge discovery and data mining. New York: Association for Computing Machinery; 2017. p. 135–44.
29.
30.
31.
go back to reference Wang X, Ji H, Shi C, Wang B, Ye Y, Cui P, et al. Heterogeneous graph attention network. In: The world wide web conference. New York: Heterogeneous graph attention network. ; 2019. p. 2022–32. Wang X, Ji H, Shi C, Wang B, Ye Y, Cui P, et al. Heterogeneous graph attention network. In: The world wide web conference. New York: Heterogeneous graph attention network. ; 2019. p. 2022–32.
Metadata
Title
Health insurance fraud detection by using an attributed heterogeneous information network with a hierarchical attention mechanism
Authors
Jiangtao Lu
Kaibiao Lin
Ruicong Chen
Min Lin
Xin Chen
Ping Lu
Publication date
01-12-2023
Publisher
BioMed Central
Published in
BMC Medical Informatics and Decision Making / Issue 1/2023
Electronic ISSN: 1472-6947
DOI
https://doi.org/10.1186/s12911-023-02152-0

Other articles of this Issue 1/2023

BMC Medical Informatics and Decision Making 1/2023 Go to the issue