Machine Learning-Based Cost Prediction of Health Insurance Claims Using Medical Dataset

Main Article Content

Dinesh Yadav

Abstract

It allows companies to fix the prices of policies, evaluate risks and divide resources among different areas in health care. When the costs of health insurance claims are difficult to anticipate, healthcare companies are less prepared for financial planning and managing potential risks. Since more medical data is now accessible, machine learning (ML) approaches can help us model the complex reasons behind cost changes. This paper recommends using Light Gradient Boosting Machine (LightGBM) to predict the amount of money paid for health insurance claims from a medical insurance dataset. Just before dividing it into training and testing sets, the dataset with numbers and categories is prepared by encoding and handling any missing values. Using the new model results in significant improvements over regular regression models, showing an R-squared (R²) value of 86.81%, MAE of 2381.57 and an RMSE of 4450.43. Experimental tests and visual aid prove that the model can catch non-linear patterns and boost the accuracy of predictions. This study points out that using LightGBM can make health insurance cost predictions more accurate.

Downloads

Download data is not yet available.

Article Details

Section

Research Paper

How to Cite

Machine Learning-Based Cost Prediction of Health Insurance Claims Using Medical Dataset. (2025). Journal of Global Research in Multidisciplinary Studies(JGRMS), 1(6), 8-12. https://doi.org/10.5281/