In the business of Life Insurance, evaluating a customer's application to assign a risk level is a task of utmost importance, as it helps in formulating policies and deciding the premium that the customer needs to pay. The advent of Machine Learning (ML) has made such tasks easier. Many insurance companies are adopting the same approach by using appropriate ML models. To make an effective model it is required that these companies collaborate and share their data, but the data stored by each company are very critical as it consist of customer's private information and it is very risky to share them. To overcome such issues, Federated Learning (FL) is introduced which works in a distributed fashion without sharing the actual data. In this paper, we exploit FL to provide risk prediction in the Life Insurance Industry. The dataset from Kaggle Prudential Life Insurance Assessment is used for this study. To simulate different distributions of data among different clients we use the Dirichlet Process to partition the data. The different values of the concentration parameter is used to generate distributions that cover a spectrum of similarity along with the varying number of clients involved in the learning process. The results show the validity of the proposed approach.
Risk Prediction in the Life Insurance Industry Using Federated Learning Approach
Gupta H.;Puliafito A.
2022-01-01
Abstract
In the business of Life Insurance, evaluating a customer's application to assign a risk level is a task of utmost importance, as it helps in formulating policies and deciding the premium that the customer needs to pay. The advent of Machine Learning (ML) has made such tasks easier. Many insurance companies are adopting the same approach by using appropriate ML models. To make an effective model it is required that these companies collaborate and share their data, but the data stored by each company are very critical as it consist of customer's private information and it is very risky to share them. To overcome such issues, Federated Learning (FL) is introduced which works in a distributed fashion without sharing the actual data. In this paper, we exploit FL to provide risk prediction in the Life Insurance Industry. The dataset from Kaggle Prudential Life Insurance Assessment is used for this study. To simulate different distributions of data among different clients we use the Dirichlet Process to partition the data. The different values of the concentration parameter is used to generate distributions that cover a spectrum of similarity along with the varying number of clients involved in the learning process. The results show the validity of the proposed approach.Pubblicazioni consigliate
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.


