I have been working on a startup that will use artificial intelligence to rate vehicle service contracts. For a VSC provider, increased accuracy means sharper pricing and, potentially, lower reserves. Outsourcing this work to a specialist bureau means reduced costs, too. Our business model is already used for risk rating consumer credit, and the technology is already used for risk rating auto insurance.
In this article, I present an example using auto insurance data. If you would like to see how our approach works with VSC data, please get in touch. We are currently seeking VSC providers for our pilot program.
The French MTPL dataset is often cited in the AI literature. It gives the claims history for roughly 600,000 policies. Of these, I used 90% for training and set aside 10% for testing. So, the results shown here are not just “curve fitting,” but predictions against new data.
The Gini Coefficient
The challenge with insurance data is that most policies never have a claim. This is known as the imbalanced data problem. If you’re training an AI classifier, it can achieve 95% accuracy simply by predicting “no claim” every time. You will want to use an objective function that heavily penalizes false negatives, and you may also want to oversample the “with claim” cases.
The dashed line in the chart above represents cumulative actual claims, sorted in order of increasing severity. This is called the Lorenz curve. You can see that it’s zero all the way across and then, at the 95% mark, the claims kick in.
The blue line is the Lorenz curve for the predictive model. A good fit here would be a deep concavity that hugs the dotted line. That would mean the model is estimating low where the actuals are low (zero) and then progressively steeper.
The Gini index is a measure of the Lorenz curve’s concavity. This 0.30 is pretty good. The team that won the Allstate Challenge did it with 0.20. The downside to Gini is that it only tests the model’s ability to rank relative risks, not absolute ones. I have seen models up above 0.40 that were still way off on actual dollars.
Mean Absolute Error
The key metric, to my way of thinking, is being able to predict the total claims liability. This automatically gives you the mean, and Gini characterizes the distribution. I like MAE because it represents actual dollars, and it’s not pulled astray by outliers (like mean squared error).
Here, you see that the model overestimates by 1.2%.
You may be wondering why MAE is so high, when we are within $1.00 on the average claim. That’s because all of the no-claim people were estimated at an average of $72.50, and they’re 95% of the test set. The average estimate for the group that turned out to have claims (remember, this is out-of-sample data) was $130.70.
For claim severity, I trained a small neural net, including my own custom layers for scaling and encoding. I really like TensorFlow for this, because it saves the trained encoders as part of the model. You want to use a small neural net with a small dataset, because a bigger one can simply memorize the training data, and not be predictive at all.
This dataset has only nine features and, in fairness, a linear model would fit it just as well. My code repo is now filled with neural nets, random forests, and two-stage combo models. What this means for our startup is that we don’t have to hire a platoon of actuaries. We can get by with a few data scientists using AI as a “force multiplier.”
Earlier this century, I played a key role in moving the industry to electronic origination. At the time, it was clear that the API approach would liberate VSC pricing from the confines of printed rate cards and broad risk classes. Each rate quote could be tailored to the individual vehicle.
As I said earlier, our approach is current, proved, and working elsewhere. It’s just not being used in the VSC industry … yet.