Edtech Unicorns and JIT Training

Udemy went IPO last week, and PitchBook just published a note on the category, so I thought to write about my positive experiences with Coursera.  Online learning is segmented by subject, level, and quality of instruction.  See the research note for a complete rundown.

The edtech boom has not waned now that most schools and universities are again meeting in person. 

Coursera is oriented toward college credit and professional certification.  My instructor for neural nets, Coursera co-founder Andrew Ng, is a professor at Stanford.  They offer online degree programs in conjunction with major universities.  For example, you can earn a Master’s in Data Science through CU Boulder.

I was intrigued by that, but … I have a specific business problem to solve, and I already have grad-level coursework in statistics.  It doesn’t make sense for me to sit through STAT 561 again.  For me, the “all you can eat” plan is a better value at $50 per month.

What I need, today, is to move this code off my laptop and into the cloud.  For that, I can take the cloud deployment class.  If I run into problems with data wrangling, there’s a class for that, too.  This reminds me of that scene in The Matrix, where Trinity learns to fly a helicopter.

People can gain the skills they need, as and when they need them – not as fast as Trinity, but fast enough to keep up with evolving needs on the job.  I think this is the future of education, and 37 million students agree with me.

Network Effects in Dealer Systems

Last month, I wrote that the recent acquisitions of several Digital Retail vendors were driven by the need to accumulate dealer data for predictive analytics.  Today, I’d like to discuss another of Professor Rogers’ five themes, “network effects,” and how it applies to F&I software.

We’ll consider a hypothetical company that supplies admin software for F&I products, and also sells one or more dealer systems.  Having two distinct, but related, customer groups will allow us to explore “cross-side” network effects.

If the value of being in the network increases with the size of the network, as it often does, then there is a positive network effect.  Social networks are the model case.  The more people who are on Facebook, the more valuable Facebook is to its users (and its advertisers).

This is the textbook definition of “network effects,” but it’s only one part of what Iansiti and Lakhani call Strategic Network Analysis.  Below is a handy outline.  This article will walk through the outline using our hypothetical company – and some real ones from my experience.

Network Strategy Checklist

  1. Network effects (good) – Value grows as the square of the node count … maybe.
  2. Learning effects (good) – There is valuable data to be gleaned from the network.
  3. Clustering (bad) – You can be picked apart, one cluster at a time.
  4. Synergies (good) – Your business includes another network that talks to this one.
  5. Multihoming (bad) – Easy for customers to use multiple networks.
  6. Disintermediation (bad) – Easy for customers to go around your network.
  7. Bridging (good) – Opportunity to connect your network to others.

By the end of this article, you will understand how networking relates to the data concept from the earlier article, and how to apply it to your own software.

Speaking of vocabulary, let’s agree that “network” simply means all of the customers connected to your software, even if they aren’t connected to each other.  It will be our job to invent positive network effects for the company.

The early thinking about networks dealt with actual communication networks, where adding the nth telephone made possible n-1 new connections.  This gave rise to Metcalfe’s Law, which says that the value of a network increases with the square of its size.

Working Your Network

If you are supporting a “peer-to-peer” activity among your dealers, like Bruce Thompson’s auction platform, Car Offer, then you have Metcalfe’s Law working for you.  By the way, Bruce’s company was among those in the aforementioned wave of acquisitions.

If you are supporting a dealer-to-dealer activity, like Bruce Thompson’s auction platform, then you have Metcalfe’s Law working for you. 

Research has shown that naturally occurring networks, like Facebook, do not exhibit Metcalfe-style connectivity.  They exhibit clustering, and have far fewer than O(n2) links.  Clustering is bad – point #3, above – because it makes your network vulnerable to poaching.

Even if you don’t have network effects, per se, you can still organize learning effects using your dealers’ data.  Let’s say you have a reporting system that shows how well each dealer did on PVR last month.  Add some analytics, and you can show that although he has improved by 10%, he is still in the bottom quintile among medium-sized Ford dealers.

That’s descriptive analytics.  To make it prescriptive, let’s say our hypothetical company also operates a menu system.  Now, we can use historical data to predict which F&I product is most likely to be sold on the next deal.  The same technique can be applied to Digital Retail, desking, choosing a vehicle, etc.

Note that we have data from our reporting system doing analytics for our menu system – and pooled across dealers.  Any data we can accumulate is fair game.  This is why I recently advised one of my clients to “start hoarding now” for a prospective AI project.

Cross-Side Network Effects

So far, we’ve covered points 1-3 for our hypothetical company’s dealer network.  I’ll leave their provider network as an exercise for the reader, and move on to point #4.  This is where your business serves two groups, and its value to group A increases with the size of group B.

I like to say “cross-side” because that clearly describes the structure.  Iansiti and Lakhani say “synergy.”  Another popular term is “marketplace,” as in Amazon Marketplace, which I don’t like as much because of its end-consumer connotation.

It’s hard to bootstrap a network, and it’s twice as hard to bootstrap a marketplace. 

Is there an opportunity for cross-side effects between dealers and F&I providers?  Obviously ­– this is the business model I devised for Provider Exchange Network ten years ago.  Back then, it was voodoo magic, but a challenger today would face serious problems.

It’s hard to bootstrap a network, and it’s twice as hard to bootstrap a marketplace.  In the early days at PEN, we had exactly one (1) dealer system, which did not attract a lot of providers.  This, in turn, did not attract a lot of dealer systems.  Kudos to Ron Greer for breaking the deadlock.

Worse, while PEN is a “pure play” marketplace, our hypothetical software company sells its own menu system.  This will deter competing menu systems from coming onboard.  I’ll take up another of Professor Rogers’ themes, “working with your competitors,” in a later post.

Finally, network effects are a “winner takes all” proposition.  Once everybody is on Facebook, it’s hard to enroll them into another network.  That’s not to say it can’t be done.  Brian Reed’s F&I Express successfully created a dealer-to-provider marketplace that parallels PEN.

This brings us to point #5, “multihoming.”  Most F&I product providers are willing to be on multiple networks.  When I was doing this job for Safe-Guard, we ran most of our traffic through PEN, but also F&I Express and Stone Eagle, plus a few standalone menu systems.

The cost of multihoming is felt more by the dealer systems, which are often small and struggle to develop multiple connections.  On the other hand, Maxim and Vision insisted on connecting to us directly.  This is point #6, “disintermediation.”

New Kinds of Traffic

Fortunately for our hypothetical company, Digital Retail is driving the need for new kinds of traffic between providers and dealer systems.  This means new transaction types or, technically, new JSON payloads.  Transmitting digital media is one I’ve encountered a few times.  Custom (AI-based) pricing is another.

Digital Retail is driving the need for new kinds of traffic between providers and dealer systems. 

Controlling software at both ends of the pipeline would allow them to lead the market with the new transaction types.  Key skills are the ability to manage a network and develop a compelling interface (yes, an API can be “compelling”).

As before, note that the same concepts apply for a dealer-to-lender network, like Route One.  There is even a provider-to-lender network right here in Dallas.  Two, if you count Express Recoveries.

So, now you have examples of Strategic Network Analysis from real-world F&I software.  This is one of the methods the Virag Consulting website means when it says “formal methods” to place your software in its strategic context.  

If you’ve read this far, you are probably a practitioner yourself, and I hope this contributes to your success.  It should also advance the ongoing discussion of data and analytics in dealer systems.

Looking for Work

I am ready for my next engagement.  This blog, together with my Linked-In profile, gives some indication of what I have accomplished and what I can do for your business.  There are also some case studies on my web site.

I am currently interested in digital retail, digital marketing, and artificial intelligence.  I generally do contract work, but will consider salaried.  If you have a job that requires my particular set of skills, please get in touch.

What is “Real” AI?

Clients ask me this all the time.  They want to know if a proposed new system has the real stuff, or if it’s snake oil.  It’s a tough question, because the answer is complicated.  Even if I dictate some challenge questions, their discussion with the sales rep is likely to be inconclusive.

The bottom line is that we want to use historical data to make predictions.  Here are some things we might want to predict:

  • Is this customer going to buy a car today? (Yes/No)
  • Which protection product is he going to buy? (Choice)
  • What will be my loss ratio? (Number)

In Predictive Selling for F&I, I discussed some ways to predict product sales.  The classic example is to look at LTV and predict whether the customer will want GAP.  High LTV, more likely.  Low LTV, less likely.  With historical data and a little math, you can write a formula to determine the GAP-sale probability.

What is predictive analytics?

If you’re using statistics and one variable, that’s not AI, but it is a handy predictive model just the same.  What if you’re using a bunch of variables, as with linear regression?  Regression is powerful, but it is still an analytical method.

The technical meaning of analytical is that you can solve the problem directly using math, instead of another approach like iteration or heuristics.  Back when I was designing “payment rollback” for MenuVantage, I proved it was possible to algebraically reverse our payment formulas – possible, but not practical.  It made more sense to run the calculations forward, and use iteration to solve the problem.

You can do simple linear regression on a calculator.  In fact, they made us do this in business school.  If you don’t believe me – HP prints the formulas on the back of their HP-12 calculator.  So, while you can make a damned good predictive model using linear regression, it’s still not AI.  It’s predictive analytics.

By the way, “analytics” is a singular noun, like “physics.”  No one ever says “physics are fun.”  Take that, spellcheck!

What is machine learning?

The distinctive feature of AI is that the system generates a predictive model that is not reachable through analysis.  It will trundle through your historical data using iteration to determine, say, the factor weights in a neural network, or the split values in a decision tree.

“Machine learning is the field of study that gives computers the ability to learn without being explicitly programmed.”

The model improves with exposure to more data (and tuning) hence Machine Learning.  This is very powerful, and will serve for a working definition of “real” AI.

AI is an umbrella term that includes Machine Learning but also algorithms, like expert systems, that don’t learn from experience.  Analytics includes statistical methods that may make good predictions, but these also do not learn.  There is nothing wrong with these techniques.

Here are some challenge questions:

  • What does your model predict?
  • What variables does it use?
  • What is the predictive model?
  • How accurate is it?

A funny thing I learned reading forums like KD Nuggets is that kids today learn neural nets first, and then they learn about linear regression as the special case that can be solved analytically.

What is a neural network?

Yes, the theory is based on how neurons behave in the brain.  Image recognition, in particular, owes a lot to the dorsal pathway of the visual cortex.  Researchers take this very seriously, and continue to draw inspiration from the brain.  So, this is great if your client happens to be a neuroscientist.

My client is more likely to be a technology leader, so I will explain neural nets by analogy with linear regression.  Linear regression takes a bunch of “X” variables and establishes a linear relationship among them, to predict the value of a single dependent “Y” variable.  Schematically, that looks like this:

Now suppose that instead of one linear equation, you use regression to predict eight intermediate “Z” variables, and then feed those into another linear model that predicts the original “Y.” Every link in the network has a factor weight, just as in linear regression.

Apart from some finer points (like nonlinear activation functions) you can think of a neural net as a stack of interlaced regression models.

You may recall that linear regression works by using partial derivatives to find the minimum of an error function parametrized by the regression coefficients.  Well, that’s exactly what the neural network training process does!

What is deep learning?

This brings us to one final buzzword, Deep Learning.  The more layers in the stack, the smarter the neural net.  There’s no danger of overdoing it, because the model will learn to skip redundant layers.  The popular image recognition model, ResNet152 has – you guessed it – 152 layers.

So, it’s deep.  It also sounds cool, as if the model is learning “deeply” which, technically, I suppose it is.  This is not relevant for our purposes, so ignore it unless it affects accuracy.