Just Enough Estimating

Software developers hate to give estimates, and they’re generally bad at it. I can say this because I came up as a developer, and I understand the attitude. We’re wired to dig in and fix problems, not sit around characterizing them. This is one reason we like to code at night. If we finish early, we’ll go to bed. If not, we’ll just keep coding.

Accurate estimating is a serious challenge for managers and scrum coaches. Ironically, it was less of a problem back in the waterfall days. Longer development cycles meant more risk, so we put more effort into estimating. Also, bigger chunks of work meant that an underestimate on one task might be offset by an overestimate somewhere else.

GMAT data sufficiency questions are surprisingly sophisticated, and most students do not truly understand the game or leverage all the hints present in these complex problems.

Contract development firms would live or die by their estimates, so they developed elaborate quantitative tools. See my earlier post on agile development.

This challenge, sizing up a problem without actually solving it, is a testable cognitive skill. I am referring to the dreaded data sufficiency section of the GMAT exam, which is cleverly designed to kill you if you try to solve each problem. The skill is to work each problem only far enough to determine the data requirements. Here’s a sample:

A dealer offered a service contract at a price that would gross 40 percent over cost. Which, if any, of these facts is needed to determine the dealer’s cost: 1) After reducing his price by 10 percent, the dealer sold the contract at a profit of $403, or 2) The dealer sold the contract for $1,953.

Sometimes, “we don’t know anything until we know everything.” Every piece of the puzzle must snap together. This is true of the development phase, but not estimating. Making decisions under uncertainty is the theme of most management training.

The diagram above is the ne plus ultra of data sufficiency problems. I found it, not on a test-prep site, but on Twitter. If you’re handy with geometry, you’ll notice that you don’t have enough measurements to fully determine the shape. You can’t compute its area, for instance.

But that’s not the question! Even though the exact shape isn’t fixed, the length of its perimeter is. This is a great example where you don’t need to know everything, to know the answer.

Weighted Factors for Product Selection

Every so often, I will write up a standard quantitative procedure, usually because someone has asked me about it.  For instance, see Pay Plan Math, What Is Accuracy, and Know Your Time Series.  Today, it’s weighted-factor analysis for product selection.  At a high level, this procedure is:

  1. Gather your requirements and selection criteria
  2. Quantify how important each criterion is
  3. Grade the vendor responses
  4. Compute numerical scores

Gather Requirements and Criteria

First, through interviews and maybe some direct observation, discover why we need the product.  In my business, this is generally a software product, but it could be anything.  Next, determine what are the requirements and selection criteria.

Selection criteria are the features we will evaluate to decide which product is the best fit, whereas requirements are features the product must have to even be considered.  Don’t make the mistake of thinking requirements are just extra-special criteria.

If you’re looking to buy a car, and gas mileage is on the list, then a hybrid will score well on that criterion.  If you’re only looking to buy a hybrid, then that’s the category, and you’re not looking at gas cars at all.

The purpose of requirements is to define the category of product we’re looking for. If you’re writing an RFP, the criteria are what the vendors respond to, and the requirements determine which vendors get the RFP.  When in doubt, send them the RFP anyway and let the vendor figure it out.

For example, if I am selecting cybersecurity software, I might want endpoint protection (EPP), endpoint detection and response (EDR), managed detection and response (MDR), or even a security operations center (SOC).  These all address the same problem, but they’re not the same product.

Quantify Importance of the Criteria

In the chart, I show criteria scored on a scale of 1 to 5, which is typical. Then, for the sake of example, I norm these to a total score of 100.  This is probably overkill, but it’s fun to have 100 as a baseline.  Later, we’ll do the same with the final score.  Clients love simple numbers.

One way to explore the criteria is do a forced ranking from most important to least important.  This is not amenable to quantitative methods, but it’s a good way to get started.  Spend an hour in front of the whiteboard while the client staff fight it out over the ranking, then let them each do the 1 to 5, and average their responses.

Yet another way is to give each participant 100 points that he can allocate as desired across the criteria.  This is the most accurate, in terms of understanding tradeoffs, and it makes the math easy.

I like to keep the cost analysis separate from the features.  It is possible to turn the price proposal into another row among the criteria, but no one really thinks this way.  What you’re shooting for is, “this one scored 84 out of a hundred, and it’s $100,000 more than the one that scored 74,” with traceability back to the features that account for the difference.

Grade the Vendor Responses

Maybe you’ve sent an RFP and are now grading the proposals, or maybe you’re doing your own research. Using an RFP is handy because you can include the criteria and let the vendors tell you how they propose to meet them. In either case, you (and the committee) are responsible for assigning a number to indicate how well the product meets each criterion.

Here again, the 1 to 5 scale is popular and easy to use.  Obviously, grades supported by numbers are best.  For gas mileage, you can assign 1, 2, 3, 4, and 5 to specific ranges of MPG.  Something like “vendor support” can be tied to a service-level agreement in hours or minutes.

Compute the Final Score

This is called weighted-factor analysis because each product is scored according to its criteria grades, and the criteria have different weights.  It’s just like computing a weighted average.  Since we’ve normed the weights to 100 and we’re using a 5-point grading scale, we divide the totals by five to produce a score out of 100.  You can present this as a percentage if you want.

In our carefully contrived example, vendor #3 comes out on top even though they had the lowest raw score, because they scored well on the criteria that mattered most.

When data scientists say that “our precision exceeds our accuracy,” this is what they mean. Do not take this fundamentally subjective numerical score out to two decimal places.  The point of this procedure is not so much to generate a number, but to make the variables explicit.

The idea is that sum of many small decisions will be more accurate than one big one, particularly if there is consensus among the participants. Everyone on the committee should be able to say why the chosen product scored ten points better than the runner-up.

Also, to be a little bit pragmatic: now everyone has their fingerprints on the decision.  No one can complain that they weren’t consulted, or question how the decision was made.

Funny aside:  One of my first consulting projects was the selection of a networking vendor for Ford Credit. We did the full procedure: interviews, requirements, criteria, an RFP, a selection committee, bidder conferences, sealed bids, etc. Digital Equipment (DEC) won. Remember them? And then some big shot from the Glass House swooped in and gave the contract to IBM. What about our fancy RFP project? Well, it was “defective” because it failed to produce IBM as the winner. There was a saying in those days, “no one gets fired for buying IBM.” It was seen as the safe choice – and the only choice for risk-averse executives.

Inventory Management for Powersports

One of the things I enjoyed about my sojourn in powersports was comparing practices with automotive retail.  The most intriguing is a convergence of facts that suggest inventory management should be centralized:

  • Multiple new franchises in the same rooftop.
  • Many units arrive in a crate and require assembly.
  • Limited space, in the showroom and in the shop.
  • Inconsistent VIN decoding.

I’ll explain each of these, showing how powersports is different from automotive (and more like Dick’s) when it comes to inventory management. Then, I’ll briefly describe how such a setup might operate.

Powersports is Different from Automotive

Unlike an auto group, where stores are segregated by their OEM franchise, stores in a powersports group have much the same make/model mix. My local dealer sells Kawasaki, Polaris, Can-Am, and Yamaha – as they all do.

My favorite allocation model is not “you’ve sold all your Razors and Mavericks, so you get more.”

This means it’s possible to consolidate intake for the group, and allocate distribution based on real-time results. My favorite allocation model is not, “you’ve sold all your Razors and Mavericks, so you get more.” Stores must equally share the slow movers, too, so a bundled restock model is better.

Powersports stores are often small and crowded. They’re much more exciting than car stores, bursting with vehicles and accessories, with the colors and signage of multiple manufacturers. Keeping extra inventory offsite, in cheaper space, makes good sense.

Service is also space constrained, which means that building new units must often compete for bays (and techs) with repair and maintenance for customer vehicles. New build can be delegated to the warehouse, along with recon and custom build. Centralizing this work allows more efficient scheduling.

Centralizing intake also means cleaner model data for planning and analysis. In automotive, we take for granted that we always know the model and trim. In powersports, not so much. If you have multiple people receiving inventory in multiple stores, there can be a lot of variability. 

Distribution Center Operation

Let’s follow some inventory through the distro, and highlight why this is a good idea. We start with new unit intake, where we have a central point to reconcile orders, schedule new unit build, and deal with freight damage. It’s also physically easier to handle freight trucks at a warehouse.

This is a central point to enter units into your store level DMS. If you’re running an enterprise inventory system, which I recommend, enter to it in parallel. Or, better yet, enter to it first and push to the DMS. The inventory system can prioritize build requests, track which stores are getting which units, and notify them.

Over in the shop, we are of course building new units but also accessorizing and building customized units, which may be from new or used inventory. There is good margin to be made here. The shop also centralizes recon work for trade units, which are backhauled from the delivery runs.

This is a control point for whether the recon pencils versus going to auction. Here again, the inventory system, operating “above” the store level DMS, helps route trade units back to their stores. It should interface with your logistics system.

An Expensive Proposition

So, it’s a good idea. On the other hand, let’s be honest about the costs:

  • Cost of renting and operating the warehouse.
  • Cost of running the logistics operation.
  • Opportunity cost of inventory sitting in the warehouse.

The cost benefit analysis comes down to how many stores are in the group, and how close they are to the warehouse. The rent can be offset against the cost of floorspace in a retail zone. Ditto for the opportunity cost, if this is inventory you were going to keep in the stores anyway. Also, we assume that the group is doing some kind of centralized order consolidation.

As for logistics, I’ve had good luck with Samsara. You probably have some trucks operating already, picking up used units, service units, or just redistributing inventory. Hell, if you want to go full digital retail, you can offer home delivery out of the warehouse – although this is not recommended. That’s another subtle way powersports is different. 

I Don’t Care about UI/UX

Readers with up-to-date Twitter skills will recognize the classic Willem Defoe meme.  I have been doing web apps for a long time, and it seems everybody is an expert in UI and UX – both!  They have, as Jamie puts it, a “flair” for web design.  Genius-level stuff, like green CTA buttons because green means go.

What will it look like? I don’t care.

On a recent gig, the first thing I did was replace hi-fi mockups with Balsamiq.  The product team had been killing themselves to do mocks in Figma, and failing, and then wrangling with the UI developers well into each sprint.  There’s a good reason why Balsamiq uses scratchy lines and a comic font, which the crew understood instantly.

Is this what it’s going to look like?  No!  What will it look like?  I don’t care!  Well, I do care, but I studiously avoid having an opinion about web design because I respect the professional competence of my UI/UX team.  I have a habit of saying “I don’t care,” when what I really mean is: I don’t want to interfere in a decision better made by actual experts.

We know exactly what the page will do, from business analysis and functional design.  We also know roughly what it will look like, from the style guide.  But, what will it look like, exactly?  I am content to wait and see, and BTW we’re agile and we’re AB testing – so it will change, anyway.

Brah, where’s your queuing service?

Everybody thinks they’re an expert because UI/UX is the presentation layer.  It’s (seemingly) just visual.  People think, hey, my socks match my pants, so I can play too.  Oddly, no one ever offers advice on how to do the data layer, or what message bus to use.