5 Key Predictive Modeling Techniques: The Building Blocks of Strategic AnalyticsReading time: 5 minutes
“Data”, as we know, can describe an endless variety of things. It can describe text messages with your closest group of friends, the compositional breakdown of wine, or just about any other set of collected information. However, data analysis techniques don’t tend to vary as widely as the data itself. If you work in data analysis long enough, you may find yourself occasionally on autopilot. That’s true for predictive modeling, too.
It’s no wonder that an “auto-pilot” feeling sometimes creeps in: using the same modeling algorithm, you can produce a dataset to predict if a fundraising prospect will donate to your organization, if a customer will retain, or if a student admitted to a college will enroll. While the tools you use often remain the same across multiple applications of your model, the strategic predictive modeling techniques you use to incorporate that model into your work often differ greatly.
The most significant strategic changes from project to project relate to the basic “building blocks” of predictive modeling. Evaluating these building blocks is a critical first step towards successful modeling:
- Make sure your variables have reasonable impacts on the outcome
- Focus on the simpler model diagnostics
- Interpret your model based on what you need it to tell you
- Look for different indicators based on your audience
- Validate the model in your own business terms
Make sure your variables have reasonable impacts on the outcome
When your modeling process yields an equation, you should always double-check that what it discovered in the data makes sense. I don’t mean that you should ask if a coefficient of 0.98332 feels correct versus a coefficient of 0.87418 though. I mean that if the model determined that having a conversation with a fundraiser decreased someone’s likelihood to donate, or that getting straight A’s decreased a student’s likelihood to successfully complete their degree on time, you should investigate.
Discovering unexpected relationships between your data’s characteristics and your desired outcome is common. But your subject matter expertise is a critical fail-safe. Always review the model’s findings if you find something unexpected.
Focus on the simpler model diagnostics
The data science world is filled with incredibly dedicated professionals. There’s no shortage of recommendations on which model diagnostics are the most authoritative. There’s a real virtue to finding some indicators that help you and having the confidence to stick with those. Even if you are aware of some test, metric, or data preparation that you could be doing, be confident in designing your project to the level of complexity that you need.
Some models are meant to lend insight into the behaviors in your data. In those cases, you might not need to iron out every wrinkle in your data set after you’ve gained enough information to inform your end-users’ next steps. Even when models have higher stakes, like annual budget or sales projections, you don’t always need to “boil the ocean” by normalizing every last variable or investigating the heteroscedasticity of your residuals!
It’s cliché, but…. “Don’t let the perfect be the enemy of the good.” There is almost always another way to improve or measure the model, but you only need to go as far as your use-case requires.
Interpret your model based on what you need it to tell you
(Note: this is not the same as “Interpret the results as being what you need them to be”!)
Don’t get caught up in the misconception that you need to review every facet of what a model communicates. Keep in mind what you need to learn more about – the reason you built the model in the first place – and focus your interpretation on that topic.
Sometimes predictive modeling takes longer than it needs to because we overthink the results. Do you only need to know about the correlation between your key characteristics? There may be no need to complete the entire modeling process. Think of predictive modeling as an a la carte exchange; you will find yourself far more likely to use it in your analysis efforts.
Look for different indicators based on your audience
This predictive modeling technique has to do with communicating the results of your model to the people who put those results to use: the stakeholders and end-users who benefit from your modeling efforts. You may have always turned to the C-Statistic for your own sense of goodness-of-fit. But some audiences will not have your background. It can be hard to convey the significance of an R2 value, a concordance percentage, the c-statistic, or any of the other goodness-of-fit measurements in a modeling process.
Sidestep that difficulty while simultaneously improving trust by identifying an indicator of the model that will resonate with your audience. This requires some creativity, but there are many fun ways to make this happen. Perhaps, for instance, you could apply your model to a holdout sample, then report how nearly you’ve approximated your outcome. If you were predicting a likelihood of a web-visitor making a purchase, that could look like “we predicted that 19.4% of all site visits would translate to a sale, and the sample demonstrated a 20% purchase rate”.
By communicating in terms of the practical outcome, you’re swapping out technical model metrics with an explanation that resonates with your audience. This would speak to stakeholders who have bottom-line concerns much more directly than a c-statistic would.
Always think about who you are communicating with. How can you help them better understand the model?
Validate the model in your own business terms
This brings things full circle. The same algorithm can (and should!) be used for dozens of projects. That means the algorithm readouts will be generic… until you make them unique. Rather than describing an R2 as “the percentage of variability the model explains,” describe it as “the variability in auction price that is explained by the car’s age, manufacturer, and condition”- if you’re modeling the auction price of cars, that is. Instead of reporting that the model predicted the outcome “with less than 2% margin of error”, report that the model estimated new customer accounts within 10, less than 2% off actual”.
The more relevant your description of results, the more directly it will inspire action.
Takeaways: Why Implement these Strategic Predictive Modeling Techniques?
My position involves supporting a lot of customers and teams who I don’t know well (at first, anyway). That means when I discuss a model with them, I can’t rely on their background knowledge of my technical abilities. The building blocks I spell out above are what help me to quickly and compellingly communicate models. In summary:
1. By encouraging users to consider the practical significance of the variable, we can merge their contextual knowledge with the predictive models’ results in no time at all.
2. By using simple diagnostics as guideposts for the model, we focus on substantive improvements, not esoteric discussions and debates over which metric should be trusted more than the others.
3. I naturally need to orient myself with the end-users’ needs to assist them in building meaningful models. By focusing on their business priorities, the model-building discussion is always geared towards what they need to learn about. Time spent is always relevant to business needs.
4. In many cases, my work with the model-builders morphs into meetings with other stakeholders at their organization. I routinely find that they resonate with different levels of detail when we review the model. This keeps conversations fruitful regardless of the group’s statistical expertise.
5. Finally, when you follow these steps, every single person at the organization should be able to understand the model’s importance within their own perspective of the organization. This is no accident. When you build, review, and validate the model in terms that relate to the business’s needs, you will arrive at a tailored end-result. This improves your ability to implement your model moving forward.
Have any foundational strategic predictive modeling techniques of your own to share? Add them in the comments below!