Valuing Analytics & Predictive Modeling in Higher Ed

As promised, here is part two of my interview with Mike Laracy, Founder, President, and CEO  of Rapid Insight. Mike’s 20+ years of data analytics & predictive modeling experience have provided him with many insights. Here’s Mike on becoming more data-driven in higher education, which models produce the highest ROI, and mistakes to avoid:

Where does predictive modeling fit into the analytic ecosystem in higher education?

Within the analytic ecosystem in higher ed, there is a range of ways in which data is analyzed and looked at. On one side, you have historical reporting, which our clients do a lot of and is vital to every institution.  Somewhere in the middle is data exploration and analysis, where you’re slicing and dicing data to understand it better or make more informed decisions based on what happened in the past.  On the other side of the spectrum is predictive modeling.  Modeling requires taking a look at all of the variables in a given set of information to make informed predictions about what will happen in the future. What is each applicant’s probability of enrolling or what is each student’s attrition likelihood?  What will the incoming class look like based on the current admit pool?  These are the types of questions that are being answered in higher ed with predictive analytics.  The resulting probabilities can also be used in the aggregate. For example, enrollment models allow you to predict overall enrollment, enrollment by gender, by program, or by any other factor.  The models are also used to project financial outlay based on the financial aid promised to admitted applicants and their individual enrollment probabilities.

Higher education has come a long way in the last five to ten years in its use of predictive analytics. The entire student life cycle is now being modeled starting with prospect and inquiry modeling all the way through to alumni donor modeling.   It used to be that any institutions that were doing this kind of modeling were relying on outside consulting companies.  Today most are doing their modeling in-house.  Colleges and universities view their data as a strategic asset and they are extracting value from their data with the same tools and methodologies as the Fortune 500 companies.

What kinds of resources are needed and what is the first step for an institution who wants to become more data-driven in their decision making?

It’s important to have somebody who knows the data. As long as a user has an understanding of their data, our software makes it very easy to analyze data and build predictive models very quickly. And our support team is available to answer any analytic questions.

Gaining access to their data is the first step. We see a lot of institutions that have some reporting tools which don’t allow them to ask new questions of the data. So, they might have a set of 50 reports that they’re able to run over and over but anytime someone has a new question, without access to the raw data there’s no way to answer the question.

It really helps if the institution is committed to a culture of data driven decision making.  Then all the various stakeholders are more focused on ensuring data access for those doing the predictive modeling.

What do you say to those who are on “the quest for perfect data”?  Is it okay to implement predictive analytics before you have that data warehouse or those perfectly cleansed datasets?

No institution is ever going to have perfect data, so you work with what you have. We suggest seeing what you have, finding any obvious problems in the data, and then fixing those problems the best you can. We’ve designed our solutions such that a data warehouse is not required but, even with a clean data warehouse, the data is never going to be perfect.   As long as you as you have an understanding of the data, you can move forward.

In your experience, which models in higher education produce the highest ROI?
We have a customer, Paul Smith’s College that has quantified their retention modeling efforts. Using their model results, they put programs into place to help those students that were predicted to be high-risk of attrition. They credit the modeling with helping them identify which students to focus on, saving them $3m in net tuition revenue so far.

We have other clients that are using predictive modeling on the prospect side and they’re realizing significant savings on their recruiting efforts. So instead of mailing to 200,000 high school seniors, they’re mailing to 50,000, and realizing significant savings by not mailing and not calling those students who have pretty much zero probability of applying or enrolling.

Although not as easily quantifiable, enrollment modeling has a pretty big ROI.  Not only on determining which applicants are likely to enroll, but in predicting class size.  If an institution overshoots and enrolls too many applicants, they’ll have dorm, classroom, and other resource issues.  If enroll too little, they’ll have revenue issues.  So predicting class size and determining who and how many applicants to admit is extremely important.

What are some common mistakes you see when approaching predictive modeling for your higher ed customers?

One mistake that I often see is when information is thrown out as not useful to the models.  Zip code is a good example. Zip code looks like a five digit numeric variable, but you wouldn’t want to use it as a numeric variable in a model. In some cases it can be used categorically to help identify applicants’ origins, but its most useful purpose is to for calculating a distance from campus variable.  This is a variable that we see showing up as a predictor in many prospect/ inquiry models, enrollment models, alumni models, and even retention models.  Another example of a variable that is often overlooked is application date.  Application date often contains a ton of useful information if looked at correctly.  It can be used to calculate the number of days between when the application was sent and the application deadline.  This piece of information can tell you a lot about an applicant’s intentions.  A student who gets their application in the day before the deadline probably has very different intentions than a student who applies nine months before the deadline.  This variable ends up participating in many models.

To get our customers up to speed on best practices in predictive modeling we’ve created resources like lists of recommended variables for specific models and guides on how to create useful new variables from existing data.

Decentralize analytics.
Harness the power of many.

Create and share reports and datasets across the enterprise, and put analytical power in the hands of everyone. Veera creates a truly data-driven culture. Try it for yourself today.



Decentralize analytics. Harness the power of many.