Data-Driven Road Trip Stop 4: Modeling the Ideal Visit to Moab

I spent some of my formative years growing up in Moab, Utah. When I heard Lily would be visiting Arches National Park (two miles from Moab) on her road trip, it got me thinking about how the area might have changed since I lived there as a child.

I’d love to check up on it, but thanks to COVID-19, I’m betting the earliest I’ll have a chance to return is sometime in 2022. And everyone will likely be itching to travel after the pandemic becomes less of a factor in daily life. So maybe I’ll hold off until 2023 in hopes that things will have calmed down by then.

With all this time to think about it, I figured I’d put predictive modeling to work and estimate the ideal time to visit Arches in 2023. I have two years to get ready; why not plan ahead?

What does “ideal” look like for me?

I’d like to visit at a time when the crowds are as small as possible and when the temperature is moderately warm. I’d set that range at 70 to 80 degrees Fahrenheit.

This means I need two sets of data to predict when the ideal time to visit might be: visitor entry numbers for Arches and temperature logs for the Moab area.

Lily tracked down and shared a dataset that listed the number of vehicles permitted into Arches National Park every month between 1992 and 2020. Thankfully, the data was well-structured and clean, so it didn’t require additional prep.

The most noteworthy thing in the dataset is that several months in 2020 were extreme outliers. For example, in April 2020, there were ZERO visitors, the only month in the entire 28 years’ worth of data without a single visitor. The reason? The park was closed due to COVID-19. The month of May also saw significantly fewer visitors than average for the same reason. I noted that we might need to exclude this year from the model to get an accurate result.

Next, I downloaded historical data from the NOAA for the same time range: 1992 through 2020.

The data was fairly complete but had some null values in the Time of Observation (TOBS) column.

It was also stored in a different layout than the traffic data and showed daily information rather than monthly. As such, it needed some prep before we could build a model with it.

Prepping the Data

Since I don’t have the expert data background that our analyst support staff does, I asked Lily for a helping hand getting the data ready. She built a quick Construct job to convert the temperature data to monthly averages, transpose it to match the traffic data’s format, and merge the two datasets. The process produced a single clean dataset for predictive modeling.

One of the great features of Construct is that it’s incredibly easy to follow along with someone else’s process. Thanks to the visual workflow, I understood exactly what Lily’s Construct job accomplished as she walked me through it. We then created a model in Predict with the temperature and weather data, and Lily explained that the decile analysis indicated that the model was strong. Interestingly, including or excluding 2020, the year with outlier data due to the park’s COVID closure, didn’t impact the model’s results.

To complete the process, we brought the model back into Construct. After a three-minute demo on the Embed Data node, I was up and running, making my predictions.

So, What’s the Best Month to Visit in 2023?

Based on predicted traffic patterns and the expected average temperature, my ideal month to visit Arches is April 2023. The temperature should be in the 70s, and it will be moderately busy, with roughly 65,000 cars entering the park that month. A few months later, in July, it will be in the high 90s, with nearly 20,000 additional vehicles permitted over the month.

Now that I know which month I’ll visit, it’s time to pack! But I think I’ll wait till February 2023 to start working on that.

The model adeptly answered my simple question about when to visit Arches in a couple of years.

But by looking further into the future, this model could be a helpful tool for the National Parks Service to use in determining policies regarding admittance to the park.

For example, if the last 30 years’ traffic patterns continue, the model predicts that over 204,000 vehicles will be admitted to the park in August 2050. With that in mind, the NPS can make decisions (grounded in data) about when to start infrastructure projects or implement limits on the number of vehicles admitted each day.

Data-Informed Decision Making

Ultimately, Rapid Insight’s tools are here to enable data-informed decision-making possible for all organizations. Granted, the example in this post was a simple one. But it took less than an hour to prep data and build a model that could be of real value to decision-makers at the Parks Service.

Rapid Insight’s efficiency can scale to much larger projects and broader applications, analyzing and cleansing thousands of rows of data and predicting any organizational outcome you need. That’s one of the beautiful things about the tools; they’re designed to work for you, in whatever way you need them to.