As a consumer of regression analysis, there are several things you need to keep in mind.
First, don’t tell your data analyst to go out and figure out what is affecting sales. “The way most analyses go haywire is the manager hasn’t narrowed the focus on what he or she is looking for,” says Redman. It’s your job to identify the factors that you suspect are having an impact and ask your analyst to look at those. “If you tell a data scientist to go on a fishing expedition, or to tell you something you don’t know, then you deserve what you get, which is bad analysis,” he says. In other words, don’t ask your analysts to look at every variable they can possibly get their hands on all at once. If you do, you’re likely to find relationships that don’t really exist. It’s the same principle as flipping a coin: do it enough times, you’ll eventually think you see something interesting, like a bunch of heads all in a row.
Also keep in mind whether or not you can do anything about the independent variable you’re considering. You can’t change how much it rains so how important is it to understand that? “We can’t do anything about weather or our competitor’s promotion but we can affect our own promotions or add features, for example,” says Redman. Always ask yourself what you will do with the data. What actions will you take? What decisions will you make?
Second, “analyses are very sensitive to bad data” so be careful about the data you collect and how you collect it, and know whether you can trust it. “All the data doesn’t have to be correct or perfect,” explains Redman but consider what you will be doing with the analysis. If the decisions you’ll make as a result don’t have a huge impact on your business, then it’s OK if the data is “kind of leaky.” But “if you’re trying to decide whether to build 8 or 10 of something and each one costs $1 million to build, then it’s a bigger deal,” he says. The chart below explains how to think about whether to act on the data.
Redman says that some managers who are new to understanding regression analysis make the mistake of ignoring the error term. This is dangerous because they’re making the relationship between something more certain than it is. “Oftentimes the results spit out of a computer and managers think, ‘That’s great, let’s use this going forward.’” But remember that the results are always uncertain. As Redman points out, “If the regression explains 90% of the relationship, that’s great. But if it explains 10%, and you act like it’s 90%, that’s not good.” The point of the analysis is to quantify the certainty that something will happen. “It’s not telling you how rain will influence your sales, but it’s telling you the probability that rain may influence your sales.”
The last mistake that Redman warns against is letting data replace your intuition.
“You always have to lay your intuition on top of the data,” he explains. Ask yourself whether the results fit with your understanding of the situation. And if you see something that doesn’t make sense ask whether the data was right or whether there is indeed a large error term. Redman suggests you look to more experienced managers or other analyses if you’re getting something that doesn’t make sense. And, he says, never forget to look beyond the numbers to what’s happening outside your office: “You need to pair any analysis with study of real world. The best scientists — and managers — look at both.”