You are currently viewing I Created This Step-By-Step Guide to Using Regression Analysis to Forecast Sales

I Created This Step-By-Step Guide to Using Regression Analysis to Forecast Sales

The term “regression analysis” may seem rigid, but it’s not as scary as you’d think. It‘s basically a fancy way of saying that we’re trying to figure out how two things are related. It‘s like playing detective, but instead of solving mysteries, we’re solving puzzles about data.

However, if you’re anything like me and not at all mathematically inclined, the idea of using regression analysis may seem daunting. Thankfully, this piece will give an easy-to-understand breakdown of how to use regression analysis, when to use it, and how it works when it comes to sales. Plus, I’ll guide you through an easy-to-follow example using Google Sheets.

Table of Contents:

  • What is regression analysis?
  • Regression Analysis Examples
  • What is regression analysis used for?
  • What is regression analysis in sales?
  • How To Use Regression Analysis To Forecast Sales
  • Why use regression analysis?
  • Sales Regression Forecasting Tools
  • Regression Analysis Helps You Better Understand Sales Performance

There are multiple types of regression analysis. Here are some short definitions to get you up to speed:

1. Linear Regression

Linear regression is the most basic and common type of regression analysis that uses the following equation: Y = bX + a. On a graph, linear regressions look like what you’re probably already imagining.

Let’s say you’re trying to figure out the correlation between heat and ice cream consumption. On one side, you’d have the temperature (how hot it is), and on the other, you’d have the number of ice cream cones sold.

Linear regression is like drawing a straight line through the points on this graph. This line helps us predict how many ice cream cones you might sell at a certain temperature.

2. Nonlinear Regression

Sometimes, relationships between two variables aren’t a straight line.

As temperature increases, people buy more ice cream. But maybe, when it gets really hot, people start to slow down and buy less ice cream. Nonlinear regression is like drawing a curved line through the points. This helps us account for more complex relationships.

3. Multiple Linear Regression

What if the number of ice cream cones sold also depends on other things, like the day of the week or whether there’s a nearby festival?

Multiple linear regression lets us consider multiple factors at once. It’s like drawing a line in a 3D space or even higher dimensions.

regression-forecasting-2-20241016-6293569

Regression Analysis Examples

Now that I’ve covered the basics of the different types of regression analyses and how they work, here are some real-world ways that you could utilize them.

1. To evaluate the effectiveness of a marketing campaign.

You just joined a marketing team at a B2C company and they’re looking to assess the efficacy of different channels (i.e., TV, online, social media) on sales.

To help them figure this out, you’d complete a multiple linear regression – utilizing sales data and marketing spend for each channel – to analyze the impact of each marketing channel on sales.

In this example, here’s what your regression analysis equation would consist of:

  • Sales is the dependent variable you’d want to predict.
  • a is the intercept, representing the baseline sales level when all marketing spends are zero.
  • b (which would be multiple coefficients – b1, b2, and b3 – here) represents the impact of each marketing channel on sales.

2. To assess risk in lending.

Perhaps you work at a bank and you’re trying to figure out how much interest to charge customers on their loans. You don’t want to charge too much (because you might lose customers) but, at the same time, you don’t want to charge too little and lose money.

To solve for this, you’d use a linear regression analysis to review data on past loans. You’d look at things like the customer’s credit score, income history, and the amount of the loan. Then, you’d see how these factors relate to the likelihood of the customer defaulting on the loan.

Once you’ve got all this information assembled, your linear regression analysis would predict the risk of repayment failure for new loan applicants, helping you set interest rates that are fair to both the bank and the customer.

In this example, here’s what your regression analysis equation would consist of:

  • Repayment probability is the dependent variable you’d want to predict.
  • a is the intercept, representing the baseline default probability when all other factors are zero.
  • b (which would be multiple coefficients – b1, b2, b3 – here) represents the impact of each factor on the repayment probability.

3. To determine a pricing strategy.

Let’s say you work for a tech company. They’re considering launching a new product and want to determine an optimal pricing strategy before they begin to roll it out to customers.

In this case, if you wanted to use a type of regression analysis to determine the success of a marketing campaign, you could potentially lean on a nonlinear regression analysis to model the relationship between price and demand.

You’d use market research data, competitor pricing, and sales data from previous years, for similar products, to help the company identify a price point that maximizes revenue.

In this example, here’s what your regression analysis equation would consist of:

  • Demand is the dependent variable you’d want to predict.
  • a is the intercept, representing baseline demand when the price is zero.
  • b (which would be multiple coefficients – b1, b2 – here) represents the nonlinear relationship between price and demand.

Now, I can understand if these types of explanations don’t paint the prettiest picture. In reality, regression analyses aren’t the most colorful, especially if you don’t already have a grasp of the full regression analysis process.

Instead of trying to jump the gun, in the next section, I’ll walk you through a more simplified explanation of regression analysis, in a more controlled context: Sales forecasting.

Sports coaches use it to analyze player performance and make strategic decisions during games. Economists use it to predict changes in interest rates. Overall, it’s a pretty helpful tool for making informed decisions based on data.

But since regression analysis is most commonly used by sales folks, I’ll explore what regression analysis looks like when used for sales forecasting.

Now, independent and dependent variables are still at play here, but the dependent variable is always the same: sales performance. Whether it’s total revenue or number of deals closed, your dependent variable will always be sales performance. The independent variable is the factor you are controlling to observe how it affects sales performance, like the number of salespeople you have or how much money is spent on advertising.

Sales regression forecasting results help businesses understand how their sales teams are or are not succeeding and what the future could look like based on past sales performance. The results can also be used to predict future sales based on changes that haven’t yet been made, like if hiring more salespeople would increase business revenue.

You’re probably wondering: What do all these words mean… in math terms?

Like I said before, I’m not good at math. ButI did conduct a simple sales regression analysis that is straightforward and didn’t require many calculations on my part.

Let’s go over this example together below.

How To Use Regression Analysis To Forecast Sales

No successful sales forecast can happen without a query in mind. To test this theory out, let’s run one to understand if having salespeople make more sales calls will mean that they close more deals. We’re going to do so by using a linear regression analysis.

To conduct this forecast with me, we’ll need historical data that depicts the number of sales calls made over a certain period. Mathematically, the number of sales calls is the independent variable, or X value, and the dependent variable is the number of deals closed per month, or Y value.

I made up the data set below to represent monthly sales calls, and a corresponding number of deals closed over a two-year period.

regression-forecasting-3-20241016-3644545d

So, the overall regression equation is Y = bX + a, where:

  • X is the independent variable (number of sales calls)
  • Y is the dependent variable (number of deals closed)
  • b is the slope of the line
  • a is the point of interception, or what Y equals when X is zero

Since we’re using Google Sheets, its built-in functions will do the hard stuff for us, so we won’t need to try and calculate the values of these variables. We’ll use the historical data table and select the correct graph to represent our data.

1. Create your chart.

The first step of the process is to highlight the numbers in the X and Y column and navigate to the toolbar, select ‘Insert’, and click ‘Chart’ from the dropdown menu.

regression-forecasting-4-20241016-9739209

2. Turn your chart into a scatter plot.

The default graph that appears isn’t what we need, so I clicked on the Chart editor tool and selected ‘Scatter plot’, as shown in the image below.

regression-forecasting-5-20241016-6774674

3. Customize your scatter plot.

After selecting the scatter plot, I clicked ‘Customize’, ‘Series’, then scrolled down to select the ‘Trendline’ box (shown below).

regression-forecasting-6-20241016-6729922

And after all of these customizations, I get the following scatter plot.

regression-forecasting-7-20241016-2365977

The Sheets tool did the math for me, and the line in the chart is the b variable from the regression equation, or slope, that creates the line of best fit. The blue dots are the y values, or the number of deals closed based on the number of sales calls.

Now, the scatter plot answers our original question: Will having salespeople make more sales calls will close more deals? By looking at the visual representation of our data, the answer is yes, and we know this because the line of best fit trendline is moving upwards, which indicates a positive relationship.

Even though one month can have 20 sales calls and 10 deals, and the next has 10 calls and 40 deals, the statistical analysis of the historical data in the table assumes that, on average, more sales calls means more deals closed.

I’m fine with this data because it tells a story. It means that having salespeople make more calls per-month will increase deal count. However, this scatter plot does not give us the specific forecast numbers that we’ll need to understand our future sales performance. So, let’s use the same example to obtain that more thorough information.

Now, let’s shift gears a bit. Let’s say that we’re now looking to find out how to generate more quarterly revenue, which is directly related to sales activity. We can assume closing more deals means generating more revenue, but you still want the data to prove that having your salespeople make more calls would actually close more deals.

The built-in FORECAST.LINEAR equation in Sheets will help you understand this, based on the historical data in the first table.

4. Create a new table in the same sheet, using the same columns from the original table.

I made the table below within the same sheet to create my forecast breakdown. In my Sheets document, this new table uses the same columns as the first (A, B, and C) and begins in row 26.

I went with 50 because the highest number of sales calls made in any given month from the original data table is 40 and we want to know what happens to deal totals if that number actually increases. I could’ve only used 50, but I increased the number by 10 each month to get an accurate forecast that is based on statistics, not a one-off occurrence.

regression-forecasting-8-20241016-6418415

5. Insert a FORECAST.LINEAR function into your data set.

After creating this chart, I followed this path within the Insert dropdown menu in the Sheets toolbar: Insert -> Function -> Statistical -> FORECAST.LINEAR.

This part gets a little bit technical, but it’s not as complex as it seems. The instruction menu below tells me that I’ll obtain my forecasts by filling in the relevant column numbers for the target number of sales calls.

regression-forecasting-9-20241016-9202126

Here is the breakdown of what the elements of the FORECAST.LINEAR equation mean:

  • x is the value on the x-axis (in the scatter plot) that we want to forecast, which is the target call volume.
  • data_y uses the first and last row number in column C in the original table, 2 and 24.
  • data_x uses the first and last row number in column B in the original table, 2 and 24.
  • data_y goes before data_x because the dependent variable in column C changes because of the number in column B.

This equation, as the FORECAST.LINEAR instructions tell us, will calculate the expected y value (number of deals closed) for a specific x value based on a linear regression of the original data set.

But, there are two ways to fill out this equation. The first option, shown below, is to manually input the x value for the number of target calls and repeat for each row:

=FORECAST.LINEAR(50, C2:C24, B2:B24)

The second option is to use the corresponding cell number for the first x value and drag the equation down to each subsequent cell. This is what the equation would look like if I used the cell number for 50 in the second data table:

=FORECAST.LINEAR(B27, C2:C24, B2:B24)

To reiterate, I used the number 50 because I want to be sure that making more sales calls results in more closed deals and more revenue, not just a random occurrence. This is what the number of deals closed would be, not rounded up to exact decimal points.

.

If you couldn’t already tell, this is the beauty of completing regression analyses. Because we were able to make solid predictions based on pre-existing data, we can now confidently ask our sales team to make ten more calls per month than the previous month, knowing the number of deals closed will increase and, ultimately, help our business to generate more revenue.

While Google Sheets helped me do the math without any further calculations, other tools are available to streamline and simplify this process. I’ll talk about a few of them in the following section.

Sales Regression Forecasting Tools

A critical factor in conducting a successful regression analysis is having data and having enough data. While you can add and just use two numbers, regression requires enough data to determine if there is a significant relationship between your variables. Without enough data points, it will be challenging to run an accurate forecast. If you don’t yet have enough data, it may be best to wait until you have enough.

Once you have the data you need, the list of tools below that can help you through the process of collecting, storing, and exporting your sales data. Check them out:

1. HubSpot’s Forecasting Software

screenshot of hubspot's forecasting software product demo

HubSpot’s newest Forecasting Software helps businesses of all sizes predict future sales and revenue. Thankfully, it uses a variety of data points – including past sales data, pipeline deals, and sales activities – to create accurate forecasts. Plus, with features like deal probability tracking, customizable forecasting models, and real-time reporting, it empowers sales pros to make well-informed decisons and optimize their sales strategies.

These projections can help you come up with further questions to analyze in your regression analysis to understand what is (or isn’t) going wrong.

screenshot of hubspot's forecasting software product demo

What I like about this tool: When I tell you that HubSpot’s Sales Forecasting tool is extremely customizable, I’m not kidding. You can separate data and pipelines based on quarterly results, monthly results, even teams. If you’re anything like me, having as much control over what I’m seeing and how I’m seeing it is a huge plus, especially for sales pros working collaboratively.

2. InsightSquared

screenshot of insight squared uxInsightSquared is a revenue intelligence platform that uses AI to make accurate forecasting predictions.

While it can’t run a regression analysis, it can give you the data you need to conduct the regression on your own. Specifically, it provides data breakdowns of the teams, representatives, and sales activities that are driving the best results. You can use this insight to come up with further questions to ask in your regression analysis to better understand performance.

What I like about this tool: InsightSquared allows its users to track every moment of sales activities. From the initial prospecting stage to securing an actual lead, you’re not only able to see how your sales are maturing but where exactly they’re doing so (which is super important for gathering accurate sales forecasting info).

3. MethodData

screenshot of method data homepage

Since sorting through data is essential for beginning your analysis, MethodData is a valuable tool. The service can create custom sales reports based on the variables you need for your specific regression, and the automated processes save you time. Instead of digging through your data and cleaning it up enough to be usable, it happens automatically once you create your custom reports.

What I like about this tool: As I briefly mentioned above, MethodData is about to learn about your data as you work with it, which is a timesaver. Plus, its ability to do the hard stuff (like creating custom sales reports) makes the difficult aspects of completing multiple regression analyses simple.

Regression Analysis Helps You Better Understand Sales Performance

A regression analysis will give you statistical insight into the factors that influence sales performance.

If you take the time to come up with a viable regression question that focuses on two business-specific variables and use the right data, you’ll be able to accurately forecast expected sales performance and understand what elements of your strategy can remain the same, or what needs to change to meet new business goals.

Editor’s note: This post was originally published in December 2020 and has been updated for comprehensiveness.