# Basics of the Newsvendor Model

*By Kumar Singh, Research Director, SAPinsider*

### The “Optimal” Dilemma

Assume that you have decided to leverage the power of e-commerce to start your own business. Your classmate during your MBA days, who was from China and has relocated back to China, now owns a factory that manufactures floating ducks for swimming pools (the kind that kids like to play with). He is your good friend so he is willing to ship them to you with minimal margin so you decide to start selling them in the US on an e-commerce platform, and make money. Of course, the assumption here is that people simply love these floating ducks. However, your friend has a condition- you have to order the entire quantity for the year in one single order so that he can manufacture them cost-effectively.

Now you have a dilemma. Your perceived market in northeast US and pools close in September in the region. Any floating ducks that you can’t sell will not be reusable next year and hence will be sunk cost. But if you order too conservatively, you may end up with your stock running out and hence lost demand. What is the quantity that you should order? The newsvendor model, a statistical tool, has been leveraged for decades to address this dilemma. Technology advanced today allows us to leverage this model to enhance existing solutions as well as create new ones. In this article, we will understand how the model works, in a simple way.

### Understanding the Newsvendor Model

Newsvendor formula is a pretty powerful formula if used properly. Before you delve deeper into the strategic ways to leverage the newsvendor model, it makes sense to understand the basics of the newsvendor model. Let us assume that I decide to publish a blog daily in print format, in the form of a newsletter, and price the newsletter $1. Now I don’t know exactly how many newsletters I will be able to sell in a day. But I do need to decide how many I want to print. Assuming that I am selling these newsletters with the intention of making money, I want to maximize my revenue and minimize my losses.

Now let us assume that it costs me $0.1 to print one newsletter. So if I sell one (assuming there are no other costs involved), I make $0.9 ($1-$0.1 printing cost). If there are unsold copies though, I am facing a loss at $0.1 (printing cost) for each copy. So I do want to sell all I print to maximize my revenue.

**Too Many or Too Few?**

So I look at my sales figures for the last six months and see that the minimum that I have sold in the last six months is 10/day. So if I print only 10 copies, the chances of me being left with unsold copies are minimal. However, what about the lost sales? I find out after few days that at least 5 individuals are coming to buy the newsletter after I have sold my ten copies. So essentially, if I had five more copies, I could have easily made $0.9 X 5 more. So printing only 10 copies is not maximizing my revenues.

Essentially, what I have is a problem of “Too much or too little”? So the question then is how do I figure out what is the “optimal” quantity I should print each day, given the following data points that I have:

- Historical sales data
- Cost of losing
**a**sale ($0.9) - Cost of not selling/copy ($0.1)

Given the dilemma, I want to use a scientific methodology to determine what is the quantity that I should print that will help me maximize my profits.

**The Newsvendor Formula**

Below is the newsvendor formula. Now let us go through the components of the formula to understand it better.

As stated in my example above, the formula can help me calculate what is the optimal quantity of newsletters that I should print that is “optimal”. The notation F(Q) represents the probability that the demand is less than or equal to Q. So the Q that you determine using this probability value is the optimal Q. This will become more clear once we go through an example.

Now let us go through each of the parameters indicated in the formula:

** = Cost of underage.**

So if we go back to my newsletter example, if after I sell all my newsletter, there is still demand for my newsletters, for each unit of demand, I am losing $0.9 ($1 selling price – $0.01 printing cost). This $0.9 is my cost of underage, revenue lost due to underproducing

=** Cost of overage.**

Going back to the example again, If there are unsold copies though, I am facing loss at $0.1 (printing cost) for each copy. This cost is the cost of overproducing, hence referred to as cost of the overage. The value of F(Q) thus calculated using the formula above is: 0.9. The value of F(Q) calculated above, commonly known by statisticians as **critical ratio,** is basically providing you your ** optimal in-stock rate or the optimal service level, e**xplained in more detail in a subsequent paragraph.

**Optimal Service Level**

**But what does this optimal service level number mean?**

So what does that 0.9 mean? The reason this parameter is denoted by F(**Q**) is that F(Q) is the probability associated with an optimal stocking (in this case printing) quantity Q (and we will discuss how we can calculate that Q from the F(Q) we calculated in a later section). So, in simple terms, we can decipher 0.9 as *If I print Q newsletters every day, there is 90% probability that my daily demand is less than or equal to Q. Hence, Q is the optimal number of newsletters I should print and the corresponding service level that I will provide with that quantity is 90%.*

**Calculating Optimal Print Copies**

So now, we know that the optimal quantity, based on my underage and overage costs, will help me provide a service level of 90% but we still don’t know what that quantity should be.

This brings us to the topic of * demand distribution*. This is a very intensive topic in itself -there are links at the end of this article to resources you can use to gain fundamentals on this topic. What you need is to leverage your historical demand data to build a normal distribution curve of your demand, like the one shown in the illustration below.

Any specific point on the horizontal axis directly associates with an area of the curve (like the area highlighted in the Red in the diagram below). And this area is the probability that we calculated earlier, so when we calculate the probability of 90%, it essentially means that this probability is associated with a point Q, that will cover 90% of the area of the normal distribution curve. Do not confuse Q with quantity.

As you can see from the explanation and illustrations above, for a probability of 90%, the value of Q on the horizontal axis should be such so that it covers 90% of the area. But how do we get to that Q number? Luckily (at least for an average Joe like me), there are tables that can help determine the number of standard deviations associated with a certain probability level. This number is known as z factor in the statistics world.

**The Z Factor**

Z is the number of standard deviations you need to add to average demand to obtain your desired in-stock rate F(Q). Each service level that we have calculated above corresponds to a value of the z factor (number of standard deviations) and can be determined using a reference table that maps z value and service level/probability *for each distribution type. This z value, in turn, will allow us to calculate the quantity from the distribution parameters like mean and standard deviation. *Since the value of z, derived from the value of F(Q), is dependent on the type of distribution, the key to doing this accurately is your historical demand data. Calculating an accurate empirical distribution is very critical for optimal application of the newsvendor model since the properties of that distribution, along with the probability calculated above, will help us determine the optimal number of newsletters I should print.

In my case, I have determined that my newsletter demand is indeed a normal distribution with a daily Mean of 10 newsletters and a standard deviation of 2 newsletters. Now, I used the z value and probability table for normal distribution to get z value for a probability of 0.9.

z = 1.282

**Finally, Calculating Quantity**

The demand distribution that I created in the prior section is a specific type of distribution (Normal Distribution). As mentioned above, each distribution can be defined by certain parameters. For Normal Distribution, the Quantity for a specific z value is defined by the following formula:

Now we can use the above formula to determine the optimal newsletter quantities that I can print. Substituting the values in the formula above:

Q = 10 + 1.3 X 2 ~13 newsletters daily. Viola! I know what is the optimal number of newsletters I need to publish in order to maximize my profit. Hopefully, you can easily see how this approach can be leveraged on the floating ducks business that we introduced at the beginning of the article.

**What does this mean for SAPinsiders?**

While in the example, we used the model to calculate the final order quantity, this approach can be used to determine service levels in more scientific ways as well. While service levels have been traditionally assigned based on inventory categorization and margins, this model, in tandem with additional data points, can help bring more science to the process.

Integrating this into a multi-echelon inventory optimization model can have benefits too. Fortunately, technology today allows us to build inventory optimization models that are more dynamic, and go beyond safety stock. Embedding a newsvendor model in a dynamic, “learning” inventory optimization model can help you strategize more effective replenishment and stocking policies.

You can build a foundational sense of normal distribution from this article:

https://statisticsbyjim.com/basics/normal-distribution/