Hello all,

In the previous article on history cleansing in SAP IBP Demand Planning, we have seen how we do use Substitute Missing Value Algorithm in detail. Well, there are two other algorithms that are also used often, i.e. Outlier Correction Algorithm and Promotion Sales Lift Elimination Algorithm. In this article, we are going to see how do these algorithms function.

We have already seen how we do create a forecast model in SAP IBP Fiori Launchpad using the Manage Forecast Model app/tile. So in this article, we will directly start by choosing an outlier correction algorithm and see what are the details we use and why we do use them. Here in the given image, we have selected outlier correction as our algorithm.

After selecting the outlier correction algorithm, we get the following tab where there are various fields that we are required to fill according to our business requirement. We will see these fields one by one.

There are two methods SAP has provided to find whether the given value is an outlier or not. These two methods are as follow.

An interquartile range test system checks whether the time series values are within the interquartile range, which is the difference between the third quartile and the first quartile of the data. The values that are not within this range are identified as outliers.

In statistics, the quartiles are the three values that divide the data into four equal groups, each group comprising 25% of observations from the data. The quartiles are determined as follows:

- The first quartile consists of the lowest 25% of data (therefore, it’s also called the 25th percentile).
- The second quartile cuts the data into 2 equal halves (therefore, it’s also called the median or the 50th percentile).

The third quartile consists of the highest 25% of data. - The interquartile range (IQR) is the difference between the third and first quartiles. (IQR = Q3 - Q1)

To detect the outliers using the interquartile range method, the system calculates a lower and an upper bound using the first (Q1) and the third (Q3) quartile: - Lower bound = Q1 – Multiplier × IQR
- Upper bound = Q3 + Multiplier × IQR

The values that fall outside of this tolerance lane are considered outliers. The multiplier influences the sensitivity of the outlier detection; using lower values for the multiplier will probably lead to the detection of more outliers in the sales history because it will contract the tolerance lane.

A multiplier is a decimal number by which the system multiplies the range of accepted values i.e. (Interquartile range in case of the interquartile range test method and standard deviation in case of variance test), thus including additional values in the range or excluding a set of values from it. The most commonly used multipliers are 1.5 and 3.

When we use 1.5 as a multiplier in our model then it forms an inner fence i.e. shorter tolerance lane and more outliers are detected which brings false results. To exclude the false outliers, we need to manually review the detections and for doing so there should not be any forecasting steps or post-processing steps involved in the model. Therefore, it is recommended that we should use 3 as a multiplier. By using 3 as a multiplier we get the outer fence i.e. wider tolerance lane. So only strong outliers are detected which are mostly correct so we don’t need to review them manually

In this variance test, the system checks whether the historical values deviate from the mean of historical data by the standard deviation multiplied by the constant i.e. multiplier.

To detect the outliers using the variance test method, the system calculates a lower and an upper bound using the mean and the standard deviation (SD) of the historical data:

Lower bound = Mean – Multiplier × SD

Upper bound = Mean + Multiplier × SD

The values that fall outside of this tolerance lane are considered outliers.

After finding the outliers in our data we would correct them. For doing the correction of outliers there are various methods, those are as shown in the given image.

- No Correction By using this method, outliers are not corrected. We do use outliers only for information purposes.
- Correction with Mean The system replaces outliers with the average of all key figure values calculated for the historical periods.
- Correction with Mean Excluding Outliers The system replaces outliers with the mean of all key figure values calculated for the historical periods and doesn’t consider the outliers for the calculation. This is the default outlier correction method.
- Correction with Tolerance The system recalculates the tolerance range and changes the outliers so that they are at the limits of the new tolerance range.
- Correction with Tolerance Excluding Outliers The system recalculates the tolerance range without considering the outliers and changes the outliers so that they are at the limits of the new tolerance range.
- Correction with Median The system replaces outliers with the median of all key figure values calculated for the historical periods.
- Correction with Median Excluding Outliers The system replaces outliers with the median of all key figure values calculated for the historical periods and doesn’t consider the outliers for the calculation.

Here in the given image, we have taken Actuals Quantity as the input for the algorithm, so the system will use Actuals Quantity key figure values to detect any outliers and correct them.

For reading the previous article please refer to the below link

https://www.walsystems.com/history_cleasing.html

- SAP Integrated Business Planning for Supply Chain. (SAP Help Portal).
- https://help.sap.com/viewer/feae3cea3cc549aaa9d9de7d363a83e6/2108/en-US/603f8154c2652357e10000000a44176d.html

Amol Khomne

SAP IBP Consultant, Baranwal Consultancy and Services (BCS), Pune, Maharashtra, India