The client is a global pharma major whose finance function generates 24 months look-ahead forecasts for 15000+ Operational Expenditure (OpEx) time series across ~500 different cost centres in different countries and multiple currencies. However, the entire forecasting was done manually making the process laborious, time-consuming, error-prone and non-standardized. Hence, the finance team wanted to implement a machine-learning solution to generate these forecasts automatically with a high degree of accuracy.
INXITE OUT APPROACH
As-Is Assessment and EDA
Collaborated with the client to understand the existing forecasting process, the key drivers for the OpEx flows and its variability. Defined mechanisms for unique identification of each time series as well as methods for mapping between the actuals, the forecasts as well as the drivers. A thorough and detailed EDA was henceforth undertaken to understand the behaviour of the different OpEx flows and impact of the normal business operations like mergers, demergers, new cost centre introduction etc.
Time Series selection, pre-processing, and validation strategy were employed. Univariate Models like SARIMA, ETS etc. were utilized for benchmarking against Manual Forecasts and to obtain a baseline accuracy.
Our initial approach involved creating a Multiple Linear Regression Model
Wages corresponded to the largest portion of operational expenditures, and initially, a Regression-based model was employed to develop a wage model. The MLR-based wage model was accurate but interpretability, which was critical, was low. Underlying business dynamics like pre-defined order in the average wages across different workgroups, minimum pre-defined gaps between average wages across wage groups, lower and upper thresholds for each wage group were not capturable in MLR. Normal regression methods do not allow incorporation of inequality constraints on coefficients leading to incorrect and unexplainable coefficients due to data insufficiencies (both volume and variational) and imperfections (in both data and model)
Hence, we developed a Constrained Regression Model to incorporate the business constraints
Underlying business dynamics that were not capturable in MLR, were incorporated using a constrained regression model. Business dynamics were converted into a set of Inequality Constraints on Range, Non-negativity and Ordered percentage gaps. The regression problem was formulated as an optimization problem that was subject to those constraints, minimizing the errors. Trust Region Reflective algorithm was utilized for further optimization.
Figure 1: End-to-End Model Approach
Read more about the Constrained Regression model in our blog Interpretable AI For Better Operationalization
- Aggregated Percentage Error of ~1% across all cost centres and Absolute Percentage Error reduction of 10% compared to manual forecasts.
- Automated OpEx forecasting business process was established, leading to 80%+ reduction in manual effort.
- Improved predictability of operating expenses due to higher accuracy.