To do any sort of business planning, you need forecasting. Our reason for existing is to help our customers drive success from their data, so, around two years ago we began to research adding automated forecasting to our cloud analytics platform.
This did however present a huge challenge. Take the typical lifecycle of a forecasting project. It’s an iterative process of refining a model, generating a forecast and evaluating the results to understand how to further refine the model. This cycle is repeated until you are happy with the results.
Evaluating the results and refining the model are usually manual tasks. They need an analyst with a good understanding of the model and the business or process that generates the data. That’s a pretty specific skillset.
That’s fine if you are creating one forecast for one organisation. We, however, want to scale our business rapidly, taking on lots of different customers. It’s not practical for us to recruit large numbers of highly skilled analysts in a short space of time. So, we need to automate as much of the process as we can.
We looked at a number of solutions. An obvious first choice, as we already use AWS, was the Amazon AI platform SageMaker, which includes DeepAR. DeepAR is a ML forecasting model based on a recurrent neural network with long-short term memory cells allowing it to model trends and seasonality. Importantly for us, the platform offers automatic model tuning. So, you run your data through the model numerous times, and it automatically selects hyperparameters to improve the forecast.
We also looked at Facebook Prophet. Unlike DeepAR, it’s a statistical model, similar to the generalised additive model (GAM). It copes particularly well with trends, seasonality and holidays, each being the additive components for the model. It’s also open source, and available in Python or R, so was easy for us to integrate into our platform.
Finally, we also considered Amazon Forecast. As a fully managed service on AWS, you upload your data and it uses automated machine learning (AutoML) to choose and refine an appropriate model. It includes a growing number of algorithms, including ARIMA, DeepAR+ (an enhancement of DeepAR), Prophet, ETS and NPTS.
We began the project using real world data from one of our existing customers, a jewellery retailer. We chose them primarily because they were enthusiastic to take part, but also their data was particularly challenging to forecast. The nature of their sales means the data is noisy, outliers are often high value sales that occur randomly, yet account for a large proportion of the value so cannot be discounted. The sales are also highly seasonal, with the 5 weeks leading up to Christmas accounting for as much as 50% of the sales.
We trained the models on three years data, from 2015-2017, then tested the resulting forecasts against the real-world results for 2018. For comparison we had the manual forecasts the company had used at the time, but unfortunately this was only detailed enough for the fairly rudimentary measure of the total difference over 12 months.
On total error over 12 months all the methods we tested significantly outperformed the manual forecast. The best results were generated by Prophet, with a total error of under 300k (out of annual sales over 100 million). This compared to over 15 million error with the manual forecast. However, using more meaningful measures (RMSE and MAE) DeepAR gave us the most accurate results.
In the 9 months since we conducted this research, we’ve integrated automated forecasting into our platform. We initially chose to use Prophet, even though DeepAR was arguably more accurate. Our reasoning was that Prophet was significantly easier to configure, and less resource hungry to train. Given the relatively small increase in accuracy DeepAR gave us, the ease at which we could scale the solution with Prophet won the day. In more recent months, with the added complication of forecasting during the Covid-19 crisis, we've begun using a combination of Prophet and LightGMB, with some success.
It has been fascinating to see how quickly this field has developed in the last year. The proliferation of available models, if anything, has gathered pace. We’ve continued to test different algorithms and techniques, and the level of automation and accuracy of our forecasting increases almost daily. We are already much closer to the goal of completely automating forecasting for industry than we thought possible 12 months ago. Who knows where we’ll be in another 12 months!
Business planning can be a challenge at the best of times, but during a period of unprecedented disruption like the current Covid19 crisis, it’s particularly so. Not only is the disruption on a scale none of us have seen in our lifetime, it’s unpredictable in the extreme. Governments are changing rules with massive repercussions almost weekly (most of us would agree with good reason), and the estimates of when we might return to something like normal vary alarmingly.
So, we’ve been working with some of our customers to build forecasts to guide their business planning during the crisis. We’ve used data from the UK’s Office for National Statistics' Data Science Lab to help model the fall in business activity due to the crisis, and a number of estimates as to when lockdown will end to model the return to normal activity.
The huge advantage of automated forecasting is we can re-generate forecasts as often as we like. So, as the situation develops and new information becomes available, we can adjust our models and re-generate the forecasts. It means our customers can respond quickly and effectively to changing circumstances, which is hugely beneficial in such uncertain times.
We’ve seen first-hand how the current crisis is proving a massive challenge for our customers and how automated forecasting can help. So, as far as our own costs allow, we’re offering a few UK business’s free access to our cloud analytics and forecasting platform for a few months to help ride out the crisis.
Our resources are limited, and we will put in some of our own time and effort to help organisation on board, so we need to limit this offer to a few companies we think we can make the most difference with. The software works best for medium or large organisation who have enough data to make forecasting accurate.
If you think this could help your organisation, please just get in touch at https://inmydata.datapa.com/about/contact-us
Like most people in the IT industry of a similar age (close enough to 50 to see the hairs poking out its nostrils), I’ve seen a lot of change. That said, it’s hard to think of any technology that has the potential to change so many working practices and industries as the current developments in machine learning and data analytics. For an analytics vendor like DataPA, staying at the forefront of such developments is a constant (but exhilarating) challenge.
Bringing fresh, new ideas into the business
It’s one of the reasons we at DataPA often take on students for a summer placement. It not only brings new ideas into the business, but also makes long term recruitment of fresh talent much easier. Our experience this summer was a perfect example of why embracing new talent is always a good idea.
For many customers on inmydata*, our cloud analytics platform, forecasting is a critical requirement for their business. For instance, our retail customers need sales forecasts to set targets and deliver stock planning. So, we scoped out an R&D project to discover whether we could use machine learning to automate forecasting for our customers.
Luckily for us, The DataLab run a fantastic MSc placement project in Scotland for data science, data engineering and analytics that includes a 10 to 12 week industrial placement. Through that program we engaged with the brilliant MBN Academy, who within a week matched us with Callum, a Data Engineering MSc student from Dundee University.
Delivering automated forecasts with improved accuracy
Things progressed well over the summer. For some of us in the business it took some frantic googling to keep up. RNN, LSTM, DeepAR and Prophet!? At least we’d heard of Facebook. There was, however, no arguing with the results. Testing against data from previous years, the models Callum built could predict annual sales within 2% of the actual figure. That compared with a 15% margin of error for the manual forecasts the customer had used in practice. To put that into context, the manual forecast was over £15 million out on sales of just over £100 million, whereas the automated forecast was less than £300,000 out.
Accurate automatic forecasting on demand
Since Callum finished his dissertation with us, we’ve been integrating the models he developed into our cloud platform. We can now deliver accurate forecast models on demand, automatically building rolling forecasts on a weekly basis. It’s a development that not only offers huge opportunity to our existing customers, but also allows us to compete much more aggressively in the crowded analytics market space.
Taking part in DataLab’s MSc Placement Project this summer has been fantastic. MBN Academy made it utterly painless to recruit Callum. In the short time he was with us he was a pleasure to work with and delivered real benefit for both DataPA and our customers.
Find out more about The Data Lab’s MSc programme and how you can get involved in offering a placement opportunity.
* inmydata is an innovative cloud analytics solution. It delivers everything you need to set the right goals for your organisation, and then measure your progress in achieving them.