Thursday, 10 June 2021 13:48

Piecewise Linear Trend with Automated Time Series Forecasting (APL)

Written by Marc DANIAU
Rate this item
(0 votes)
Source https://blogs.sap.com/2021/06/11/piecewise-linear-trend-with-automated-time-series-forecasting-apl/
“© 2020. SAP SE or an SAP affiliate company. All rights reserved.” “Used with permission of SAP SE”

If you are a user of APL time series, you probably have seen models fitting a linear trend or a quadratic trend to your data. With version 2113 the Automated Predictive Library introduces an additional method called Piecewise Linear that can detect breakpoints in your series. You don’t have to do anything new to take advantage of this functionality, the trend is detected automatically as shown in the example below.

For SAP Analytics Cloud users, note that Piecewise Linear Trend is coming with  the 2021.Q3 QRC (August release).

This article presents two ways of using APL: Python notebook, and SQL script. Let’s start with Python.

First, we connect to the HANA database.

# Connect using the HANA secure user store from hana_ml import dataframe as hd conn = hd.ConnectionContext(userkey='MLMDA_KEY')

You may want to check that the Automated Predictive Library on your HANA server is recent enough.

import hana_ml.algorithms.apl.apl_base as apl_base df = apl_base.get_apl_version(conn) v = df[df.name.eq('APL.Version.ServicePack')].iloc[0]['value'] print('APL Version is ' + v)

Don’t forget to sort the series over time before giving it to APL.

sql_cmd = 'SELECT * FROM "APL_SAMPLES"."BOSTON_MARATHON" ORDER BY "THE_DATE"' hdf_in = hd.DataFrame(conn, sql_cmd) df = hdf_in.collect()

This is how the series looks like.

import matplotlib.pyplot as plt plt.figure(figsize=(12,5)) plt.plot(df.THE_DATE, df.WINNING_TIME) plt.title('Boston Marathon Winning Time') plt.xlabel('Year') plt.ylabel('Men Times in minutes') plt.grid() plt.show()

We ask APL to build a time series model and make a forecast 3 years ahead.

from hana_ml.algorithms.apl.time_series import AutoTimeSeries model = AutoTimeSeries(time_column_name= 'THE_DATE', target= 'WINNING_TIME', horizon= 3) hdf_out = model.fit_predict(hdf_in)

And then we display the forecasted values.

df = hdf_out.collect() import pandas as pd df['THE_DATE'] = pd.to_datetime(df['THE_DATE']) df = df.set_index('THE_DATE') plt.figure(figsize=(12,5)) ax1 = df.ACTUAL.plot(color='royalblue', label='Actual') ax2 = df.PREDICTED.plot(color='darkorange', label='Forecast', linestyle='dashed') h1, l1 = ax1.get_legend_handles_labels() plt.legend(h1, l1, loc=1) plt.title('Boston Marathon Winning Time') plt.xlabel('Year') plt.ylabel('Men Times in minutes') plt.grid() plt.show()

The forecast line shows a piecewise trend with two breakpoints. This is confirmed by the components information.

d = model.get_model_components() components_df = pd.DataFrame(list(d.items()), columns=["Component", "Value"]) components_df.style.hide_index()

We display some forecasting accuracy indicators from our APL model.

import numpy as np d = model.get_performance_metrics() # Average each indicator across the horizon time window apm = [] for k, v in d.items(): apm.append((k, np.mean(v))) metric = [apm for apm in apm if apm[0] =='L1'][0][1] print("MAE is {:0.3f}".format(metric)) metric = [apm for apm in apm if apm[0] =='MAPE'][0][1] *100 print("MAPE is {:0.2f}%".format(metric)) metric = [apm for apm in apm if apm[0] =='L2'][0][1] print("RMSE is {:0.3f}".format(metric))
Continue reading here
Read 51 times

Leave a comment

Make sure you enter all the required information, indicated by an asterisk (*). HTML code is not allowed.